Introduction to Programming 2025/2026 (CC1024) - DCC/FCUP

Practical Class #06 - Correction and Efficiency (03/11 to 07/11)

Exercises for submission

In what concerns the "exercises during classes" component of the evaluation, the exercises that you can submit for this class are:

[IP059] Cinnamon Jumping Game (weight 25%) (main exercise)
[IP060] Experimental Stability (weight 25%) (main exercise)
[IP061] Wallet Watcher (weight 25%) (main exercise)
[IP062] Rocket Pair Protocol (weight 25%) (main exercise)
[IP063] Neural Activity (weight 25%) (extra exercise)
[IP064] Comic Dilemma (weight 25%) (extra exercise)
[IP065] Bacterial Cultures (weight 25%) (challenge exercise)
[IP066] Alphabetical Obsession (weight 25%) (challenge exercise)

Deadline for submission: November 22nd (submit to IP's Mooshak)

You are encouraged to talk to the professors and your colleagues if you encounter difficulties. However, any more direct help that you have received from other colleagues or LLMs should be acknowledged in the comments of the code you submit.
After the deadline the problems will still be available on Mooshak, but the submissions will not count towards your grade.
Each practical class is worth 10% of your final component for exercices during class. Since there will be 11 classes with submissions, you can achieve maximum grade even without doing all classes.
For a problem to count you must pass all tests (that is, to have an accepted). Even if you solve all problems, the maximum on one class is 100%.
To obtain 100% it will always be enough to solve the main exercises.

With the exercises in this class you will develop the following skills:

Explore the notion of correctness in algorithms
Understand some of the most common errors of beginner coders
Explore the notion of efficiency in algorithms
Learn some new algorithmic techniques such as sliding windows and prefix sums
Gain insight into the efficiency of some Python data structures such as sets
Work on your problem solving skills

1) About Correction

The first set of exercises will be a bit different from usual. We will give some wrong code and will ask you to point out what is wrong and correct it, to expose you to some common types of errors. Some of these pieces of code are actual answers from students who thought their code was correct... but it wasn't 😺

We know that Python allows other types of solutions for some of the exercises, but for the sake of understanding, try to just do a minimal change to the code to make it correct and keep the core ideas of the algorithms in there.

You should first try to solve by yourself but afterwards you can (and should) check the answer provided.

The following function should write all integers between \(a\) and \(b\), inclusive.
The coder had the following idea: "I just need to iterate between \(a\) and \(b\) and print the values"
Can you see what is wrong? Can you guess what a call to all_numbers(2,6) would output? How can you correct it?
```
def all_numbers(a, b):
    for i in range(a, b):
        print(i)
```
Answer

- The range instruction goes from \(a\) to \(b-1\). It should be range(a, b + 1)
- This is known as an "off-by-one" error
The following function should return True if the given string is a palindrome and False otherwise. A palindrome is a word that reads the same backwards or forwards. For instance "abba" and "madam" are palindromes.
The coder had the following idea: "I just need to compare the first with the last character, the second with the penultimate, etc. If any of these pairs are different, then I can safely return False. If every pair I compared the same, I have a palindrome!"
Can you see what is wrong? Can you guess what a call to palindrome("abba") would result in? How can you correct it?
```
def palindrome(s):
    for i in range(len(s)):
        if s[i] != s[len(s) - i]:
            return False
    return True
      
```
Answer

- If the string \(s\) has length \(n\), this is comparing \(s[0]\) with \(s[n]\), \(s[1]\) with \(s[n-1]\), \(s[2]\) with \(s[n-2]\), etc.
- The problem is that the positions of the string are \(0, 1, 2, \ldots, n-1\), so position \(n\) does not exist and this would result in a IndexError: string index out of range
- We should actually compare \(s[0]\) with \(s[n-1]\), \(s[1]\) with \(s[n-2]\), \(s[2]\) with \(s[n-3]\), etc
- We can correct by changing the comparison line to if s[i] != s[len(s) - i - 1]:
- This is again an example of an "off-by-one" error
- We could also only traverse to half of the length of the string (can you see why?) but as it is, the function would still work (only doing double the amount of actual comparisons needed)
The following function should return the amount of even numbers in a given list.
The coder had the following idea: "I just need to keep a variable with the amount of even numbers already seen, iterate the list and increment each time I see another even number, returning the amount in the end."
Can you see what is wrong? Can you guess what a call to count_even([1,2,3,4,5,6,7]) would produce? How can you correct it?
```
def count_even(lst):
    for x in lst:
        if x % 2 == 0:
            count += 1
    return count
```
Answer

- The call would result in a UnboundLocalError: local variable 'count' referenced before assignment since the \(count\) variable was not assigned before being used
- We can correct by adding count = 0 before the for
- This is an example of an initialization error
The following function should return the maximum value of a list of integers.
The coder had the following idea: "I just need to keep a variable with current maximum, change it if a value bigger than the previous maximum exists, returning that maximum value in the end."
What happens if the values in the list are all negative? Can you guess what a call to mymax([-1, -2, -3, -1]) would return? How can you correct it?
```
def mymax(lst):
    answer = 0
    for x in lst:
        if x > answer:
            answer = x
    return answer
```
Answer

- Since \(answer\) is initialized to zero, this would be the value returned when all the elements are negative
- We can correct by adding changing that line to answer = lst[0], as the first element is our first guess for the maximum
- You might ask: what if the list is empty? In that case, the maximum is not defined, and the statement would need to define what it wants for the empty list case
- If you know the range of the integer in the list is \([a,b]\), you could also initialize to any value smaller than \(a\) but notice that our previous solution is much more generic and would work also for finding other values (e.g. minimum) and/or when you don't even know the range beforehand
- This is another example of an initialization error and one of the reasons why you should always pay attention to the limits of the variables given in the problem
The following function should execute a right shift on the given list, that is, rotate all the elements to the right (e.g. [2,4,6,8] should become [8,2,4,6]). It should not return anything, but actually change the list itself.
The coder had the following idea: "I create a new list called answer, position all elements shifted to the right on it and then I just assign the new list to the old list."
Can you guess what running this code would print? Why isn't the list changed outside the function? What is the error? How can you correct it?
```
def shift_right(lst):
    answer = [lst[-1]]
    for i in range(1, len(lst)):
        answer.append(lst[i-1])
    lst = answer
    print("lst inside function:", lst)

# Example usage
lst = [2,4,6,8]
shift_right(lst)
print("lst outside function:", lst)
```
Answer

- \(answer\) does have the desired answer, but it is a new list
- \(lst\) is just a function local variable that "references"/"points" to the actual list
- The assignment instruction lst = answer does not copy the contents of \(answer\) to \(lst\), but simply points the local variable \(lst\) to the new list (keeping the old list intact)
- If we actually wanted to copy to the old list, instead of the assignment we could for instance empty the old list (lst.clear()) and then copy the contents (lst.extend(answer) would extend \(lst\) with all the elements of \(answer\)
- Note that this would actually need to iterate through all the elements one by one and "copy" them to the old list. We could have actually changed the list directly, as we will see on the next exercise.
The function for this exercise has the same goal as the previous one, that is, execute a right shift on the given list, changing the list itself.
Now the coder wants to directly change the list and has the following idea: "I first store the last element (that I will later put on the first position) and then every element becomes the element that was on the previous position."
Can you guess what running this code would print? What is the error? How can you correct it?
```
def shift_right(lst):
    last = lst[-1]
    for i in range(1, len(lst)):
        lst[i] = lst[i-1]
    lst[0] = last

# Example usage
lst = [2,4,6,8]
shift_right(lst)
print(lst)
```
Answer

- It would print [8, 2, 2, 2]
- This is a problem with the order of operations
- The first element would be copied to the second position; afterwards it would be copied to the third position and so on
- We can correct it by simply changing the order in which the assignments are made; for example, we could change the for to for i in range(len(lst)-1, 0, -1): to go from the back to the front
The following function should return True if all the elements of the list are positive or False otherwise.
The coder had the following idea: "I iterate all the elements to check if the elements are positive and return True if the are or False otherwise."
Can you think about what kind of lists would make this code fail? Can you correct it?
```
def all_positive(lst):
    for x in lst:
        if x>0:
            return True
    return False
```
Answer

- This is a problem with logic of the algorithm
- Since the code returns as soon as it finds a positive number, it would return True for all lists with at least one positive number (even if the others are not)
- We can correct it by "inverting" the logic: if we find any non-positive number, we can safely return False and in the end, if nothing was non-positive, then it means everything is positive and therefore we can return True

Wrapping up:

There are of course many more possible errors in the correction, but this alerts you to some common mistakes when you are starting your journey on algorithms and coding:

off-by-one errors are bugs in your algorithm when some number differs from its intended value by 1. For example, it can be because you forgot to reach the last element, or you go one element beyond the end of your data structure. Always be careful with the border cases and what happens with the first/last elements.
initialization errors are bugs in which you forget to initialize something, or initialize it wrongly. If you use any variables to contains some kind of answer that you keep changing, always think about what a suitable value should it have at the start and pay attention to the constraints that are given on the problem statement that limit the possible input values.
order of operation errors are bugs in which you need to take care in about the order in which you execute the instructions, because they affect the instructions that come afterwards.
logic errors are bugs in which your logic is mistaken. Maybe you have an and when you should have a or or vice-versa, or maybe you forgot about some detail.
There are a myriad of ways in which your code might fail for some specific king of input, so always test your code and try to think about harder instances that could be counter-examples for your algorithm.
Remember that in Python, function arguments are always "passed by assignment", so inside a function your local variables are references to the object passed. If the object is mutable, you can change its contents by interacting with it, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.

Now let's go back to a more "traditional" class with some exercises that you can submit on Mooshak (however with a bit more guidance from your side than usual).

This week's theme for the exercises is The Big Bang Theory.

2) Ok, you have correct code, but is it efficient?

Start by carefully reading the following problem and make sure you fully understand what is asked:

[IP059] Cinnamon Jumping Game

A first naive solution

At first glance, you can simply just simulate the jumping game, and a correct code could look like this:

def jump(k, a, b):
    pos = 0
    for i in range(k):
        if i%2 == 0: pos += a
        else: pos -= b
    return pos

This actually solves the example function calls and all "small" inputs. Save this code to a file and submit to Mooshak. You will get a... Time Limit Exceeded.

What is the problem with this code? The execution time it takes depends on the \(k\), as we are doing a for cycle with \(k\) iterations. More than that, it takes time that is in linear proportion to \(k\), that is, if you double the number \(k\), you will need double the iterations and your code will roughly take double the execution time (you will later learn on other courses that this is a \(\mathcal{O}(k)\) algorithm, roughly meaning that it takes linear time on \(k\)). And in this problem \(k\) can be as large as \(10^9\), which is too high for this kind of solution...

A better efficient solution

So how can you solve this? Think a bit about the problem and try to think on a solution that does not depend on \(k\). If you get stuck you can have a look at the hints below, that is hidden so it is not spoiling your joy in solving the problem ☺️.
After you solve, test your solution and get your deserved Accepted for a solution that takes constant time and does not depend on \(k\) (what you will learn later to be a \(\mathcal{O}(1)\) solution)

Hints

- Instead of doing \(+a -b\) for \(k/2\) times, you can simply multiply \(a-b\) by \(k/2\), right?
- You just need to be careful about \(k\) being odd or even, to that you know if the last iteration is on \(+a\) or \(-b\)

3) Now for something algorithmically more interesting (sliding windows)

Start by carefully reading the following problem and make sure you fully understand what is asked:

[IP060] Experimental Stability

A first naive solution

Let's again think about a straight forward (naive) solution, that tries all possible sequences of size \(k\) and for each of them counts the number of positive values:

def stable(k, lst):
    best = 0
    for i in range(len(lst) - k + 1):
        count = 0
        for v in lst[i: i+k]:
            if v > 0:
                count += 1
        best = max(best, count)
    return best

Start by understanding the provided code:

The outer cycle goes through all possible "start" positions of a \(k\)-segment
The inner cycle traverses one \(k\)-segment
\(best\) represents the maximum number of positive values on a \(k\)-segment until now
\(count\) represents the number of positive values on the current \(k\)-segment
Why to we use range(len(lst) - k + 1). Until what position can a \(segment\) start?
What does the best = max(best, count) do? We could have also just used an if, right?

After you are sure you understood, you can save to file and submit to Mooshak. You will get a... Time Limit Exceeded (TLE), yet again 😇.

Why is this happening? Now it might not be so clear why we are getting (TLE)...

Measuring execution time

For the purposes of learning, we will do a more in-depth analysis. First, we will define a function to create random lists on demand on \(n\) integers on the range \([a,b]\):
(you can remember about pseudo-randomness and the random library by looking at lecture T11)

import random

# generate a random list of n integers in the range [a,b]
def random_list(n, a, b):
    lst = []
    for _ in range(n):
        lst.append(random.randint(a,b))
    return lst

Try it a little bit to get a hang of how it works. Here are two example calls to generate 10 numbers between 1 and 9. Notice how each call will produce a different list:

>>> random_list(10, 1, 9)
[9, 5, 9, 9, 2, 1, 4, 6, 3, 5]
>>> random_list(10, 1, 9)
[4, 2, 4, 8, 8, 4, 5, 5, 8, 5]

Now we can try to run our previous code on lists of different sizes and see how much time they take. We will experiment on fixing \(k = n/2\) and see how the execution time grows as we increase the size of the list, that we will call \(n\).

Download the following python script: ip060_experiment.py

Try to understand what it is doing and run it on your computer. You should get an output similar to:

n = 100, time = 0.000s 0.00x
n = 200, time = 0.000s 3.80x
n = 400, time = 0.002s 3.83x
n = 800, time = 0.007s 3.93x
n = 1600, time = 0.028s 4.01x
n = 3200, time = 0.106s 3.80x
n = 6400, time = 0.412s 3.89x
n = 12800, time = 1.640s 3.98x
n = 25600, time = 6.898s 4.21x
n = 51200, time = 26.031s 3.77x

Roughly speaking, each time we double \(n\), the execution time grows 4x and it not just the double! This means the execution time grows quadratically on \(n\) (later you will learn that this is a \(\mathcal{O}(n^2)\) algorithm).

This is due to the fact that for this case (\(k=n/2\)) we have roughly \(n/2\) segments, each one of them with \(n/2\) elements... And so we have a first outer cycle going roughly for \(n/2\) iterations, and for each of them we have the inner cycle going for \(n/2\) iterations, so we need \((n/2)^2 = n^2/4\) iterations, which is precisely something that grows quadratically on \(n\)...

This growth ratio as we increase the size input is precisely what algorithmic efficiency is all about. Even if you have a computer twice as fast, 10x times faster or even 100x times faster, it doesn't change the fundamental fact that the execution time will grow quadratically and inevitably, for a big enough input, your quadratic algorithm will be worse than an algorithm that for instance would depend just linearly on the size of the input.

A better efficient solution

So yes, we need a linear time algorithm for this problem. Can you think about one? What do we change each time our considered sequence moves one position to the right? Do we really need to compute again "from scratch" the number of positive values in that sequence?

Again, try to think a bit about the problem and think on a solution that just depend linearly on \(n\). We again provide hints if you need them, and even if you solve the problem by yourself, you should look at the afterwards.
After you solve, test your solution and get your deserved Accepted for an efficient solution.

If you "plug in" your better solution to the previous experiment code, you should get something like the following, showcasing that the growth ratio of execution time has gone down to 2x instead of 4x:

n = 100, time = 0.000s 0.00x
n = 200, time = 0.000s 1.62x
n = 400, time = 0.000s 2.00x
n = 800, time = 0.000s 2.03x
n = 1600, time = 0.000s 2.05x
n = 3200, time = 0.001s 2.00x
n = 6400, time = 0.001s 2.04x
n = 12800, time = 0.002s 1.99x
n = 25600, time = 0.004s 2.00x
n = 51200, time = 0.008s 2.00x

Hints

- Start by processing the initial segment of size \(k\), counting the number of positive values
- Now, to process the next segment, instead of counting again \(k\) elements, you just need to consider two values changes:
. You remove one element from the segment (the "old" first one); if it was positive, you can now decrement \(count\)
. You add one element to the segment (the "new" last one); if it is positive, you can now increment \(count\)

.........
.........
.........
.........
.........
.........

This technique is known as a sliding window and you use it whenever you have some king of "segment" moving through the data and you want to consider only what changes as you are moving it

4) Another cool algorithmic technique (prefix sums)

Start by carefully reading the following problem and make sure you fully understand what is asked:

[IP061] Wallet Watcher

A first naive solution

Notice how here you are not asked just for a function, but you need a full program reading the input and writing the output.

Again, at first glance you can just process each query by traversing all positions between \(a\) and \(b\) and a correct code could like this:

n = int(input())
lst = list(map(int, input().split()))

q = int(input())
for _ in range(q):
    a, b = map(int, input().split())
    mysum = 0
    for i in range(a-1, b):
        mysum += lst[i]        
    print(mysum)

Understand the code and then test with the sample input. The code works and seems correct but if you submit to Mooshak you will get a... Time Limit Exceeded.

What is the problem? For each query we might need to traverse the entire list, in the worst case we might need \(q \times n\) iterations to get the answer. But both \(n\) and \(q\) can go up to \(50\,000\) which is too high for such a naive solution...

Don't be fooled to think that calling Python's sum function will change what is happening "behind the curtains". For instance, the previous code could have also be written in the following way:

n = int(input())
lst = list(map(int, input().split()))

q = int(input())
for _ in range(q):
    a, b = map(int, input().split())
    mysum = sum(lst[a-1:b])
    print(mysum)

Now the inner cycle is reduced to an instruction (and really "optimized" one), but you are still fundamentally iterating through all positions of the (sub)list for each query, and the growth ratio of the execution time (remember that?) is still the same ⏱️

A better efficient solution

So how can you solve the problem?

A very interesting concept is the one of "prefix sums" (also known as cumulative sums).

The prefix sums are the total sums up to the corresponding positions. For the example input:

    Position i:   0    1    2    3    4    5    6    7    8    9   10
  Array Values:      100  -50  200  300 -150  400 -300  100  200   50
    Prefix Sum:   0  100   50  250  550  400  800  500  600  800  850

If we have stored the prefix sums (which can be computed in linear time on the size of the list\), finding the sum of a given interval can now be done in constant time!
For example, if \(psum[]\) stores the prefix sum, the sum between positions \(a\) and \(b\) is equal to \(psum[b] - psum[a-1]\). Can you see why?

This technique can be used to help solve many problems more efficiently, including helping Penny. If the prefix sums are calculated right after reading the money operations, we can answer each query with a simple subtraction, and the entire code will have time proportional to \(n + q\) instead of \(n \times q\) which is a big change!

Implement this idea using as a basis the previous code, test it and submit to get your well deserved Accepted! 😎

5) One last (rocket) journey into the planet of efficiency

Start by carefully reading the following problem and make sure you fully understand what is asked:

[IP062] Rocket Pair Protocol

A first naive solution

Back to functions, you are now seeing the pattern of first seeing a naive inefficient solution, and the improving it. How can we search for duplicates? Here is an example possible solution:

def rocket_pair(lst, x):
    for a in lst:
        for b in lst:
            if a != b and a+b == x:
                return True
    return False

We try all possible pairs \(a,b\) and check if they are different (avoiding using twice the same element - remember the elements are distinct) and if they sum to our desired target. As soon as we find one match, we can safely return \(True\). If no match is found, than the answer is \(False\).

Seems like a solid approach, and if we are lucky to find the sum, we can even return "quickly", but in the worst case the answer is \(False\) and we still need to check all pairs.

You could think on actually avoid trying twice each pair (as we test \(a,b\) and \(b,a\)), but that would only cut the time in two, not changing the growth ratio of the execution time, which grows quadratically on the size of the list, since we have a quadratic number of pairs. Because the size of the list can be \(50,000\) we are doomed to have a Time Limit Exceeded with this approach...

A better efficient solution

So how can you solve the problem?

On the lectures (T10) you already discussed the set data structure and you had a quick discussion on its efficiency (using hash functions) and how, on average, it provides constant time for adding an element, removing an element and checking if an element is on the set.

How can we use this to solve our task? Try to come up with an algorithm in which one of the loops of the naive solution is essentially transformed into a set membership operation.

If you get stuck you can have a look at the hints below 😉

After you solve, test your solution and get your deserved Accepted for an efficient solution.

Hints

- Traverse all the items of the list and keep a set of all previous values (adding them one by one to the set)
- For each new value \(v\), before you add it to the set you only need to check if \(x-v\) exists on the set, since that implies that there is a pair that sums to \(x\)
- This membership operation takes constant time on average, so your solution now only depends linearly on the size of the list 😎

6) Extra exercises for consolidating your knowledge [extra]

Loved the main exercises and want more adventure in the universe of efficiency and Big Bang Theory? Here are more exercises to sharpen your skills.

Read the statements, code and try to submit Accepted solutions to all of the following problems. Don't forget to test first on your computer!

[IP063] Neural Activity (oh no, more sliding windows!)
[IP064] Comic Dilemma (hum... sets again?)

7) Challenge exercises [challenge]

You've made your way Pasadena's finest minds, helping Raj, Leonard, Penny, Howard, Amy and Stuart. But no... two final bosses remain: the deceptively sweet yet terrifying Bernadette, and the ultimate challenge, Sheldon Cooper, whose logic defies not only code, but the very laws of physics themselves!

Bazinga! Let's see if your Python can handle that. 🧠🐍

Read the statements, code and try to submit Accepted solutions to the following problems. Don't forget to test first on your computer!

[IP065] Bacterial Cultures (2D prefix sums?)
[IP066] Alphabetical Obsession (advanced sliding windows?)

Happy coding! 😊