Memory Model

We've now seen the importance of space in algorithms. Ideally, we want to limit how much space our programs use, and also optimize our programs so that they use the fast parts of the memory hierarchy. However, before we can do that, we have to know how Python programs store data. Let's take a moment to understand the memory model that Python uses.

Everything is an object

In P2, you learned that objects are programming constructs that have some internal data (or state) and behavior. They allow us to group pieces of data together into a single entity, and associate functions with them. Functions that are internal to a specific object are referred to as methods.

In Python, all data is represented as objects. Simple values such as integers, floats, and booleans are all objects. Strings, lists, and dictionaries are all objects too. Everything is an object!

The stack and the heap

When a Python program executes, all of its data (including variables and values) are stored in main memory. The memory that a Python program can use is split into multiple parts. We will focus on two of these parts: the runtime stack and the dynamic memory heap, often called just the stack and the heap.

The stack

The stack is where local variables and function parameters are stored. When a function is called, a new stack frame is added to the stack. The stack frame includes enough room for all of the local variables and parameters that the function may use. When the function call returns, the stack frame is removed from the stack. For example, consider the following program:

def adder(a, b):
    total = a + b
    return total

if __name__ == "__main__":
    x = 5
    sum = adder(x, x)
    print(sum)

This is what the stack would look like while the adder() function executes:

A rectangle to represent stack memory, split between two stack frames: one for main and one for adder. Each stack frame contains the local variables that are inside the function.

Figure: Two stack frames and the local variables within them.

The heap

Objects are stored in a region of memory known as the heap. Remember that all values in Python are objects, so all Python values are indeed stored on the heap. Objects remain on the heap until they are no longer used by the program. At that point, their memory is reclaimed through a process called garbage collection.

Putting the ideas of a stack and a heap together, we can draw diagrams for the memory layout of a Python program. Take for example this program:

if __name__ == "__main__":
    a = 5
    greeting = "Hello, world!"
    print(greeting[:a])

When the print() statement executes, there are two variables on the stack (a and greeting), which reference two separate objects on the heap:

Memory divided into two parts: a stack region and a heap region. The stack region contains two variables, each of which holds a memory address of an object in the heap region.

Figure: Variables hold references to objects on the heap.

Strictly speaking, even the string resulting from the expression greeting[:a] will be an object on the heap, but we omitted it for simplicity.

This shows us that a variable is simply a container for a reference to an object. In other words, the value inside of a variable is just the memory address of where an object resides on the heap. Often, we represent these types of references using arrows and abstract away the actual memory addresses:

The same as the previous image: two variables in the stack referencing two objects on the heap. This time, there are arrows pointing from each variable to an object instead of memory addresses.

Figure: References can be visualized as arrows.

Mutability

In Python, there are two kinds of objects: mutable objects and immutable objects.

Immutable objects can not be modified by the program. All of Python's primitive data types are immutable: strings, integers, floats, booleans, and even tuples.

On the other hand, lists, dictionaries, and user-defined object types are mutable -- their contents can be modified.

So what happens when you need to, for example, change the value of a variable that holds an integer? Watch the video below to see how memory diagrams reflect how mutable and immutable objects behave.

Tracing code using diagrams

When trying to understand what a code fragment is doing, it can be useful to trace the code using diagrams. You can write the diagrams on paper by hand, or you can use tools to help you do the tracing, such as Python Tutor.

Here's a fragment of code that modifies some lists and then prints them:

Watch the video below to see an example of how to trace through this code using Python Tutor.

Check your understanding

Try tracing the following code fragment by drawing a picture, or using Python Tutor to visualize the execution. What is printed in the end? After you're done tracing the code, execute the code to see the output and reveal the solution to see memory diagrams.

Click to reveal the answer.

Answer.

[2, 2, 3]
[2, 2, 3]
[6, 5, 6]

After line 6, this is the picture in memory:

lst_x and lst_y reference the same list on the heap, while lst_z references a separate list

In other words, lst_x and lst_y reference the same list in memory on the heap. Therefore, changes made during the call to modify(lst_x) change the first element of the list to 3, but immediatley after returning, lst_y[0] = lst_y[1] changes the first element to be a 2. These changes affect the same list, referenced by both lst_x and lst_y. Therefore, after line 9, this is the picture in memory:

After modifying both lst_x and lst_y, both changes are reflected in the same list in memory

The final modification changes lst_z, meaning that this is the final picture in memory:

State of memory after changing lst_z

Data Structures and Algorithms