Course Overview
Welcome to Data Structures and Algorithms!
Course Description
This course teaches the fundamentals of data structures and introduces students to the implementation and analysis of algorithms.
Students start by examining the basic linear data structures: linked lists, arrays, stacks, and queues. They learn how to build these structures from scratch, represent algorithms using pseudo-code, and translate algorithms into running programs. They apply these algorithms to real-life applications to understand complexity and performance tradeoffs. Students will also learn how to develop algorithms for sorting and searching, use iteration and recursion for repetition, and make tradeoffs between the approaches. They will learn to estimate the efficiency of algorithms, and practice writing and refining algorithms in Python.
This course emphasizes big-picture understanding and practical problem-solving in preparation for technical interviews and professional practice. Throughout the course, students will solve common practice problems, and participate in peer interview sessions. As part of their formative assessments, they will also deepen their understanding of these topics and practice technical communication by writing technical blog posts.
Topics
- Python Memory Model
- ADTs
- Algorithm Analysis
- Searching and Sorting
- Recursion
- Linked Lists
- Lists, Stacks, and Queues
- Trees
How the course works
There are multiple ways you'll learn in this course:
- Read and engage with the materials on this site
- Attend live class and complete the activities in class
- Answer practice exercises to try out the concepts
- Complete assignments and exams to demonstrate what you have learned
Active engagement is necessary for success in the course! You should try to write lots of programs, so that you can explore the concepts in a variety of ways.
You are encouraged to seek out additional practice problems outside of the practice problems included in the course.
Learning Outcomes
By the end of the course, students will be able to:
- Describe the implementation and performance of fundamental data structures
- Describe underlying data structures upon which more complex structures are built
- Design and analyze recursive algorithms
- Implement several searching and sorting algorithms including insertion-sort, merge-sort and heap-sort
- Solve common algorithmic technical interview problems
Instructor
- Cody Doucette
- cody.doucette@kibo.school
Please contact on Discord first with questions about the course.
Live Class Time
- Thursdays, 6:00 PM GMT
Office Hours
- Fridays, 3:00 PM GMT
Core Reading List
- Miller, B. Ranum, D. (2013). Problem Solving with Algorithms and Data Structures using Python. Franklin Beedle Publishers. Chapters 1-5 (183pp)
- Skiena, S. (2020). The Algorithm Design Manual. Springer; 3rd edition.
Supplemental Reading List
- Goodrich, M. (2013). Data Structures & Algorithms in Python. Chapters 1 - 7 (294 pp).
- Leetcode. Practice Problems.
Live Classes
Each week, you will have a live class (see course overview for time). You are required to attend the live class sessions.
Video recordings and resources for the class will be posted after the class each week. If you have technical difficulties or are occasionally unable to attend the live class, please be sure to watch the recording as quickly as possible so that you do not fall behind.
Week | Topic | Resources | Recording |
---|---|---|---|
1 | Goals and Building Blocks | Slides | YouTube |
2 | Algorithm Analysis and Sorting | Slides | YouTube |
3 | Recursion I | Slides | YouTube |
4 | Recursion II | Slides | YouTube |
5 | Linked Lists | Slides | YouTube |
6 | Midterm week | ||
7 | Lists, Stacks, & Queues | Slides | YouTube |
8 | Trees | Slides | YouTube |
9 | BSTs | Slides | YouTube |
Assessments & Grading
You overall course grade is composed of these weighted elements:
- Assignments: 40%
- Midterm exam: 10%
- Final exam: 25%
- Quizzes: 15%
- Peer interviews: 10%
Assignments
During some weeks, you'll be given a programming assignment to test the new skills and concepts you've learned. The assignments will be challenging, and it's recommended that you complete the material and practice problems before you start the assignment each week. The assignments will be due on Sunday nights and will be submitted and graded on Gradescope. There will be five assignments overall.
Exams
There will be a midterm exam in week 6 of the course and a final exam in week 10 of the course. The exams will be administered online during our regular class meeting time. The exams are timed and are to be completed individually.
Quizzes
After each week's material, there will be a short quiz on Gradescope to check that you understand the material. There are a total of eight quizzes overall -- in all weeks except weeks 6 and 10. The quizzes must be submitted an hour before our class meeting each week (Thursdays at 5:00 PM GMT).
Peer Interviews
You will be participating in peer technical interviews. As a part of these peer interviews, you will be required to submit reflections on Gradescope. Peer interview reflections are due on Sunday nights. These interview questions are similar to what you might encounter during coding interviews for software development jobs. Our goal with these questions is to help prepare you for technical interviews in the software industry.
Late Policy
Assignments and peer interview reflections can be submitted late under Kibo's late policy:
- Submissions within two weeks of the deadline incur a 10% penalty.
- Submissions beyond two weeks late incur a 20% penalty.
The final submission deadline is the last day of the academic term.
Note that the late policy only applies to assignments and peer interview reflections. Quizzes must be submitted before the live class each week, and no late submissions will be accepted.
Students facing exceptional circumstances in terms of submitting work on time should immediately contact their Kibo advisor.
Getting Help
If you have any trouble understanding the concepts or stuck on a problem, we expect you to reach out for help!
Below are the different ways to get help in this class.
Discord Channel
The first place to go is always the course's help channel on Discord. Share your question there so that your Instructor and your peers can help as soon as we can. Peers should jump in and help answer questions (see the Getting and Giving Help sections for some guidelines).
Message your Instructor on Discord
If your question doesn't get resolved within 24 hours on Discord, you can reach out to your instructor directly via Discord DM or Email.
Office Hours
There will be weekly office hours with your Instructor and your TA. Please make use of them!
Tips on Asking Good Questions
Asking effective questions is a crucial skill for any computer science student. Here are some guidelines to help structure questions effectively:
-
Be Specific:
- Clearly state the problem or concept you're struggling with.
- Avoid vague or broad questions. The more specific you are, the easier it is for others to help.
-
Provide Context:
- Include relevant details about your environment, programming language, tools, and any error messages you're encountering.
- Explain what you're trying to achieve and any steps you've already taken to solve the problem.
-
Show Your Work:
- If your question involves code, provide a minimal, complete, verifiable, and reproducible example (a "MCVE") that demonstrates the issue.
- Highlight the specific lines or sections where you believe the problem lies.
-
Highlight Error Messages:
- If you're getting error messages, include them in your question. Understanding the error is often crucial to finding a solution.
-
Research First:
- Demonstrate that you've made an effort to solve the problem on your own. Share what you've found in your research and explain why it didn't fully solve your issue.
-
Use Clear Language:
- Clearly articulate your question. Avoid jargon or overly technical terms if you're unsure of their meaning.
- Proofread your question to ensure it's grammatically correct and easy to understand.
-
Be Patient and Respectful:
- Be patient while waiting for a response.
- Show gratitude when someone helps you, and be open to feedback.
-
Ask for Understanding, Not Just Solutions:
- Instead of just asking for the solution, try to understand the underlying concepts. This will help you learn and become more self-sufficient in problem-solving.
-
Provide Updates:
- If you make progress or find a solution on your own, share it with those who are helping you. It not only shows gratitude but also helps others who might have a similar issue.
Remember, effective communication is key to getting the help you need both in school and professionally. Following these guidelines will not only help you in receiving quality assistance but will also contribute to a positive and collaborative community experience.
Screenshots
It’s often helpful to include a screenshot with your question. Here’s how:
- Windows: press the Windows key + Print Screen key
- the screenshot will be saved to the Pictures > Screenshots folder
- alternatively: press the Windows key + Shift + S to open the snipping tool
- Mac: press the Command key + Shift key + 4
- it will save to your desktop, and show as a thumbnail
Giving Help
Providing help to peers in a way that fosters learning and collaboration while maintaining academic integrity is crucial. Here are some guidelines that a computer science university student can follow:
-
Understand University Policies:
- Familiarize yourself with Kibo's Academic Honesty and Integrity Policy. This policy is designed to protect the value of your degree, which is ultimately determined by the ability of our graduates to apply their knowledge and skills to develop high quality solutions to challenging problems--not their grades!
-
Encourage Independent Learning:
- Rather than giving direct answers, guide your peers to resources, references, or methodologies that can help them solve the problem on their own. Encourage them to understand the concepts rather than just finding the correct solution. Work through examples that are different from the assignments or practice problems provide in the course to demonstrate the concepts.
-
Collaborate, Don't Complete:
- Collaborate on ideas and concepts, but avoid completing assignments or projects for others. Provide suggestions, share insights, and discuss approaches without doing the work for them or showing your work to them.
-
Set Boundaries:
- Make it clear that you're willing to help with understanding concepts and problem-solving, but you won't assist in any activity that violates academic integrity policies.
-
Use Group Study Sessions:
- Participate in group study sessions where everyone can contribute and learn together. This way, ideas are shared, but each individual is responsible for their own understanding and work.
-
Be Mindful of Collaboration Tools:
- If using collaboration tools like version control systems or shared documents, make sure that contributions are clear and well-documented. Clearly delineate individual contributions to avoid confusion.
-
Refer to Resources:
- Direct your peers to relevant textbooks, online resources, or documentation. Learning to find and use resources is an essential skill, and guiding them toward these materials can be immensely helpful both in the moment and your career.
-
Ask Probing Questions:
- Instead of providing direct answers, ask questions that guide your peers to think critically about the problem. This helps them develop problem-solving skills.
-
Be Transparent:
- If you're unsure about the appropriateness of your assistance, it's better to seek guidance from professors or teaching assistants. Be transparent about the level of help you're providing.
-
Promote Honesty:
- Encourage your peers to take pride in their work and to be honest about the level of help they received. Acknowledging assistance is a key aspect of academic integrity.
Remember, the goal is to create an environment where students can learn from each other (after all, we are better together) while we develop our individual skills and understanding of the subject matter.
Academic Integrity
When you turn in any work that is graded, you are representing that the work is your own. Copying work from another student or from an online resource and submitting it is plagiarism. Using generative AI tools such as ChatGPT to help you understand concepts (i.e., as though it is your own personal tutor) is valuable. However, you should not submit work generated by these tools as though it is your own work. Remember, the activities we assign are your opportunity to prove to yourself (and to us) that you understand the concepts. Using these tools to generate answers to assignments may help you in the short-term, but not in the long-term.
As a reminder of Kibo's academic honesty and integrity policy: Any student found to be committing academic misconduct will be subject to disciplinary action including dismissal.
Disciplinary action may include:
- Failing the assignment
- Failing the course
- Dismissal from Kibo
For more information about what counts as plagiarism and tips for working with integrity, review the "What is Plagiarism?" Video and Slides.
The full Kibo policy on Academic Honesty and Integrity Policy is available here.
Course Tools
In this course, we are using these tools to work on code. If you haven't set up your laptop and installed the software yet, follow the guide in https://github.com/kiboschool/setup-guides.
- Github is a website that hosts code. We'll use it as a place to keep our project and assignment code.
- Github Classroom is a tool for assigning individual and team projects on Github.
- VSCode is your code editor. It's where you'll write code to solve programming assignments. It also has an embedded terminal, where you can run Python to try out your code and test that it works.
- Anchor is Kibo's Learning Management System (LMS). You will access your course content through this website, track your progress, and see your grades through this site.
- Gradescope is a grading platform. We'll use it to track assignment submissions and give you feedback on your work.
- Woolf is our accreditation partner. We'll track work there too, so that you get credit towards your degree.
Goals and Building Blocks
Welcome to week 1! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- conceptualize space and time in algorithms with real-world examples
- convert between units of space and time in computer science
- measure the running time of algorithms using the Python
timeit
library - diagram the memory hierarchy of modern computers
- understand the purpose of abstract data types for specifying data structure properties
- identify the basics of the coding interview process
What's due this week
- Week 1 Quiz (on Gradescope)
- Assignment 1
Data structures and algorithms
In Programming 1 and 2, you were introduced to the concepts of data structures and algorithms, so let's start by reminding ourselves about what we know so far. Then, we will discuss why we should care about data structures and algorithms -- including why we are dedicating an entire course to study them!
What is a data structure?
Many problems require organizing and using data to solve them. For example, an international shipping company might be interested in keeping track of all of the previous orders it has fulfilled, and at any time, calculating the top N greatest orders by monetary value. There could be thousands, tens of thousands, hundreds of thousands of orders -- or more -- and these calculations need to be performed quickly. Data structures are constructs that help us organize data to solve these types of problems.
To make solving these problems easier, some programming languages have built-in data structures in the language itself. For example, Python has lists, dictionaries, and tuples to organize data. In previous courses, you used these constructs to complete tasks like count the frequencies of words and organize phone books.
Here's a formal definition of data structure:
Notice in the definition of data structure that we mention efficiency. In this course, we will be quite concerned with writing efficient algorithms; that is, algorithms that take up as few resources (time and space) as possible. But what is an algorithm?
What is an algorithm?
In P1, you learned that an algorithm is simply a set of instructions to perform a computation, and that definition still holds true. In this course, however, we will expand upon our knowledge of algorithms in three key ways:
- We will study advanced techniques for writing algorithms (e.g., recursion)
- We will develop formalisms for describing the efficiency of algorithms (i.e., big-O notation)
- We will learn how to write algorithms that complement particular data structures (e.g., stacks, queues, and trees) to achieve efficient computations
With these new tools and techniques at our disposal, we will be much better placed to tackle data-based problems.
Why study data structures and algorithms?
You might be wondering why there is an entire course dedicated to data structures and algorithms. The case for studying Python in the P1 and P2 courses was clear. You learned numerous skills for designing, coding, and debugging solutions in Python to various problems. However, you won't learn many new Python language features in this class. So how is this going to be useful?
Many of the computational problems you will face in the real world will be data-oriented and require the use of data structures. You will need to be able to solve data-oriented problems efficiently. Some approaches to solving such problems are too expensive and are therefore not feasible.
This course gives you the tools necessary to analyze data-based problems and devise an efficient solution. Even when presented with a problem that you have never encountered before, you will be able to reason about the best way to solve it based on the set of principles that we will learn.
Space and time efficiency
When we talk about the efficiency of an algorithm, we are concerned with how the algorithm uses two resources: space and time.

Time efficiency
The time efficiency of an algorithm refers to the amount of computational time that an algorithm takes to complete its task. It is typically, though not always, the most constrained resource for solving a problem.
For example, if a hacker tried to guess the password for your movie streaming account, the main challenge would be the amount of time it takes. The hacker could guess every possible combination (or permutation) of letters, numbers, and special characters in every possible length, but that would take a long time.
Let's say that you could make the assumption that passwords are always ten characters long, and only consist of letters from the Latin alphabet (A...Z,a...z) and Arabic numerals (0...9). That still means there are 1062 possible password combinations!
Guessing those at a rate of one million guesses per second would still take you quadrillions of quadrillions of years to guess them all. We try to avoid such brute-force algorithms, since they often take too long to compute. But soon, we will have the tools to evaluate the time efficiency of algorithms to judge which ones are efficient and which are not.
Space efficiency
In addition to time, we are also often concerned with the amount of space that an algorithm requires to perform its computation. In computer science, the space efficiency of an algorithm refers to the amount of computer memory the algorithm needs to perform its task.
Many tasks do not require a significant amount of memory beyond what is given to them as input. For example, consider a program that computes n factorial, or n! = n * (n-1) * (n-2) ... * 2 * 1
:
def factorial(n):
product = 1
for i in range(1, n + 1):
product *= i
return product
Notice that this function does not require much memory -- just a few local variables (n
, product
, and i
) to help compute the result.
On the other hand, recall this implementation of insertion sort from P1:
def insertion_sort(a_list):
results = [a_list[0]]
for item in a_list:
if item >= results[-1]:
results.append(item)
continue
for i in range(len(results)):
if item <= results[i]:
results.insert(i, item)
break
return results
In addition to the list passed in as a parameter (a_list
), another list named results
is also formed as part of the execution of the function. results
ultimately contains all of the elements of a_list
, put into sorted order. This means that if a_list
has 1 million elements, results
will also have 1 million elements! This is a significant amount of memory required by the algorithm, and must be taken into account when discussing its space efficiency.
Note: there are alternate implementations of the insertion sort algorithm that do not require using an extra list.
Space-time tradeoffs
We now know that we can evaluate algorithms in terms of their space and time efficiencies. It's also important to note that often there is a tradeoff between space and time resources used in an algorithm.
For example, say that we wanted to write a program that was capable of calculating n!
for all positive integers n <= 10
. We could use the same algorithm from above, perhaps with some extra error-checking:
def factorial(n):
if n < 1 or n > 10:
print("n must be positive and <= 10")
return 0 # error
product = 1
for i in range(1, n + 1):
product *= i
return product
Let's think about other ways we could compute this result. Instead of calculating the result when the program is run, we could instead precompute the values of n!
for all integers 1 <= n <= 10
and store those in a list. Then, all the function would have to do is return the entry at the appropriate index of the list:
def factorial_precomputed(n):
if n < 1 or n > 10:
print("n must be positive and <= 10")
return 0 # error
factorials = [1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]
# n! is stored in index n - 1
return factorials[n - 1]
factorial_precomputed()
will take fewer steps to execute than factorial()
because it doesn't need to use a for
loop to actually calculate the value of n!
. Therefore, it will be faster. However, it uses more space than factorial()
because it needs an extra list of size n
.
We have used some extra space in order to save time -- a space-time tradeoff! Precomputing a result and performing a simple lookup when needed is a very common form of a space-time tradeoff.
Summary
- Algorithms can be evaluated in terms of how much computational time they take to complete
- Algorithms can also be evaluated in terms of how much memory they consume
- Sometimes, there is a tradeoff between the amount of time and space an algorithm takes
The memory hierarchy
While we're talking about space and memory, it's worth mentioning that not all memory in computers is equal! In fact, there is a memory hierarchy.
Computers have different types of memory available for use, each with different physical properties. Together, they compose a memory hierarchy. This image illustrates their relative speeds, costs, and capacities:

Registers and Caches
At the top of the diagram we have types of memory that reside on the CPU itself, such as registers and caches. The CPU can retrieve data from registers and caches very quickly, but space on the CPU is limited, so together they represent only a small fraction of the available memory on a computer. Because of this, the data held in registers and caches is generally limited to only the data that was very recently used or will soon be used by the programs being executed by the CPU.
Main Memory
The next level down in the hierarchy is primary storage, or main memory. Physically, it typically consists of what's known as random access memory (RAM) devices. Compared to registers and caches, RAM is much less expensive and can therefore be much larger in size, but it takes the CPU longer to fetch data from main memory. However, the data held in main memory generally represents all of the data of programs that are currently being executed.
Secondary storage
After RAM, the next level in the memory hierarchy is secondary storage, often implemented using magnetic hard disk drives (HDDs) or solid state drives (SSDs). Secondary storage is persistent, meaning that the data in secondary storage remains there even when the computer is turned off. This is in contrast to caches and main memory, whose contents are cleared when the computer is powered off. Secondary storage is much larger than main memory, but it also takes much longer for the CPU to fetch data from secondary storage. This is because in order to use it, the data stored on disk must first be moved into main memory through a process known as paging.
Network-based storage
The last level of the memory hierarchy that we will consider is network-based storage. This represents all data that is not stored locally on a computer, and that must be requested and fetched over a network. For example, downloading a file from cloud storage over the Internet is a form of network-based storage. The time to send a request for data over the Internet and then download that file is considerably larger than secondary storage, but you can store much more, since you are accessing files that could potentially be stored on many different computers, or servers.
There are other levels and details within this hierarchy that we have not touched on, but this view of memory is sufficient for our purposes.
Units of space and time
Now that we know algorithmic efficiency is measured along the axes of space and time, it's also helpful to know how these resources are measured.
Measuring time
In the password-guessing example, we took a computationally intensive problem and thought about how long it would take to complete, which allowed us to measure the result in terms of quadrillions of years. However, time-based measurements of computer operations are typically on a much smaller scale:
Unit | Value | Example |
---|---|---|
Second (s) | 1 second | Downloading a medium-sized PDF file |
Millisecond (ms) | 10-3 seconds | Opening a PDF file on your computer |
Microsecond (us) | 10-6 seconds | Response time for an LCD monitor |
Nanosecond (ns) | 10-9 seconds | CPU executing 100 instructions |
In other words, the following are equivalent:
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns
1 s = 1,000 ms = 1,000,000 us = 1,000,000,000 ns
The amount of time it takes for a process to complete is called its latency. We give some example latency measurements at the end of this lesson.
Measuring space
We have specific units for measuring the amount of space that data takes up. The most basic unit to measure volume of data is the bit (binary digit) -- a 0 or 1. Groups of 8 bits are often referred to as a byte.
For example, the ASCII encoding of characters assigns 8 bits (1 byte) for each Latin alphabet character. The letter 'A'
has the binary encoding 01000001
.
Larger groups of bits and bytes use the metric prefixes. For example:
Name | Value | General terms | Example |
---|---|---|---|
byte | 1 byte | One byte | One ASCII character |
Kilobyte (KB) | 103 bytes | One thousand bytes | 2-3 paragraphs of text |
Megabyte (MB) | 106 bytes | One million bytes | Picture taken by a smartphone |
Gigabyte (GB) | 109 bytes | One billion bytes | A movie downloaded to your device |
Terabyte (TB) | 1012 bytes | One trillion bytes | 1,000 movies downloaded to your device |
⚠️ But be careful! Always be aware of whether you're measuring something in terms of bits or bytes. Typically, values measured in bits are represented with a lowercase "b" or "bits," e.g., 100 Kb or 100 Kbits. Values measured in bytes are usually represented with an uppercase "B" or "bytes," e.g., 100 KB or 100 Kbytes. If precision matters for your task, it's better to explicitly use "bits" or "bytes" to clearly communicate.
Base 2 and base 10
As shown above, the prefixes "kilo," "mega," "giga," and "tera" correspond to powers of 10: 103, 106, 109, and 1012. However, when we say that a computer has 8 GB of RAM, we actually do not mean that it has 8 x 109 bytes!
This is because volumes of memory and storage are conventionally measured in powers of 2 (binary), not powers of 10 (decimal). Quite luckily, some powers of 2 are almost equal to some powers of 10, so the difference is often insignificant. Units that are based on powers of 2 have slightly different prefixes:
Value (in bytes) | Name | Approximate Decimal Equivalent |
---|---|---|
1 | byte | 1 |
210 = 1,024 | Kibibyte (KiB) | 103 = 1,000 |
220 = 1,048,576 | Mibibyte (MiB) | 106 = 1,000,000 |
230 = 1,073,741,824 | Gibibyte (GiB) | 109 = 1,000,000,000 |
240 = 1,099,511,600,000 | Tebibyte (TiB) | 1012 = 1,000,000,000,000 |
Because the difference between, say, a KB and a KiB is so small, "KB" could be used to mean either one. Again: if precision matters, be overtly clear about which unit you're using.
To make matters even more confusing, telecommunications systems measure properties such as Internet bandwidth using the power of 10 system. A 1 Gbps connection means that 1 x 109 bits of information can be transmitted in one second.
Why the difference? Well, telecommunications and storage technologies evolved separately and without a consistent standard, leading to different conventions for units that continues today!
Putting it all together
Now that we know about the memory hierarchy and the different ways of measuring space and time, we can ground our understanding of these concepts with some concrete numbers:
Process | Latency | Note |
---|---|---|
Reading data from cache | 1 ns | E.g., fetching an integer variable's value |
Reading data from main memory | 100 ns | |
Reading 1 MB from main memory | 250,000 ns = 250 us | |
Reading 1 MB from disk | 20,000,000 ns = 20,000 us = 20 ms | 80x slower than main memory |
Sending data from US to Europe and back | 150,000,000 ns = 150,000 us = 150 ms | Just one packet! Downloading an entire file would take longer |
Knowing these exact figures is not crucial to being able to write algorithms and use data structures. But it's helpful to know the scale at which we're working, and how the physical components of a computer will affect our programs.
Memory Model
We've now seen the importance of space in algorithms. Ideally, we want to limit how much space our programs use, and also optimize our programs so that they use the fast parts of the memory hierarchy. However, before we can do that, we have to know how Python programs store data. Let's take a moment to understand the memory model that Python uses.
Everything is an object
In P2, you learned that objects are programming constructs that have some internal data (or state) and behavior. They allow us to group pieces of data together into a single entity, and associate functions with them. Functions that are internal to a specific object are referred to as methods.
In Python, all data is represented as objects. Simple values such as integers, floats, and booleans are all objects. Strings, lists, and dictionaries are all objects too. Everything is an object!
The stack and the heap
When a Python program executes, all of its data (including variables and values) are stored in main memory. The memory that a Python program can use is split into multiple parts. We will focus on two of these parts: the runtime stack and the dynamic memory heap, often called just the stack and the heap.
The stack
The stack is where local variables and function parameters are stored. When a function is called, a new stack frame is added to the stack. The stack frame includes enough room for all of the local variables and parameters that the function may use. When the function call returns, the stack frame is removed from the stack. For example, consider the following program:
def adder(a, b):
total = a + b
return total
if __name__ == "__main__":
x = 5
sum = adder(x, x)
print(sum)
This is what the stack would look like while the adder()
function executes:

The heap
Objects are stored in a region of memory known as the heap. Remember that all values in Python are objects, so all Python values are indeed stored on the heap. Objects remain on the heap until they are no longer used by the program. At that point, their memory is reclaimed through a process called garbage collection.
Putting the ideas of a stack and a heap together, we can draw diagrams for the memory layout of a Python program. Take for example this program:
if __name__ == "__main__":
a = 5
greeting = "Hello, world!"
print(greeting[:a])
When the print()
statement executes, there are two variables on the stack (a
and greeting
), which reference two separate objects on the heap:

Strictly speaking, even the string resulting from the expression
greeting[:a]
will be an object on the heap, but we omitted it for simplicity.
This shows us that a variable is simply a container for a reference to an object. In other words, the value inside of a variable is just the memory address of where an object resides on the heap. Often, we represent these types of references using arrows and abstract away the actual memory addresses:

Mutability
In Python, there are two kinds of objects: mutable objects and immutable objects.
Immutable objects can not be modified by the program. All of Python's primitive data types are immutable: strings, integers, floats, booleans, and even tuples.
On the other hand, lists, dictionaries, and user-defined object types are mutable -- their contents can be modified.
So what happens when you need to, for example, change the value of a variable that holds an integer? Watch the video below to see how memory diagrams reflect how mutable and immutable objects behave.
Tracing code using diagrams
When trying to understand what a code fragment is doing, it can be useful to trace the code using diagrams. You can write the diagrams on paper by hand, or you can use tools to help you do the tracing, such as Python Tutor.
Here's a fragment of code that modifies some lists and then prints them:
Watch the video below to see an example of how to trace through this code using Python Tutor.
Abstract data types
Let's now leave behind efficiency and metrics for the moment, and touch on another building block that will resurface repeatedly in this course: abstract data types, or ADTs.
What are ADTs?
To summarize, ADTs define the semantics of how a data structure should behave, including the operations that the data structure should support. However, an ADT doesn't define how the data structure should be implemented -- it is only an abstract description.
ADTs and Python
Now that we know what an ADT is in general terms, let's think about how to use ADTs in code.
Python does not have a specific language feature that represents an ADT. When we need to implement an ADT using a Python class, we will include the name of the ADT in the class name. This way, we will have a hint about the expected behavior of objects of that class.
For example, if we wanted to create a list-based implementation of a tuple as described in the video, it might look like this:
class ListTuple:
def __init__(self, a, b):
self.items = [a, b]
def get_item_a(self):
return self.items[0]
def get_item_b(self):
return self.items[1]
def set_item_a(self, new_a):
self.items[0] = new_a
def get_item_b(self, new_b):
self.items[1] = new_b
Under the hood, the tuple abstraction is implemented as a Python list. On the other hand, if we wanted to create a variable-based representation of a tuple, it would look like this:
class TwoVariableTuple:
def __init__(self, a, b):
self.a = a
self.b = b
def get_item_a(self):
return self.a
def get_item_b(self):
return self.b
def set_item_a(self, new_a):
self.a = new_a
def get_item_b(self, new_b):
self.b = new_b
Both ListTuple
and TwoVariableTuple
provide the operations of a tuple, just implemented in different ways!
You might be wondering at this point why we would need to have multiple implementations of an ADT if they provide the same behavior. The answer is efficiency. One implementation of an ADT might be more efficient than another implementation of an ADT. Once we see more fundamental ways of organizing data other than lists, the differences in efficiency in how we implement ADTs will become clearer.
Interviewing
As we round out our first week together, let's highlight one more goal for this term: gaining the skills and experience needed to be successful in technical interviews.
A common practice in the tech industry is to conduct a coding interview as a part of the overall application process for software engineering and other computer science-related roles. In these interviews, a candidate is given a computational problem to solve, and is asked to code a solution in real time while the interviewer observes. This is quite different from a typical behavioral interview, and can be intimidating to those looking to enter the industry for the first time.
This course is the perfect opportunity to hone interviewing skills because most of the questions posed in these coding interviews are based on data structures and algorithms.
Watch the following video about the coding interview process at Google. Take note of how often data structures and algorithms are mentioned!
Throughout this term, you will progressively learn the skills you need to succeed in interviews, and practice those skills in peer interviews.
Practice: Goals and Building Blocks
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Finding the Average of a List
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
2. Working with Units of Space and Time
A photographer has a collection of 10,500 photos. She learns from her camera's manual that each photograph taken by the camera is 2.5 MB. Each photo also has 250 KB of metadata that needs to be stored alongside the photo. She currently has the photos in a cloud storage account that allows her to have 100 GB of data. However, she is worried that she might be reaching the limit of the account and is considering storing the collection on her local computer instead.
-
How close is the photographer to filling up her cloud storage account?
-
The photographer's computer has 40 GB of available disk space. What percentage of her collection would she be able to store on her computer?
-
The photographer learns that it takes 2000 ms to download each photo from the cloud storage account to her computer. At that speed, how many minutes would it take to download the entire collection?
-
The photographer also learns that her Internet uplink speed is 512 Kbps. Approximately how long would it take for her to upload 10 photos to her cloud storage account?
-
After reading the documentation of her editing software, the photographer finds that the software can comfortably accommodate 0.25 GiB of photo data in main memory at a time, and doesn't need to take into account the metadata associated with the photos. Approximately how many photos can she edit simultaneously?
Open the spoiler below to see the full solution.
Make sure you give yourself enough time to solve the practice without looking at the solution. It is really important for your learning.
Solution
-
Since the application being used in this problem is storage, it’s probably better to assume that the units are in terms of base 2. In other words, we could say that 1 KB = 1024 B, etc. However, since we’re just looking for rough numbers, we can use the base 10 system, i.e. 1 KB = 1000 B. If that’s the case, then 250 KB of metadata is equivalent to .25 MB. Let’s add that to the size of the photo itself to get 2.75 MB of data overall for each picture. If the photographer has 10,500 photos, that’s 10,500 * 2.75 MB = 28,875 MB = 28.875 GB of data. Therefore, she has used less than 29% of the available space on her cloud storage account.
-
Since her photo collection would only take up 28.875 GB of storage, she would be able to store 100% of it with more than 10 GB of storage space left over.
-
To download all 10,500 photos, it would take 10,500 * 2000 ms = 21000000 ms = 21000 seconds = 350 minutes.
-
Let’s convert the Internet uplink speed to MBps so that we have units that can be directly calculated with a 2.75 MB photo. 512 Kbps = 64 KBps (since there are 8 bits in a byte). We then have 64 KBps = .064 MBps. Therefore, .064 MBps * 2.75 MB means that every photo can be uploaded in .176 seconds. To upload 10 photos would therefore take 1.76 seconds.
-
Since this problem specifically mentions GiB, let’s use base 2 units and assume that each photo is really 2.5 MiB. We don’t need to include the metadata. 0.25 GiB = 250 MiB, so the editing software can hold 250 MiB / 2.5 MiB = 100 photos at a time.
3. Tracing Memory Modifications
Consider the Python code below. Trace the execution of the code by creating a memory diagram showing the stack and the heap, or use Python Tutor. What is output the by the print
statement? Only execute the code after you have attempted the exercise.
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
4. Tracing Memory Modifications with Functions
Consider the Python code below. Trace the execution of the code by creating a memory diagram showing the stack and the heap, or use Python Tutor. What is output the by the print
statement? Only execute the code after you have attempted the exercise.
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
5. Implementing a Dictionary ADT
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 1 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 1
Due: January 14 at 11:59 PM GMT
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Implementing a Set ADT
This week we saw multiple examples of ADTs, including tuples, dictionaries, and sets. These three ADTs happen to be built into Python as language features, but we've defined our own versions to illustrate how ADTs can be implemented.
This week we also learned about the various levels of the memory hierarchy, and saw the relative speeds at which you can access data from those levels.
In this assignment, you will implement an ADT that represents a mathematical set. You will implement it in three different ways: by storing the elements in a list, storing the elements in a sorted list, and storing the elements on disk. You will also benchmark the performance of these implementations.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Week 1 Interview
During the first two weeks of the term, two special sessions will be held in which the instructor will perform an interview with a student. These special sessions are separate from the regular class meeting and are optional. However, the goal of them is to provide you with some experience and expectations for the routine of a technical interview.
The instructor will seek out volunteers to be the interviewees for these special sessions.
- Question Guide
- Slides
- Video
- Who is the Interviewer: Cody
Algorithm Analysis and Sorting
Welcome to week 2! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Categorize algorithms according to their big-O efficiency classes
- Compose algorithms that search efficiently through sorted data
- Analyze worst case and best case behavior of algorithms
- Trace the execution of sorting algorithms
- Derive the efficiency of algorithms experimentally
What's due this week
- Quiz 2 (on Gradescope)
- Assignment 2
Big-O Notation
We are now ready to develop a formalism for classifying algorithms according to their time and space efficiencies. This formalism is known as big-O notation.
What is big-O notation?
Big-O notation can be used to describe an algorithm's time and space efficiency. It describes an upper bound on the amount of resources that will be required for the algorithm as a function of the input size (n
) to the algorithm.
Time efficiency is sometimes known as time complexity or running time. Space efficiency is sometimes known as space complexity.
Watch the video below to learn about big-O notation and to hear an example where knowing the efficiency of an algorithm may have helped a software developer devise a better solution:
To summarize, here are some of the most common efficiency classes, sorted by slowest to fastest:
O(n!)
("factorial time")O(cn)
wherec
is some constant ("exponential time")O(nc)
wherec
is some constant >2 ("polynomial time")O(n2)
("quadratic time")O(nlogn)
("linearithmic time" or "log-linear time")O(n)
("linear time")O(logn)
("logarithmic time")O(1)
("constant time")
For example, we would say that as the input size n
grows, an O(n)
algorithm is faster than an O(n2)
algorithm.
What is big-O measuring?
We mentioned above that a big-O expression defines an upper bound on the amount of resources used by an algorithm. But how do we measure the amount of resources?
For time efficiency, recall from last week we are concerned with the computational time of the algorithm. The computational time is determined by the number of operations that the algorithm performs. When a computer program is executed, the code of the program is broken down into a series of simple operations, for example:
- Adding two numbers together
- Comparing two numbers
- Fetching a value from main memory
- Assigning a value to a variable
- And many others!
Taken together, these fundamental operations compose an entire computer program. For some programs, we are actually able to derive an exact formula for the number of operations it performs. For example, we might be able to say that an algorithm performs 3n - 3
move operations, where n
is the size of the input. To derive the big-O running time from this exact formula, we can use a couple of rules.
The rules of big-O notation
Rule 1: lower-order terms are ignored
Keep in mind that big-O defines what happens as n
increases -- in theory, to infinity. Therefore, when analyzing the efficiency of algorithms, we think of n
as a very large number.
When n
is large, mathematical expressions involving n
are dominated by their largest term (i.e., the term with the highest degree or exponent):
n | n^2/2 | n/2 | n^2/2 - n/2 | |
---|---|---|---|---|
10 | 50 | 5 | 45 | |
100 | 5,000 | 50 | 4,950 | |
10,000 | 50,000,000 | 5,000 | 49,995,000 |
Therefore, in big-O notation we ignore all lower-order terms and focus only on the highest order term: n2 + nlogn - n
becomes simply O(n2)
.
Rule 2: coefficients are ignored
For the same reason, we typically ignore coefficients, even on the largest term. As n
grows very large, coefficients are not significant. Even a coefficient such as 1/1,000,000,000,000
is insignificant when n
grows toward infinity!
For example, n2/2 + n = O(n2)
because we can ignore the lower-order term (n
) and can also ignore the coefficient (1/2
).
Big-O and Worst-Case Analysis
An algorithm can behave differently depending on the contents of its input. For example, consider this algorithm:
def print_odd_len_list(lst):
# if the list has an even number of elements
if len(lst) % 2 == 0:
return
for item in lst:
print(item)
Clearly, the if
statement here impacts how long this algorithm takes to run. If the given lst
has an odd number of elements, the function prints all of the elements, one-by-one. However, if lst
has an even number of elements, the function simply returns without doing any printing. How do we analyze the effect that conditional execution (i.e., if
statements) has on the running time of an algorithm?
We can break our analysis down into different cases. For print_odd_len_list()
, the best case occurs when the list is an even length, because there is the least amount of work to do. For even-length lists, the function is O(1) -- no matter how long the list is, it can always immediately return!
In the worst case, an odd-length list is given. In this case, the function iterates over the entire list, and is therefore O(n)
.
When we're analyzing algorithms, we are typically interested in the worst case behavior when devising our big-O expression. This allows us to ignore conditional execution and just focus on the overall structure of the algorithm. This worst-case analysis is for theoretical purposes only. In real-world applications, we do consider how frequently the best case would occur, along with whether there would be an average case (i.e., somewhere between best and worst).
Still, when someone asks for the big-O analysis of an algorithm, unless otherwise specified they are asking for the performance of the algorithm in the worst case.
Big-O and Space
We can also use big-O notation to express the amount of extra space that an algorithm uses. By "extra" space, we mean memory resources that are needed to complete the algorithm, excluding the input.
In some algorithms, the only extra memory resources that are needed are some local variables, such as loop indexes or a constant number of other variables on the stack. These algorithms have a space complexity of O(1)
.
Other algorithms require a more significant amount of extra space. For example, an algorithm that makes a copy of a list has a space complexity of O(n)
, since the amount of extra space needed by the algorithm linearly increases as the size n
of the input list increases.
Summary
We've covered a lot of ground in this lesson. Watch the video below to see a quick summary of big-O notation, and make sure that your understanding aligns with what it describes before moving on to the next lesson.
Heuristics for Finding Big-O Expressions
In this lesson, we will learn some heuristics, or rules of thumb, for deriving the big-O notation of algorithms by inspecting the number and nature of loops (i.e., for
loops and while
loops) in the algorithm.
But why loops? We focus our analysis on the presence of loops because loops are one of the most common techniques for iterating over the input to an algorithm. A loop over the entire input of size n
already guarantees that your algorithm is performing at least O(n)
operations -- just in order to iterate over the collection!
Loops up to n
Probably the simplest case for this type of analysis is when you have a loop that iterates according to some expression involving n
. For example, this algorithm is O(n)
:
def print_lst(lst):
n = len(lst)
for i in range(n):
print(lst[i])
Remember that the algorithm doesn't have to iterate over all n
items in order to be O(n)
. Since lower order terms and coefficients are ignored for the purposes of big-O, we would have an algorithm like this:
def my_strange_algorithm(lst):
n = len(lst)
for i in range(n / 1000000):
print(lst[i])
my_strange_algorithm()
is also O(n)
. We know this because the loop iterates n * 1/100000
times, but since for sufficiently large values of n
the coefficient 1/1000000
will not matter, we can ignore it and end up with O(n)
.
Note that this same kind of analysis applies to while
loops as well, as long as the number of iterations is on the order of the size of the input:
def print_lst(lst):
n = len(lst)
i = 0
while i < n:
print(lst[i])
i += 1
Loops up to a constant value
Some loops do not iterate with respect to the size of the input collection. For example, you might have an algorithm like the following:
def print_first_three(lst):
for i in range(3):
print(lst[i])
No matter what the size of the input list is -- could be five elements or one million elements -- this function will only ever perform three iterations. Therefore, this algorithm is constant time: O(1)
.
Back-to-back loops
Sometimes, algorithms contain loops that are serial or "back-to-back." When this happens, the operations (and big-O expressions) sum together. For example:
def serial_loops(lst):
for i in range(len(lst)):
print(lst[i])
for i in range(len(lst), -1, -1):
print(lst[i])
The first for
loop, which prints the list, is O(n)
. The second loop, which prints the list backwards, is also O(n)
. Putting these two together, O(n) + O(n) = O(2 * n) = O(n)
.
Nested loops
If loops are composed in a nested fashion, the number of operations are multiplied, and therefore the running times are multiplied. For example, consider these nested loops:
def nested_loops(lst):
for i in range(len(lst)):
for j in range(len(lst) - 5):
if i < j:
print(lst[i])
In this case, for every one iteration of the outer (i
) loop, the inner (j
) loop iterates O(n)
times. The outer loop itself iterates O(n)
times. Therefore, the if
statement inside of the nested loops is evaluated O(n) * O(n) = O(n^2)
times.
A couple of things to note:
- Strictly speaking, the inner loop iterates exactly
n - 5
times, but this isO(n)
. - The conditional statement (
if i < j
) doesn't have any effect on the running time. Remember that for the purposes of big-O, we are mostly concerned with the worst case behavior, so we will assume (even if it's not true) thati < j
is always true and the body of theif
is executed on every iteration.
However, it could also be the case that the inner loop looks slightly different:
def nested_loops_2(lst):
for i in range(len(lst)):
for j in range(i):
print(j)
Notice that the inner loop iterates up to i
, the index of of the outer loop. What's the big-O notation of the algorithm now? Let's count the number of iterations of the inner loop to find out.
- On the 1st iteration of the outer loop (when
i
equals 0), the inner loop will iterate 0 times. - On the 2nd iteration of the outer loop (when
i
equals 1), the inner loop will iterate 1 time. - On the 3rd iteration of the outer loop (when
i
equals 2), the inner loop will iterate 2 times. - On the final iteration of the outer loop (when
i
equalsn
- 1), the inner loop will iteraten
- 1 times.
If you sum the number of operations across all of these iterations you get:
1 + 2 + 3 + ... + (n - 3) + (n - 2) + (n - 1)
This is actually a well-known arithmetic sum that has a closed form expression:

n
integers.This means that nested_loops_2()
performs exactly n2/2 - n/2
comparisons for a list of size n
, which is O(n2)
.
Loops containing functions
When deriving the big-O notation by inspecting loops, you must take into the account the running time of any functions nested inside the loops as well. For example, it might be tempting to think of this function as O(n2)
:
def nested_loops_func(lst):
for i in range(len(lst)):
for j in range(len(lst)):
x = some_function(i, j)
print(x)
However, we must also consider the running time of some_function()
! Say that it is implemented as the following:
def some_function(times, factor):
total = 0
for k in range(times):
total += factor
return total
It turns out that some_function()
is O(n)
itself. Therefore, nesting it inside of two nested loops makes nested_loop_func()
O(n3)
overall.
Loops on different inputs
Finally, it's worth noting that when an algorithm has multiple inputs, we use different symbols to represent their runtimes. For example, this function has two inputs:
def multiple_inputs(lst1, lst2):
for item in lst1:
print(item)
for item in lst2:
print(item)
Since we don't know if lst1
and lst2
have the same lengths, we can't use n
to represent the length of both of them. Instead, we can use n
to represent the size of lst1
, and m
to represent the size of lst2
. Therefore, this algorithm has a running time of O(n + m)
.
Summary
Here's a video that summarizes big-O notation and some of the techniques discussed above:
Searching
Let's now put our algorithm analysis skills into practice in the context of searching.
Searching is a very common task in computer science. When we store data inside of data structures, we often need to search for and extract that information at a later point in time. In some applications, searches occur very frequently and need to be quickly executed.
For example, a large e-commerce website receives searches for products with keywords ("medium blue t-shirts"), and needs to search through their catalog of millions of products to return relevant results. The website may receive thousands of such requests every second!
Let's take a look at two methods of searching for data in a Python list: linear search and binary search. By the end of this week, we'll be able to compare the efficiency of these two algorithms.
Data structures such as trees and hash tables provide the ability to perform data lookups very fast -- but we're not quite there yet! We'll stick to using Python lists for now.
Linear search
A linear search simply iterates sequentially through a collection of items. The search checks every element of the collection, until either the desired item is found, or there are no more left items to inspect.
Here's an example of a linear search. The function below checks whether a list contains a given number:
def contains_num(lst, num):
for item in lst:
if item == num:
return True
return False
From our big-O heuristics, we know that this function contains a single for
loop that iterates over all items of the list, and so its running time is O(n)
.
Let's modify our linear searching algorithm. What could we do if the list was sorted -- with all of items of the list ordered according to their value, least to greatest? Could we improve the algorithm?
If the list were sorted, we could stop our search early if we were sure that the item was not in the list (the code in the elif
):
def contains_num(lst, num):
for item in lst:
if item == num:
return True
elif item > num:
# we are past the point where num
# would have been in the list
return False
return False
Since the list is sorted, if we've passed the point in the list where num
should have been, we know that it doesn't appear in the list. Try tracing this newest version of the function with the list [2, 4, 7, 8, 9]
while searching for the number 5
.
We have leveraged the fact that the list is in sorted order to improve our algorithm in some cases when num
does not appear in the list.
This optimization is nice, but it doesn't improve on the big-O running time of the original algorithm. When the item is not in the list, we still on average have to search through approximately half of the list, and n / 2 = O(n)
.
In addition, this optimization does not improve our algorithm at all when the item does appear in the list. For that, we can use binary search.
Binary search
One of the exercises from the P1 course involved writing a number guessing game, in which a program chooses a random number between 1 and 100, and the player guesses the number. If they guess the number, they win. If they guess too low or too high, the program tells them whether the guess was too high or too low and lets them keep guessing.
We mentioned in P1 that binary search would be a good algorithm to use in this problem, since it allows us to cut the space of what we're searching for in half with each guess. For the number guessing game, 50 would be a good initial guess, since the program would tell us whether the number was lower or higher, allowing us to then either ignore the numbers 1-49 or the numbers 51-100. Repeating this technique by guessing the "middle" number at each stage is binary search, which ends up being much faster than linear search!
Watch the video below to see an overview of binary search.
As the presenter mentioned, the number of items inspected by binary search is more like log2n
, compared to the n
items inspected by linear search. For this reason, binary search has a running time of O(logn)
, which is much more efficient than O(n)
!
It's worth emphasizing that binary search only works if the collection is sorted, and that sortedness doesn't come for free. As items are inserted into the collection, care must be taken to insert it in sorted order, or else we wouldn't be able to use binary search.
Now that we have an understanding of binary search in general, watch the following video to develop an understanding of how binary search could be implemented using a while
loop:
Summary
If a collection is sorted, then binary search can be a much faster way of performing a lookup than linear search. Sorting collections provides many benefits, so let's turn our attention to sorting as a topic next.
Sorting
We now know that sorting can be useful in some algorithms -- for example, if a list is sorted, then binary search can be used to make a search more efficient. But sorting is actually a much more general topic with many applications in the world around us. Watch the following video, which provides an overview of the goal of sorting and compares a few sorting algorithms:
As mentioned by the video, books are a great example of an item that can be sorted. In this class, for simplicity we will primarily focus on sorting integers, but the techniques that we will cover generalize to sorting other types of data as well (for example, strings).
Why sorting?
There are a few reasons why we're focusing on sorting algorithms for the rest of the lessons in this week, including:
- Sorting algorithms are all around us in our daily lives.
- Sorting algorithms are relatively easy to understand when learning how to analyze the efficiency of algorithms.
- There are multiple simple algorithms that take different approaches to sorting lists, which enables us to compare and contrast their techniques and efficiencies. For example, some algorithms (like selection sort and insertion sort) order the numbers in a list by comparing the numbers of the list directly to each other. Other algorithms (like radix sort) don't compare the numbers of the list at all, but rather distribute the numbers to buckets based on the value of the numbers themselves.
Our discussion of sorting algorithms is in part motivated by our desire to learn about how to analyze the efficiency of algorithms. Let's talk more about how to analyze the efficiency of sorting algorithms next.
Comparisons and moves
We learned in an earlier lesson that big-O notation represents an approximate upper bound on the number of operations that a computer program will execute in terms of the input size n
. For sorting algorithms specifically, the number of overall operations is dominated by two tasks: (1) comparing elements of the list and (2) moving elements of the list around to different positions. Therefore, we will focus our time analysis on the number of comparisons and moves that a sorting algorithm performs.
For example, this would be considered a comparison between two elements of a list:
if lst[i] < lst[i - 1]:
And the following would be considered a move into one of the positions of the list:
lst[i] = lst[i + 1]
In the next lesson, we will learn the selection sort algorithm, count the number of comparisons and moves it performs, and use those metrics to derive its big-O running time.
Selection Sort
The first sorting algorithm that we will learn is selection sort. Watch the following video to learn how the algorithm works.
In summary, to sort a list, selection sort performs the following steps:
- Find the minimum element in the list.
- Swap the minimum element in the list with the element at index 0. The minimum element is now in its final sorted position.
- Find the minimum element in the remaining portion of the list (starting from index 1).
- Swap this element with the element at index 1. The second-to-minimum element is now in its final sorted position.
- Continue finding the next-minimum element in the remaining portions of the list until you have swapped all elements into their final sorted positions.
Let's now think about how we could implement this algorithm in Python.
Swapping elements
As you can see, a key part of selection sort is swapping elements. How can we accomplish that in Python?
Let's try defining the swap()
function to swap two elements in a list. We'll make use of a variable temp
to allow the swap to happen without overwriting one of the element's values before we can use it:
def swap(elem1, elem2):
temp = elem1
elem1 = elem2
elem2 = temp
lst = [5, 8, 2, 1, 9]
swap(lst[0], lst[3])
print(lst)
As you can see, we will need to take a slightly different approach. Here's a second, improved version of the swap()
function:
def swap(lst, i, j):
temp = lst[i]
lst[i] = lst[j]
lst[j] = temp
lst = [5, 8, 2, 1, 9]
swap(lst, 0, 3)
print(lst)
Selection sort in Python
Let's see how we can use the swap()
function to implement selection sort in Python:
Counting comparisons and moves
Here again is the code for selection sort:
def index_smallest(lst, start):
index_min = start
for i in range(start + 1, len(lst)):
if lst[i] < lst[index_min]:
index_min = i
return index_min
def selection_sort(lst):
for i in range(len(lst) - 1):
j = index_smallest(lst, i)
swap(lst, i, j)
Let's try counting the number of comparisons C(n)
that selection sort makes for a list of size n
. To sort a list of n
elements, selection sort performs n - 1
passes over the list.
- On the first pass, it performs
n - 1
comparisons to find the smallest element. - On the second pass, it performs
n - 2
comparisons to find the smallest element. - ...
- On the
n - 1
st pass, it performs 1 comparison to find the smallest element.
If you sum the comparisons across all of these passes, you would get:
C(n) = 1 + 2 + 3 + ... + (n - 3) + (n - 2) + (n - 1)
This is actually the same as the arithmetic sum that we saw in the heuristics of big-O lesson. There, we learned that this sum is equal to n2/2 - n/2
, meaning that selection sort performs O(n2)
comparisons.
What about the number of moves? Well again we know that selection sort performs n - 1
passes, during each of which it performs one swap()
operation. The swap()
function performs three moves to shift values into a temporary variable and into the spots in the list, meaning that selection sort always performs exactly 3n - 3 = O(n)
moves.
Big-O notation
To summarize, we have the following exact expressions for the number of comparisons C(n)
and the number of moves M(n)
that selection sort performs for a list of size n
:
C(n) = n2/2 - n/2 = O(n2)
M(n) = 3n - n = O(n)
Using big-O notation, the number of comparisons is O(n2)
. This means that if we doubled the size of the input list from n
to 2n
, we would also expect the number of comparisons to approximately quadruple, since (2n)2 = 4n2
.
On the other hand, the number of moves in selection sort is O(n)
. If we double the length n
of the list to sort to 2n
, we would expect the number of moves to approximately double.
During the course of its execution, selection sort performs O(n2)
comparisons + O(n)
moves. If we follow the same general principle of dropping lower order terms, then the number of comparisons dominates the number of moves and the overall running time of selection sort is O(n2)
.
Insertion Sort
Let's continue our tour of sorting algorithms with insertion sort. Watch the video below for an overview of the algorithm:
To summarize, here are the steps of insertion sort:
-
Start with the element at index 1. Compare the element with all elements to its left (in this case, just the element at index 0). If the element at index 1 is less than the element at index 0, then slide the element at index 1 over so that the elements have switched places. The first two elements of the list are now sorted with respect to each other.
-
Consider the element at index 2. Again compare it with all elements to its left, sliding it over as necessary to maintain sortedness among the first three elements in the list. The first three elements of the list are now sorted with respect to each other.
-
Continue throughout the length of the list, taking one element at a time and finding its appropriate place in the sorted part of the list. During some iterations, an element may not need to be pushed left at all (if it is greater than the first element to its left).
Insertion sort in Python
Let's see how insertion sort can be implemented in Python:
Big-O analysis of insertion sort
Let's take another look at the code for insertion sort:
def insertion_sort(lst):
for i in range(1, len(lst)):
to_insert = lst[i]
j = i
while j > 0 and to_insert < lst[j - 1]:
lst[j] = lst[j - 1]
j -= 1
lst[j] = to_insert
Unlike selection sort, with insertion sort there is a difference between the best case and worst case behavior of the algorithm.
Best case analysis
Let's trace through what would happen if the list given to insertion sort was already sorted:
Since the body of the while
loop is never executed, we can essentially ignore it for the purposes of our big-O analysis. This essentially leaves us with a single for
loop to execute. We can therefore say that in the best case (a fully sorted list), the running time of insertion sort is only O(n)
!
Worst case analysis
The worst case scenario for insertion sort occurs when the algorithm is given a reverse sorted list as input. This is an inefficient scenario for insertion sort because it means that each element is compared to all elements to its left on every iteration of the algorithm.
Let's count the number of comparisons in this case:
lst[1]
is compared to 1 element (lst[0]
)lst[2]
is compared to 2 elements (lst[0]
andlst[1]
)- ...
lst[n - 1]
is compared ton
- 1 elements
Therefore, the number of comparisons C(n) = 1 + 2 + ... + (n - 2) + (n - 1)
. We've seen this pattern before! It's again an O(n2)
algorithm.
Average case analysis
In the average case, the input list is neither sorted nor reverse sorted, but simply a random collection of numbers. In this scenario, some elements may need to be shifted left many times, some elements may need to be shifted left only a few times, and some elements may not need to be shifted left at all.
Under this scenario, the nested while
loop is running a non-negligible number of times, and so we basically have nested loops, leaving us with an O(n^2)
algorithm in general.
Space complexity
How much extra space is needed for algorithms like selection sort and insertion sort?
Both of these algorithms are in-place sorting algorithms, meaning that the sorting can be done within the input list itself. We may need to use a small number of temporary variables to do swapping or to keep track of which value to insert, but this is a constant amount of extra memory. Therefore, the space complexity of both selection sort and insertion sort are O(1)
.
Radix sort
Both of the sorts we've considered so far are comparative sorting algorithms. They sort the lists by comparing elements of the list against each other to find their relative orderings, and puts them into position based on that ordering.
But that's not the only approach to sorting! One alternative is distributive sorting, which distributes the values of the list into positions based on an inherent characteristics of the values themselves.
One example of this is radix sort. Watch the video below to see how radix sort works:
In summary, the algorithm is as follows:
-
Place the values of the list into buckets 0-9 according to their least significant digit (the "ones" digit). At this point, the values are sorted with respect to their ones digits only.
-
Iterate over the values in the buckets, starting from the 0 bucket and proceeding all the way to the 9 bucket. Place the values of the list into buckets 0-9 according to their next significant digit (the "tens" digit). At this point, the values are sorted with respect to both their tens and ones digits.
-
Continue iterating over the values in the buckets (from 0 to 9), consider the next significant digit for each pass. Stop when there are no more digits to consider.
-
Format the buckets back into a list and return the list.
Runtime analysis
For random lists, both selection sort and insertion sort are O(n2)
algorithms. Although we won't go into the proof here, it can actually be proven that comparative algorithms can do, at best, O(nlogn)
. But can we do better with a non-comparative sort like radix sort?
Let's think about the running time of radix sort by defining some variables:
m
= number of digitsn
= length of the list
Using these variables, we can define the running time of radix sort to be O(m * n)
. This is because the algorithm follows this pseudocode:
for each of m digits:
for each of n values:
place each value into a bucket according to its ith digit
In other words, radix sort performs m
passes, each of which processes all n
values of the list. Since these for
loops are nested, we can use our big-O heuristics to see that the running times of these loops would be multiplied to yield a running time of O(m * n)
.
In many cases, we can consider the maximum number of digits to be a constant, since numbers often have a finite representation in computer programs (e.g., 32-bit integers). Therefore, radix sort is often considered to be a linear-time O(n)
sort!
Note that the algorithm requires a list of buckets at each step to distribute the values into, meaning that the sort is not in-place and requires O(n)
extra space.
Practice: Algorithm Analysis and Sorting
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Sorting lists using various sorts
Consider the following list:
[19, 81, 67, 21, 86, 42, 6, 30, 16]
- Sort the list using selection sort. Show the contents of the list after each pass of the outer loop.
- Sort the list using insertion sort. Show the contents of the list after each pass of the outer loop.
- Sort the list using radix sort. Show the contents of the list after each round of bucketing.
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution video.
2. Deriving big-O running time
Suppose you want to count the number of duplicate items in a list. For example, if given the list [4, 1, 9, 1, 1, 4, 6, 8]
, there would be three duplicates: two extra 1s and one extra 4.
Consider the following algorithm to count the number of duplicates:
def num_dups_1(lst):
num_dups = 0
for i in range(len(lst)):
for j in range(i + 1, len(lst)):
if lst[i] == lst[j]:
num_dups += 1
break
return num_dups
This algorithm looks at every element of the list and considers all elements to its right. If it finds a duplicate, it adds one to the count. To avoid double-counting duplicates, it will break
the inner loop once a duplicate is found and continue on to the next element of the outer loop.
- What's the worst case big-O time efficiency of
num_dups_1()
in terms of the length of the list? What's the big-O space efficiency?
Here's an alternative way of implementing this algorithm, which first sorts the input using radix sort:
def num_dups_2(lst):
sorted_lst = radix_sort(lst)
num_dups = 0
for i in range(1, len(lst)):
if lst[i] == lst[i - 1]:
num_dups += 1
return num_dups
This version of the algorithm first sorts the list using radix sort, and then uses a for
loop to count the number of duplicates as it iterates over the sorted list.
-
What's the worst-case big-O time efficiency of
num_dups_2()
in terms of the length of the list? What's the big-O space efficiency? Be sure to include the time and space efficiencies of radix sort as discussed in this week's lessons as a part of your answer. -
According to your analysis, which version of the algorithm is more efficient?
Solution Video
3. Searching for duplicates using binary search
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
4. Experimentally deriving the running time of insertion sort
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
5. Optimizing selection sort
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 2 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 2
Due: January 21 at 11:59 PM GMT
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Implementing and Analyzing Sorting Algorithms
This week we learned about how to sort collections of data using various algorithms. We also learned how to analyze the efficiency of these algorithms using big-O notation.
In this assignment, you will analyze an unnamed sorting algorithm by counting the number of operations the sorting algorithm makes. You will then take those measurements and use them to derive the big-O notations of the algorithm under different conditions. Finally, you will implement radix sort and analyze it empirically as well.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Week 2 Interviews
During the first two weeks of the term, two special sessions will be held in which the instructor will perform an interview with a student. These special sessions are separate from the regular class meeting and are optional. However, the goal of them is to provide you with some experience and expectations for the routine of a technical interview.
The instructor will seek out volunteers to be the interviewees for these special sessions.
- Question Guide
- Video
- Who is the Interviewer: Cody
Reading
Read the selection sort and insertion sort sections of Chapter 3 of An Open Guide to Data Structures and Algorithms. You don't need to read the sections about merge sort or quicksort, as we'll cover those topics later.
Recursion I
Welcome to week 3! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Implement algorithms using recursion
- Visualize and trace the execution of a recursive algorithm
- Compute the time and space complexities of recursive algorithms
- Avoid common pitfalls of writing recursive programs
What's due this week
- Week 3 Quiz on Thursday (on Gradescope)
- Get started on your peer interviews (reflection not due until next week)
A Loop Alternative
So far, we have used loops to process data structures such as Python lists and dictionaries. But that's not the only way! We can also use an approach known as recursion.
We'll define recursion as a programming technique in the next lesson. For now, let's just get our brains working recursively.
Counting your position in a queue
Imagine that you're in a queue for a sporting event, and want to know what your position is in the queue. Maybe the venue has only a limited number of seats, and you're wondering whether you'll be able to get a seat.
One approach to finding out your position in the queue would be to simply count the number of people in front of you. For each person in the queue, you add one to a running sum. Once you've counted all the people in the queue, add one more to the sum to get your position in the queue.
This technique is analogous to using a loop to iterate over some data, and it works great. But let's try thinking about this problem a little bit differently.
Instead of counting all of the people yourself, you realize that if you knew the position in the queue of the person in front of you, you could just add one to calculate your own position in the queue. Genius!

You ask Kisha what her position in the queue is. She's almost as far back as you are, so she in turn asks Amari what his position in the queue is. Amari asks the person in front of him, so on and so forth, until we reach the beginning of the queue. There, Emmanuel asks Zuri what her position in the queue is. Zuri knows for a fact that she is first in the queue, so she tells Emmanuel that she is in position 1.

Emmanuel takes this information and adds one to the count, so that he can tell the person behind him that he is in position 2 in the queue. Each person repeats this process, until eventually Kisha tells you that she is in position 536 in the queue. You take this information and add one to the count to get your final answer: you are in position 537!

This is a recursive solution to the problem. Let's now more carefully define what we mean by recursion.
What is recursion?
Here's one definition of recursion:
Definition.
Recursion is a problem solving technique in which you define a problem in terms of a simpler version of itself.
In the above example, we were able to define the "position in queue" problem in terms of a smaller version of itself:
Recursive case
position_in_queue(me) = position_in_queue(person in front of me) + 1
Through repeatedly applying this logic throughout the queue, we eventually found that position_in_queue(person in front of me)
was 536, leaving us with an easy calculation to find that position_in_queue(me)
is 537.
Another way of thinking about this approach is that we took a problem of size n
(there are n
people in the queue), and decomposed it into a problem of size (n - 1
).
Technically, we actually only applied this logic for everyone in the queue except the person at the front of the queue. In that case, the problem was simple enough for us to solve without needing to do any extra work. For the first person in the queue, we know that:
Base case
position_in_queue(first person in queue) = 1
This is known as the base case -- the case for which the problem is small enough or simple enough that we can define the answer without needing to break the problem down recursively. For all of the other people in the queue, we used the recursive case formula defined above.
Other applications of recursion
There are many natural applications of recursion in the world around us. For example, geometric patterns can be defined in terms of smaller version of themselves, such as Sierpinski's triangle:

Some tasks, such as climbing a ladder, can be defined in terms of recursion: climbing to rung n
of a ladder is equivalent to climbing to rung n - 1
and then climbing one additional rung.
Some mathematical formulas, such as the Fibonacci sequence or the factorial function, are defined recursively. For example, the factorial function (n!
) is defined as:
n! = n * (n - 1)!
You might be wondering at this point why we need to use recursion when loops seem perfectly suitable for our needs. The next lesson will address this question.
Why Use Recursion?
So far, thinking about solving problems recursively may seem more difficult than using loops. However, there are many problems that are more naturally (and more elegantly) solved using recursion. You will soon find that certain algorithms (e.g., binary search) and certain data structures (e.g., linked lists and trees) are easier to think about recursively.
Searching through a file directory is one example of a problem that is more easily solved recursively. Watch the below video to see why trying to print all of the files and folders from your computer is not simple to do using loops:
So how would we do this using recursion? We'll do this as a practice exercise at the end of the week (practice exercise 5).
Now that we know more about why recursion can be useful, let's learn the basics of how we can use recursion as a programming technique.
Recursion in Python
We now turn our attention to using recursion in Python. Watch the following video to learn how to implement the factorial function both iteratively (using loops) and recursively in Python:
To summarize, a recursive definition of factorial in Python could look like this:
def factorial(n):
if n < 2:
# base case
return 1
else:
# recursive case
return n * factorial(n - 1)
During the video, the presenter claimed that recursion "does not scale up like iteration" and "requires more memory." Both of these statements are generally true. To see why, let's talk about how recursion works with respect to our memory model.
Recursion in memory
Back in week 1, we defined the runtime stack as the part of main memory where local variables and function parameters are stored. We also had the following definition of a stack frame:
Definition.
When a function is called, a new stack frame is added to the stack. The stack frame includes enough room for all of the local variables and parameters that the function may use. When the function call returns, the stack frame is removed from the stack.
Each time a recursive function calls itself, another frame is added to the stack. This continues until the base case is reached. In the case of the factorial function, this is what the stack would look like at the moment that the base case is reached:

Notice that each stack frame contains its own n
variable. We say that each function call has its own state, or variables that are local to its stack frame, which make the overall computation possible.
Once the base case (factorial(0)
) is reached, it returns 1
to the previous function call (factorial(2)
). In fact, each recursive call will receive a return value, multiply it by n
(whatever n
is in that stack frame), and return the result to the previous recursive caller.
Every time the recursion function is called, a new frame is added to the stack, and each one of these frames takes up memory. Contrast this with a loop-based approach, which can often use a single loop index variable (such as i
). For this reason, recursion can indeed be more expensive in terms of space than iteration. We'll discuss this (in)efficiency in terms of big-O notation in a later lesson.
Tracing Recursive Functions
Tracing recursive functions can be an effective technique for understanding their behavior and identifying bugs.
Using print statements
One common approach to tracing recursive functions is to use print statements to output the input arguments and intermediate results at each step of the recursion. For example, consider the following recursive function that calculates the factorial of a positive integer:
def factorial(n):
if n == 0:
result = 1
else:
sub_result = factorial(n - 1)
result = n * sub_result
return result
To trace this function using print statements, we can add statements to output the input argument and the intermediate result at each step of the recursion. We can also add an indent
parameter, which we can use to change the level of indentation of the print statement to more clearly illustrate the nested recursive calls:
def factorial(n, indent):
print(indent + f"factorial({n})")
if n == 0:
result = 1
else:
sub_result = factorial(n - 1, indent + ' ')
result = n * sub_result
print(indent + f"result = {result}")
return result
Note that with each recursive call, the level of indentation is increased (indent + ' '
). The level of indentation increases the length of the whitespace string to more clearly show which recursive calls are nested within each other.
This approach can be useful for small and simple functions, but may be cumbersome for complex recursion.
Try stepping through this code as factorial(5, '')
executes (''
is the initial level of indentation):
Using diagrams
You may have noticed in the previous example in Python Tutor that the runtime stack changed as the recursion executed. It can indeed be helpful to visualize the stack as the recursion executes -- this can help us understand that each call to the recursive function has its own stack frame and state. It also illustrates that after the base case is reached, all of the recursive calls must be "returned through" in order for the recursive process to complete.
Watch the following video to see an example of tracing a recursive function with a stack diagram:
Tracing example
Below is another example of a recursive function, that counts up from a
to b
:
Try tracing the execution of count(1, 5)
. You should see the output as:
1
2
3
4
5
Now consider the following modification, which swaps the two lines in the else
clause:
def count(a, b):
if a >= b:
print(a)
else:
# below statements are reversed
count(a + 1, b)
print(a)
What do you think will be the output of count(1, 5)
with this modification? To figure it out, try drawing the runtime stack while tracing the execution of the code. Is anything different about the output? If needed, try putting the code into Python Tutor.
Infinite Recursion
One potential pitfall when writing recursive functions is to ensure you're not calling the recursive function infinitely. This can happen if your base case is missing or incorrectly defined.
For example, this program attempts to count down from a given integer, two numbers at a time:
def count_down_by_two(n):
if n == 0:
# base case
print(n)
else:
print(n)
count_down_by_two(n - 2)
Say that we call count_down_by_two(8)
. Everything works as expected:
count_down_by_two(8):
print(8)
call count_down_by_two(6)
count_down_by_two(6):
print(6)
call count_down_by_two(4)
count_down_by_two(4):
print(4)
call count_down_by_two(2)
count_down_by_two(2):
print(2)
call count_down_by_two(0)
count_down_by_two(0):
print(0)
return (base case)
All of the recursive calls then return, and the overall output is:
8
6
4
2
0
Check your understanding of the execution of this code by stepping through it yourself using Python Tutor:
But what if we call count_down_by_two(5)
? Before playing the video below, try to predict what the result will be by tracing the execution of the count_down_by_two()
function defined above. Only proceed to the video and code below after making your prediction.
Clearly, we need a more comprehensive base case -- one that is more robust to the different patterns of execution. Here's one way you could correct the issue (by using <=
in the base case):
def count_down_by_two(n):
if n <= 0:
# base case
# if we've gone negative, just print 0 and stop
print(0)
else:
print(n)
count_down_by_two(n - 2)
Now that we know a bit about writing recursive methods, let's learn about how to trace their execution to gain a better understanding of how they work.
Fibonacci Sequence
One classic example of a problem that can be solved using recursion is the computation of the Fibonacci sequence. The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, starting from 0 and 1. The sequence can be expressed mathematically as follows:
fib(0) = 0
fib(1) = 1
fib(n) = fib(n-1) + fib(n-2) for n > 1
For example, the first 10 terms of the Fibonacci sequence are:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34
The Fibonacci sequence is significant in mathematics and computer science because it is a ubiquitous example of a recursive sequence, and is often found in natural phenomena such as the growth patterns of plants. The sequence also exhibits a wide range of interesting mathematical properties, such as the golden ratio, which has many applications in art, architecture, and design.
Watch the following video to learn more about how the Fibonacci sequence appears in many aspects of our world:
Implementing Fibonacci
Based on the mathematical definition of the Fibonacci sequence above, try writing a recursive function that implements the nth Fibonacci number. Don't forget the base case(s)!
When satisfied with your effort at writing the function, watch the following video to see one potential solution:
Click to reveal the video.
Note the example at the end of the video, which shows that it seems to take many function calls to calculate even a small Fibonacci number, such as fibonacci(6)
. Let's take a closer look at the efficiency of the recursive implementation of the Fibonacci function in the next lesson.
Recursion and Efficiency
We left off in the last lesson with the observation that when calculating fibonacci(6)
, it seems to take more recursive function calls than we might have expected. Let's more precisely define the efficiency of the recursive implementation of calculating the nth Fibonacci number.
To help us do so, let's draw a call tree -- a diagram that illustrates how the recursive functions are called.
As illustrated in the video, calculating fibonacci(4)
required us to call fibonacci(1)
three times! As we calculate larger and larger Fibonacci numbers, the number of recursive calls that we need to make is going to grow considerably:
Fibonacci number n | Number of function calls needed |
---|---|
3 | 5 |
4 | 9 |
5 | 17 |
... | ... |
10 | 109 |
20 | 10946 |
30 | 1346269 |
As the size of the input (nth Fibonacci number) increases, the number of function calls increases exponentially (O(2n)
).
Since each function call takes constant time to execute (i.e., O(1)
), the total running time of the recursive implementation of the Fibonacci sequence can be expressed as the product of the number of function calls and the running time per call. This gives a total running time of O(2n) * O(1) = O(2n)
.
This is our first example of an exponential time algorithm. This is much slower than any of the algorithms we have seen so far. Before this, the slowest algorithms we had seen were selection sort and insertion sort (in the worst case), which were both O(n2)
.
Calculating big-O for recursive functions
When we learned heuristics for computing the big-O notation for loop-based algorithms, we said that you can multiply running times when operations are nested inside of loops. For example:
def print_list(lst):
for item in lst:
print(lst)
Here, we perform a constant time operation (a print statement) inside of a linear (O(n)
) loop, so the overall running time is O(1) * O(n) = O(n)
.
The same principle applies for recursive functions, except instead of counting the number of iterations of a loop, we are counting the number of function calls. This is because the number of function calls determines the number of times the body of the function is executed.
Consider the following count_down()
function, which counts down from n
to 0:
def count_down(n):
if n <= 0:
print(n)
else:
print(n)
count(n - 1)
How many function calls does count_down()
make to count down from n
? We can organize our analysis using another table:
n | Number of function calls needed |
---|---|
1 | 2 |
2 | 3 |
3 | 4 |
4 | 5 |
... | ... |
For example, to count down from n
= 1, we need to call count_down(1)
, which itself calls count_down(0)
, which is the base case. That required two function calls.
As we can see, counting down from n
takes exactly n + 1 = O(n)
function calls. We also know that there is a constant amount of work done for each function call -- just a print statement. Therefore, the big-O expression for count_down(n)
is O(n) * O(1) = O(n)
.
Space efficiency
We also need to consider the space complexity of recursive functions. For this, we again need to think about the number of function calls that are made with respect to the size of the input n
.
In the count_down()
example above, we reasoned that counting down from n
to 0 requires O(n)
function calls. Each one of these function calls requires adding a stack frame to the runtime stack, and each stack frame requires extra memory to store the local variables of the function.
To calculate how much extra memory we'll need, we need to think about how many frames will be on the stack when the base case is reached. This is the point at which we have the most frames on the stack throughout the execution of the algorithm. In the case of count_down()
, we have n + 1
frames on the stack, so the space complexity is O(n)
.
Contrast the recursive approach with an iterative count_down()
function below:
def count_down_iter(n):
for i in range (n, -1, -1):
print(i)
In terms of time complexity, this is an O(n)
algorithm -- just like the recursive version. However, in terms of space complexity, this algorithm is O(1)
, since it uses only a constant amount of extra memory.
In general, recursive algorithms may require more memory than their equivalent iterative counterparts. But space complexity is only one factor when considering whether to write an algorithm recursively or iteratively. Often times, the recursive version of the algorithm is easier to write or has better time efficiency.
Summary
The time complexity of a recursive function is determined in part by the number of times a recursive call is made for an input of size n
. As we've seen, this can lead to running times such as O(2n)
or O(n)
. In the near future, we will see examples of recursive functions that run in time O(logn)
and O(nlogn)
as well.
The space complexity of a recursive function is also determined in part by the number of times the function is invoked. This can generally lead to recursive algorithms requiring more memory than iterative algorithms, but this does not necessarily mean recursive algorithms are prohibitively expensive.
Practice: Recursion I
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Recursion and the stack
Consider the below definition of the Fibonacci function:
def fibonacci(n):
if n < 2:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
-
What is the fourth function call made to the
fibonacci()
function when executingfibonacci(4)
? Considerfibonacci(4)
to be the first function call in the process. -
How many stack frames are on the stack during the fourth function call of
fibonacci()
from (1)? Exclude any stack frames that may have been on the stack beforefibonacci(4)
was called. -
What is the fifth function call made to the
fibonacci()
function when executingfibonacci(4)
? Considerfibonacci(4)
to be the first function call in the process. -
How many stack frames are on the stack during the fifth function call of
fibonacci()
from (3)? Exclude any stack frames that may have been on the stack beforefibonacci(4)
was called.
Solution video.
2. Tracing a recursive function
Consider the following recursive algorithm:
def mystery(x, y)
if x * y == 0:
return x
else:
return y + mystery(x - 1, y - 2)
-
Trace the execution of
mystery(5, 6)
using one of the techniques from the lessons, including drawing a call tree or runtime stack, or by writing down the sequence of recursive calls and return values. -
What value is ultimately returned by
mystery(5, 6)
? -
How many stack frames are on the stack when the base case is reached? Exclude from your count any stack frames that may have been on the stack before
mystery(5, 6)
was called. -
Is it possible for this function to produce infinite recursion? If not, explain why. If yes, give example values of
x
andy
that would lead to infinite recursion.
Solution Video
3. Implementing various recursive functions
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
4. Recursively printing a file
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
5. Recursively walking a directory tree
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 3 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Week 3 Peer Interviews
Overview
For weeks 3 and 4, the class will be split into two pools of students, Group A and Group B. Each student in Group A will be paired with a student in Group B.
During this two-week period, you and your partner should conduct two 30 minute interviews, in which you are each once the interviewer and once the interviewee.
Each week, an "Interview Question of the Week" guide will be distributed via Google Drive to either Group A or Group B. The guide will contain all of the information needed to conduct the interview, including general instructions, a coding question, a sample solution, and proposed follow-up questions.
After the interview is completed, both partners will be asked to submit a reflection about their experience.
We will assign new partners in weeks 7-8.
Interview instructions
For all interviews, you should make sure you have read the interview instructions. After each interview, both the interviewer and interviewee should submit a reflection. The format for the reflection is described in the aforementioned document.
Here is the schedule of peer interviews and question guides for the term. Note: you must be logged-in to your
@kibo.school
account and in the appropriate group to access the question guide for a given week.
Week | Resources | Who is the interviewer? |
---|---|---|
3 | Question Guide | Group A |
Grading
The peer interviews will contribute 10% to your overall grade. This grade is based on
- completion of all four interviews,
- your reflections submitted for each interview.
You can find and submit the interview reflection in Gradescope.
FAQs
- In what setting will the interview take place?
The interview can take place as an online meeting. Meeting in person is not necessary, but is allowed if both partners agree.
- What happens if I'm unable to complete an interview in a given week?
There are opportunities for makeup interviews throughout the semester.
- What happens if my partner is unable to participate in an interview for a given week?
You can either do a makeup interview later in the semester, or find another peer who is willing to conduct the interview with you during the week (only in cases where your partner is the interviewer).
- I don't feel confident in my coding skills, and I'm afraid I won't do well during the peer interview.
In this class (and often in the real world), you are not strictly judged on your ability to produce a bug-free, highly efficient implementation of an algorithm. You are mostly judged on your thinking process and ability to work your way through the problem.
For the peer interviews, you will not be graded on the correctness of your solution, and will earn full credit as long as you complete the interview and submit a thoughtful reflection.
Also: we are here to learn! We are doing this process to give you experience interviewing, so that when you go to do a real interview, you already have some interview experience. We should all be mindful of this and help and encourage each other.
- My partner was unprepared for the interview and it hurt my experience. What can I do?
This is something you should discuss in your submitted reflection. Everyone should come prepared to every interview. Even though these are peer interviews, they are being conducted to help each other gain experience in this important part of becoming a software developer, and being unprepared will diminish this experience for your partner.
- As the interviewee, can I see the solution before I attempt the peer interview?
Please do not seek out the interview question or solution before attempting the interview. If you do, you will not get the full benefit of the exercise, which will only hurt your interviewing skills in the long run.
- How do I volunteer to be interviewed during weeks 1 and 2?
The instructor will provide information about finding volunteers at the start of the term.
- Does the interview have to last the full 30 minutes?
No, but it should last at least 20 minutes, and ideally 25-30 minutes.
- I have another question about the peer interviews. Who can I ask?
You can ask in the class Discord, or email the instructor directly.
Reading
Read Chapter 2 (up to but not including the section titled "More Examples of Recursive Algorithms") of An Open Guide to Data Structures and Algorithms.
Recursion II
Welcome to week 4! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Trace the execution of recursive algorithms on strings and lists
- Write recursive algorithms from scratch on strings and lists
- Apply divide-and-conquer recursive techniques to sort lists
- Conceptualize how Web crawlers index the Internet
What's due this week
- Week 4 Quiz on Thursday (on Gradescope)
- Week 3 Interview Reflection on Sunday
String Recursion
Last week, we learned that recursion is a powerful programming technique, and we used it to solve numerical problems: finding the nth Fibonacci number, calculating the factorial function, etc. This week, we will use recursion on lists and strings to solve more complex problems.
Just like numerical recursion, string recursion involves breaking down a problem into smaller subproblems and solving them recursively. We can think of strings as collections of n
characters, and can make the problem smaller at each step by solving a problem of size n - 1
, then solving a problem of n - 2
, etc., all the way down to a base case.
In this lesson, we will explore three common applications of string recursion in Python: printing a string, finding the length of a string, and appending a character to a string.
Printing a String
Let's use recursion to print a string, one character at a time. Thinking recursively, we can define the problem of printing a string as follows:
print(string of length n) = print(one character), then print(string of length n - 1)
Here's a recursive function that does this:
def print_string(s):
if len(s) == 0:
return
else:
print(s[0])
print_string(s[1:])
Notice a few things:
- The base case occurs when the function is given the empty string (
''
). In that case, there's nothing left to print, so we can simply return. - We make a recursive call to
print_string()
with everything but the first character using a slice,s[1:]
. - The function has no return value. Recursive functions may or may not have a return value. In this case,
print_string()
can accomplish its task without returning a specific value.
Finding the length of a string
Watch the following video to learn how to write a recursive function that calculates the length of a string.
In contrast to print_string()
above, the recursive_str_len()
function that was written in the video returns a number. This return value is built up through the return statements in the series of recursive calls.
Replacing a character in a string
Watch the following video to see how to write a recursive function that replaces a character in a string.
Efficiency
Note that for each of the above examples -- printing a string, finding the length of a string, and replacing characters in a string -- we needed to recurse over the length of the string. This is analogous to using a for
loop to iterate over the characters of the string. Therefore, all three of these algorithms are O(n)
.
List Recursion
Recursion over lists is very similar to recursion over strings. Since lists in Python can be indexed and sliced like strings, the same principles that we learned in the last lesson also apply to lists.
A mystery recursion function
Here's an example of a recursive function that operates on a list:
def mystery(lst):
if len(lst) == 0:
return 0
else:
return lst[0] + mystery(lst[1:])
Storing the recursive return value
Consider the following recursive function, which counts the even numbers in a list:
def count_even_numbers(lst):
if len(lst) == 0:
return 0
else:
if lst[0] % 2 == 0:
return 1 + count_even_numbers(lst[1:])
else:
return count_even_numbers(lst[1:])
For example, count_even_numbers([1, 2, 3, 4, 5])
would evaluate to 2
, since there are two even numbers in that list.
This function checks whether the number at the first position of the list l[0]
is even, and if it is, returns 1 plus the count of even numbers in the rest of the list (excluding the first character).
If the first number is not even, the function just returns the count of numbers in the rest of the list.
Here's an alternative way of implementing this function:
def count_even_numbers(lst):
if len(lst) == 0:
return 0
else:
count_in_rest = count_even_numbers(lst[1:])
if lst[0] % 2 == 0:
return 1 + count_in_rest
else:
return count_in_rest
In this version of the function, we made the recursive call first, stored its return value in a variable, and then used that variable when deciding what to return. This approach can be useful to simplify the code of your recursive functions and to clarify the logic.
Recursive binary search
At this point, we have mentioned binary search in multiple lessons, and even implemented it iteratively. Now, we are ready to implement a recursive version of binary search.
As before, binary search is suited for searching for a value in O(logn)
time when the collection is sorted. Watch the following video to see how it is implemented.
Here's the version of binary search from the video:
def binary_search(lst, val, low, high):
if low > high:
return None
else:
mid = (low + high) // 2
if lst[mid] == val:
return mid
elif val < lst[mid]:
return binary_search(lst, val, low, mid - 1)
else:
return binary_search(lst, val, mid + 1, high)
Reminder:
//
in Python performs integer division, which will truncate any fractional part of the result. For example,7 // 2
is3
.
Merge Sort
Let's now revisit the topic of sorting. Previously, we talked about selection sort, insertion sort, and radix sort, and compared and contrasted their efficiencies.
Now that we know recursion, let's learn an algorithm that recursively sorts a list. One such algorithm is merge sort, which uses a divide-and-consquer technique to sort a list in time O(nlogn)
.
Watch the following video to learn how merge sort works:
To summarize, merge sort has two phases:
-
Start with a list of
n
elements. Repeatedly split the list in half using recursion until you are left with sublists of size 1. This is the base case, since sublists of size 1 are, naturally, sorted. -
Merge the sublists of size 1 pairwise, into sorted sublists of size 2. Repeat this process, merging the sublists of size 2 pairwise into sublists of size 4, so on and so forth. Repeat until at last you merge the two sublists of size
n / 2
into one fully sorted list of sizen
.
Merge sort in Python
Watch the following video to step through the implementation of merge sort:
Here's the code written in the video:
def merge_sort(lst):
if len(lst) > 1:
left_lst = lst[:len(lst) // 2]
right_lst = lst[len(lst) // 2:]
# Recursion
merge_sort(left_lst)
merge_sort(right_lst)
# Merge left_lst and right_lst
i = 0 # left_lst index
j = 0 # right_lst index
k = 0 # merged list index
while i < len(left_lst) and j < len(right_lst):
if left_lst[i] < right_lst[i]:
lst[k] = left_lst[i]
i += 1
else:
lst[k] = right_lst[j]
j += 1
k += 1
# Fill in remaining elements from left_lst
while i < len(left_lst):
lst[k] = left_lst[i]
i += 1
k += 1
# Fill in remaining elements from right_lst
while j < len(right_lst):
lst[k] = right_lst[j]
j += 1
k += 1
A few notes:
- After the recursive calls to
merge_sort()
return, bothleft_lst
andright_lst
are sorted, and can therefore be merged into a single sorted list using the merging algorithm. - Only one of bodies of the last two
while
loops will be executed, depending on which half (left or right) still has elements. - The base is when the condition
if len(lst) > 1
is false. Lists that have a length of <= 1 are already sorted, so no further recursion is needed and the function doesn't need to do anything -- it can just return.
Efficiency
Let's now talk about the efficiency of merge sort, both in terms of space and time:
To summarize, merge sort runs in time O(nlogn)
in all cases, which is much better than selection sort and insertion sort, which typically run in time O(n2)
. However, merge sort requires O(n)
extra space, as it is not an in-place sorting algorithm.
Recursive Web Crawling
Let's now turn to a different application of recursion: Web page crawling.
When we first introduced recursion, we discussed the problem of recursively searching through a filesystem. We said that recursion is a natural choice for this task, since you don't know ahead of time how many directories you may need to search through. Recursing through the levels of directories avoids the need for many levels of nested loops.
Crawling Web pages is a very similar task. On the Web, pages often contain links, which are connections to other pages. When you click on a link, your browser brings you to the page that is the target of the link. That page may itself also be filled with links to other pages. You can continue crawling through links in this manner.
Web crawler algorithm
Let's think about how we would write a Web crawler. In order to fetch pages from the Web and follow links, we would need to use a Python library to help us with these tasks. Instead of using Python, though, for the moment let's just take a look at some pseudocode to describe how the Web crawler could work:
function crawl(url, visited_links):
if url in visited_links:
return
response = send_request(url)
links = extract_links(response.content)
visited_links.append(url)
for link in links:
if is_valid_link(link):
crawl(link, visited_links)
In this example, crawl()
is the main function that recursively crawls websites. The function takes two parameters: url
is the starting URL, and visited_links
is a list of all the URLs that have already been visited.
The function first checks if the current URL has already been visited, and if so, it returns. Otherwise, it sends a request to the URL using a function send_request()
and extracts all the links from the response content using a function extract_links()
. The current URL is then added to the list of visited links.
Finally, the function recursively visits each link by calling itself with the new URL. The function is_valid_link()
is a function that checks if the link is valid, such as checking if it starts with "http."
Limiting recursion depth
In a massive network of pages like the Web, there could be millions and millions of pages and links to visit. In fact, there are so many pages on the Web that if you create a new website, Google estimates that it could take weeks for its Web crawlers to reach your Web page and add it to Google's search index.
For a simple version of a Web crawler, it might make sense to stop the crawler at a certain depth of recursion. By depth of recursion, we mean the number of recursive calls currently on the runtime stack. Following a link causes the Web crawler to increase its depth of recursion, and returning from a recursive call decreases the depth.
For our Web crawler, we may wish to be able to say "crawl all Web pages starting from mywebsite.com, up to a recursion depth of 10." This would allow our Web crawler to perform its task within some limits, hopefully therefore completing in a reasonable amount of time.
How would this affect our algorithm? First of all, we would need another parameter to track the current level of depth of recursion. We will call it depth
, and give it a default value of 0. We would also need to update this parameter -- each time we make a recursive call, we need to add one to depth
:
function crawl(url, visited_links=[], depth=0):
if url in visited_links:
return
response = send_request(url)
links = extract_links(response.content)
visited_links.append(url)
for link in links:
if is_valid_link(link):
crawl(link, visited_links, depth + 1)
Note:
visited_links
anddepth
have default parameter values. If either one is not specified whencrawl()
is called, then they will be assigned a default value:visited_links
will be the empty list, anddepth
will be 0.
Practice: Recursion II
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Sorting a list using merge sort
Consider the following list:
[19, 81, 67, 21, 42, 6, 30, 16]
- Sort the list using merge sort. Show the contents of the list as the list is split and then merged together.
- What's the state of the list after the fourth merge?
Solution video.
2. Recursive string functions
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
3. Recursion list functions
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 4 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Week 4 Peer Interviews
Overview
For weeks 3 and 4, the class will be split into two pools of students, Group A and Group B. Each student in Group A will be paired with a student in Group B.
During this two-week period, you and your partner should conduct two 30 minute interviews, in which you are each once the interviewer and once the interviewee.
Each week, an "Interview Question of the Week" guide will be distributed via Google Drive to either Group A or Group B. The guide will contain all of the information needed to conduct the interview, including general instructions, a coding question, a sample solution, and proposed follow-up questions.
After the interview is completed, both partners will be asked to submit a reflection about their experience.
We will assign new partners in weeks 7-8.
Interview instructions
For all interviews, you should make sure you have read the interview instructions. After each interview, both the interviewer and interviewee should submit a reflection. The format for the reflection is described in the aforementioned document.
Here is the schedule of peer interviews and question guides for the term. Note: you must be logged-in to your
@kibo.school
account and in the appropriate group to access the question guide for a given week.
Week | Resources | Who is the interviewer? |
---|---|---|
4 | Question Guide | Group B |
Grading
The peer interviews will contribute 10% to your overall grade. This grade is based on
- completion of all four interviews,
- your reflections submitted for each interview.
You can find and submit the interview reflection in Gradescope.
FAQs
- In what setting will the interview take place?
The interview can take place as an online meeting. Meeting in person is not necessary, but is allowed if both partners agree.
- What happens if I'm unable to complete an interview in a given week?
There are opportunities for makeup interviews throughout the semester.
- What happens if my partner is unable to participate in an interview for a given week?
You can either do a makeup interview later in the semester, or find another peer who is willing to conduct the interview with you during the week (only in cases where your partner is the interviewer).
- I don't feel confident in my coding skills, and I'm afraid I won't do well during the peer interview.
In this class (and often in the real world), you are not strictly judged on your ability to produce a bug-free, highly efficient implementation of an algorithm. You are mostly judged on your thinking process and ability to work your way through the problem.
For the peer interviews, you will not be graded on the correctness of your solution, and will earn full credit as long as you complete the interview and submit a thoughtful reflection.
Also: we are here to learn! We are doing this process to give you experience interviewing, so that when you go to do a real interview, you already have some interview experience. We should all be mindful of this and help and encourage each other.
- My partner was unprepared for the interview and it hurt my experience. What can I do?
This is something you should discuss in your submitted reflection. Everyone should come prepared to every interview. Even though these are peer interviews, they are being conducted to help each other gain experience in this important part of becoming a software developer, and being unprepared will diminish this experience for your partner.
- As the interviewee, can I see the solution before I attempt the peer interview?
Please do not seek out the interview question or solution before attempting the interview. If you do, you will not get the full benefit of the exercise, which will only hurt your interviewing skills in the long run.
- How do I volunteer to be interviewed during weeks 1 and 2?
The instructor will provide information about finding volunteers at the start of the term.
- Does the interview have to last the full 30 minutes?
No, but it should last at least 20 minutes, and ideally 25-30 minutes.
- I have another question about the peer interviews. Who can I ask?
You can ask in the class Discord, or email the instructor directly.
Reading
Read Chapter 2, starting from "More Examples of Recursive Algorithms" up to but not including the section titled "Tail-Call Optimization," of An Open Guide to Data Structures and Algorithms.
Linked Lists
Welcome to week 5! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Compare and contrast arrays and linked lists
- Choose the appropriate data structure for some given constraints
- Trace algorithms on reference-based data structures, like linked-lists
- Write algorithms to build, traverse, and modify linked lists
What's due this week
- Week 5 Quiz on Thursday (on Gradescope)
- Assignment 3
Arrays
Up to this point, we have used Python lists as our main data structure. Lists may seem like a simple collection of data, but there's actually a simpler sequential data structure available for us to study: the array.
An array is a fixed-length data sequence that contains elements of a single type. The elements of an array are stored contiguously in memory (i.e., in adjacent memory locations). A sample array of four-byte integers (and some sample memory addresses) is shown below:

Aside. Although Python doesn't have a built-in array type like some other languages (e.g., Java and C++), there is an
array
library:import array arr = array.array('i', [1, 2, 3]) # creates an array of three integers print(arr)
However, we will not be using the
array
package in this class.
Accessing items in an array
A useful property of arrays is that they allow efficient random access of elements. In other words, we can access any element i
of an array in O(1)
time. Typically, programming languages use notation like arr[i]
to get element i
of an array named arr
.
Note the requirement that the array only contains elements of a single type. This is actually what makes it possible for accessing arr[i]
to be an O(1)
operation. If we know the memory address of the start of the array and also know the size of each element, then we can use arithmetic to quickly compute the location of element i
in memory:
memory_address(arr[i]) = memory_address(arr) + (i * array_item_size)
For example, each element in the above array is a four-byte integer, so we can quickly compute the memory address of the element at index 6:
memory_address(arr[6]) = 1040 + (6 * 4)
memory_address(arr[6]) = 1064
If the elements were of different types, they might be different sizes (e.g., eight-byte floating point numbers, 1 byte ASCII characters, etc.), and therefore we wouldn't be able to quickly compute the memory address of arr[i]
.
Adding items to an array
The main disadvantage of arrays is that they are fixed-length, so they cannot be easily resized. If you need to store more elements than the capacity of an array would allow, you would need to allocate a new, larger array and copy over all of the elements from the old array.
Python lists vs. arrays
You might at this point be thinking that arrays sound very similar to Python lists. They are indeed similar, but not exactly the same. Arrays are a more general concept, and are used across different programming languages. Python lists are -- of course -- specific to Python, and are actually a data structure that is implemented using arrays!
There are a couple of differences between arrays and Python lists:
- Python lists can contain different data types
- Python lists are extendable -- with a caveat!
Lists can contain different data types
In Python, you can create lists that have different types of data:
lst = ['hello', 1, False, 'world', 5.0]
But if lists use arrays under the hood, and arrays can only hold a single type of data, how is this possible? The trick is that in Python, all data types are objects. Therefore, Python lists are really arrays of references to objects:

Each reference to an object is the same size, so we can still use simple arithmetic to compute the memory address of location i
in a Python list.
Lists are "extendable"
Python lists can seemingly grow indefinitely. If we start with a collection of data in a list, we can later add elements to the list using append()
:
lst = [1, 2, 3]
lst.append(4)
lst.append(5)
...
lst.append(99)
But again -- how can this be true? If Python lists are implemented using arrays, and arrays have a fixed length, then how can we grow lists indefinitely?
The answer is that occasionally, append()
needs to resize the underlying array in order to add another element. For example, if we create a Python list with three elements:
lst = [1, 2, 3]
Under the hood, Python might actually create an array that has enough capacity for six elements:
lst = [1, 2, 3] # stored in an array [1, 2, 3, , , ]
The program would then be able to make three calls to append()
before the underlying array is full. But on the fourth call to append()
, the underlying array would need to be resized (e.g., by doubling its length) in order to accomodate the new element:
lst.append(4) # array now [1, 2, 3, 4, , ]
lst.append(5) # array now [1, 2, 3, 4, 5, ]
lst.append(6) # array now [1, 2, 3, 4, 5, 6]
lst.append(7) # array now [1, 2, 3, 4, 5, 6, 7, , , , , ]
Linked Lists
Linked lists are a fundamental alternative to arrays for building data structures. Instead of storing the items of the collection contiguously in a fixed amount of memory (as in arrays), a linked list stores the collection in separate nodes:

Each node is an object that contains two items: a value (in the example above, an integer value) and a "next" pointer -- i.e., a reference to the next node in the list. The last node in the list has a next pointer value of None
.
The linked list as a whole is represented by a variable (head
in the example above) that holds a reference to the first node in the list -- the "head" of the linked list.
Linked lists in memory
The nodes of the linked lists don't need to be stored consecutively in the heap region of memory. In fact, they can be stored at mostly arbitrary memory locations. This is the reason that each node needs to store the memory address of the next node in the list. To illustrate this concept, here's another view of the same linked list, but instead of using arrows to represent memory addresses, we will write the (made up) memory addresses:

This is in contrast to arrays, which as we learned in the previous lesson, store elements in consecutive memory locations:

This difference in memory representation between arrays and linked lists has profound effects on how they are used.
Features of linked lists
Because each linked list is composed of a sequence of separate nodes, they can grow indefinitely (provided there is enough memory) -- they don't need to be resized when they reach capacity, as arrays do.
In addition, it's easy to insert or delete an item in a linked list, since there is no need to "shift over" items when you do so. Instead, the list can be modified by just altering the references in the list. For example, to add the 23 node to the list, we can simply alter the next pointer of the 91 node:

However, linked lists also have some disadvantages. Since they are not stored in memory contiguously, they do not provide random access. In order find element i
in the list, the list must be traversed starting from the head. This makes fetching element i
an O(n)
operation. In addition, the links themselves between nodes take up memory.
Arrays vs. linked lists
Often, we need to choose a data structure based on whether it is implemented using an array or a linked list. In those situations, it's helpful to compare and contrast the two choices as follows:
- Arrays provide the ability to access elements in
O(1)
time, but it's expensive to insert an item in the middle of an array (since elements need to be shifted over) and to increase the size of an array (bothO(n)
operations). - Linked lists provide the ability to indefinitely extend the collection, but require
O(n)
time to access an element.
In general, arrays are preferred when the data structures is being used to do mostly read operations -- i.e., the data structure doesn't frequently change. For a data structure that often needs to change, it might be better to use a linked list.
Summary
Watch the below video to see a summary of linked lists:
Building Linked Lists
To begin our study of linked lists in Python, let's start with the algorithm for constructing a linked list from scratch.
A linked list string
To guide our study of linked lists, we will work with a string that is represented as a linked list of characters. Each node will contain a single character, and the linked list as a whole represents a string:

In Python, strings are implemented using arrays of characters. However, using a linked list representation of a string will allow us to learn the basics of manipulating linked data structures with a familiar data type.
The linked list as a whole will be represented using an LLString
object. Each individual node of the list will be represented using a Node
object. We can define Node
to be a class nested within the LLString
class:
class LLString:
class Node:
def __init__(self, val, next):
self.val = val
self.next = next
def __init__(self):
self.head = None
The LLString
constructor simply initializes the head
of the linked list to None
. In order to actually add some nodes to the linked list, let's write a new method in this class.
Adding a node to a linked list
Watch the video below to see how to add a node to the beginning of a linked list:
Appending a node to a linked list
What if we wanted to add a node to the end of the list? We have two options:
- Starting at
self.head
, iterate all the way to the end of the list and add a node. - Maintain a reference to the end of the list (the tail), so that we can quickly add a new node.
We will see how to iterate down a linked list to do (1) in the next lesson. For now, let's focus on (2). To start, we will need to create self.tail
in the LLString
constructor, and also make sure that the prepend()
function sets it when the first node is added to the list:
class LLString:
class Node:
def __init__(self, val, next):
self.val = val
self.next = next
def __init__(self):
self.head = None
self.tail = None
def prepend(self, new_val):
new_node = LLString.Node(new_val, self.head)
self.head = new_node
if self.tail is None:
self.tail = new_node
Now that we have self.tail
, watch the video below to see how to add a node to the end of a linked list:
Converting a string to a linked list
To make our work with LLString
easier, let's update the constructor so that it converts a Python string to an LLList
. To do so, we can make use of our new append()
method:
def __init__(self, s):
self.head = None
self.tail = None
for char in s:
self.append(char)
With this, we can create new linked lists with a single call:
str1 = LLString('hello')
Putting it all together, here's our current version of the LLString
class:
class LLString:
class Node:
def __init__(self, val, next):
self.val = val
self.next = next
def __init__(self, s):
self.head = None
self.tail = None
for char in s:
self.append(char)
def prepend(self, new_val):
new_node = LLString.Node(new_val, self.head)
self.head = new_node
if self.tail is None:
self.tail = new_node
def append(self, new_val):
new_node = LLString.Node(new_val, None)
if self.tail is not None:
self.tail.next = new_node
self.tail = new_node
if self.head is None:
self.head = new_node
Traversing a Linked List
So far, we have written algorithms to build linked lists, which have only operated at the beginning or the end of a linked list. However, many algorithms require us to iterate over (or traverse) an entire sequential data structure. So how is that done with linked lists?
Printing an LLString
For a first example of traversing a list, let's see how we could use iteration to print an LLString
:
As shown in the video, this method of traversing a list is a common programming pattern. Here's a template that you can use:
def traversal(self):
trav = self.head
while trav is not None:
# ... do something here
trav = trav.next
Recursively printing an LLString
Linked lists are a naturally recursive data structure, since each linked list is either:
- Empty
- A single node, followed by a linked list
For this reason, it is often easier to write recursive functions to operate on linked lists. For example, watch the following video to see how to recursively print a linked list:
Finding the length of a list
Your turn: either iteratively or recursively, write a method length()
that returns the length of the linked list. If the list is empty, the function should return 0.
Click to reveal the answer.
Here's the iterative version:
def length(self):
trav = self.head
count = 0
while trav is not None:
count += 1
trav = trav.next
return count
And here's the recursive version:
def length(self):
return self.__length(self.head)
def __length(self, node):
if node is None:
return 0
else:
return 1 + self.__length(node.next)
Modifying a Linked List
We've now learned how to build and traverse a linked list, but there is still one broad class of algorithms we still need to talk about: modifying an existing linked list.
Deleting a node from a linked list
Watch the following video to see how to delete a node from a linked list:
Inserting a node in a linked list
Instead of adding a new node to the beginning or end of a linked list, we may wish to add a node in the middle of the list -- at position i
.
Using what you learned from the node deletion video above, try implementing the insert()
function below. The function should create a new node with the value new_val
, and insert the node into position i
of the linked list. For example, if the list contains the string 'rat'
, then insert('f', 2)
should change the linked list so that it contains the string 'raft'
.
If there is no position i
of the list (i.e., the list is not long enough), the function should just add the node at the end of the list.
Remember to update self.head
(and as an extra challenge, self.tail
) if needed.
Click to reveal the answer.
Here's a solution:
def insert(self, new_val, i):
new_node = LLString.Node(new_val, None)
trav = self.head
prev = None
if i < 0:
return
if trav is None:
# list is empty
self.head = new_node
self.tail = new_node
return
num = 0
while num < i:
prev = trav
trav = trav.next
num += 1
if trav is None:
# add to the end
prev.next = new_node
self.tail = new_node
return
if i == 0:
new_node.next = self.head
self.head = new_node
else:
new_node.next = trav
prev.next = new_node
Doubly Linked Lists
A common variation of linked list is the doubly linked list. In this version, each node has both a next
pointer as well as a pointer to the previous node, often called prev
pointer. This makes it possible to traverse through the list in both directions, which makes some algorithms (such as deleting a node) easier to implement.
Watch the following video to learn more about doubly-linked lists:
Note: there is a brief mention in the video of how to create a doubly linked list node in the C programming language using a
struct
. You can ignore this segment and just focus on the general ideas presented about doubly linked lists.
Practice: Linked Lists
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Working with linked list references
Consider the following linked list:

- What is the value of
temp.next
? - What does
temp.next.val
evaluate to? - What does
head.next.next == temp
evaluate to? - What are some ways we can access the character 'o'?
- How do the following statements change the diagram?
temp.next = ref
temp = ref.next.next
head = head.next
ref = None
Solution video.
2. Writing LLString
functions
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
3. Working with doubly linked lists
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
4. Implementing the set ADT using a linked list
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 5 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 3
Due date: February 11
📌 This is an individual project. You are expected to work independently.
If you get stuck, confused, or have trouble with the project, you should use the #help-dsa channel in Discord or message an instructor. Remember not to spoil the project for others - do not post code samples from your work to public spaces.
Writing LLString
Methods
This week, we learned about linked lists and wrote various methods for the LLString
class.
In this assignment, you will write various functions that implement a linked list version of a string, including trimming a string, making a copy of a string, and replacing characters in a string. You will use both iteration and recursion, and ensure that your functions handle edge cases such as empty lists.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Statement on Usage of AI Tools
The list below details acceptable and unacceptable uses of AI tools in Kibo courses, including generative AI tools like ChatGPT and AI tools integrated into code editors like CoPilot. For use cases not listed below, inquire about their acceptability on the course's Discord Help channel.
Acceptable Uses:
- Generating Ideas: Assisting in brainstorming topics for essays, projects, or presentations.
- Research Aid: Helping to find sources, papers, or literature for further reading and reference.
- Explaining Concepts: Providing explanations of complex concepts in simpler terms or different perspectives.
- Outlining: Aiding in creating outlines of topics to build structure for essays or reports.
- Drafting Assistance: Creating initial drafts of supplementary writing material where the main artifact being produced is not a written item.
- Creative Inspiration: Generating creative ideas for projects in arts, literature, or design.
- Technical Guidance: Offering guidance on coding problems, mathematical equations, or scientific principles.
- Proofreading: Assisting in grammar and spelling checks for written assignments.
Unacceptable Uses:
- AI-Generated Work: Submitting work significantly generated by AI as one's own in essays, reports, code, or other assignments.
- Plagiarism: Using AI to rephrase or copy content without proper citation or acknowledgment.
- Test and Exam Assistance: Using AI tools during tests or exams unless explicitly allowed by the instructor.
- Drafting Assistance: Creating initial drafts of written materials for general educations courses.
- Complete Dependence: Relying solely on AI for understanding core concepts without engaging in personal study or critical thinking (AI tools can often be wrong).
Important Notes:
- Students must disclose the use of AI tools in their assignments and provide either shareable links or exports (if the tool doesn't support sharing) showing their use of the tools on the assignment.
- AI tools might not strictly adhere to instructions or might include unrequested information. Always review the output to ensure accuracy and relevance.
- Work generated through AI should be critically reviewed and substantially modified to ensure originality and understanding.
Reading
Read Chapter 5 of An Open Guide to Data Structures and Algorithms.
Midterm Preparation
We will have our midterm on Thursday, February 15 at 6:00 PM GMT. Here are the details:
Date and time
- Held during Week 6 during our normal class time: Thursday, February 15 at 6:00 PM GMT
- You will get the link to the midterm ahead of time, but it won't be enabled until 6:00 PM GMT the day of the exam
Material
- Everything that was covered in weeks 1 - 4 of the class
- This includes lesson material, live classes, practice problems, assignments
Format
- Administered online using a tool called OnlineExamMaker
- You can prepare and use any amount of notes on paper, but no other resources on the computer or Internet are allowed
- 45 minute exam from the time you begin
- Be sure to start the test between 6:00 - 6:15 PM GMT (it will close after that point)
- Exam will be composed of nine multiple choice questions, shown one at a time
- You can't go back and change your answer after moving on to the next question
- We won't be using Zoom during the exam; you can just access the exam individually when you're ready
- Plan to have scratch paper for tracing/diagramming etc.
Proctoring software
- The exam website is also a proctoring service
- It will ask you to turn your camera on, and will take some pictures at random during the test
- It will flag any activity such as closing the test, changing programs, switching tabs, etc. If you do this too many times, it will close the exam and you will not be able to continue. Once you start the exam, you should expect to stay on the exam page and complete each question without any interruptions or using your computer for anything else.
- It disables copy and paste during the test
- It gives us the option to monitor your activity
Exam environment
- Plan to be in a test-taking environment
- Quiet, mostly private
Practice exam
- A practice exam is available here so that you can get used to the software and format.
Questions
If you have any questions about the exam, please ask in #dsa-help on Discord or reach out to the instructor.
Week 6
Welcome to week 6! In this week, we take a break from learning new material to focus on the midterm exam.
What's due this week
- Midterm exam on Thursday
Midterm Exam
We will have our midterm on Thursday, February 15 at 6:00 PM GMT. Here are the details:
Link to Exam
You can access the exam on OnlineExamMaker here: https://t.onlineexammaker.com/doexam/BZpOYapqOE9.html
Date and time
- Held during Week 6 during our normal class time: Thursday, February 15 at 6:00 PM GMT
- You can access the link to the midterm ahead of time, but it won't be enabled until 6:00 PM GMT the day of the exam
Material
- Everything that was covered in weeks 1 - 4 of the class
- This includes lesson material, live classes, practice problems, assignments
Format
- Administered online using a tool called OnlineExamMaker
- You can prepare and use any amount of notes on paper, but no other resources on the computer or Internet are allowed
- 45 minute exam from the time you begin
- Be sure to start the test between 6:00 - 6:15 PM GMT (it will close after that point)
- Exam will be composed of nine multiple choice questions, shown one at a time
- You can't go back and change your answer after moving on to the next question
- We won't be using Zoom during the exam; you can just access the exam individually when you're ready
- Plan to have scratch paper for tracing/diagramming etc.
Proctoring software
- The exam website is also a proctoring service
- It will ask you to turn your camera on, and will take some pictures at random during the test
- It will flag any activity such as closing the test, changing programs, switching tabs, etc. If you do this too many times, it will close the exam and you will not be able to continue. Once you start the exam, you should expect to stay on the exam page and complete each question without any interruptions or using your computer for anything else.
- It disables copy and paste during the test
- It gives us the option to monitor your activity
Exam environment
- Plan to be in a test-taking environment
- Quiet, mostly private
Practice exam
- A practice exam is available here so that you can get used to the software and format.
Questions
If you have any questions leading up to the exam, please ask in #dsa-help on Discord or reach out to the instructor.
Lists, Stacks, and Queues
Welcome to week 7! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Describe the abstractions provided by lists, stacks, and queues
- Explain the reasoning for using an iterator
- Use existing Python data types as stacks and queues
What's due this week
- Week 7 Quiz on Thursday (on Gradescope)
- Get started on your peer interviews (reflection not due until next week)
Lists
We spent the first half of the course building our toolbox with the knowledge, skills, and programming constructs needed to create and use data structures, including:
- The memory model of Python
- Big-O notation
- Recursion
- Arrays and linked lists
With this knowledge, we are finally ready to describe some new ADTs! We will start with a "list."
The list ADT
A list is a data structure in which items can be inserted, accessed, and removed from any position. In other words, a list has the following properties:
- Order matters: you can ask to insert, get, or remove an element at a specific position
i
- Duplicates are allowed: there is no restriction regarding duplicates
It might be helpful to contrast the idea of a list with the set ADT, in which order did not matter (we could only grab a random element) and duplicates were not allowed.
To go along with our ADT, we can specify some operations that we want lists to support:
get_item(i)
: get the item at positioni
add_item(item, i)
: add the specified item at positioni
remove_item(i)
: remove the item at positioni
length()
: get the number of items in the listis_full()
: test if the list already has the maximum number of itemsto_string()
: return a string representation of the list
The methods for the list ADT above are just defined for our purposes in this class -- there is no specific set of methods that a "list" is required to support in general. For example, Python's built-in list data structure is similar to this definition of the list ADT. However, it doesn't have any concept of being "full" (and so doesn't have a is_full()
method) and also has other functionality, such as slicing.
Linked lists vs. lists vs. Python lists
At this point we have learned about a few entities that are all called lists, so let's take a moment to remember the differences between them.
A linked list is a building block data structure in which data is connected using nodes and references. It can used to implement various ADTs.
A list is an ADT that represents a sequential collection of data where order matters and duplicates are allowed. It could be implemented in various ways, including using an array or a linked list.
A Python list is a Python-specific implementation of the list ADT, which uses an array under the hood along with some additional functionality (e.g., slicing).
Implementing a list using an array
After creating the list ADT, the next question to ask is how should we implement it? Since a list is a sequential data structure, it's natural to consider an array.
Of course, since arrays don't exist as a built-in feature of Python, we will use Python lists. But recall that Python lists are implemented using arrays, so we need to keep that in mind as we use Python lists to implement the list ADT.
Watch the video below to see how the list ADT could be implemented using a Python list (i.e., an array):
Implementing a list using a linked list
The other natural choice for implementing the list ADT would be to use a linked list. Watch the video below to see how this would work:
Note: The class name
LLList
was chosen because it is the linked list (LL) implementation of a list.
Summary
Let's summarize the tradeoffs between the array and linked list implementations of the list ADT:
ArrayList | LLList | |
---|---|---|
get_item() | O(1) in all cases | Best: O(1) (get_item(0) )Average/Worst: O(n) (getting item at middle/end) |
add_item() | Best: O(1) (adding to end)Average/Worst: O(n) (adding to beginning/middle) | Best: O(1) (add_item(0) )Average/Worst: O(n) (adding to middle/end) |
Iterators
Let's now use one of our list implementations to solve a problem: counting the number of times a given item appears in a list.
We can write a function to do so. However, let's write it from the perspective of a program that it outside of the ArrayList
or LLList
class:
from lllist import LLList
import random
def count(list_impl, item):
total = 0
for i in range(list_impl.length()):
list_item = list_impl.get_item(i)
if list_item == item:
total += 1
return total
# create a new LLList
my_list = LLList()
# add one million random numbers to the list
for i in range(10000):
my_list.add_item(random.randint(1, 100), 0)
# count how many times 13 is in the list
print(count(my_list, 13))
This program creates a new list, adds one million random numbers to it, and then counts the number of times 13 appears in the list.
However, there's something surprising about this function: it is inefficient! To see why, watch the following video:
To summarize, the count()
function as written above is efficient when the list that is passed in as a parameter is an ArrayList
. However, when an LLList
is passed in as a parameter, the function isn't very efficient.
To fix this issue, we'll need to use an iterator.
Iterators
An iterator is an object that can be used to maintain state about the current position of a traversal over a data structure. The main benefit of an iterator is that it allows functions outside of the data structure's class to efficiently iterate over an instance of a data structure.
Iterators typically have just two methods:
has_next()
: whether there is another item in the traversalnext()
: returns the next item and advances the iterator
Functions can then create iterators and call these methods. For example, the count()
method could be changed to use an iterator:
def count(list_impl, item):
total = 0
it = list_impl.iterator()
while it.has_next():
list_item = it.next()
if list_item == item:
total += 1
return total
In order for each implementation of the list ADT to have an iterator, we can add an iterator()
method to the list of expected methods:
iterator()
: returns an iterator for the list
Making an efficient LLList
iterator
Watch the video below to see how to make an efficient iterator for the LLList
class:
By using the iterator, we've improved the time complexity of the count()
function from O(n2)
to O(n)
!
Stacks
The next ADT that we will consider is a stack.
A stack is another sequential data structure, but unlike a list, it does not provide the ability to insert or access an element at an arbitrary position i
. Instead, elements can only be added or removed at one end (the "top"), and you can only access the item that is currently at the top:
To learn more about stacks, watch the following video:
To summarize, stacks provide a last in, first out (LIFO) abstraction -- the most recently pushed item (last in) will be the first one popped (first out) during a removal. The stack ADT provides us with the following functions:
push()
: add an item to the top of the stackpop()
: remove the item at the top of the stacktop()
: get the item at the top of the stack, but don't remove itis_empty()
: test if the stack is emptyis_full()
: test if the stack is full
Note: there is no need for an iterator for stacks. Stacks don't provide the abstraction that you can iterate over their contents -- you can only access the item at the top.
Stacks have many real-world applications, including:
- The runtime stack, on which stack frames are pushed and popped
- Evaluating mathematical expressions and programming language expressions
- Supporting "undo" operations in software (go back to the most recent state)
Implementing a stack using an array
As with lists, we can think about how a stack could be implemented as an array. Watch the following video to see a sample implementation:
Implementing a stack using a linked list
We can also efficiently implement stacks using a linked list. Watch the following video to see how:
Summary
Let's summarize the tradeoffs between the array and linked list implementations of the stack ADT:
ArrayStack | LLStack | |
---|---|---|
push() | O(1) (on average) | O(1) |
pop() | O(1) | O(1) |
Space efficiency | O(m) (m is number of expected items) | O(n) (n is number of items in stack) |
Queues
The final ADT that we will consider this week is the queue.
A queue is yet another sequential data structure. The main abstraction that it provides is a first in, first out (FIFO) policy: items are added at the rear and removed from the front, and you can only access the item that is currently at the front.
To learn more about queues, watch the following video:
To summarize, the queue ADT provides the following methods:
enqueue()
: add an item at the rear of the queuedequeue()
: remove the item at the front of the queuefront()
: get the item at the front of the queue, but don't remove itis_empty()
: test if the queue is emptyis_full()
: test if the queue is full
Note: just like with stacks, there is no need for an iterator for a queue. It doesn't provide an iterable abstraction.
Since the FIFO abstraction is considered a "fair" policy in many real-world situations (e.g., "first come, first served"), queues have many applications across computer science and related operational fields. Any time a "request" of some kind needs to be served, it can be put in a queue, such as:
- Waiting in a virtual queue to buy tickets
- A Web server processing incoming HTTP requests for pages
- A printer serving requests for print jobs
Implementing a queue using a linked list
Implementing a queue using a linked list can be easily done by maintaining both a front
and rear
reference to a linked list. Watch the following video to see how:
A queue can also efficiently be implemented using an array, but the exact implementation of it is saved as a practice exercise for this week.
Summary
Although we haven't yet covered the array implementation of a queue (you can find that as a practice problem), we can preview the efficiencies of the implementations below:
ArrayQueue (see practice exercises) | LLQueue | |
---|---|---|
enqueue() | O(1) | O(1) |
dequeue() | O(1) | O(1) |
Space efficiency | O(m) (m is number of expected items) | O(n) (n is number of items in stack) |
Stacks & Queues in Python
Now that we've learned about stacks and queues in general, let's think about how you might actually use stacks and queues in the context of Python programs.
Although we built various versions of these data structures for educational purposes, you likely wouldn't actually use them in a program. Instead, Python provides some implementations that you may find useful.
Stacks in Python
A Python list provides the ability to function as a stack. Strictly speaking, a stack doesn't have access to arbitrary elements of the list, as a Python list does. However, a stack can be simulated by only using methods that operate at the end of the Python list -- the "top" of the "stack."
To push an element onto the stack, use the Python lists's append()
method. As we've seen many times before, this adds an element to the end of the list. In some cases, this can require the underlying array to be resized, but in most cases it is an efficient operation.
To pop an element from the stack, use the pop()
method. This removes and returns the last element of the Python list.
Here's an example that pushes some elements on a stack and then pops them:
Queues in Python
In theory, a Python list could also be used to implement a queue. The append()
method could again be used to insert items at the rear of the "queue," and then the pop()
method could be used with an optional parameter to remove the item at index 0 (the front of the "queue"):
However, this approach is not efficient, since calling pop(0)
requires shifting all elements of the Python list to the left.
Instead of using a Python list, you can use a deque
from the collections
library when you need a queue in Python.
A "deque" is a double-ended queue -- a queue that supports efficient enqueue and dequeue operations from either end of the collection. To use it in the traditional sense of a queue, use append()
to enqueue items at the rear of the queue, and popleft()
to dequeue items from the front of the queue:
Practice: Lists, Stacks, and Queues
💡 This is your chance to put what you've learned into action.
Try solving these practice challenges to check that you understand the concepts. No submission is necessary for practice exercises.
Why practice?
Practice coding helps you become a great coder. These practice problems aren't graded, but that doesn't mean they aren't important.
You should aim to practice a lot, especially with unfamiliar concepts. Spread practice over multiple days to take advantage of the spacing effect, which helps you retain new knowledge.
More about practice
Practice helps you understand what you know, and what you don't know. It can be easy to trick yourself into thinking you understand something when you do not -- or that you don't understand when you do. Practicing by writing code or debugging code will help you find out what you really understand, and where you are still confused.
Practice helps build confidence in your coding. The more programs you write, and the more problems you solve, the more you learn that you are a capable coder and problem-solver.
Practice doesn't always feel good - sometimes you'll be stumped! But, practice shouldn't feel super frustrating either. If you find yourself getting angry at yourself or the code, it's a good time to take a break and ask for help.
The solutions to each challenge are available, and you can view a video of the solution below each challenge.
- Try to go through the whole challenge without using the solution.
- If you can’t do the challenge without looking the solution, it means you don’t understand the material well enough yet.
- Try the next practice challenges without looking at the solution. If you need more practice challenges, reach out on Discord.
1. Stack and queue puzzles
-
Suppose that an intermixed sequence of push and pop operations are performed on an initially empty LIFO stack. The pushes push the integers 0 through 9 in order, and the pops print out the return value. Which of the following sequences could not occur?
a.
4 3 2 1 0 9 8 7 6 5
b.4 6 8 7 5 3 2 9 0 1
c.2 5 6 7 4 8 9 3 1 0
d.4 3 2 1 0 5 6 7 8 9
-
A letter means push and an asterisk means pop in the following sequence. Give the sequence of values returned by the pop operations when this sequence of operations is performed on an initially empty LIFO stack.
G * O C L * * I M * B K * * * I B * * O * *
-
A letter means enqueue and an asterisk means dequeue in the following sequence. Give the sequence of values returned by the dequeue operations when this sequence of operations is performed on an initially empty FIFO queue.
G * O C L * * I M * B K * * * I B * * O * *
Solution video.
2. Implementing the ArrayQueue
class
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
3. Implementing a queue using two stacks
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
4. Using iterators
▶️ Access the practice exercise on GitHub Classroom
Watch the video below to see the full solution.
Make sure you give yourself enough time to solve the practice without watching the video. It is really important for your learning.
Solution Video
Week 7 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Week 7 Peer Interviews
Interview instructions
For all interviews, you should make sure you have read the interview instructions. After each interview, both the interviewer and interviewee should submit a reflection. The format for the reflection is described in the aforementioned document.
Here is the schedule of peer interviews and question guides for the term. Note: you must be logged-in to your
@kibo.school
account and in the appropriate group to access the question guide for a given week.
Week | Resources | Who is the interviewer? |
---|---|---|
7 | Question Guide | Group A |
Grading
The peer interviews will contribute 10% to your overall grade. This grade is based on
- completion of all four interviews,
- your reflections submitted for each interview.
You can find and submit the interview reflection in Gradescope.
FAQs
- In what setting will the interview take place?
The interview can take place as an online meeting. Meeting in person is not necessary, but is allowed if both partners agree.
- What happens if I'm unable to complete an interview in a given week?
There are opportunities for makeup interviews throughout the semester.
- What happens if my partner is unable to participate in an interview for a given week?
You can either do a makeup interview later in the semester, or find another peer who is willing to conduct the interview with you during the week (only in cases where your partner is the interviewer).
- I don't feel confident in my coding skills, and I'm afraid I won't do well during the peer interview.
In this class (and often in the real world), you are not strictly judged on your ability to produce a bug-free, highly efficient implementation of an algorithm. You are mostly judged on your thinking process and ability to work your way through the problem.
For the peer interviews, you will not be graded on the correctness of your solution, and will earn full credit as long as you complete the interview and submit a thoughtful reflection.
Also: we are here to learn! We are doing this process to give you experience interviewing, so that when you go to do a real interview, you already have some interview experience. We should all be mindful of this and help and encourage each other.
- My partner was unprepared for the interview and it hurt my experience. What can I do?
This is something you should discuss in your submitted reflection. Everyone should come prepared to every interview. Even though these are peer interviews, they are being conducted to help each other gain experience in this important part of becoming a software developer, and being unprepared will diminish this experience for your partner.
- As the interviewee, can I see the solution before I attempt the peer interview?
Please do not seek out the interview question or solution before attempting the interview. If you do, you will not get the full benefit of the exercise, which will only hurt your interviewing skills in the long run.
- How do I volunteer to be interviewed during weeks 1 and 2?
The instructor will provide information about finding volunteers at the start of the term.
- Does the interview have to last the full 30 minutes?
No, but it should last at least 20 minutes, and ideally 25-30 minutes.
- I have another question about the peer interviews. Who can I ask?
You can ask in the class Discord, or email the instructor directly.
Reading
Read Chapter 6 of An Open Guide to Data Structures and Algorithms.
Trees
Welcome to week 8! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Describe the characteristics of a tree data structure using new terminology
- Write algorithms that operate on trees and explain their time complexity
- Perform a variety of tree traversal algorithms
- Construct and use Huffman trees to encode and compress text
What's due this week
- Week 8 Quiz on Thursday (on Gradescope)
- Peer Interview 3 Reflection on Sunday
- Assignment 4 on Sunday
Tree Basics
Up to this point, the data structures that we have considered have been linear -- they have modeled data as being in a sequence. However, there can be non-linear relationships in data as well -- for example, some relationships in data are hierarchical.
Examples of hierarchical relationships include:
- Managers and employees in an organization
- Ancestors and descendants in a family tree
- Directories and files in filesystems
Hierarchical data can be represented using a tree. Watch the following video to learn about trees as a data structure:
To summarize:
- Trees are (typically) linked data structures composed of nodes and edges (also known as links).
- At the "top" of the tree, there is a root node.
- Each node has zero or more child nodes. Nodes that don't have any children are leaf nodes. Nodes that have at least one child are interior nodes.
- Each node (except for the root) has a parent node.
- Trees are naturally recursive data structures. Each tree is a root node connected to subtrees starting from the root node's children.
- The depth of a node is the number of links on the path from the root to the node. The height of a tree is the maximum depth among all nodes.
- A common type of tree is a binary tree, in which nodes have two child pointers to left and right children.
Trees in Python
Now that we knows the basics of a tree data structure, we can start to think about how we would actually use one in Python.
Just as we saw with linked lists, there is also no built-in tree data structure in Python. There are tree libraries that you can import, but we won't do so in this class. Instead, it's instructive for us to build a tree class from first principles.
The LinkedTree
class
We will create a LinkedTree
class to represent a binary tree, where each node in the tree has a value and two possible child pointers: a left and a right. To do so, we will need our tree to be composed of tree nodes, which we will create using a nested, inner class:
class LinkedTree:
class TreeNode:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right
def __init__(self, root):
self.root = root
Notice that each TreeNode
will have a value, a (possibly None
) left pointer, and a (possibly None
) right pointer.
Building a LinkedTree
Just like with linked lists, we could create some TreeNode
objects, link them together manually, and use one of them as the root of a new LinkedTree
. For example:
n1 = TreeNode(5)
n2 = TreeNode(10)
n3 = TreeNode(7)
n1.left = n2
n1.right = n3
tree = LinkedTree(n1)
After constructing this tree, this is what the picture would look like in memory:

Note that we represent each TreeNode
as a box with three parts: a left child pointer, a value, and a right child pointer.
Building a tree with random values
We could also write a method to construct a random tree for us. Watch the video below to see how to build of height h
filled with random values:
Counting the number of nodes in a tree
With our LinkedTree
class, we can now write algorithms that operate on the tree. See the below implementation of a num_nodes()
method, which finds the number of nodes in a tree:
def __num_nodes(self, node):
if node is None:
return 0
return 1 + self.__num_nodes(node.left) + self.__num_nodes(node.right)
def num_nodes(self):
return self.__num_nodes(self.root)
The basic algorithm is that the count of nodes in the tree is equivalent to 1 (for the root node) + the number of nodes in the left subtree + the number of nodes in the subtree. When applied recursively, this will count the total number of nodes in the tree.
Note: It is much easier to write recursive algorithms on trees than iterative algorithms, since we need to be able to traverse up and down the tree. Making recursive calls and then returning from those recursive calls models this up and down traversal naturally. Therefore, we generally won't write iterative algorithms on trees, with some exceptions.
As always, we are interested in finding the time complexity of our algorithm. For num_nodes()
, we make one recursive call for each node in the tree. As the number of nodes in the tree increases, the number of recursive calls increases linearly. Therefore, the running time of num_nodes()
is O(n)
.
Finding the depth of a node
Watch the video below to see how to write a method that finds the depth of a node:
Tree Traversals
In this lesson, we will think about how to traverse over a tree in a pre-defined, standard way.
All of the previous data structures that we have considered were linear, and iterating over a linear data structure has an obvious beginning and end. However, with a tree, the order in which we visit the nodes of the tree in an algorithm is not obvious. For example, consider the following tree:

One way of traversing this tree would be in level-order: the nodes are visited one level at a time, from top to bottom. A level-order traversal would therefore visit the nodes in the order 1 2 3 4 5 6 7 8 9
.
However, we've already written an algorithm that does not visit the nodes in level-order! The num_nodes()
method from the previous lesson, for example, accounted for the root node in the count, then recursively visited the left subtree, and then recursively visited the right subtree:
def __num_nodes(self, node):
if node is None:
return 0
return 1 + self.__num_nodes(node.left) + self.__num_nodes(node.right)
For the above tree, __num_nodes()
would have visited the nodes in the order 1 2 4 5 7 8 3 6 9
.
Often, it does not matter the order in which the nodes are visited. For some scenarios, such as evaluating an algebraic expression that is modeled as a tree, the order does matter. Therefore, there are some well-known traversal orders, described below.
Pre-order traversal
To summarize, a pre-order traversal of a binary tree first visits the current node, then visits the current node's left subtree, and then recursively visits the current node's right subtree:
def __preorder_print(self, cur):
if cur is None:
return
print(cur.val)
self.__preorder_print(cur.left)
self.__preorder_print(cur.right)
def preorder_print(self):
self.__preorder_print(self.root)
In-order traversal
An in-order traversal of a binary tree first visits the left subtree recursively, then visits the current node, and then visits the right subtree recursively:
def __inorder_print(self, cur):
if cur is None:
return
self.__inorder_print(cur.left)
print(cur.val)
self.__inorder_print(cur.right)
def inorder_print(self):
self.__inorder_print(self.root)
Post-order traversal
An post-order traversal of a binary tree first visits the left subtree recursively, then visits the right subtree recursively, and then visits the current node:
def __postorder_print(self, cur):
if cur is None:
return
self.__postorder_print(cur.left)
self.__postorder_print(cur.right)
print(cur.val)
def postorder_print(self):
self.__postorder_print(self.root)
Level Order Traversal
We mentioned in the previous lesson that visiting the nodes of a tree level-by-level, top to bottom is known as a level-order traversal. However, we haven't yet talked about how a level-order traversal is implemented.
Pre-order, in-order, and post-order traversals have a natural recursive implementation. A level-order traversal is actually easier to implement iteratively, but we will also need the help of queue to do it!
Implementing a level-order traversal
The basic idea behind a level-order traversal is that we want to visit all of the nodes at level h
before we visit the nodes at level h + 1
. To do so, we can enqueue the children of each node as we encounter them:
from collections import deque
def levelorder_print(self):
q = deque()
q.append(self.root)
while len(q) > 0:
node = q.popleft()
print(node.val)
if node.left is not None:
q.append(node.left)
if node.right is not None:
q.append(node.right)
Watch the following video to see how a level-order traversal works:
Huffman Trees
One application of trees is Huffman coding -- building a tree (a Huffman tree) to encode and decode text. Often, Huffman coding also allows text to be compressed to take up less space for storage or transmission.
Watch the following video to see an overview of Huffman coding and the algorithm for building and using a Huffman tree:
To summarize, text characters typically take 1 byte (8 bits) of space each. However, in many cases, documents can be compressed so that each character takes fewer than 8 bits. By building a Huffman tree based on the frequencies of characters in a document, high-frequency characters can get a short encoding (e.g., 3 bits) and low-frequency characters can get a relatively longer encoding (e.g., 6 bits). This will allow the document as a whole to be reduced in size, and then can be decoded using the Huffman tree.
Building and using Huffman tree
Watch the video below for an explanation of (1) building a Huffman tree, (2) using a Huffman tree to compress some text, and (3) using a Huffman tree to decompress some text:
Week 8 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 4
Due date: March 3
📌 This is an individual project. You are expected to work independently.
If you get stuck, confused, or have trouble with the project, you should use the #help-dsa channel in Discord or message an instructor. Remember not to spoil the project for others - do not post code samples from your work to public spaces.
Compression Using Huffman Trees
You've been learning to use data structures to solve problems. In this assignment, you'll solve a practical problem: compressing and decompressing files.
You will create a program to compress and decompress files using Huffman coding. You will construct a data structure based on the characters in a file, then use that data structure to compress a document and store it. You will then write the code to decompress the file using the Huffman coding from the encoding step. The program will output the amount of space saved using the compression algorithm.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Statement on Usage of AI Tools
The list below details acceptable and unacceptable uses of AI tools in Kibo courses, including generative AI tools like ChatGPT and AI tools integrated into code editors like CoPilot. For use cases not listed below, inquire about their acceptability on the course's Discord Help channel.
Acceptable Uses:
- Generating Ideas: Assisting in brainstorming topics for essays, projects, or presentations.
- Research Aid: Helping to find sources, papers, or literature for further reading and reference.
- Explaining Concepts: Providing explanations of complex concepts in simpler terms or different perspectives.
- Outlining: Aiding in creating outlines of topics to build structure for essays or reports.
- Drafting Assistance: Creating initial drafts of supplementary writing material where the main artifact being produced is not a written item.
- Creative Inspiration: Generating creative ideas for projects in arts, literature, or design.
- Technical Guidance: Offering guidance on coding problems, mathematical equations, or scientific principles.
- Proofreading: Assisting in grammar and spelling checks for written assignments.
Unacceptable Uses:
- AI-Generated Work: Submitting work significantly generated by AI as one's own in essays, reports, code, or other assignments.
- Plagiarism: Using AI to rephrase or copy content without proper citation or acknowledgment.
- Test and Exam Assistance: Using AI tools during tests or exams unless explicitly allowed by the instructor.
- Drafting Assistance: Creating initial drafts of written materials for general educations courses.
- Complete Dependence: Relying solely on AI for understanding core concepts without engaging in personal study or critical thinking (AI tools can often be wrong).
Important Notes:
- Students must disclose the use of AI tools in their assignments and provide either shareable links or exports (if the tool doesn't support sharing) showing their use of the tools on the assignment.
- AI tools might not strictly adhere to instructions or might include unrequested information. Always review the output to ensure accuracy and relevance.
- Work generated through AI should be critically reviewed and substantially modified to ensure originality and understanding.
Week 8 Peer Interviews
Interview instructions
For all interviews, you should make sure you have read the interview instructions. After each interview, both the interviewer and interviewee should submit a reflection. The format for the reflection is described in the aforementioned document.
Here is the schedule of peer interviews and question guides for the term. Note: you must be logged-in to your
@kibo.school
account and in the appropriate group to access the question guide for a given week.
Week | Resources | Who is the interviewer? |
---|---|---|
8 | Question Guide | Group B |
Grading
The peer interviews will contribute 10% to your overall grade. This grade is based on
- completion of all four interviews,
- your reflections submitted for each interview.
You can find and submit the interview reflection in Gradescope.
FAQs
- In what setting will the interview take place?
The interview can take place as an online meeting. Meeting in person is not necessary, but is allowed if both partners agree.
- What happens if I'm unable to complete an interview in a given week?
There are opportunities for makeup interviews throughout the semester.
- What happens if my partner is unable to participate in an interview for a given week?
You can either do a makeup interview later in the semester, or find another peer who is willing to conduct the interview with you during the week (only in cases where your partner is the interviewer).
- I don't feel confident in my coding skills, and I'm afraid I won't do well during the peer interview.
In this class (and often in the real world), you are not strictly judged on your ability to produce a bug-free, highly efficient implementation of an algorithm. You are mostly judged on your thinking process and ability to work your way through the problem.
For the peer interviews, you will not be graded on the correctness of your solution, and will earn full credit as long as you complete the interview and submit a thoughtful reflection.
Also: we are here to learn! We are doing this process to give you experience interviewing, so that when you go to do a real interview, you already have some interview experience. We should all be mindful of this and help and encourage each other.
- My partner was unprepared for the interview and it hurt my experience. What can I do?
This is something you should discuss in your submitted reflection. Everyone should come prepared to every interview. Even though these are peer interviews, they are being conducted to help each other gain experience in this important part of becoming a software developer, and being unprepared will diminish this experience for your partner.
- As the interviewee, can I see the solution before I attempt the peer interview?
Please do not seek out the interview question or solution before attempting the interview. If you do, you will not get the full benefit of the exercise, which will only hurt your interviewing skills in the long run.
- How do I volunteer to be interviewed during weeks 1 and 2?
The instructor will provide information about finding volunteers at the start of the term.
- Does the interview have to last the full 30 minutes?
No, but it should last at least 20 minutes, and ideally 25-30 minutes.
- I have another question about the peer interviews. Who can I ask?
You can ask in the class Discord, or email the instructor directly.
Binary Search Trees
Welcome to week 9! Watch the video below to get an overview of the coming week.
Learning objectives
By the end of this week, you will be able to:
- Explain the concepts of data dictionaries and binary search trees (BSTs)
- Trace the execution of searching, insertion, and deletion in a BST
- Write efficient procedures in Python that operate on BSTs
- Explain the importance of balance in a search tree
- Construct and use 2-3 trees as a balanced BST
What's due this week
- Week 9 Quiz on Thursday (on Gradescope)
- Peer Interview 4 Reflection on Sunday
- Assignment 5 on Sunday
Binary Search Trees
Last week, we learned about binary trees. We will now spend some time talking about a specific type of binary tree: a binary search tree.
A binary search tree (BST) is a binary tree in which every node has a key k
, and where the keys are ordered according to the search-tree property:
Search-tree property
For each node
k
(k
is the key):
- All nodes in
k
's left subtree are <k
- All nodes in
k
's right subtree are >k
Here's a visualization of the search-tree property:

The tree below is a binary search tree, since each node obeys the search-tree property:

Ordering the tree turns out to be very valuable, as it will allow us to write efficient tree algorithms. However, before we discuss this efficiency aspect, let's talk about another feature of BSTs: they are an example of a data dictionary.
Data dictionaries
A dictionary is an ADT that stores (key, value) pairs, where each key can appear at most once. It supports the following operations:
insert(key, value)
: insert a new (key, value) pairdelete(key)
: given a key, delete the corresponding (key, value) pairlookup(key)
: lookup the value associated with the given key
Other names for a data dictionary include a map (maps keys to values) and associative array ("array" of keys associated with values).
A binary search tree is an implementation of a dictionary. Each node in the tree has a key (recall: the BST is ordered by the keys) and also contains a value. Often, the values associated with each node are not shown, as in the case of the example BST from earlier in the lesson.
Overview of BSTs
Watch the video below to see an overview of BSTs:
Search in BSTs
We mentioned in the previous lesson that ordering keys using the search-tree property allows us to write efficient tree algorithms. The foremost example of this is searching for the node with a given key (also known as a lookup or a get).
Since a BST is a dictionary, we often want to know: what is the item (or value) associated with a given key k
? The search algorithm pseudocode is as follows:
- If
k
is equal to the root node's key, return the root node's item - Else if
k
is less than the root node's key, search in the left subtree - Else (
k
is greater than the root node's key), search in the right subtree
This logic can be applied recursively at each level of the tree. Since we know the ordering of the keys, we can ignore entire subtrees where k
could not possibly be! This allows the search algorithm to follow a single path, starting from the root node and ending at the node with the key k
(if it exists).
Implementing search in a BST
Watch the video below to see the definition of a LinkedBST
class that we will use for illustrating a binary search tree, as well as the implementation of the search()
method:
Insertion in BSTs
Now that we have seen how to search in a BST, let's think about how to construct a BST. In other words, how is a (key, value) pair inserted into a tree?
Here's the pseudocode for the algorithm to insert a new (key, value) pair (k, v)
:
- Search for
k
in the BST. If it already exists, return without modifying the tree. - If we don't find a node with key
k
, then the last node in the search will become the parent P of the new node N.- If
k
is less than P's key, then N will become P's new left child. - Otherwise (
k
is greater than P's key) N will become P's new right child.
- If
After the insertion algorithm completes, the BST will maintain the search-tree property!
Note: different BST variants may handle the case of
k
already existing differently. For example, they may choose to overwrite the value associated withk
, or may maintain a list of values for each key.
Implementing insertion in BSTs
Watch the video below to see how to implement the insertion algorithm on a BST:
Deletion in BSTs
Sometimes, we may wish to delete a (key, value) pair from a BST. When we do so, we must be careful to maintain the search-tree property even after the deletion is complete.
When deleting a node, we must start at the root of the tree and traverse the path to the node to be deleted. If the node is found, then there are three different scenarios to consider: when the node to delete has no children (a leaf node), when it has one child, and when it has two children.
Case 1: node to delete is a leaf node
This is the simplest case. When we want to delete a leaf node X, we simply need to delete X's parent's reference to X. For example, say that we wanted to delete the 7 node in the tree below:

To do so, we only need to set the 5 node's right child reference to None
.
Case 2: the node to delete has one child
When the node to delete (X) has one child, we take the parent's reference to X and make it instead point to X's child. For example, if we wanted to delete the 13 node, then we would change the child pointer of the parent (the 9 node) so that it references the 10 node:

Case 3: the node to delete has two children
When deleting a node with two children, we need to be careful. Once the node is deleted, we need the BST to still obey the search-tree property for all nodes and subtrees.
To delete a node X with two children:
- Find the smallest (minimum) node in X's right subtree. Call this node Y.
- Replace X's key and value with the key and value from Y.
- Delete the original Y.
For example, if we wanted to delete the 3 node, we would first replace it with the 4 node (since the 4 node is the smallest in the 3 node's right subtree). Then, we delete the original 4 node:

In this case, the original 4 node had no children, and so it was easy to delete -- you could consider an example of a Case 1 deletion. However, it's also possible that the smallest node in the right subtree has a child. For example, if we wanted to delete the root (the 8 node), then we would replace it with the 9 node. When deleting the original 9 node, it is a Case 2 deletion, and we therefore set the new 9's right child to be the 13 node:

Efficiency of BSTs
Up to this point, we have looked at three of the main BST algorithms: search, insertion, and deletion. However, we haven't remarked on their efficiencies yet. Actually, the way we analyze all of their time complexities will be the same, so it makes sense to discuss it all at once.
Before we talk about the efficiency of these BST algorithms, let's think first about the efficiency of other tree algorithms that we've seen.
Efficiency of "regular" tree algorithms
Last week, when we considered non-search trees, we basically saw two types of algorithms:
-
Algorithms that always need to visit every node in the tree (i.e., always
O(n)
). Tree traversals and counting the number of nodes fall into this category. -
Algorithms that can sometimes terminate before visiting every node. For example, when trying to find a depth of a node, in the best case the node we're searching for is at the root of the tree (
O(1)
). In the worst case, we need to look at all nodes (O(n)
).
With BSTs, we have a third common type of algorithm:
- Algorithms that can focus on a particular path in the tree. They don't need to look at all nodes -- they follow a path from the root down to a particular node.
Efficiency of BST algorithms
With binary search trees, the search-tree property often allows algorithms to ignore entire subtrees and proceed down a particular path of the tree. For example, during search()
, we were able to focus on a particular path down only the parts of the tree where it was possible for the given key to be found.
In the worst case, whenever we were searching, inserting, or deleting a node, we had to traverse from the root node all the way down to a leaf node. In other words, we traversed a path throughout the height h
of the tree. Therefore, we could say that search, insertion, and deletion are all O(h)
algorithms in the worst case.
Balanced vs. unbalanced trees
We now know that in the worst case, searching and the related algorithms for BSTs is O(h)
, where h
is the height of the tree. Ideally, the height of the tree is approximately logn
, where n
is the number of nodes in the tree:

The height of this tree is logn
because of the branching structure of the tree. Each interior node has a left and right child, which essentially divides the number of nodes to fill out the tree by two at each node. Therefore, to be specific about the base of the logarithm, the height is approximatelylog2n
levels.
However, not all BSTs are balanced in this manner. Depending on how the keys of the BST are inserted, the tree may become unbalanced. For example, if we inserted all of the keys of a BST in sorted order, we would end up with a tree that has only right children:

In this case, the height h
is exactly n - 1
= O(n)
. Therefore, the BST algorithms would all be O(n)
!
Summary
To summarize, here are the time complexities of the three main BST methods:
Search | Insertion | Deletion | |
---|---|---|---|
Best Case | O(1) | O(1) | O(1) |
Average Case (balanced tree) | O(logn) | O(logn) | O(logn) |
Worst Case (unbalanced tree) | O(n) | O(n) | O(n) |
Clearly, it would be beneficial to avoid the worst case performance of unbalanced trees. We look at balanced BSTs next.
2-3 Trees
Binary search trees generally provide logarithmic-time search, insertion, and deletion. However, as we saw in the previous lesson, BSTs can become unbalanced, which causes the efficiency of those algorithms to drop to linear time.
To prevent this issue, there are BST variants that guarantee that the tree stays balanced. There are many such balanced BST variants, but we will focus on just one: 2-3 trees.
2-3 trees
A 2-3 tree ("two-three tree") is a tree in which:
- All nodes have subtrees of equal height (perfect balance)
- Each node is either:
- A 2-node ("two-node"), which contains a data item and 0 or 2 children
- A 3-node ("three-node"), which contains a data item and 0 or 3 children
- The keys form a search tree
For example, the following is a 2-3 tree:

Search in 2-3 trees
Watch the following video to see how to search for a key in a 2-3 tree:
Insertion in 2-3 trees
Watch the following video to see how to insert a new key in a 2-3 tree:
Efficiency of 2-3 trees
Since a 2-3 tree is always perfectly balanced, it always has a height h = ~log2n
. Therefore, search, insertion, and deletion are all O(logn)
algorithms in the worst case -- there is no possibility for them to devolve into O(n)
!
Week 9 Quiz
Complete this week's quiz using the following link: Gradescope
Submitting Your Work
Your work must be submitted Anchor for degree credit and to Gradescope for grading.
For coding tasks involving Github Classroom:
- Ensure that you
commit
andpush
your local code changes to your remote repository. (Note: In general, you should commit and push frequently, so that you have a backup of your work, so that there is evidence that you did your own work, and so that you can return to a previous state easily.) - Upload your submission to Gradescope via the appropriate submission link by selecting the correct GitHub repository from the drop-down list.
- Export a zip archive of your GitHub repository by visiting your repo on GitHub, clicking on
the green
Code
button, and selecting "Download Zip". - Upload the zip file of your repository to Anchor using the form below.
For cases where you answer questions on Gradescope:
- Complete the work in Gradescope by navigating to the appropriate link.
- Export it as a pdf using th Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor using the form below.
For any work completed outside of GitHub or Gradescope:
- Take either screen captures of your work or export a pdf showing your complete work.
- Submit the materials to Gradescope via the appropriate submission link for the course.
- Upload the screen captures or pdf files to Anchor using the form below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 5
Due date: March 10
📌 This is an individual project. You are expected to work independently.
If you get stuck, confused, or have trouble with the project, you should use the #help-dsa channel in Discord or message an instructor. Remember not to spoil the project for others - do not post code samples from your work to public spaces.
Implementing BST functions
In this assignment, you will practice manipulating memory and references by implementing BST functions and an iterator. You will practice working with trees and implementing tree-manipulation functions efficiently.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Statement on Usage of AI Tools
The list below details acceptable and unacceptable uses of AI tools in Kibo courses, including generative AI tools like ChatGPT and AI tools integrated into code editors like CoPilot. For use cases not listed below, inquire about their acceptability on the course's Discord Help channel.
Acceptable Uses:
- Generating Ideas: Assisting in brainstorming topics for essays, projects, or presentations.
- Research Aid: Helping to find sources, papers, or literature for further reading and reference.
- Explaining Concepts: Providing explanations of complex concepts in simpler terms or different perspectives.
- Outlining: Aiding in creating outlines of topics to build structure for essays or reports.
- Drafting Assistance: Creating initial drafts of supplementary writing material where the main artifact being produced is not a written item.
- Creative Inspiration: Generating creative ideas for projects in arts, literature, or design.
- Technical Guidance: Offering guidance on coding problems, mathematical equations, or scientific principles.
- Proofreading: Assisting in grammar and spelling checks for written assignments.
Unacceptable Uses:
- AI-Generated Work: Submitting work significantly generated by AI as one's own in essays, reports, code, or other assignments.
- Plagiarism: Using AI to rephrase or copy content without proper citation or acknowledgment.
- Test and Exam Assistance: Using AI tools during tests or exams unless explicitly allowed by the instructor.
- Drafting Assistance: Creating initial drafts of written materials for general educations courses.
- Complete Dependence: Relying solely on AI for understanding core concepts without engaging in personal study or critical thinking (AI tools can often be wrong).
Important Notes:
- Students must disclose the use of AI tools in their assignments and provide either shareable links or exports (if the tool doesn't support sharing) showing their use of the tools on the assignment.
- AI tools might not strictly adhere to instructions or might include unrequested information. Always review the output to ensure accuracy and relevance.
- Work generated through AI should be critically reviewed and substantially modified to ensure originality and understanding.
Reading
Read Chapter 8 of An Open Guide to Data Structures and Algorithms.
Wrap Up
Welcome to week 10! This is the final week of the term. Congratulations on nearing the end!
There is no new material this week. Instead, your last task is to study for the final exam and take it on Thursday.
What's due this week
- Final exam on Thursday
- Any outstanding assignments are due by the last day of the term (Friday)
Final Exam
We will have our final exam on Thursday, March 14 at 6:00 PM GMT. Here are the details:
Link to Exam
You can access the exam on OnlineExamMaker here (link to come soon): https://t.onlineexammaker.com/doexam/GpWeERZEy6N.html.
Date and time
- Held during Week 10 during our normal class time: Thursday, March 14 at 6:00 PM GMT
- You can access the link to the final exam ahead of time, but it won't be enabled until 6:00 PM GMT the day of the exam
Material
- Everything that was covered during the term, with an emphasis on weeks 5 - 9
- This includes lesson material, live classes, practice problems, assignments
Format
- Administered online using a tool called OnlineExamMaker
- You can prepare and use any amount of notes on paper, but no other resources on the computer or Internet are allowed
- 75 minute exam from the time you begin
- Be sure to start the test between 6:00 - 6:15 PM GMT (it will close after that point)
- Exam will be composed of 20 multiple choice questions, shown one at a time
- You can't go back and change your answer after moving on to the next question
- We won't be using Zoom during the exam; you can just access the exam individually when you're ready
- Plan to have scratch paper for tracing/diagramming etc.
Proctoring software
- The exam website is also a proctoring service
- It will ask you to turn your camera on, and will take some pictures at random during the test
- It will flag any activity such as closing the test, changing programs, switching tabs, etc. If you do this too many times, it will close the exam and you will not be able to continue. Once you start the exam, you should expect to stay on the exam page and complete each question without any interruptions or using your computer for anything else.
- It disables copy and paste during the test
- It gives us the option to monitor your activity
Exam environment
- Plan to be in a test-taking environment
- Quiet, mostly private
Questions
If you have any questions leading up to the exam, please ask in #dsa-help on Discord or reach out to the instructor.