Course Overview
Welcome to Data Structures and Algorithms 2!
Course Description
This course builds on Data Structures & Algorithms 1. Students will explore non-linear data structures, and implement and analyze advanced algorithms.
The course begins with a brief review of basic data structures and algorithms. Students learn about advanced data structures including heaps, priority queues, and hash tables. Students build on their knowledge of graph theory to implement graph algorithms, and explore topics like finding the shortest paths in graphs, and applications of algorithms in maps, social networks, and a host of real-life applications. Other key topics include: divide and conquer, recursion, greedy algorithms, dynamic programming, computability theory, and case studies in algorithm design.
The course emphasizes big-picture understanding and practical problem-solving in preparation for technical interviews and professional practice. Students will solve common algorithmic problems, and optionally participate in mock interview sessions.
Topics
- Hash tables
- Heaps and priority queues
- State-space search
- Greedy algorithms
- Graphs
- Computability
- Divide-and-conquer algorithms
- Dynamic programming
How the course works
There are multiple ways you'll learn in this course:
- Read and engage with the materials on this site
- Attend live class and complete the activities in class
- Answer practice exercises to try out the concepts
- Complete assignments and projects to demonstrate what you have learned
Active engagement is necessary for success in the course! You should try to write lots of programs, so that you can explore the concepts in a variety of ways.
You are encouraged to seek out additional practice problems outside of the practice problems included in the course.
Learning Outcomes
By the end of the course, students will be able to:
- Describe the implementation and performance of fundamental data structures
- Describe underlying data structures upon which more complex structures are built
- Implement and evaluate runtime of a variety of algorithms, including divide-and-conquer, greedy, and dynamic programming algorithms
- Solve common algorithmic technical interview problems
Instructor
- Cody Doucette
- cody.doucette@kibo.school
Please contact on Discord first with questions about the course.
Live Class Time
- Thursdays, 6:00 PM GMT
Office Hours
- Fridays, 7:00 PM GMT
Core Reading List
- Bible, P. Moser, L. Scarlato, M. (2023). An Open Guide to Data Structures and Algorithms. PALNI.
- Miller, B., Ranum, D. (2013). Problem Solving with Algorithms and Data Structures using Python. Franklin Beedle Publishers. Chapters 6-9 (180pp)
- Goodrich, M. (2013). Data Structures & Algorithms in Python. Chapters 8-15 (418pp)
Supplemental Reading List
- Skiena, S. (2011). The Algorithm Design Manual. Springer; 2nd edition
- Skiena, S. (2020) Analysis of Algorithms, videos and Lecture Notes
- Roughgarden, T. (2017). Algorithms Illuminated Series. Soundlikeyourself Publishing
- Roughgarden, T. (2017). Algorithms 1 (YouTube)
- Leetcode. Practice Problems
Live Classes
Each week, you will have a live class (see course overview for time). You are required to attend the live class sessions.
Video recordings and resources for the class will be posted after the classes each week. If you have technical difficulties or are occasionally unable to attend the live class, please be sure to watch the recording as quickly as possible so that you do not fall behind.
Week | Topic | Live Class | Materials |
---|---|---|---|
1 | Python and Data Structures | YouTube | Slides |
2 | Hash Tables | YouTube | Slides |
3 | Priority Queues and Heaps | YouTube | Slides |
4 | Dynamic Programming | YouTube | Slides |
5 | Midterm Week | No class | |
6 | Graphs I | YouTube | Slides |
7 | Graphs II | YouTube | Slides |
7 | Advanced Algorithms | YouTube | Slides |
7 | Intractable Problems | YouTube | Slides |
10 | Final Exam Week | No class |
Assessments & Grading
You overall course grade is composed of these weighted elements:
- Weekly Quizzes (8 total): 10%
- Programming Assignments (5 total): 40%
- Midterm Exam: 25%
- Final Exam: 25%
Weekly Quizzes
In each week that we have new material, there will be a short quiz in Gradescope on the material covered in the lessons. These quizzes are due shortly before the start of the live class. No late submissions are accepted.
These quizzes will make up 10% of your overall course grade.
- Quizzes will be released in Gradescope each week.
- Quizzes will be due at the end of each week.
- Submit each quiz in Gradescope, then submit a pdf of your quiz in Anchor
Programming Assignments
During some weeks, you'll be given an assignment to test the new skills and concepts you learned. The assignments will be challenging, and it's recommended that you complete the material and practice problems before you start the project each week.
There will be five assignments overall.
Midterm Exam
There will be a midterm exam held during week 5 of the term, covering the material in weeks 1-4.
Final Exam
There will be a final exam held during week 10 of the term, covering the material in weeks 6-9.
Late Policy
Programming assignments can be submitted late under Kibo's late policy:
- Submissions within two weeks of the deadline incur a 10% penalty.
- Submissions beyond two weeks late incur a 20% penalty.
The final submission deadline is the last day of the academic term.
Note that the late policy only applies to programming assignments. Quizzes must be submitted before the live class each week, and no late submissions will be accepted. The midterm exam and final exam must be taken at the assigned time.
Students facing exceptional circumstances in terms of submitting work on time should immediately contact their Kibo advisor.
Getting Help
If you have any trouble understanding the concepts or stuck on a problem, we expect you to reach out for help!
Below are the different ways to get help in this class.
Discord Channel
The first place to go is always the course's help channel on Discord. Share your question there so that your Instructor and your peers can help as soon as we can. Peers should jump in and help answer questions (see the Getting and Giving Help sections for some guidelines).
Message your Instructor on Discord
If your question doesn't get resolved within 24 hours on Discord, you can reach out to your instructor directly via Discord DM or Email.
Office Hours
There will be weekly office hours with your Instructor and your TA. Please make use of them!
Tips on Asking Good Questions
Asking effective questions is a crucial skill for any computer science student. Here are some guidelines to help structure questions effectively:
-
Be Specific:
- Clearly state the problem or concept you're struggling with.
- Avoid vague or broad questions. The more specific you are, the easier it is for others to help.
-
Provide Context:
- Include relevant details about your environment, programming language, tools, and any error messages you're encountering.
- Explain what you're trying to achieve and any steps you've already taken to solve the problem.
-
Show Your Work:
- If your question involves code, provide a minimal, complete, verifiable, and reproducible example (a "MCVE") that demonstrates the issue.
- Highlight the specific lines or sections where you believe the problem lies.
-
Highlight Error Messages:
- If you're getting error messages, include them in your question. Understanding the error is often crucial to finding a solution.
-
Research First:
- Demonstrate that you've made an effort to solve the problem on your own. Share what you've found in your research and explain why it didn't fully solve your issue.
-
Use Clear Language:
- Clearly articulate your question. Avoid jargon or overly technical terms if you're unsure of their meaning.
- Proofread your question to ensure it's grammatically correct and easy to understand.
-
Be Patient and Respectful:
- Be patient while waiting for a response.
- Show gratitude when someone helps you, and be open to feedback.
-
Ask for Understanding, Not Just Solutions:
- Instead of just asking for the solution, try to understand the underlying concepts. This will help you learn and become more self-sufficient in problem-solving.
-
Provide Updates:
- If you make progress or find a solution on your own, share it with those who are helping you. It not only shows gratitude but also helps others who might have a similar issue.
Remember, effective communication is key to getting the help you need both in school and professionally. Following these guidelines will not only help you in receiving quality assistance but will also contribute to a positive and collaborative community experience.
Screenshots
It’s often helpful to include a screenshot with your question. Here’s how:
- Windows: press the Windows key + Print Screen key
- the screenshot will be saved to the Pictures > Screenshots folder
- alternatively: press the Windows key + Shift + S to open the snipping tool
- Mac: press the Command key + Shift key + 4
- it will save to your desktop, and show as a thumbnail
Giving Help
Providing help to peers in a way that fosters learning and collaboration while maintaining academic integrity is crucial. Here are some guidelines that a computer science university student can follow:
-
Understand University Policies:
- Familiarize yourself with Kibo's Academic Honesty and Integrity Policy. This policy is designed to protect the value of your degree, which is ultimately determined by the ability of our graduates to apply their knowledge and skills to develop high quality solutions to challenging problems--not their grades!
-
Encourage Independent Learning:
- Rather than giving direct answers, guide your peers to resources, references, or methodologies that can help them solve the problem on their own. Encourage them to understand the concepts rather than just finding the correct solution. Work through examples that are different from the assignments or practice problems provide in the course to demonstrate the concepts.
-
Collaborate, Don't Complete:
- Collaborate on ideas and concepts, but avoid completing assignments or projects for others. Provide suggestions, share insights, and discuss approaches without doing the work for them or showing your work to them.
-
Set Boundaries:
- Make it clear that you're willing to help with understanding concepts and problem-solving, but you won't assist in any activity that violates academic integrity policies.
-
Use Group Study Sessions:
- Participate in group study sessions where everyone can contribute and learn together. This way, ideas are shared, but each individual is responsible for their own understanding and work.
-
Be Mindful of Collaboration Tools:
- If using collaboration tools like version control systems or shared documents, make sure that contributions are clear and well-documented. Clearly delineate individual contributions to avoid confusion.
-
Refer to Resources:
- Direct your peers to relevant textbooks, online resources, or documentation. Learning to find and use resources is an essential skill, and guiding them toward these materials can be immensely helpful both in the moment and your career.
-
Ask Probing Questions:
- Instead of providing direct answers, ask questions that guide your peers to think critically about the problem. This helps them develop problem-solving skills.
-
Be Transparent:
- If you're unsure about the appropriateness of your assistance, it's better to seek guidance from professors or teaching assistants. Be transparent about the level of help you're providing.
-
Promote Honesty:
- Encourage your peers to take pride in their work and to be honest about the level of help they received. Acknowledging assistance is a key aspect of academic integrity.
Remember, the goal is to create an environment where students can learn from each other (after all, we are better together) while we develop our individual skills and understanding of the subject matter.
Academic Integrity
When you turn in any work that is graded, you are representing that the work is your own. Copying work from another student or from an online resource and submitting it is plagiarism. Using generative AI tools such as ChatGPT to help you understand concepts (i.e., as though it is your own personal tutor) is valuable. However, you should not submit work generated by these tools as though it is your own work. Remember, the activities we assign are your opportunity to prove to yourself (and to us) that you understand the concepts. Using these tools to generate answers to assignments may help you in the short-term, but not in the long-term.
As a reminder of Kibo's academic honesty and integrity policy: Any student found to be committing academic misconduct will be subject to disciplinary action including dismissal.
Disciplinary action may include:
- Failing the assignment
- Failing the course
- Dismissal from Kibo
For more information about what counts as plagiarism and tips for working with integrity, review the "What is Plagiarism?" Video and Slides.
The full Kibo policy on Academic Honesty and Integrity Policy is available here.
Course Tools
In this course, we are using these tools to work on code. If you haven't set up your laptop and installed the software yet, follow the guide in https://github.com/kiboschool/setup-guides.
- GitHub is a website that hosts code. We'll use it as a place to keep our project and assignment code.
- GitHub Classroom is a tool for assigning individual and team projects on Github.
- Visual Studio Code is an Integrated Development Environment (IDE) that has many plugins which can extend the features and capabilities of the application. Take time to learn how ot use VS Code (and other key tools) because you will ultimately save enormous amounts of time.
- Anchor is Kibo's Learning Management System (LMS). You will access your course content through this website, track your progress, and see your grades through this site.
- Gradescope is a grading platform. We'll use it to track assignment submissions and give you feedback on your work.
- Woolf is our accreditation partner. We'll track work there too, so that you get credit towards your degree.
Python and Data Structures Review
Welcome to week 1!
Learning objectives
By the end of this week, you will be able to:
- recall the differences between arrays and linked data structures in terms of memory allocation, insertion, and deletion operations.
- analyze algorithms for their efficiencies in terms of both time and space.
- develop iterative and recursive algorithms that read and modify linked data structures.
- apply Python's built-in data structures to efficiently solve problems.
What's due this week
- Week 1 Quiz (on Gradescope), due Thursday before class starts
- Assignment 1, due Sunday
Data Structures
As we continue our study of data structures and algorithms, let's first think back to some of the concepts we learned in the previous course.
Before continuing our study of data structures and algorithms, let's first think back to what we have previously learned. In the previous data structures course, we learned about a few different linear data structures, including arrays, linked lists, stacks, and queues.
We think of arrays and linked lists as building block data structures; we often use them to implement other data structures with some kind of abstraction, such as stacks and queues.
How we choose to implement data structures, including deciding between using an array or a linked list, depends on what properties we are looking for and what efficiency is required. Check out the video below to remind yourself about these data structures:
To summarize:
- Big-O notation provides us with a language to analyze the efficiency of algorithms, both in terms of time and space. Big-O classes include
O(1)
(constant),O(logn)
,O(n)
,O(n2)
, andO(cn)
(in order from fastest to slowest). - An array is a fixed-size collection of items that provides random access to its elements.
- A linked list is an extensible collection of items that does not provide random access to its elements.
- A queue is a data structure with a first-in, first-out (FIFO) abstraction.
- A stack is a data structure with a last-in, first-out (LIFO) abstraction.
Data structures in Python
As we learn about these data structures, we implement them in Python or use existing Python constructs.
Array
Python doesn't have a built-in array data type. However, it does have a list data type, which we use to simulate an array. Under the hood, Python lists are actually implemented as arrays, so we have to be careful and recognize that occasionally operations on a Python list might be expensive (for example, inserting an item in the middle of a list).
Linked list
When we need to use a linked list, we create a special Node
class that contains at least a piece of data and a reference to the next node in the list. We then create instances of the Node
list and connect them to form a linked list.
For example, here's a portion of the LLString
class that used in a previous semester, which includes Node
as a nested, inner class and also has a prepend()
method to add a node to the head of the list.
class LLString:
class Node:
def __init__(self, val, next):
self.val = val
self.next = next
def __init__(self):
self.head = None
def prepend(self, new_val):
new_node = LLString.Node(new_val, self.head)
self.head = new_node
Queue
Python provides a deque
object from its collections
library that we can use as a queue data structure. A deque
(pronounced "deck", stands for double-ended queue) provides with O(1)
enqueue and dequeue operations by using the append()
and popleft()
methods, respectively:
Stack
The easiest way to simulate a stack in Python is to use a list. To push items onto the stack, we can use the append()
method, and the pop items from the top of the stack we can use the (you guessed it) pop()
method. Both of these are O(1)
algorithms.
Arrays vs. Linked Lists
A classic tradeoff in computer science is whether to implement your data structure using an array or a linked data structure. Let's remind ourselves of some of the space and time tradeoffs between the two, since we will continue to use them in building new data structures in the coming weeks.
Watch the following video to remind yourself of the similarities and differences between arrays and linked lists:
To summarize:
Arrays
- Allow random (
O(1)
) access to elements, which makes them amenable to techniques like binary search (since it requires jumping to specific points in the array) - Need to be stored in contiguous memory, so they are typically not resizable
- If you need to add more items to the array than its capacity would allow, you would need to create a new, larger array and copy all of the elements over (
O(n)
)
Linked lists
- Linked lists don't allow random access to elements; to find an item in the list, you generally must start at the head of the list and traverse as far as needed (
O(n)
) - Doesn't need to be stored in contiguous memory. Instead, individual nodes can be stored anywhere on the memory heap, with pointers connecting nodes. Therefore, linked lists are not fixed-length
- Adding to the head or tail of the list is generally
O(1)
Efficiency and Big-O Notation
With every new data structure and algorithm that we have learned, we have also discussed the efficiency implications. We will continue to do so in this course, and will also learn new ways of evaluating the efficiency of algorithms in terms of time and space.
We have been using big-O notation to describe efficiency. To remind yourself of the basics of big-O notation, watch the video below:
To summarize:
- Big-O notation is a measure of an algorithms efficiency
- It describes the efficiency as a function of the size of the input,
n
- When given an expression that represents the number of operations of an algorithm, the corresponding big-O notation can be calculated by ignoring lower-order terms and constants
- Other heuristics can be used to derive big-O expressions, such as by analyzing loops
- It's important to remember that big-O notation is a simplified way of describing efficiency that focuses on the worst-case behavior of an algorithm; in practice, we care about all cases
Recursion
Many problems in computer science can be solved by repeatedly breaking down a problem into a slightly smaller problem, since once you know the solution to the smaller problem, it's easier to solve the original problem.
In the previous course, we learned about how to use recursion to solve such problems. We saw that recursion is a natural and elegant way to write algorithms that operate on data structures such as arrays, linked lists, and trees.
Watch the video below to remind yourself about how to think recursively and write a recursive function:
Recursion on data structures
Let's now take a look at the length()
function, which uses recursion to find the length of a linked list:
class LinkedList:
class Node:
def __init__(self, val, next):
self.val = val
self.next = next
def __length(self, head):
if head is None:
return None
length_of_rest = self.length(head.next)
return length_of_rest + 1
def length(self):
return self.__length(self.head)
This is the style in which we had been writing recursive functions. Notice a few things in particular:
- We have a recursive helper method, which uses a parameter
head
to track where we currently are in the list. - We have a base case to check whether
head
isNone
; if it is, we have reached the end of the list. - We make a recursive call, passing in
head.next
to advance to the next node in the list. We store the return value in a variablelength_of_rest
. - Once we know the length of the rest of the list we simply need to add one and return the result.
Dictionaries and trees
At the very end of the previous course, we left off by describing the concept of data dictionaries -- data structures that are purpose-built for storing and looking-up data. There are many different kinds of data dictionaries, and many forms of search algorithms, including linear search, binary search, and binary search trees.
Linear search
Perhaps the simplest way of implementing a data dictionary would be to store all of our data in an array, and iterate over that array when searching for an item. However, if we don't know anything about the position of a desired item, then we have no choice except to look at all items, which makes linear search an O(n)
algorithm.
Binary search
However, we can search much faster if we keep the items of the array in sorted order. If the items are ordered, then we can use binary search to more quickly "jump" to positions in the array. This significantly cuts down on the number of items we need to inspect when searching, which makes binary search an O(logn)
algorithm.
However, keeping the items of the array in sorted order comes at a cost. When we need to insert a new item, we need to move all elements that come after the newly-inserted item. This takes O(n)
time, which is not ideal.
Binary search trees
In order to achieve O(logn)
lookup time in addition to O(logn)
insertion time, we can use binary search trees. By using a linked data structure that branches at each junction, we can more quickly perform insertions while still getting the benefits of binary search.
With balanced binary search trees, we can additionally guarantee logarithmic time lookup and insertion algorithms. Otherwise, "standard" BSTs can become unbalanced in some cases, leading to linear time complexities.
What's next?
With linear search, binary search, and binary search trees, our current understanding of the different implementations of data dictionaries at the moment is summarized by this table:
data structure | searching for an item | inserting an item |
---|---|---|
a list implemented using an array | O(logn) using binary search | O(n) |
a list implemented using a linked list | O(n) using linear search | O(n) |
binary search tree | O(logn) in best and average cases; O(n) in worst case | O(logn) in best and average cases; O(n) in worst case |
balanced binary search trees (e.g., 2-3 trees) | O(logn) in all cases | O(logn) in all cases |
Logarithmic algorithms are quite fast for data dictionaries, and are certainly a significant improvement over linear algorithms. However, we can actually do even better with the help of hash tables. Believe it or not, we will actually be able to implement search and insertion in O(1)
time! This will be the topic of our next week of study.
Week 1 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 1
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa2 channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Python and Data Structures Primer
In this assignment, you will use the data structures knowledge and Python skills that you have gained from previous courses to implement a series of functions.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Statement on Usage of AI Tools
The list below details acceptable and unacceptable uses of AI tools in Kibo courses, including generative AI tools like ChatGPT and AI tools integrated into code editors like CoPilot. For use cases not listed below, inquire about their acceptability on the course's Discord Help channel.
Acceptable Uses:
- Generating Ideas: Assisting in brainstorming topics for essays, projects, or presentations.
- Research Aid: Helping to find sources, papers, or literature for further reading and reference.
- Explaining Concepts: Providing explanations of complex concepts in simpler terms or different perspectives.
- Outlining: Aiding in creating outlines of topics to build structure for essays or reports.
- Drafting Assistance: Creating initial drafts of supplementary writing material where the main artifact being produced is not a written item.
- Creative Inspiration: Generating creative ideas for projects in arts, literature, or design.
- Technical Guidance: Offering guidance on coding problems, mathematical equations, or scientific principles.
- Proofreading: Assisting in grammar and spelling checks for written assignments.
Unacceptable Uses:
- AI-Generated Work: Submitting work significantly generated by AI as one's own in essays, reports, code, or other assignments.
- Plagiarism: Using AI to rephrase or copy content without proper citation or acknowledgment.
- Test and Exam Assistance: Using AI tools during tests or exams unless explicitly allowed by the instructor.
- Drafting Assistance: Creating initial drafts of written materials for general educations courses.
- Complete Dependence: Relying solely on AI for understanding core concepts without engaging in personal study or critical thinking (AI tools can often be wrong).
Important Notes:
- Students must disclose the use of AI tools in their assignments and provide either shareable links or exports (if the tool doesn't support sharing) showing their use of the tools on the assignment.
- AI tools might not strictly adhere to instructions or might include unrequested information. Always review the output to ensure accuracy and relevance.
- Work generated through AI should be critically reviewed and substantially modified to ensure originality and understanding.
Hash Tables
Welcome to week 2!
Learning objectives
By the end of this week, you will be able to:
- trace, compare, and contrast various collision resolution methods, and evaluate their impact on hash table performance.
- apply the concept of hash tables to solve problems involving efficient data lookup and storage.
- compare the effectiveness of hash tables to other data dictionaries, taking into account time and space complexity.
What's due this week
- Week 2 Quiz (on Gradescope), due Thursday before class starts
- Assignment 2, due Sunday
Direct-Address Tables
At the end of last week, we discussed the characteristics of various implementations of data dictionaries. The most efficient variant we have seen so far is balanced binary search trees, which guarantee O(logn)
lookup and insertion performance. But can we do better?
Ultimately, we will see that hash tables allow us to achieve even better performance than balanced BSTs. However, before we talk about hash tables, we need to discuss direct-address tables.
Direct-address tables
Say we are trying to insert some (key, value) pairs into an initially empty array of length n
. If we know that each key is in the range [0, n
- 1], then we can simply use the key as the index into the array!
For example, say that our (key, value) pairs represent football players. Each key is a player's jersey number, and each value is the player's name. If we can assume that jersey numbers are in the range 0-99, then we can easily insert all of the players into an array of length 100:

To lookup the name of a player, we can use their jersey number to index the array and find the name in O(1)
time.
However, there are a few issues with this approach to storing (key, value) pairs:
- In many cases, we don't know the key space, or range of potential keys, ahead of time. For example, the key could be any number, not just a number in the range 0-99.
- Even if we knew the range of the keys, we have to make our array large enough to fit the maximum value in the range. In other words, if the maximum possible key value was 1,000,000, we would need to make our array large enough to hold 1,000,000 items, even if we were only expecting 1,000 items in total.
Because of these issues related to the data types and ranges of values of keys, directly using keys to index an array won't work in the general case. However, we can achieve a similar result using hash functions.
Hash Tables
In the previous lesson, we saw that there were multiple issues with trying to index a table directly using keys, including the fact that keys can be any value (including strings!).
To solve this, hash tables provide us with the ability to map a key of any data type to an array index. Watch the two videos below to learn about how hash functions work:
To summarize:
- A hash function
h(x)
maps a keyx
to a hash value, which is an integer. - A key can be mapped to an array position using the expression
h(x) % n
, wheren
is the size of the array. - An array that uses hash functions to manage (key, value) pairs is a hash table.
- Ideally, hash functions map keys to integers uniformly at random.
- Typically, performing a key lookup in a hash table is a constant time (
O(1)
) operation. - Since a hash function may map two different keys to the same table position, we need a way to deal with collisions.
Separate chaining
As we previously saw, a hash table can have a collision if two different keys have the same hash value. More formally, we say that two keys $x$ and $y$ collide if $h(x) = h(y)$ where $x \neq y$.
Therefore, we may sometimes run into a situation where we are trying to insert key $y$ into the table, but the key $x$ already occupies the bucket where $y$ belongs! One technique for resolving a collision is known as separate chaining.
Buckets of linked lists
In a separate chaining hash table, the table is represented as an array of buckets, where each bucket contains a linked list of entries:

Now, two (or more) keys that have the same hash value are inserted into the corresponding bucket's linked list. For example, cat
, cow
, and camel
are all hashed to the bucket at index 2 and are therefore stored in a linked list.
Note: from this point on, for simplicity our hash table diagrams will only show the keys that are used to organize the table. However, don't forget that typically, a hash table is used to store (key, value) pairs as a data dictionary.
Building a hash table
Watch the following video to see how to build a separate chaining hash table from a list of keys:
Analyzing separate chaining
Previously, we saw that in many cases, looking up a key in a hash table is O(1)
. However, collisions complicate the lookup operation. In separate chaining, when we hash a key to a bucket, we then need to walk down the linked list in that bucket to find the key (if it's present).
Worst case analysis
In the worst case, all $n$ keys that are in the table have been hashed to the same bucket, creating a linked list in that bucket that is n
nodes long. Looking up a key in this list would therefore take time O(n)
in the worst case, since we would need to start at the beginning of this list and walk down it until we find the key we're looking for.
Average case analysis
But what about the average case? Is it typically closer the best case O(1)
performance, or closer to the worst case O(n)
performance? To answer this question, we first need to define the load factor of a hash table.
The load factor ($\alpha$, "alpha") of a hash table is the main way to measure how full the hash table is. It is defined as the ratio of the number of entries in the table ($n$) to the overall number of buckets in the table ($m$):
$\alpha = n/m$
For example, the load factor of a hash table of length 100 with 50 items in it is $\alpha = 0.5$.
When looking up a key in a separate-chaining hash table, the average length of the linked lists is the load factor, since it represents $n$ elements spread out over $m$ slots. Therefore, you can think of the average case lookup performance in a separate chaining hash table as O(n/m)
.
However, as long as the load factor is kept < 1 (i.e., the table is not too full), the time complexity is generally better than O(logn)
, and approaches O(1)
.
This is yet another classic time-space tradeoff in computer science. Larger hash tables take up more memory, but have a lower load factor, which means that the linked lists in buckets are shorter, which leads to faster lookup times. Smaller hash tables conserve space, but have longer linked lists and therfore longer lookup times.
Open Addressing I
Separate chaining is just one technique for resolving collisions in hash tables. The other main technique is known as open addressing.
In open addressing, the keys are stored directly in the entries of the hash table (not in buckets of linked lists). If a collision occurs when trying to insert a new key, a different, open entry in the table must be found instead. The process of finding an open entry is known as probing, and we'll consider three different techniques: linear probing, quadratic probing, and double hashing.
Linear probing
In linear probing, whenever a collision occurs in the hash table, we look sequentially in the table for the next open entry, one index a time. In other words, as long as there are collisions, we continue to look for the next empty cell according to this sequence (wrapping around as needed):
h(x), h(x) + 1, h(x) + 2, h(x) + 3, ...
Watch the following video to see an example of linear probing:
If there is an open cell in the table (and if the table size is a prime number), linear probing will eventually find it. The disadvantage, however, is that linear probing doesn't tend to spread keys throughout the table very well. It can lead to clusters of occupied cells, which leads to longer subsequent probes when looking up values.
To try to correct this behavior, we can consider quadratic probing.
Quadratic probing
Quadratic probing skips increasingly larger sequences of cells when looking for an open position. It uses the following sequence of probes (wrapping around as necessary):
h(x), h(x) + 12, h(x) + 22, h(x) + 32, ...
Watch the following video to see an example of quadratic probing:
The advantage of quadratic probing is that it spreads the keys of the table further apart, which creates smaller clusters of occupied cells compared to linear probing. However, the disadvantage is that quadratic probing is not necessarily guaranteed to find an open cell. Because the probe sequence increases quadratically, there is no mathematical guarantee that we will eventually land on an empty cell.
Therefore, we'll consider one more probing scheme in the next lesson: double hashing.
Open Addressing II
The final probing technique for resolving collisions that we'll consider is double hashing, which actually uses two different hash functions.
Double hashing
In double hashing, we use two hash functions instead of one. When there is a collision when using our first hash function, h1(x)
, we'll add in increasingly larger factors of our second function, h2(x)
. The probe sequence is as follows:
h1(x), h1(x) + 1*h2(x), h1(x) + 2*h2(x), h1(x) + 3*h2(x), ...
Because we are repeatedly adding in multiples of h2(x)
, we must pick an h2
that can never output 0 -- otherwise, the probe sequence would not change.
Watch the video below to see how double hashing uses two different hash functions to find an empty space in the table:
With double hashing, we get the best of both worlds: a probing scheme that spreads the keys throughout the table like quadratic probing does, and is also guaranteed to find an open position (if one exists, and the table size is a prime number) like linear probing does.
Analysis of open addressing
In many cases, looking up a key in an open addressing hash table is O(1)
, especially if there are no collisions. However, when there are collisions, we need to utilize one of the probing algorithms, which can inflate the time complexity.
Worst case analysis
In the worst case, the table is full and the key we are looking for (or inserting) is not yet in the hash table. In that case, the probing algorithms will lead us to probe all $$m$$ slots in the table to ensure that the key is not present, which is O(m)
. This is because in the worst case, we hash a key to a slot in the hash table, but as we probe through the table, we continue to find full slots until we have checked every slot.
Average case analysis
In the average case, when hashing a key to a slot in the table, we might encounter some small number of collisions during the probing algorithm. However, as long as the load factor is not too high (say, $$\alpha < 1/2$$), we will be able to find the key (or be sure it's not present) within a constant number of probes. Therefore, similar to separate chaining, under decent conditions (a good hash function and a table that's not too full), we can expect performance that approaches O(1)
.
Hash Deletion
In the past few lessons, we covered how to insert new items into separate chaining and open addressing hash tables. Let's also now take a look at how to delete an item from a hash table.
Separate chaining
Deleting a key x
is relatively simple in a separate chaining hash table. The algorithm is:
- Compute
h(x)
to find the bucket wherex
belongs. - Use the linked list deletion algorithm to delete the node with key
x
.
Open addressing
Deletion in an open addressing hash table is a little bit more challenging, since it can leave gaps in the table that interfere with the probing process. Watch the following video to see why deletion can create gaps, and how to fix it:
To summarize:
- When deleting an item, assign a special "removed" value to that slot in the table.
- For example, "empty" spaces may contain -1 and "removed" spaces may contain -2 as sentinel values
- When performing lookups, we need to skip past any removed slots and continue searching
- When performing insertions, we can insert into either empty or removed slots (whichever comes first), after checking to make sure the key to be inserted doesn't already exist
Example Hash Function
Throughout these lessons, we have been using relatively simple hash functions that just correspond to the length of a string or the index of the string's first character. However, these aren't necessarily good hash functions for use in actual applications. Remember that hash functions should (more or less) map keys to hash values uniformly at random -- we wouldn't want every word that starts with 'a'
to be mapping to the same hash value!
The following video shows how to define a hash function that can be used on strings:
Applications
Hash functions are critical for hash tables as data structures, but they are also a more general mechanism and useful for a variety of applications, especially for security. Optionally, check out the following videos about how hash functions are used in various contexts.
Note: these are not applications of hash tables, just applications of hash functions.
Hash functions and file authenticity
Hash functions and passwords
Hash functions and cryptographic ledgers (blockchain)
Reading
Read about hashing and hash tables in Chapter 7 of An Open Guide to Data Structures and Algorithms.
Practice
Here is a selection of practice problems from LeetCode involving hash tables (in Python, dictionaries):
Week 2 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 2
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa2 channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Hash Tables
In this assignment, you will complete a series of exercises that test your knowledge of how to use and build hash table algorithms.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Statement on Usage of AI Tools
The list below details acceptable and unacceptable uses of AI tools in Kibo courses, including generative AI tools like ChatGPT and AI tools integrated into code editors like CoPilot. For use cases not listed below, inquire about their acceptability on the course's Discord Help channel.
Acceptable Uses:
- Generating Ideas: Assisting in brainstorming topics for essays, projects, or presentations.
- Research Aid: Helping to find sources, papers, or literature for further reading and reference.
- Explaining Concepts: Providing explanations of complex concepts in simpler terms or different perspectives.
- Outlining: Aiding in creating outlines of topics to build structure for essays or reports.
- Drafting Assistance: Creating initial drafts of supplementary writing material where the main artifact being produced is not a written item.
- Creative Inspiration: Generating creative ideas for projects in arts, literature, or design.
- Technical Guidance: Offering guidance on coding problems, mathematical equations, or scientific principles.
- Proofreading: Assisting in grammar and spelling checks for written assignments.
Unacceptable Uses:
- AI-Generated Work: Submitting work significantly generated by AI as one's own in essays, reports, code, or other assignments.
- Plagiarism: Using AI to rephrase or copy content without proper citation or acknowledgment.
- Test and Exam Assistance: Using AI tools during tests or exams unless explicitly allowed by the instructor.
- Drafting Assistance: Creating initial drafts of written materials for general educations courses.
- Complete Dependence: Relying solely on AI for understanding core concepts without engaging in personal study or critical thinking (AI tools can often be wrong).
Important Notes:
- Students must disclose the use of AI tools in their assignments and provide either shareable links or exports (if the tool doesn't support sharing) showing their use of the tools on the assignment.
- AI tools might not strictly adhere to instructions or might include unrequested information. Always review the output to ensure accuracy and relevance.
- Work generated through AI should be critically reviewed and substantially modified to ensure originality and understanding.
Priority Queues and Heaps
Welcome to week 3!
Learning objectives
By the end of this week, you will be able to:
- Describe how the heap structure allows efficient insertion and deletion of elements while maintaining its ordering property.
- Trace, compare, and contrast heapsort with other sorting algorithms.
- Implement solutions to problems by applying priority queues and heaps.
What's due this week
- Week 3 Quiz (on Gradescope), due Thursday before class starts
Priority Queues
Now that we've finished talking about data dictionaries in the form of search trees and hash tables, let's turn our attention to a new kind of ADT: the priority queue.
A priority queue is a data structure where each element has an associated priority. The priority of an element determines the order in which the element will be removed from the queue. In a max priority queue, the element with the maximum priority would be the first to be removed, the element with the next maximum priority would be removed second, etc. For example:

There is also a min priority queue variation, where the element with the minimum priority would be the first to be removed, etc.
Priority queues are useful for a number of applications, including:
- Scheduling a shared resource, such as the CPU: give some processes a higher priority so that they will be scheduled first
- Huffman encoding: at every step of forming the Huffman tree, you need to know the next two nodes with the highest frequencies; this could be implemented using a priority queue.
- Traffic prioritization: when network bandwidth is limited, packets are prioritized according to their priority, and higher priority packets are sent first
Operations of a PQ
The priority queue ADT generally calls for the following operations:
insert(item, priority)
: insert an item with a given priority into the queueremove()
: remove the maximum priority item from the queuepeek()
: return the maximum priority element (and/or the priority itself) without actually removing the element
Notice that these operations are very similar to the operations of the queue and stack ADTs that we learned about in a previous course. Indeed, a priority queue is a generalization of the concepts of queues and stacks -- it just makes the priority of each element explicit. In a stack, the priority of an element is related to how recently it was added -- more recently added items have higher priorities.
Implementing a PQ
There are various ways that we could implement the PQ ADT using the data structures we have at our disposal. Here are some options:
Unsorted array or linked list
With an unsorted array or linked list, every time the insert()
operation is called, we could just append the new item onto the end of the data structure. Generally speaking, this should take only O(1)
time, although in the case of an array we might need to occasionally resize the array.
However, when we go to remove()
the element with the maximum priority, we would need to do an O(n)
pass over the array or linked list to (a) find the element and (b) remove it, closing any gaps in the elements by shifting as needed. Even though insertion is very efficient, removal is quite inefficient.
Sorted array or linked list
Perhaps if we maintain a sorted list, then we could do better! If we kept an array or linked list sorted by the priorities, then we could always know that the maximum priority is always at the beginning of the list (or end of the list, depending on how you want to implement it). We could therefore easily implement the remove()
operation in O(1)
time.
Linear data structures seem to always require O(n)
time, one way or another. Could we do better with a binary search tree?
Balanced BST
If we use a balanced BST, with priorities as the keys and items as the values, then we could implement the priority queue in logarithmic time: it takes time O(logn)
to insert a new (priority, item) pair. Removal also takes O(logn)
time, since although we could always maintain a reference to the maximum priority in the tree for quick access, the operation of removing requires re-balancing the tree, which takes time O(logn)
. While this is better in some ways than the O(n)
time required by the linear data structures, it's still not ideal.
Instead, what we need is a new abstraction: a heap.
Heaps
A heap is a type of tree data structure that satisfies the heap property: every node's key is greater than or equal to the keys of its children.
Here are some example heaps:

Note that with the heap property, it necessarily means that the maximum value in the heap is at the root of the tree. This makes it a very natural data structure to use when implementing a priority queue, since the highest priority element is readily available!
The most common way of implementing a heap is by using a complete binary tree.
Complete binary tree
A complete binary tree (note: not a binary search tree) is a binary tree of height h where:
- levels 0 through h - 1 are fully occupied
- in level h, there are no "gaps" to the left of any node
In essence, a complete tree is one where the tree is filled with nodes from top to bottom, left to right, and each node is full of nodes except for possibly the last layer. Here are some examples of complete trees:

And here are some examples of trees that are not complete (where the node that's missing to prevent it from being complete is highlighted):

Representing a complete binary tree
In our previous data structures course, we represented trees a linked data structure: every node in a tree had some data element, as well as pointers to its left child, right child, and parent. This works great in the general case for representing trees.
However, when using complete trees (as we are for heaps), we can actually have a simpler representation that just uses an array. This makes it a lot simpler to manage and manipulate the heap.
To represent a complete tree as an array, the trees's nodes are stored in the order given by a level-order traversal; top to bottom, left to right:
![A complete tree showing how it can be interpreted using an array. The root node shows a[0] and its left and right children show a[1] and a[2], respectively. The node with a[1] has a[3] and a[4] as its left and right child, etc.](/images/week-03/dsa2-week3-heap-array.png)

Once the tree is represented as an array (or in our case, a list in Python), you can use the following formulas to navigate the tree:
- The root node is in
lst[0]
- Given the node in
lst[i]
:- its left child is in
lst[2*i + 1]
- its right child is in
lst[2*i + 2]
- its parent is in
lst[(i - 1)/2]
(using integer division)
- its left child is in
For example, consider this complete tree:

- the left child of the node in
lst[1]
is inlst[2*1 + 1] = lst[3]
- the left child of the node in
lst[2]
is inlst[2*2 + 1] = lst[5]
- the right child of the node in
lst[3]
is inlst[2*3 + 2] = lst[8]
- the parent of the node in
lst[4]
is inlst[(4 - 1)/2] = lst[1]
Summary
To summarize, we are now looking to implement the priority queue ADT using a max-at-top heap. Most often, heaps are complete trees, which are implemented as arrays due to the simplicity of the representation.
Watch the following video to see a summary of heaps and complete trees:
Heap Insertion
Let's talk about how to insert items into a heap. To do so, let's define a Heap
class that we can add functionality to, and start it with a list of items and a counter for the current number of items:
class Heap:
def __init__():
self.contents = []
self.num_items = 0
Recall that since heaps are complete trees, they have a convenient representation as a list. Therefore, we'll avoid creating a linked data structure and instead implement the heap using a list.
Algorithm
Watch the video below to see how to insert an item into a heap:
To summarize:
- The new item initially gets put into the end of the list (a leaf node).
- The value in that leaf node is "sifted up" in the heap until it obeys the heap property.
Efficiency
When inserting an item into the heap, we first place the item in the last position in the list in constant time. However, we then need to run the sift_up()
algorithm, which in the worst case pushes the item from a leaf node all the way up to the root.
Since the heap is a complete tree, we know that it is also balanced, and therefore we can say that in the worst case, an item is sifted up throughout the height of the tree, where h = O(logn)
. Each of the swaps that occurs during the sift operation takes constant time.
Therefore, insertion in a heap is O(logn)
.
Now that we know how to insert items into a heap, let's take a step back and look at the overall heap constructoin algorithm and its efficiency. The big-O notation for building a heap may surprise you!
Heap Removal
The remove()
operation of a heap is extremely important -- usually, the reason why we keep the items in a heap is so that we can efficiently remove them when ready.
The removal algorithm is quite similar to the insertion algorithm, in the sense that we will need to do some sifting to keep the heap in order. However, this time we'll be sifting down instead of sifting up.
Algorithm
Watch the video below to see how to remove the maximum item from a heap:
To summarize, when removing the maximum item from the heap, we can:
- Make a copy of the largest item
- Move the last item in the heap to the root
- "Sift down" the new root item until it is >= its children (or it's a leaf)
- Note: when a parent is less than both of its children, we need to swap the parent with the greater of the two children.
- Return the largest item
Efficiency
Rather than discuss the efficiency explicitly, let's put your analytical skills to the test:
Heap Construction
Say that we have a list of numbers that we want to convert into a heap:
[15, 23, 39, 14, 8, 45, 18, 27]
To build a heap, we could take this list of random numbers and insert them, one at a time, into an initially empty heap. We know from the previous lesson that each insertion takes time O(logn)
, and if there are n
elements, then in the worst case this process takes time n elements * O(logn) per insertion = O(nlogn)
.
However, this algorithm is not optimal. We can actually do better than O(nlogn)
!
Linear-time heap construction
An alternative method to building a heap is to start with all of the elements we want to insert in a complete tree. For example, we would represent the above list as the following complete tree:
![A complete tree with the following list added in level order: [15, 23, 39, 14, 8, 45, 18, 27].](/images/week-03/dsa2-week3-buildheap.png)
Then, we use the following algorithm:
- Find the parent of the last child node.
- Sift down the parent as much as needed to obey the heap property. Remember that if both children are greater than the parent, the parent should be swapped with the greater of the two.
- Proceed to the rest of the nodes in the same level, from right to left. Sift them down as needed.
- Proceed to the next level up in the complete tree, moving from right to left and sifting them down as needed.
- Continue through the rest of the remaining levels.
Watch the following video to see an example of this algorithm:
This algorithm actually works in O(n)
time! But how?
Explanation of efficiency
Let's make an informal argument for why this heap construction method only takes linear time.
Half the nodes are leaf nodes
First, let's think about how many nodes there are in each level of a complete tree:
Level | Number of nodes |
---|---|
0 | 1 |
1 | 2 |
2 | 4 |
3 | 8 |
... | ... |
h | 2h |
Also, a complete tree of height h
has 2(h + 1) - 1
nodes total. For example, a tree of height 2 (which has three levels) has 7 nodes total: 1 in the first level, 2 in the second level, and 4 in the third level.
Therefore, it stands to reason that out of the 2(h + 1) - 1
total nodes in a tree, 2h
of them are in the last level (leaf nodes). For example, in a tree of height 4, there are 31 total nodes and 16 of them are leaf nodes!
This means that in a complete tree, roughly half of the nodes are leaf nodes. This is important because in the heap construction algorithm, we skip all of the leaf nodes -- we don't need to consider them for sifting down. We start with the parent of the last leaf node.
Most nodes are not sifted far
We showed above that approximately half (~n/2
) of the nodes don't need to be sifted at all because they're leaf nodes. Proceeding with that logic, there are ~n/4
parents of leaf nodes, and although they have to be considered for sifting, they are only sifted at most one level (down into a leaf node).
In fact, the further you work your way up in the tree, there are fewer and fewer nodes that need to be sifted throughout the height of the tree. The root node is the only node that needs to potentially be sifted throughout the full height h
of the tree!
In other words, the time that you spend sifting a few elements throughout the height of tree is offset by the many nodes that are sifted only a small number of levels or are not sifted at all. Although we won't get into the mathematical argument for this, it turns out that by this argument, the running time of heap construction is O(n)
!
Code for heap construction
Up to this point, we have been using a definition of the Heap
class that has a fairly simple constructor:
class Heap:
def __init__():
self.contents = []
self.num_items = 0
However, we could also use a version of the constructor that accepts a list contents
as a parameter, and builds a heap based on that list. For example:
def __init__(self, contents):
self.contents = contents
self.num_items = 0
self.__make_heap()
def __make_heap(self):
if len(self.contents) == 0:
return
last = len(self.contents) - 1
parent_of_last = (last - 1) // 2
for i in range(parent_of_last, -1, -1):
self.__sift_down(i)
The __make_heap()
method implements the algorithm described above for starting with the parent of the last leaf node, and repeatedly sifting down each element as needed in order to construct a heap.
In case it's helpful, here's an entire implementation of the Heap
class for your reference:
class Heap:
def __init__(self, contents):
self.contents = contents
self.num_items = len(contents)
self.__make_heap()
def __make_heap(self):
if len(self.contents) == 0:
return
last = len(self.contents) - 1
parent_of_last = (last - 1) // 2
for i in range(parent_of_last, -1, -1):
self.__sift_down(i)
def insert(self, item):
self.contents.append(item)
self.__sift_up(self.num_items)
self.num_items += 1
def remove(self):
if self.num_items == 0:
return None
to_remove = self.contents[0]
self.contents[0] = self.contents[self.num_items - 1]
self.contents.pop()
self.num_items -= 1
if self.num_items > 0:
self.__sift_down(0)
return to_remove
def is_empty(self):
return self.num_items == 0
# to_string - create a string representation of the heap of the form
# { ( root ) ( items in level 1 ) ( items in level 2 ) ... }
def to_string(self):
string = '{ '
start = 0
level_size = 1
while start < self.num_items:
# print all of the items at the current level of the tree
string += '( '
i = start
while i < start + level_size and i < self.num_items:
string += (self.contents[i] + ' ')
i += 1
string += ') '
# move down to the next level
start += level_size
level_size *= 2
string += '}'
return str
def __sift_down(self, i):
# Store a reference to the element being sifted.
to_sift = self.contents[i]
# Find where the sifted element belongs.
parent = i
child = 2 * parent + 1
while child < self.num_items:
# If the right child is bigger, compare with it.
if child < self.num_items - 1 and self.contents[child] < self.contents[child + 1]:
child = child + 1
# Check if we're done.
if to_sift >= self.contents[child]:
break
# If not, move child up and move down one level in the tree.
self.contents[parent] = self.contents[child]
parent = child
child = 2 * parent + 1
self.contents[parent] = to_sift
def __sift_up(self, i):
# Store a reference to the element being sifted.
to_sift = self.contents[i]
# Find where the sifted element belongs.
child = i
while child > 0:
parent = (child - 1) // 2
# Check if we're done.
if to_sift <= self.contents[parent]:
break
# If not, move parent down and move up one level in the tree.
self.contents[child] = self.contents[parent]
child = parent
self.contents[child] = to_sift
Heapsort
Back in our previous data structures course, we talked about various sorting algorithms, including selection sort, insertion sort, mergesort, and radix sort. Here's a summary of the efficiencies of those algorithms:
Algorithm | Best case | Worst case | Space |
---|---|---|---|
Selection sort | O(n2) | O(n2) | O(1) |
Insertion sort | O(n) | O(n2) | O(1) |
Mergesort | O(nlogn) | O(nlogn | O(n) |
Radix sort | O(m*n) | O(m*n) | O(n) |
We were able to define fairly efficient algorithms in mergesort and radix sort, but both of those algorithms require O(n)
extra space to work. Can we do better?
Improving selection sort
Recall selection sort. It repeatedly finds the next smallest element in a list, and swaps it into place. It isn't a particularly efficient algorithm though, since for each of O(n)
elements, it performs an O(n)
pass over the list to find the next smallest element, leading to a running time of O(n2)
.
However, we now have a data structure that allows us to efficiently find the next smallest (or largest) item: a heap! Indeed, we will use a heap to implement heapsort, which will repeatedly find the next largest item in a list and swap it into its final sorted position.
Algorithm
Say that we start with an arbitrary list of numbers. We can use heapsort to sort the list with the following high-level algorithm:
- Turn the list into a heap.
- Remove the maximum item from the heap and place it at the end of the list. This element is now in its final sorted position. Re-heapify the heap.
- Repeatedly remove the maximum item from the heap and place it in its final sorted position until all elements have been removed.
Watch the following video to see how heapsort works:
Practice
Consider the following list:
[20, 18, 32, 3, 11, 7, 17]
Sort the list using heapsort. Show the evolution of the list as it changes in memory.
Pseudocode for heapsort
Here is some pseudocode for the operation of heapsort. It's actually close to working Python code, but it won't quite work with the implementation of the Heap
class that we made, since remove()
actually pops the element from the list (as opposed to just placing it at the end and keeping it there).
def heapsort(lst):
# Create a new heap from the given list.
heap = Heap(lst)
end_unsorted = len(lst) - 1
while end_unsorted > 0:
# Get the largest remaining element and put it
# at the end of the unsorted portion of the array.
largest_remaining = heap.remove()
lst[end_unsorted] = largest_remaining
end_unsorted -= 1
A fun challenge could be to make this version of heapsort()
work with the Heap
class that we have given you, which would involve keeping the list the same length even as elements are removed.
Other resources
Here are some other resources for heapsort that you might find helpful:
Heapsort Analysis
Now that we've seen how heapsort operates, let's think about how efficient it is.
Time analysis
We saw in the previous lesson that heapsort consists of two main steps:
- Convert an arbitrary list into a heap.
- Repeatedly remove the maximum element from a heap until you get a sorted list.
As we saw in the heap construction lesson, step (1) above takes time O(n)
. After this step is done, we then repeatedly remove the maximum element. In fact we remove n - 1 = O(n)
such elements, and each call to remove()
takes time O(logn)
in order to put the heap back into order. Therefore, the running time of the second step is O(n) * O(logn) = O(nlogn)
.
Putting the two steps together, we get O(n) + O(nlogn) = O(nlogn)
. This is just as efficient as mergesort!
Space analysis
Although heapsort has the same running time as mergesort, it actually performs better in terms of space. Whereas mergesort requires an auxiliary list to be able to work (which means O(n)
extra space), the heapsort algorithm can be performed in-place, using only a constant amount of extra space for local variables. Notice that in the visualizations of heapsort shown in the previous lesson, all the work is done in the same list that was passed in!
Completing our table of sorting algorithm analysis, we can now add heapsort:
Algorithm | Best case | Worst case | Space |
---|---|---|---|
Selection sort | O(n2) | O(n2) | O(1) |
Insertion sort | O(n) | O(n2) | O(1) |
Mergesort | O(nlogn) | O(nlogn) | O(n) |
Radix sort | O(m*n) | O(m*n) | O(n) |
Heapsort | O(nlogn) | O(nlogn) | O(1) |
heapq Library
So far, we've built our own Heap
class to illustrate the basic algorithms behind heaps. However, there are heap libraries available for you to use in Python as well.
One option is the heapq
module, which provides functionality for a min-at-top heap. In this lesson, we'll cover the three main functions of heapq
: heappush()
, heappop()
, and heapify()
.
heappush()
The heappush()
function adds an element to the heap while maintaining the heap invariant. For example:
import heapq
heap = []
heapq.heappush(heap, 4)
heapq.heappush(heap, 1)
heapq.heappush(heap, 7)
print(heap) # Output: [1, 4, 7]
Notice that the "heap" is just a list, but the heapq
library manages the list to enforce heap properties. After adding the three elements to the list, the list is [1, 4, 7]
, representing a complete tree where 1
is the root node and 4
and 7
are its left and right children, respectively.
Note that you can also use this library to insert tuples, where the first element of the tuple is the priority, and the second element is some associated data:
import heapq
heap = []
heapq.heappush(heap, (4, 'hello'))
heapq.heappush(heap, (1, 'world'))
heapq.heappush(heap, (7, 'foo'))
print(heap) # Output: [(1, 'world'), (4, 'hello'), (7, 'foo')]
heappop()
The heappop()
function removes and returns the smallest element from the heap. After removal, it rearranges the remaining elements to maintain the heap structure.
import heapq
# Construct a heap.
heap = []
heapq.heappush(heap, (4, 'hello'))
heapq.heappush(heap, (1, 'world'))
heapq.heappush(heap, (7, 'foo'))
print(heapq.heappop(heap)) # Output: (1, 'world')
print(heap) # Output: [(4, 'hello'), (7, 'foo')]
When heappop()
is called, the minimum element from the heap is removed, and the rest of the heap is maintained.
heapify()
The heapify()
function transforms an arbitrary list into a heap. For example:
import heapq
heap = [(7, 'foo'), (1, 'world'), (4, 'hello')]
heapq.heapify(heap)
print(heap) # Output: [(1, 'world'), (4, 'hello'), (7, 'foo')]
The heapify()
function uses the same heap construction algorithm that we studied, which builds a heap in linear time.
Example Problem
Now, let's apply what we've learned to solve a problem. Suppose we have a list of integers and we want to find the k
smallest elements. We can use a heap to efficiently find these elements.
import heapq
def k_smallest_elements(nums, k):
smallest = []
heapq.heapify(nums)
for i in range(k):
smallest.append(heapq.heappop(nums))
return smallest
# Example usage:
nums = [5, 8, 2, 10, 3, 6]
k = 3
print(k_smallest_elements(nums, k)) # Output: [2, 3, 5]
The function heapifies the input list, and then calls heappop()
k
times to find the k
smallest elements as it adds them to a list.
Reading
Read about heaps and priority queues in Chapter 9 of An Open Guide to Data Structures and Algorithms.
Practice
Here is a selection of practice problems from LeetCode involving heaps.
- Kth Largest Element in an Array
- Note that for this problem, you will need to use a max-at-top heap.
- Sort Characters By Frequency
- Merge k Sorted Lists
- Note that for this problem, you will need to use a min-at-top heap.
In these problems, you might find it helpful to use either the heapq
module, or the following Heap
class:
class Heap:
def __init__(self, contents):
self.contents = contents
self.num_items = 0
self.__make_heap()
def __make_heap(self):
if len(self.contents) == 0:
return
last = len(self.contents) - 1
parent_of_last = (last - 1) // 2
for i in range(parent_of_last, -1, -1):
self.__sift_down(i)
def insert(self, item):
self.contents.append(item)
self.__sift_up(self.num_items)
self.num_items += 1
def remove(self):
if self.num_items == 0:
return None
to_remove = self.contents[0]
self.contents[0] = self.contents[self.num_items - 1]
self.contents.pop()
self.num_items -= 1
if self.num_items > 0:
self.__sift_down(0)
return to_remove
def __sift_down(self, i):
# Store a reference to the element being sifted.
to_sift = self.contents[i]
# Find where the sifted element belongs.
parent = i
child = 2 * parent + 1
while child < self.num_items:
# If the right child is bigger, compare with it.
if child < self.num_items - 1 and self.contents[child] < self.contents[child + 1]:
child = child + 1
# Check if we're done.
if to_sift >= self.contents[child]:
break
# If not, move child up and move down one level in the tree.
self.contents[parent] = self.contents[child]
parent = child
child = 2 * parent + 1
self.contents[parent] = to_sift
def __sift_up(self, i):
# Store a reference to the element being sifted.
to_sift = self.contents[i]
# Find where the sifted element belongs.
child = i
while child > 0:
parent = (child - 1) // 2
# Check if we're done.
if to_sift <= self.contents[parent]:
break
# If not, move parent down and move up one level in the tree.
self.contents[child] = self.contents[parent]
child = parent
self.contents[child] = to_sift
You can copy and paste this Heap
class directly into your code as an inner, nested class. You can then use it like:
h = Solution.Heap([])
h.insert(50)
h.insert(20)
h.insert(75)
print(h.remove())
Week 3 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Dynamic Programming
Welcome to week 4!
Learning objectives
By the end of this week, you will be able to:
- Recognize the characteristics of problems that can be solved using dynamic programming
- Analyze the time and space complexity of dynamic programming solutions, and compare them to alternative approaches like brute force
- Implement dynamic programming solutions for problems exhibiting optimal substructure and overlapping sub-problems
What's due this week
- Week 4 Quiz (on Gradescope), due Thursday before class starts
- Start working on Assignment 3 (due next week)
Rod Cutting Problem
The rod cutting problem is a classic optimization problem in computer science and mathematics. It involves cutting a rod of a certain length into smaller pieces to maximize the total revenue obtained from selling those pieces. This problem has various practical applications, such as in manufacturing, resource allocation, and finance.
Let's take a closer look at this problem to motivate our study of dynamic programming.
Problem Setup
In the rod cutting problem, we are given:
- A rod of length $$n$$ cm
- A table of prices $$p_i$$ for rod pieces of length $$i$$, where $$i$$ ranges from 1 to $$n$$
The goal is to determine the optimal way to cut the rod to maximize revenue. Each cut of the rod is free. Note that if the price $$p_n$$ of a rod of length $$n$$ is large enough, the optimal solution might be to not cut the rod at all.
Example
Let's consider the following table of prices for rod pieces of different lengths:
Length | Price |
---|---|
1 | 1 |
2 | 5 |
3 | 8 |
4 | 9 |
5 | 10 |
6 | 17 |
7 | 17 |
8 | 20 |
Say that we have a rod of length $$n = 4$$. The following image shows all of the possible ways that we could cut a rod of this length:

The optimal solution with this price table is to cut the rod into two rods of length 2 cm, since that leads to a revenue of 10.
Next, let's think about how to write a program that solves this problem.
Solution Approach: Recursive Strategy
One approach to solve the rod cutting problem is through a recursive strategy. We start by considering all possible ways to cut the rod into smaller pieces and calculate the revenue for each combination. We then choose the combination that yields the maximum revenue.
Python Implementation
def max_revenue(length, prices):
if length <= 0:
return 0
# Start max_val at a value that will be overwritten by any number.
max_val = float('-inf')
# Consider every possible combination to cut a rod of length n:
for i in range(1, length + 1):
revenue = prices[i] + max_revenue(length - i, prices)
if revenue > max_val:
max_val = revenue
return max_val
# Example prices table
prices = {
1: 1,
2: 5,
3: 8,
4: 9,
5: 10,
6: 17,
7: 17,
8: 20
}
rod_length = 4
max_revenue = max_revenue(rod_length, prices)
print("Maximum revenue:", max_revenue)
This implementation defines a function max_revenue(length, prices)
that takes the length of the rod and a dictionary of prices as input and returns the maximum revenue that can be obtained by cutting the rod. It recursively considers all possible ways to cut the rod and calculates the revenue for each combination, then returns the maximum revenue among them.
For example, say that we want to know the maximum revenue that we can generate with a rod of length 4
. This is represented by max_revenue(4)
(ignoring the prices
parameter for simplicity). In the stack frame for max_revenue(4)
, we find the maximum among the following:
price(1) + max_revenue(3)
price(2) + max_revenue(2)
price(3) + max_revenue(1)
price(4) + max_revenue(0)
We apply the function recursively, and in the end return the maximum.
Time Complexity Analysis
The time complexity of this recursive approach is exponential, specifically $$O(2^n)$$, where $$n$$ is the length of the rod.
This exponential time complexity arises because the function explores all possible combinations of cuts, leading to a function call tree with $$2^n$$ nodes. Each node in the call tree represents a subproblem, and as the rod length increases, the number of subproblems grows exponentially.
For example, let's consider the call tree for a rod of length 4 cm:

In this call tree:
- At each level, the function makes a decision to cut the rod at a particular length, starting from the maximum length and recursively exploring all possible cuts.
- The tree branches out exponentially as the function explores all possible combinations of cuts.
As we can see, the call tree grows exponentially with the length of the rod.
Rod length | Number of calls |
---|---|
0 | 1 |
1 | 2 |
2 | 4 |
3 | 8 |
4 | 16 |
... | ... |
n | 2^n |
For a rod of length $$n$$, the number of nodes in the call tree is exactly $$2^n$$. This exponential growth in the number of function calls leads to the overall time complexity of $$O(2^n)$$, which is an extremely expensive algorithm!
Next, let's look at a new technique to reduce the complexity of this algorithm: dynamic programming!
Dynamic Programming
In the previous lesson, we explored the rod cutting problem and implemented a solution using a recursive strategy. However, we observed that the recursive approach led to a call tree with $$2^n$$ nodes, which translates to an exponential time complexity, making it impractical for large rod lengths.
Let's take another look at that call tree:

Notice that when computing max_revenue(4)
, we recompute the same values over and over again. We make a call to compute max_revenue(2)
twice, max_revenue(1)
four times, and max_revenue(0)
eight times!
To overcome the inefficiency of the recursive approach, we can use a technique that saves previously computed results and reuses them when needed. This technique, called dynamic programming, allows us to avoid redundant computations and significantly improve the efficiency of our solution.
Version with dynamic programming
Here's how we can modify our previous recursive solution to incorporate dynamic programming. We're going to ues a dictionary, saved
, that maps rod lengths to the maximum revenue possible with that length:
def max_revenue(length, prices, saved):
# If we've calculated the max revenue for this length before, use it.
if length in saved:
return saved[length]
if length <= 0:
return 0
# Start max_val at a value that will be overwritten by any number.
max_val = float('-inf')
# Consider every possible combination to cut a rod of length n:
for i in range(1, length + 1):
revenue = prices[i] + max_revenue(length - i, prices, saved)
if revenue > max_val:
max_val = revenue
# Save this computation.
saved[length] = max_val
return max_val
# Example prices table
prices = {
1: 1,
2: 5,
3: 8,
4: 9,
5: 10,
6: 17,
7: 17,
8: 20
}
rod_length = 4
saved = {}
max_revenue = max_revenue_saved(rod_length, prices, saved)
print("Maximum revenue:", max_revenue)
Time Complexity Analysis
With this approach, the time complexity significantly improves to $$O(n^2)$$, where $$n$$ is the length of the rod. This happens because for every cut size, we need to consider the cut sizes of all smaller subproblems. The improvement is achieved by avoiding redundant computations and reusing previously computed results.
However, it's worth noting that the space complexity of the solution increases, due to the storage of intermediate results in the saved
dictionary. The space complexity becomes $$O(n)$$ as well. This is a classic example of a space-time tradeoff -- we're saving time by storing the intermediate results.
Check out the video below to see another explanation of how dynamic programming can be applied to the rod cutting problem:
Dynamic programming properties
Dynamic programming is a powerful technique used to solve problems by breaking them down into smaller subproblems and solving each subproblem only once. There are two key properties that are necessary for dynamic programming to be applicable, which are:
-
Optimal Substructure: The solution to a larger problem can be constructed from the solutions to its smaller subproblems.
-
Overlapping Subproblems: The same subproblems are encountered multiple times during the computation.
The rod cutting problem exhibits both optimal substructure and overlapping subproblems. The optimal way to cut a rod of length $$n$$ can be obtained by considering the optimal ways to cut its smaller subrods, and we encounter the same subrod lengths multiple times during the computation. Therefore, dynamic programming is well-suited for solving the rod cutting problem efficiently.
Fibonacci Top-Down Dynamic Programming
Let's now take a look at another problem where dynamic programming can be used: the Fibonacci sequence.
We discussed the Fibonacci sequence in the previous data structures course. We defined it to be a series where each number is the sum of the two preceding numbers, starting from $$0$$ and $$1$$. The sequence therefore starts like this:
$$0, 1, 1, 2, 3, 5, 8, 13, 21, ...$$
And continues infinitely. You can define the Fibonacci sequence recursively as:
- Base cases:
- $$\text{fib}(0) = 0$$
- $$\text{fib}(1) = 1$$
- Recursive definition:
- $$\text{fib}(n) = \text{fib}(n-1) + \text{fib}(n-2)$$, for $$n > 1$$
Let's now take a look back the recursive Python solution that we previously came up with to compute the $$n$$th Fibonacci number.
Naive Fibonacci implementation
We've seen before that we can translate the recursive definition of the Fibonacci sequence above into some Python code as follows:
def fibonacci(n):
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
This implementation is $$O(2^n)$$, since it requires an exponential number of function calls to compute. The analysis for why this is true is actually very similar to the analysis we did for the rod cutting problem. We end up recomputing the same value over and over again!
Suitability for dynamic programming
The calculation of the $$n$$th Fibonacci number exhibits the two characteristics that make it suitable for dynamic programming: optimal substructure and overlapping subproblems.
Remember that optimal substructure means that an optimal solution to a problem can be constructed from optimal solutions to its subproblems. In the case of the Fibonacci sequence, the optimal substructure property holds because the value of the $$n$$th Fibonacci number depends on the values of the $$(n-1)$$th and $$(n-2)$$th Fibonacci numbers.
It also exhibits the characteristic of overlapping subproblems, since there are some subproblems that are reused multiple times. For example, when computing fibonacci(5)
, we need fibonacci(4)
and fibonacci(3)
, both of which require fibonacci(2)
.
Applying dynamic programming
Let's now try applying dynamic programming to the Fibonacci problem.
Start with the naive implementation of fibonacci()
below, and add dynamic programming using a dictionary of pre-computed values. This dictionary, called saved
, has been added as an optional parameter, so its initial value is the empty dictionary.
Add to the dictionary as necessary, and each time the function is called, first consult the dictionary to check whether the solution to that subproblem has already been computed.
When you think you've got it, check out the solution below:
Click to reveal the solution.
def fibonacci(n, saved={}):
if n in saved:
return saved[n]
if n == 0:
return 0
if n == 1:
return 1
saved[n] = fibonacci(n - 1) + fibonacci(n - 2)
return saved[n]
Memoization
This technique of storing the already computed solutions in temporary data structure is known as memoization (note: not the same as memorization!). Memoization is a technique used to improve the performance of recursive algorithms by storing the results of expensive function calls and returning the cached result when the same inputs occur again. Essentially, memoization ensures that each subproblem is solved only once, eliminating redundant computations.
Using memoization is known as a top-down approach to dynamic programming, since we start with the original problem (the "top"), recursively break it down into smaller subproblems, solve these subproblems one by one until we reach the base cases (the "bottom") and memoize the results, and finally combine the solutions of the subproblems to obtain the solution to the original problem.
Check out the video below to learn more about dynamic programming and memoization:
Aside from top-down dynamic programming, there's another approach known as bottom-up. Let's take a look at that next.
Fibonacci Bottom-Up Dynamic Programming
In the previous lesson, we explored memoization, a top-down approach to dynamic programming. We learned how memoization optimizes recursive algorithms by caching results and reusing solutions to overlapping subproblems. Now, let's delve into another approach: bottom-up dynamic programming.
Unlike memoization, which starts with the original problem and recursively breaks it down into smaller subproblems, bottom-up dynamic programming works iteratively from the bottom (i.e., the base cases) up to the original problem.
Bottom-up approach
When devising a bottom-up solution, we can use the following steps:
-
Initialization: Solve the base cases of the problem and store their solutions.
-
Iterative Solution: Iteratively solve increasingly larger subproblems, building up the solutions based on previously computed ones.
-
Optimal Solution: By the end of the iteration, we have computed the solution to the original problem.
Bottom-up dynamic programming is particularly useful when it's easy to determine the order in which subproblems should be solved and when all subproblems need to be solved.
Applying bottom-up to Fibonacci
Let's see how bottom-up dynamic programming can be applied to the problem of computing the $$n$$th Fibonacci number. Instead of recursively solving smaller subproblems, we'll iteratively compute Fibonacci numbers starting from the base cases.
With the steps above in mind, we'll proceed as follows:
- Initialize a list to store the first two Fibonacci numbers:
fib[0] = 0
andfib[1] = 1
. - Iterate from
2
ton
, computing each Fibonacci number as the sum of the two preceding ones:fib[i] = fib[i - 1] + fib[i - 2]
. - The value of
fib[n]
represents the $$n$$th Fibonacci number.
def fibonacci_bottom_up(n):
fib = [0, 1] # the first two Fibonacci numbers
for i in range(2, n + 1):
next_number = fib[i - 1] + fib[i - 2]
fib.append(next_number)
return fib[n]
Choosing top-down vs. bottom-up
Now that we know there are two different approaches to dynamic programming, you might be wondering which one you should choose when faced with a relevant problem.
In general, the two approaches are interchangeable, but here are some specific advantages of each:
-
Top-down (memoization): Intuitive and easy to implement, especially when translating an existing recursive solution to a memoized solution. Efficiently handles problems with a large state space by storing computed results in a cache. Preferred when the problem naturally lends itself to recursion and when only a subset of the state space needs to be explored.
-
Bottom-up: Typically more efficient in terms of space complexity, especially for problems with a straightforward order of computation. Avoids the overhead of function calls and recursion. Preferred when all subproblems need to be solved and when the order of computation is well-defined.
Next, let's take a look at one more dynamic programming problems: computing the longest common subsequence.
Longest Common Subsequence
Another classic dynamic programming problem is finding the longest common subsequence between two strings. A subsequence of characters doesn't necessarily have to be in consecutive positions -- that would be substrings. For example, given these two strings:
$$S_1 = AABCDD$$
$$S_2 = ABBBD$$
Then the longest common substring between $$S_1$$ and $$S_2$$ is just $$AB$$, but the longest common subsequence between them is $$ABD$$, since that sequence of characters appears in both strings.
Applications
Naive Approach
Like most dynamic programming problems, it's helpful to think of a naive, recursive, and expensive solution first, and then improve on the solution once we have an initial understanding.
Reasoning
Let's take a look at an example instance of the problem: let's try to compute the longest common subsequence between $$AABCDD$$ and $$ABBBD)$$.
First, we notice that both strings start with $$A$$. Since the two strings match in this position, that $$A$$ must be a part of the LCS. Therefore, we can say that $$LCS(AABCDD, ABBBD) = A + LCS(ABCDD, BBBD)$$ -- breaking the problem down into a subproblem by removing the initial $$A$$ from both strings. Note that this means the problem exhibits the optimal subproblem property!
What about when the first characters of the strings don't match? For example, let's take a look at that next subproblem: $$LCS(ABCDD, BBBD)$$. Since $$A$$ and $$B$$ don't match, we have two options for the next subproblem to consider:
- Take the first character off of $$S_1$$ and see what the LCS is: $$LCS(BCDD, BBBD)$$
- Take the first character off of $$S_2$$ and see what the LCS is: $$LCS(ABCDD, BBD)$$.
Once we've figured out both options, $$LCS(ABCDD, BBBD)$$ is equal to whichever is longer.
Python Implementation
Here's a Python program that implements the naive, recursive approach:
def lcs(s1, s2):
if len(s1) == 0 or len(s2) == 0:
# At least one of the strings is empty, so there's no
# characters in the longest common subsequence.
return ''
if s1[0] == s2[0]:
return s1[0] + lcs(s1[1:], s2[1:])
else:
# Pick whichever subproblem ultimately leads to a
# longer LCS.
r1 = lcs(s1, s2[1:])
r2 = lcs(s1[1:], s2))
if len(r1) > len(r2):
return r1
else:
return r2
Efficiency
This algorithm is quite inefficient, since it has to be called for all possible substrings of both s1
and s2
. Additionally, many of these function calls are redundant. For example:
lcs('cba', 'dea') calls lcs('cba', 'ea') and lcs('ba', 'dea')
lcs('cba, 'ea') calls lcs('cba', 'a') and lcs('ba', ea')
...
lcs('ba', 'dea') calls lcs('ba', 'ea') and lcs('a', 'dea')
...
As you can see, lcs('ba', 'ea')
is called multiple times, in different parts of the call tree. This means that the LCS problem also exhibits overlapping subproblems -- another characteristic of dynamic programming problems.
Since it reconsiders all possible combinations of strings, the efficiency is $$O(2^n * 2^m)$$, where $$n$$ is the length of $$S_1$$ and $$m$$ is the length of $$S_2$$.
Bottom-Up Approach
Above, we noticed that this problem exhibits both required characteristics for a dynamic programming solution. Let's see first how this can be achieved in a bottom-up approach:
To summarize, the video above uses a bottom-up (tabulation) approach, where they solve the smallest subproblems first, and then use the solutions to those subproblems to build up a solution to the overall problem. As is illustrated by the usage of the 2D table, both the time and space complexity for this approach are $$O(mn)$$.
Memoized Approach
We could also solve the LCS problem using top-down dynamic programming with memoization. This would require making $$O(mn)$$ recursive calls as well, since we need to consider all $$n$$ possible substrings for $$S_1$$ against all $$m$$ possible substrings of $$S_2$$. However, we never need to recompute the values after they are found the first time, since they can be stored in a table.
Reading
Read about dynamic programming in Chapter 10 of An Open Guide to Data Structures and Algorithms.
Practice
Here is a selection of practice problems from LeetCode involving dynamic programming:
Week 4 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 3
Due date: May 12
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa2 channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
File Analyzer
In this assignment, you will complete a series of exercises that test your knowledge of how to use heaps and implement dynamic programming algorithms.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Week 5
Welcome to week 5! In this week, we take a break from learning new material to focus on the midterm exam.
What's due this week
- Midterm exam on Thursday
- Assignment 3, due Sunday
Midterm Exam
We will have our midterm on Thursday, May 9 at 6:00 PM GMT. Here are the details:
Link to Exam
You can access the exam on OnlineExamMaker here: Midterm exam.
Date and time
- Held during Week 6 during our normal class time: Thursday, May 9 at 6:00 PM GMT
- You can access the link to the midterm ahead of time, but it won't be enabled until 6:00 PM GMT the day of the exam
Material
- Everything that was covered in weeks 1 - 4 of the class
- This includes lesson material, live classes, practice problems, assignments
Format
- Administered online using a tool called OnlineExamMaker
- You can prepare and use any amount of notes on paper, but no other resources on the computer or Internet are allowed
- 60 minute exam from the time you begin
- Be sure to start the test between 6:00 - 6:15 PM GMT (it will close after that point)
- Exam will be composed of 20 multiple choice questions, shown one at a time
- You can't go back and change your answer after moving on to the next question
- We won't be using Zoom during the exam; you can just access the exam individually when you're ready
- Plan to have scratch paper for tracing/diagramming etc.
Proctoring software
- The exam website is also a proctoring service
- It will ask you to turn your camera on, and will take some pictures at random during the test
- It will flag any activity such as closing the test, changing programs, switching tabs, etc. If you do this too many times, it will close the exam and you will not be able to continue. Once you start the exam, you should expect to stay on the exam page and complete each question without any interruptions or using your computer for anything else.
- It disables copy and paste during the test
- It gives us the option to monitor your activity
Exam environment
- Plan to be in a test-taking environment
- Quiet, mostly private
Practice exam
- A practice exam is available here so that you can get used to the software and format.
Questions
If you have any questions leading up to the exam, please ask in #help-dsa2 on Discord or reach out to the instructor.
Graphs I
Welcome to week 6!
Learning objectives
By the end of this week, you will be able to:
- Interpret concepts such as nodes, edges, degrees, paths, and cycles in the contexts of graph-based problems
- Compare and contrast various graph representations in terms of memory usage and efficiency for different graph operations
- Recognize and implement appropriate algorithms to solve graph-based problems
What's due this week
- Week 6 Quiz (on Gradescope), due Thursday before class starts
Basics
Let's now turn our attention to yet another fundamental data structure: graphs. Just like other data structures, such as arrays, linked lists, and trees, graphs are a building block on which we can build other abstractions to solve complex problems.
We have discussed graphs to some degree in previous courses, so let's begin with a review of graphs and the associated terminology. To begin with, here's an example of a graph:

Each graph, including the one above, is defined in terms of a set of vertices $$V$$ (also known as *nodes) and a set of edges $$E$$. Each edge connects a pair of vertices.
Graphs therefore give us the ability to represent entities and the relationships between them. For example, social networks are most naturally represented as graphs.
Here's another example: a graph that shows the connectedness of some cities:

In this case, the vertices represent cities, and the edges represent the lengths of the roads between these cities. This is actually an example of a weighted graph, since there is a cost associated with each edge. Eventually, we'll be able to represent problems as graphs, and then ask questions like: what's the shortest route from Abuja to Port Harcourt?
Watch the following videos to learn about the basics of graphs:
Terminology
As you can see from the videos, there is a lot of terminology associated with graphs. Let's revisit some of the more important concepts below with examples:
Adjacent
Two vertices are adjacent if they are connected by a single edge.
For example, in the graph below, nodes $$d$$ and $$f$$ are adjacent, but $$f$$ and $$j$$ are not adjacent.

Neighbors
The collection of vertices that are adjacent to a vertex $$v$$ are known as the neighbors of $$v$$. For example, the neighbors of $$d$$ are $$c$$, $$f$$, and $$e$$.
Path
A path in a graph is a sequence of edges that connects two vertices. For example, here's a graph with a highlighted path:

Connected
A graph is connected if there is a path between any two pairs of vertices. The graph above is connected, but the following six vertices (all part of the same graph) are not connected:

Sometimes, we call the different parts of an unconnected graph islands, as in, the above graph has two islands of vertices.
Note: Depending on context, you might consider the above image to consist of two connected graphs or one unconnected graph.
Complete
A graph is complete if there is an edge between every pair of vertices. Here's an example:

Directed
A directed graph has a direction associated with each edge, which is depicted using an arrow:

Edges in a directed graph are often represented as ordered pairs of the form (start vertex, end vertex). For example, $$(a, b)$$ is an edge in the graph above, but $$(b, a)$$ is not.
Graphs which have no direction on edges are known as undirected graphs.
Cycles
A cycle is a path that leaves a given vertex using one edge, and then later returns to that same vertex using a different edge. For example:

Now that we know a bit about graphs and the terminology used to describe them, we can turn our attention to ways to represent graphs in programs.
Representations
Now that we understand the definition of graphs, we can turn our attention to thinking about how to represent them in computers.
Think back to our time designing data structures such as stacks, queues, and heaps. In those cases, we had choices to make about how to represent them in memory. In particular, we were able to choose between array-based data structures and linked data structures. For example, in a previous course we saw how to represent stacks and queues with both arrays and linked lists, and in this course saw that we could greatly simplify the representation of a heap (a complete tree) by using an array.
For graphs, there are typically three ways that they can be represented in a program's memory:
- Edge list
- Adjacency matrix
- Adjacency list
Let's now talk about each of these representations.
Edge list
Watch the following video to see how to represent a graph using a vertex list and edge list:
To summarize:
- Graphs can be represented as separate lists: one for the vertices $$V$$, and one for the edges $$E$$
- Algorithms like finding all vertices that are adjacent to a given vertex, or checking whether two vertices are adjacent, require time $$O(|E|)$$, which in the worst case can be $$O(|V|^2)$$, which is not very time efficient
Adjacency matrix
Another option is to represent the graph using an adjancency matrix:
To summarize:
- Graphs can be represented using a list of vertices and a two-dimensional matrix that represents the edges between those vertices
- When represented this way, finding the adjacent vertices of a given vertex only requires $$O(|V|)$$ time, and checking whether two vertices are adjacent is even better: $$O(1)$$ time!
- The main disadvantage is that we now require $$O(|V|^2)$$ space to store the edge representation
Let's take a closer look at our cities graph from the previous lesson. If we chose to store it as an adjacency matrix, here's what it would look like:

Adjacency list
The final representation of graphs we will consider is that of an adjacency list:
- For every vertex, we can keep track of the vertices with which it is neighbors using an adjacency list
- This makes (1) finding the adjacent vertices of a given vertex and (2) checking whether two vertices are adjacent run in time $$O(|V|)$$, but typically performs much better than the edge list representation, since we don't need to consider unrelated vertices
Bringing back our cities example, here is our final representation using an adjacency list. Note that two of the vertices and the edge connecting them are highlighted in the diagram:

Traversals
In a previous course, we saw that we can traverse trees in a variety of ways, including depth-first traversals (including pre-order, post-order, and in-order traversals), and breadth-first traversals (which is the same as a level-order traversal).
With graphs, we need similar traversal algorithms. Given a starting point, we need to define ways of traversing the graph to visit all of the nodes -- at least, the ones reachable from the starting point.
Let's talk about the two main traversal techniques: depth-first and breadth-first traversals.
Depth-first traversal
Watch the following video to see how a depth-first traversal works on graphs, and to see what kinds of algorithms you can write with a depth-first traversal:
To summarize:
- A depth-first traversal proceeds as far as possible along a given path before backing up
- A DF traversal is most naturally implemented recursively
- Since graphs can have cycles (unlike trees), we need to mark nodes as "visited" in order to know when the recursion down a particular path should stop
For your reference, here's some pseudocode for a depth-first traveral (assuming an adjacency list graph):
def df_trav(v, parent):
# Visit v.
print(v.id)
# Mark this vertex as visited.
v.done = True
# Mark this vertex's parent in the DF traversal
v.parent = parent
# Iterate over the vertex's adjacency list.
e = v.edges
while e is not None:
w = e.end # Consider a neighbor w
if not w.done:
df_trav(w, v)
e = e.next
Breadth-first traversal
Watch the following video to see how a breadth-first traversal works on graphs, and to see what kinds of algorithms you can write with a breadth-first traverasl:
To summarize:
- A breadth-first traversal uses the following algorithm:
- visit a vertex
- visit all of its neighbors
- visit all unvisited vertices 2 edges away
- visit all unvisited vertices 3 edges away, etc
- In order to visit one level at a time, the BF traversal algorithm uses an auxiliary queue
- Nodes still need to be marked as visited during the algorithm, since nodes can be seen in multiple "levels" of the traversal (since there can be cycles)
For your reference, here's some pseudocode to implement the BF traversal (assuming an adjacency list implementation):
def bf_trav(origin):
origin.encountered = true
origin.parent = null
q = Queue()
q.insert(origin)
while len(q) > 0:
# Get next vertex in the traversal.
v = q.remove()
# Visit v.
print(v.id)
# Add v's unencountered neighbors to the queue.
e = v.edges
while e is not None:
w = e.end
if not w.encountered:
w.encoutered = True
w.parent = v
q.insert(w)
e = e.next
Minimum Spanning Trees
In the previous lesson, we performed depth-first and breadth-first traversals of the city graph starting from Lagos. For each traversal, we were looking to find the order in which the cities were visited.
However, instead of the sequence of cities, we could have asked for a different output: the spanning tree produced by each traversal. A spanning tree is a subset of a connected graph that contains all of the vertices and a subset of the edges, which form a tree. For example, here is the spanning tree produced by the depth-first traversal:

Note that the spanning tree is formed by connecting each visited child node with its parent node, which is the node that caused the child to be visited via a recursive call.
And here is the spanning tree produced by the breadth-first traversal:

Note that the spanning tree is formed by connecting each visited child node with its parent node, which is the node that added the child to the queue.
Minimum spanning trees
Although both the DF traversal and BF traversal produce spanning trees, neither of the spanning trees produced by the algorithm in the example above are minimum spanning trees. A minimum spanning tree (MST) has the smallest total cost among all possible spanning trees.
For example, the spanning tree produced by the DF traversal has a total edge weight of 2250, while the BF traversal spanning tree is 2460. However, the MST has a total edge cost of 2050 (it doesn't use the edge between Ibadan and Abuja):

Note that if all edges have unique costs, there is only one MST. If some edges have the same cost, there may be more than one MST.
Computing an MST could be useful for various applications, including determining the shortest highway system for a set of cities, or calculating the smallest length of cable needed to connect a network of computers.
Note:
The MST does not necessarily include the minimal cost path between a pair of vertices. For example, the shortest path between Ibadan and Abuja is the edge that directly connects them, which is not included in the MST.
So how can we compute the MST of a graph? Let's next look at one option: Prim's algorithm.
Prim's Algorithm
There are multiple different ways of finding the minimum spanning tree of a graph. For example, a naive algorithm would just enumerate every possible spanning tree that can be formed from the graph, and would keep the one with the lowest total weight. However, in the worst case, for a connected graph there are $$O(n^n)$$ possible spanning trees, and so this is not a practical algorithm.
One popular alternative for efficiently computing the MST of a graph is Prim's algorithm.. Let's see how it works.
Algorithm
The algorithm is as follows:
- Begin with the following two subsets:
- $$A$$ = any one of the vertices
- $$B$$ = all of the other vertices
- Repeatedly do the following:
- Select the lowest-cost edge $$(v_a, v_b)$$ connecting a vertex in $$A$$ to a vertex in $$B$$
- Add $$(v_a, v_b)$$ to the spanning tree
- Move vertex $$v_b$$ from set $$B$$ to set $$A$$
- Continue until set $$A$$ contains all of the vertices.
Efficiency
Let's think about the efficiency of Prim's algorithm, both for an adjacency matrix and an adjacency list.
When using an adjacency matrix representation, Prim's algorithm has a time complexity of $$O(|V|^2)$$, where $$V$$ represents the vertices in the graph. This is because in each iteration of the algorithm, it scans through the entire adjacency matrix to find the minimum-weight edge connecting a vertex from the MST to a vertex outside the MST. Since there are $$|V|$$ vertices and for each vertex, we might have to consider all other vertices to find the minimum edge weight, the time complexity becomes $$O(|V|^2)$$.
For an adjacency list, in the worst case, you need to scan through all $$|E|$$ edges for every one of the $$|V|$$ nodes. Therefore, the running time is $$O(|V|*|E|)$.
Example
Recall our city graph from the previous lessons:

Summary
Watch the video below to see a summary of Prim's algorithm:
Topological Sort
Another interesting algorithm that comes up in many applications is topological sort. Topological sort is an algorithm that works specifically on directed acyclic graphs (DAGs), and orders the vertices such that if there is directed edge from $$a$$ to $$b$$, then $$a$$ comes before $$b$$ in the output of the sort.
DAGs
Directed acyclic graphs (DAGs) are a fundamental data structure, often used to represent relationships between objects. As the name suggests, DAGs consist of nodes connected by directed edges, and since they are acyclic, they must not contain any cycles.
The importance of DAGs lies in their applicability to a wide range of problems in various fields. In computer science, DAGs are commonly used in algorithms for tasks such as scheduling and dependency resolution. For example, in software development, DAGs can represent the dependencies between different modules or components of a program, allowing for efficient compilation or build processes.
Beyond computer science, DAGs find applications in domains including biology, finance, and logistics. In finance, they can represent financial transactions and dependencies between assets in a portfolio. In logistics, DAGs can help optimize supply chain management by representing the flow of goods between different locations. Overall, DAGs serve as powerful tools for representing and analyzing complex relationships and dependencies in various domains.
Algorithm
Watch the video below to see how topological sort works:
Strongly Connected Components
Now that we know a bit about graphs and traversals, we can continue our exploration by looking at other problems related to graphs. Check out the following video to see the idea of strongly connected components and how to find them:
Reading
Read about graphs in Chapter 11 (all sections except for "Single Source Shortest Path") of An Open Guide to Data Structures and Algorithms.
Practice
Here is a selection of practice problems from LeetCode involving graphs:
Week 6 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Graphs II
Welcome to week 7!
Learning objectives
By the end of this week, you will be able to:
- Read and write algorithms using adjacency-list graph implementations
- Trace and apply algorithms related to shortest path problems
- Recognize bipartite graphs and apply them to problems such as the stable marriage problem
What's due this week
- Week 7 Quiz (on Gradescope), due Thursday before class starts
- Assignment 4, due Sunday
Graph Implementation
Let's now look more closely at a Python implementation of a graph that we'll be using throughout this week. The implementation uses an adjacency list approach, where we have:
- A linked list of
Vertex
objects representing the vertices of the graph - Each
Vertex
object contains a linked list ofEdge
objects, representing the edges that are connected to the givenVertex
- Each
Edge
object contains a reference to the twoVertex
objects that it connects, along with a cost (weight)
Here's a visual representation of our forthcoming Graph
class:
Try to match up the vertices and edges in the graph on the right with the in-memory representation on the left.
Here's the core definitions of the Graph
, Vertex
, and Edge
objects:
class Graph:
def __init__(self):
self.vertices = None # linked list of vertices in the graph
class Vertex:
def __init__(self, id):
self.id = id # vertex ID
self.edges = None # linked list of edges for this vertex
self.next = None # next vertex in the linked list of vertices
...
class Edge:
def __init__(self, start, end, cost):
self.start = start # starting vertex of the edge
self.end = end # ending vertex of the edge
self.cost = cost # weight of the edge
self.next = None # next edge in the linked list of edges
...
Here's the full implementation: graph.py
Here's a sample graph: graph.txt
When you run graph.py
, you will be prompted to enter both (1) the name of the input file (graph.txt
in this case) and (2) the starting point for the algorithms:
$ python graph.py
Graph info file: graph.txt
starting point: Lagos
depth-first traversal from Lagos:
...
Note that there are a methods provided in graph.py
:
get_vertex()
,add_vertex()
, andadd_edge()
for fetching and building the graphinit_from_file()
, which reads in a graph configuration from a file and creates aGraph
object with the vertices and edges that are listed in the file (example file)- Various graph algorithms:
breadth_first_trav()
,depth_first_trav()
, andprim()
- Since the various graph algorithms may mark the graph (such as to mark vertices as encountered or done), there is a
reinit_vertices()
method that resets the state of each vertex so that the next algorithm can have a clean slate
Dijkstra's Algorithm
Let's now turn to a new problem that graphs can help us solve: the shortest path problem. It's often useful to know the shortest path from one vertex to another – i.e., the one with the minimal total cost. This can help in applications such as routing traffic in the Internet.
For an unweighted graph, we can simply do the following:
- Start a breadth-first traversal from the origin, $$v$$
- Stop the traversal when you reach the other vertex, $$w$$
- The path from $$v$$ to $$w$$ in the resulting (possibly partial) spanning tree is a shortest path
A breadth-first traversal works for an unweighted graph because the shortest path is simply the one with the fewest edges, and a breadth-first traversal visits cities in order according to the number of edges they are from the origin.
For weighted graphs, there are multiple algorithms that find the shortest paths. One very famous example is Dijkstra's algorithm.
Dijkstra's algorithm
Dijkstra's algorithm allows us to find the shortest path from a vertex $$v$$ (the origin) to all other vertices that can be reached from $$v$$.
The basic idea is to maintain estimates of the shortest paths from the origin to every vertex (along with their costs), and gradually refine these estimates as we traverse the graph
Watch the video below to see how Dijkstra's algorithm works:
Implementation
We can now add Dijkstra's algorithm to our graph.py
file:
def dijkstra(self, origin_id):
self.reinit_vertices()
origin = self.get_vertex(origin_id)
if origin is None:
raise ValueError(f"No such vertex: {origin_id}")
origin.cost = 0
while True:
w = None
v = self.vertices
while v:
if not v.done and (w is None or v.cost < w.cost):
w = v
v = v.next
if w is None or w.cost == float('inf'):
return
print(f"\tfinalizing {w.id} (cost = {w.cost}" +
(f", parent = {w.parent.id})" if w.parent else ")"))
print(f"\t\tpath = {w.path_string()}")
w.done = True
edge = w.edges
while edge:
x = edge.end
if not x.done:
cost_via_w = w.cost + edge.cost
if cost_via_w < x.cost:
x.cost = cost_via_w
x.parent = w
edge = edge.next
Try adding this code to your graph.py
file and running it with an example graph.
Example
Recall the graph that we've been using that represents some cities and roads in a map (note: some of the edge weights have been slightly changed from previous versions so that the output of the algorithm is more interesting):
Time analysis
The time complexity of Dijkstra's algorithm depends in part on what technique is used to select the vertex with the next-smallest estimate in order to finalize it. For example, for every iteration of the algorithm, we could scan over the entire list of vertices, but a faster way would be to store the vertices in a min-at-top priority queue, where the priority is the vertex's current shortest path estimate. This way, we could repeatedly remove the next-smallest estimate vertex and finalize it.
When using a priority queue, each operation to the extract the minimum takes $$O(log|V|)$$ time, and there are $$|V|$$ such operations. Over the course of the algorithm, vertexes are decreased according to the $$|E|$$ edges, and each decrease operation requires $$O(log|V|)$$ time, since you must search through the heap for the vertex whose estimate you want to decrease. Therefore, the running time is $$O(|E|*log|V| + |V|*log|V|) = O((|E| + |V|)log|V|)$$. In most cases, $$|E| > |V|$$, so often the running time is written as $$O(|E|log|V|)$$.
Summary
Watch the following video to see a quick summary of Dijkstra's algorithm:
Bellman-Ford Algorithm
In addition to Dijkstra's algorithm, there is another common algorithm for solving the single-source shortest paths problem: the Bellman-Ford algorithm.
The Bellman-Ford algorithm is actually slower than Dijkstra's algorithm, but it has one notable advantage: it is capable of solving the shortest paths problem even for graphs that have some edges with negative weights.
Negative edge weights
Negative edge weights are exactly what they sound like: weights assigned to edges in a graph that have a negative value. While they might seem counterintuitive at first, negative edge weights serve various purposes and can model a wide range of real-world scenarios. These scenarios include:
- Time savings: In certain scenarios, negative edge weights can represent time savings. For instance, if the edge weight between two cities represents travel time, a negative weight might indicate that taking a certain route reduces travel time, perhaps due to a shortcut or a faster mode of transportation.
- Financial transactions: In financial networks or economic models, negative edge weights can represent gains or profits. For instance, in a trade network, negative weights could signify profitable transactions or reduced costs associated with certain trade routes.
- Resource allocation: Negative edge weights can also model situations where resources are saved or conserved. For example, in a network representing resource allocation, a negative weight could denote a more efficient use of resources or a reduction in resource consumption along a particular path.
Bellman-Ford algorithm
The algorithm itself is similar to Dijkstra's algorithm, in that it repeatedly relaxes edge estimates. The main difference, however, is that Bellman-Ford repeatedly relaxes all edges in the graph, whereas Dijkstra's algorithm relaxes the edges connected only to each vertex as each vertex is finalized.
Since the Bellman-Ford algorithm can work on graphs with negative weights, a possible output of the algorithm is that there are no shortest paths due to the existence of cycles in the graph with net negative cost. If such negative cycles exist, then the algorithm returns that fact. Otherwise, the algorithm returns the shortest paths from a given vertex.
In pseudocode, the algorithm works as follows:
-
Initialization: Set the distance from the source vertex to itself as 0, and all other distances as infinity.
-
Relaxation: Repeat the following relaxation step for $$|V| - 1$$ times:
- For each edge $$(u, v)$$ in the graph, relax the edge by updating the distance to vertex $$v$$ if the distance to $$u$$ plus the weight of $$(u, v)$$ is less than the current distance estimate to $$v$$.
-
Detection of Negative Cycles: After the relaxation step, perform an additional iteration over all edges. If any distance is updated during this iteration, it indicates the presence of a negative cycle in the graph.
Watch the following video to learn about the Bellman-Ford algorithm:
Time analysis
We mentioned above that the Bellman-Ford algorithm is slower than Dijkstra's algorithm. That's because whereas Dijkstra's algorithm is $$O(|E|log|V|)$$, Bellman-Ford is $$O(|E||V|)$$. This is easily seen from the pseudocode of the algorithm: relax all $$|E|$$ edges repeatedly for $$O(|V|)$$ times.
Summary
Watch the following short videos to summarize the concept of Bellman-Ford and see another example:
Floyd-Warshall Algorithm
Let's explore another fundamental algorithm in the realm of graph theory: the Floyd-Warshall algorithm. While Dijkstra's algorithm and the Bellman-Ford algorithm focus on finding shortest paths from a single source vertex to all other vertices, the Floyd-Warshall algorithm takes a different approach, aiming to find the shortest paths between all pairs of vertices in a weighted graph.
Algorithm
Watch the following video to learn about the Floyd-Warshall algorithm:
Summary:
- The algorithm uses a dynamic programming approach, keeping track of a table of shortest path distances between each pair of vertices,
dp
- The algorithm performs $$|V|$$ iterations
- At each iteration $$k$$, the algorithm considers all pairs of vertices $$i$$ and $$j$$, and updates the table
dp[i][j]
if there exists a shorter path from $$i$$ to $$j$$ through vertex $$k$$ - At the end of the algorithm, the
dp
table contains all shortest path distances between every pair of vertices
Time analysis
To find all possible $$|V|^2$$ shortest paths between all vertices $$i$$ and $$j$$, the algorithm does $$O(|V|)$$ amount of work for each shortest path, leaving an overall running time of $$O(|V|^3)$$. It's also worth noting that the algorithm requires $$O(|V|^2)$$ extra space to keep track of all of the distances for every pair of vertices.
Summary
Watch the following video for a summary of the Floyd-Warshall algorithm:
Bipartite Graphs
A bipartite graph is a special type of graph that can be divided into two distinct sets of vertices, such that no two vertices within the same set are adjacent. Here's an example:
Formally, a graph $$G = (V, E)$$ is bipartite if its vertex set $$V$$ can be split into two disjoint sets, $$U$$ and $$W$$, with the property that every edge in $$E$$ connects a vertex in $$U$$ to a vertex in $$W$$. This means that there are no edges connecting vertices within the same set.
Applications
Bipartite graphs are particularly important in various fields due to their unique structure. They are commonly used to model relationships between two different types of entities. For example, in a bipartite graph representing a job allocation system, one set of vertices might represent workers, and the other set represents jobs, with edges indicating which workers are eligible for which jobs.
Similarly, in a social network analysis, one set might represent people and the other set could represent activities, with edges denoting participation. This structure is also prevalent in fields such as scheduling, matching problems, and network flow, where bipartite graphs facilitate the representation and solution of complex relational problems.
Summary
Watch the following videos to learn more about bipartite graphs:
Stable Marriage Problem
The stable marriage problem (SMP) is a classic problem in the field of combinatorial optimization and computer science, which seeks to find a stable matching between two sets of elements given a set of preferences.
The problem is widely used in matching markets, such as assigning medical residents to hospitals, students to schools, and even in some organ donation programs where donors and recipients are paired. Its importance and utility in economics, game theory, and real-world applications have made it a fundamental problem in algorithm design and analysis, showcasing the intersection of theoretical computer science and practical optimization problems.
The most common scenario involves two equally sized sets of individuals, typically referred to as men and women, each of whom ranks all members of the opposite set based on their preference for marriage partners. The goal is to pair each man with a woman in such a way that no pair of individuals would both prefer each other over their current partners. This condition ensures that no "blocking pairs" exist, which would make the matching unstable.
Bipartite graphs
In the context of the stable marriage problem, bipartite graphs provide a useful way to visualize and understand the matching process. For the SMP, the two disjoint sets in the bipartite graph represent the two groups to be matched, typically men and women. Each vertex in one set (men) is connected to vertices in the other set (women) based on the potential for marriage. An edge between a man and a woman signifies that they are potential partners.
Gale-Shapley algorithm
The Gale-Shapley algorithm, also known as the deferred acceptance algorithm, is a solution to the stable marriage problem that ensures a stable matching between two equally sized sets of participants.
The algorithm proceeds in rounds, with each man proposing to his most-preferred woman who has not yet rejected him. Each woman then reviews her current set of proposals, including any existing tentative engagements, and tentatively accepts the proposal from the man she prefers the most while rejecting the others. Rejected men then move on to propose to their next choice. This process repeats until no man has a new woman to propose to, at which point all tentative engagements become final.
Watch the following videos to learn more about the stable marriage problem and Gale-Shapley algorithm:
Week 7 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 4
Due date: May 26
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa2 channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Graph Algorithms
In this assignment, you will complete a series of exercises that test your knowledge of graph algorithms.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Advanced Algorithms
Welcome to week 8!
Learning objectives
By the end of this week, you will be able to:
- Trace examples of quicksort, explain its usage as a divide-and-conquer algorithm, and analyze its average case running time
- Analyze and differentiate between naive string matching algorithms and more advanced algorithms such as Rabin Karp
- Formulate and solve linear programming problems
- Describe the concept of duality as it relates to optimization problems
What's due this week
- Week 8 Quiz (on Gradescope), due Thursday before class starts
Quicksort
In this course and the previous DSA course, we have seen many different sorting algorithms:
- Insertion sort
- Selection sort
- Radix sort
- Merge sort
- Heapsort
We have finally reached our last sorting algorithm: quicksort. Similar to merge sort, quicksort is a divide-and-conquer algorithm, since it repeatedly divides a list to break it down into smaller and smaller subproblems, and recursively sorts those subproblems.
Let's take a closer look at how quicksort works.
Algorithm
Watch the following videos to learn about quicksort:
To summarize:
- Quicksort is a recursive, divide-and-conquer algorithm
- It repeatedly rearranges (partition) the elements so that we end up with two sublists that meet the following criterion: each element in left array <= each element in right array
- It applies quicksort recursively to the sublists, stopping when a sublist has a single element
Choice of pivot
There are many possible ways of choosing the quicksort pivot, including:
- The first element or the last element
- Risky, because it could lead to worst case behavior if the list is sorted or almost sorted
- The middle element of the list
- A randomly chosen element
- The median of three elements
- For example, the median of the first, last, and middle elements
Application
Quicksort is widely recognized for its impact and extensive application in various fields due to its efficient average-case performance and in-place sorting capability. It is frequently used in systems where speed is crucial, such as in database management, search engines, and large-scale data processing.
Quicksort's ability to handle large datasets efficiently makes it a preferred choice in many real-time applications and performance-critical environments. The algorithm's in-place sorting feature minimizes memory usage, making it suitable for systems with limited resources. Additionally, Quicksort is often the default sorting algorithm in many programming language libraries, including C's standard library (qsort
), Python's Timsort
(which includes elements of quicksort), and Java's Arrays.sort
. Its combination of simplicity, speed, and low memory overhead ensures that quicksort remains a foundational algorithm in computer science and software engineering.
Summary
Here's a couple more good videos about quicksort, in case you need some more examples:
Quicksort Analysis
We mentioned in the last lesson that quicksort is similar to merge sort, since they are both divide-and-conquer algorithms that improve on the running time of more primitive sorting algorithms (such as insertion sort or selection sort). However, whereas merge sort is $$O(nlogn)$$ in all cases, quicksort has sa more nuanced running time that depends on the contents of the list.
Best case and worst case
Watch the following video to see the best case and worst case running time behavior of quicksort:
To summarize:
- In the best case, each partition divides the remaining portion of the list approximately in half. In this caes, there are $$O(logn)$$ partitions -- when visualized as a call tree, the height of the tree is $$O(logn)$$. At each of the levels of that tree, $$O(n)$$ amount of operations are performed to do the partitioning. Therefore, the best case running time is $$O(nlogn)$$.
- In the worst case, each partition reduces the size of the list by one element, i.e., the pivot is chosen to be the smallest or largest element of each sublist. In this case, the call tree has a height of $$O(n)$$, with $$O(n)$$ operations performed at each level to perform the partition, leading to a running time of $$O(n^2)$$.
- Notice that the choice of partition can have a significant impact on the running time of the algorithm.
Average case
In the average case of quicksort, there is a mixture of good and bad partitions. In practice, the running time in the average case is closer to $$O(nlogn)$$ than it is to O$$(n^2)$$. Watch the following video to see an empirical explanation of the average case:
String Matching
We now leave quicksort to look at a familiar algorithm: string matching. In other words, given a string $$S$$, we want to find whether a string $$W$$ occurs within that string.
For example, if $$S$$ is ABC ABCDAB ABCDABCDABDE
and $$W$$ is ABCDABD
, then $$W$$ appears in $$S$$ (at position 15).
Naive algorithm
At this point, you're likely familiar with the "naive" string matching algorithm, which checks every possible position in $$S$$ for $$W$$:
def naive_string_match(s, w):
n = len(s)
m = len(w)
# Loop through the text from the start to the point where the pattern can fit
for i in range(n - m + 1):
# Check for a match at this position
match = True
for j in range(m):
if s[i + j] != w[j]:
match = False
break
# If match is still True, we found a match
if match:
return i
return -1
The worst-case running time for this algorithm is $$O(n * m)$$, where $$n$$ is the length of the longer string $$S$$ and $$m$$ is the length of the smaller string $$W$$. This is because the algorithm compares each character of the pattern with the text for each possible starting position.
However, we can do better than this algorithm. One example is the Knuth-Morris-Pratt algorithm.
Knuth-Morris-Pratt algorithm
The Knuth-Morris-Pratt (KMP) algorithm improves on naive string search using the following algorithm:
- Preprocessing phase: The KMP algorithm preprocesses the pattern to create a partial match (or "failure") table, which takes $$O(m)$$ time.
- Matching phase: The actual string matching is done in $$O(n)$$ time. The KMP algorithm avoids redundant comparisons by using the information from the partial match table, ensuring that each character in the text is compared at most once.
Watch the following video to see how the KMP algorithm works:
As described above, this algorithm is significantly faster, and operates in $$O(n + m)$$ time. However, the space complexity of the algorithm increases to $$O(m)$$, which is more space than the naive algorithm.
Rabin-Karp Algorithm
Believe it or not, we can actually improve even further on the string matching problem than the KMP algorithm. In the Rabin-Karp algorithm, we can search for a string $$M$$ in $$S$$ again in $$O(n + m)$$ time, but by using only $$O(1)$$ space.
Algorithm
Watch the following video to see how the Rabin-Karp algorithm works:
To summarize:
- The algorithm uses hashing to convert the pattern and substrings of the text into numerical values. This allows for quick comparisons between the pattern and text substrings based on their hash values.
- The algorithm employs a rolling hash function to efficiently compute the hash value of the next substring in constant time by updating the hash value of the current substring. This avoids recomputing the hash from scratch for each substring.
- If the hash values of the pattern and a text substring match, a direct comparison is performed to confirm the match. This step ensures that false positives (hash collisions) are correctly identified and handled.
- The main drawback of the Rabin-Karp algorithm is the possibility of hash collisions, where different substrings have the same hash value. This requires additional verification steps, potentially affecting performance. Additionally, the algorithm's average-case time complexity is $$O(n + m)$$, but its worst-case time complexity can degrade to $$O(n * m)$$ if many hash collisions occur.
Summary
Here's another video that summarizes the Rabin-Karp technique:
Linear Programming
Linear programming (LP) is a mathematical method used for optimizing a linear objective function, subject to a set of linear inequality or equality constraints.
The primary goal of linear programming is to find the best possible outcome, such as maximum profit or lowest cost, in a mathematical model whose requirements are represented by linear relationships. Linear programming is widely used in various fields, including economics, business, engineering, and military applications, for solving problems related to resource allocation, production planning, transportation, and scheduling.
Formulating LP problems
To formulate a linear programming problem, you need to define three main components:
- the objective function
- the decision variables
- the constraints
The objective function is a linear equation that represents the goal of the optimization, such as maximizing revenue or minimizing costs. The decision variables are the unknowns that you are solving for, and they represent the quantities you want to determine. The constraints are linear inequalities or equalities that represent the limitations or requirements of the problem.
For example, consider a company that produces two products, $$A$$ and $$B$$. The company wants to maximize its profit, where the profit from product $$A$$ is $5 per unit and from product $$B$$ is $4 per unit.
The production of these products is subject to resource constraints: each unit of product $$A$$ requires 2 hours of labor and 1 unit of material, while each unit of product $$B$$ requires 1 hour of labor and 2 units of material. The company has a total of 100 hours of labor and 80 units of material available. The linear programming model for this problem would be:
Objective function: Maximize profit = $$5A + 4B$$
Constraints:
- $$2A + B \leq 100$$ (labor constraint)
- $$A + 2B \leq 80$$ (material constraint)
- $$A, B \geq 0$$ (non-negativity constraint)
Simplex algorithm
Linear programming problems can be solved using various methods, the most common of which is the simplex algorithm.
To see an overview of linear programming and the simplex algorithm, watch the following video:
Duality
Note: this lesson is optional.
An important concept in linear programming is duality. Every linear programming problem (referred to as the primal problem) has a corresponding dual problem. The dual problem provides insights into the original problem and can sometimes be easier to solve.
The relationship between the primal and dual problems is such that the optimal solution to one provides information about the optimal solution to the other. Specifically, the value of the objective function in the primal problem at the optimal solution equals the value of the objective function in the dual problem at the optimal solution. This principle is known as the Strong Duality Theorem.
Primal Problem Example
Suppose a company produces two types of products, $$x_1$$ and $$x_2$$. The company wants to maximize its profit, with profits of $3 per unit of $$x_1$$ and $5 per unit of $$x_2$$. The production is subject to the following constraints:
- The production of $$x_1$$ and $$x_2$$ requires different amounts of two resources, $$R1$$ and $$R2$$.
- Resource $$R1$$ has a maximum availability of 4 units.
- Resource $$R2$$ has a maximum availability of 12 units.
- Each unit of $$x_1$$ requires 1 unit of $$R1$$ and 3 units of $$R2$$.
- Each unit of $$x_2$$ requires 2 units of $$R1$$ and 2 units of $$R2$$.
The primal linear programming problem can be formulated as follows:
Primal Problem:
Maximize $$z = 3x_1 + 5x_2$$
subject to:
$$ \begin{align*} x_1 + 2x_2 & \leq 4 & \text{(Resource R1 constraint)} \ 3x_1 + 2x_2 & \leq 12 & \text{(Resource R2 constraint)} \ x_1, x_2 & \geq 0 & \text{(Non-negativity constraint)} \end{align*} $$
Dual Problem Example
To formulate the dual problem, we follow these steps:
- Each constraint in the primal problem corresponds to a variable in the dual problem.
- The objective function coefficients in the primal problem become the right-hand side constants in the dual problem.
- The right-hand side constants in the primal problem become the objective function coefficients in the dual problem.
- The inequalities are reversed, and the primal's "less than or equal to" constraints become the dual's "greater than or equal to" constraints.
Dual Problem:
Minimize $$ w = 4y_1 + 12y_2 $$
subject to:
$$ \begin{align*} y_1 + 3y_2 & \geq 3 & \text{(from coefficient of } x_1 \text{ in primal objective)} \ 2y_1 + 2y_2 & \geq 5 & \text{(from coefficient of } x_2 \text{ in primal objective)} \ y_1, y_2 & \geq 0 & \text{(Non-negativity constraint)} \end{align*} $$
Interpretation
- Primal Problem: The company wants to maximize profit by deciding how much of each product ($$x_1$$ and $$x_2$$) to produce, given the resource constraints.
- Dual Problem: The dual problem can be interpreted as a resource allocation problem, where $$y_1$$ and $$y_2$$ represent the shadow prices of resources $$R1$$ and $$R2$$, respectively. The objective is to minimize the total cost of resources, subject to the constraint that the resource prices provide at least as much value as the profit contributions of the products.
Relationship Between Primal and Dual
According to the Strong Duality Theorem, if both the primal and dual problems have feasible solutions, then the optimal value of the objective function in the primal problem (maximum profit) equals the optimal value of the objective function in the dual problem (minimum cost). This relationship provides valuable insights into the economics of resource allocation and pricing in optimization problems.
Practice
Here is a selection of practice problems from LeetCode involving this week's topics:
Week 8 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Intractable Problems
Welcome to week 9!
Learning objectives
By the end of this week, you will be able to:
- Describe the difference between problems in various complexity classes
- Explain the traveling salesman problem and solve it using naive and approximation algorithms
- Recognize the challenges in solving whether P = NP
What's due this week
- Week 9 Quiz (on Gradescope), due Thursday before class starts
- Assignment 5, due Sunday
Traveling Salesman Problem
To begin this week's material, let's tackle a very famous problem in computer science: the traveling salesman problem (TSP).
Here's one way of thinking about the TSP: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?
Definition
The input to the traveling salesman problem is:
- A set of $$n$$ cities.
- A distance matrix (or a set of distances in some format) where each entry specifies the distance between two cities.
The objective is to find the shortest possible route that visits each city once and returns to the starting city.
Watch the following video to see an overview of the TSP:
Brute force algorithm
One way of solving the TSP is to use a brute force approach -- generating all possible permutations of the cities to find the shortest possible route that visits each city exactly once and returns to the starting city.
Here’s a detailed description of this approach:
-
Generate all possible permutations of the cities. For $$n$$ cities, there are $$n!$$ ($$n$$ factorial) possible permutations. However, since the route is a cycle (i.e., starting and ending at the same city), we can fix one city as the starting point and only consider permutations of the remaining $$n−1$$ cities. This reduces the number of permutations to $$(n-1)!$$.
-
For each permutation (route), calculate the total distance by summing the distances between consecutive cities in the permutation and adding the distance from the last city back to the starting city.
-
Compare the total distances of all permutations and keep track of the permutation with the shortest distance.
Complexity
The brute force approach has a time complexity of $$O(n!)$$, where $$n$$ is the number of cities. This factorial growth makes the brute force approach impractical for large $$n$$. The space complexity is $$O(n)$$, for storing the current permutation and other necessary variables.
Due to its factorial time complexity, the brute force approach is only feasible for small instances of the TSP (usually up to about 10-12 cities). For larger instances, the computation time becomes prohibitively large, necessitating the use of more sophisticated algorithms.
Held-Karp algorithm
One such algorithm is the Held-Karp algorithm. Proposed by Michael Held and Richard M. Karp in 1962, this algorithm is a cornerstone method in computer science, and significantly reduces the computational complexity of solving the TSP compared to brute-force methods. It leverages the principle of dynamic programming to break down the problem into smaller subproblems, which are solved once and stored for future use, thus avoiding redundant calculations.
The algorithm works by iteratively building up the solution using subsets of the cities. It starts with the smallest possible subsets and progressively adds more cities, calculating the minimum cost to reach each subset's final city from the starting point. For each subset of cities, it records the shortest path that ends at each specific city within the subset. This process continues until it encompasses all cities, ultimately determining the shortest path for the complete set.
Watch the following algorithm to learn about the Held-Karp algorithm:
Complexity
Although the Held-Karp algorithm reduces the time complexity significantly from factorial to exponential (specifically, $$O(n^2*2^n)$$), it is still computationally intensive for very large numbers of cities. Despite this, it remains a critical theoretical approach and a benchmark for evaluating heuristic and approximation algorithms for the TSP.
Hard Problems
For most of the problems that we have seen in our study of algorithms, we have seen polynomial time algorithms. Polynomial time algorithms are generally considered to be fairly efficient. For many of the simple algorithms we've seen (searching, sorting, etc.), we are used to seeing logarithmic or linear algorithms, and consider polynomial time algorithms (e.g., selection sort as an $$O(n^2)$$ algorithm) to be inefficient. However, for more complex, real-world problems (especially with graphs), polynomial time algorithms are considered fairly efficient.
As seen in the previous lesson, the traveling salesman problem is a hard problem to solve. In fact, it is an intractable problem -- a problem for which no known polynomial time algorithm exists.
To learn more about this and other, similar hard problems, watch the following videos:
Complexity Classes
To formalize the concepts of whether algorithms can be solved in polynomial time (or, whether a solution can be verified in polynomial time), we define the concept of complexity classes. Watch the following videos to learn more:
P vs. NP
The P vs. NP problem is one of the most fundamental and profound questions in computer science and mathematics, concerning the relationship between two classes of problems: those that can be solved in polynomial time (P) and those for which a solution can be verified in polynomial time (NP).
Specifically, the problem asks whether every problem whose solution can be verified quickly (in polynomial time) can also be solved quickly (in polynomial time), which would imply P = NP.
Despite significant efforts, no one has yet proven whether P equals NP or not. If P were equal to NP, it would mean that many complex problems in fields such as cryptography, optimization, and algorithm design could be solved efficiently, drastically altering our approach to computation.
Conversely, if P is not equal to NP, it would affirm the inherent difficulty of these problems, implying that some problems can be verified quickly but not solved quickly. This unresolved question continues to be a central focus of theoretical computer science, with far-reaching implications across numerous domains.
Watch the following videos to learn more about the P vs. NP problem:
Reductions
This lesson is optional.
In complexity theory, the concept of reductions is fundamental for comparing the complexity of different problems. A reduction is a way of transforming one problem into another in such a manner that a solution to the second problem can be used to solve the first problem. This concept is used to show that one problem is at least as hard as another.
To see an example of a reduction, watch the following video:
Approximation Algorithms
Approximation algorithms are a vital tool in tackling NP-hard optimization problems, where finding an exact solution is computationally infeasible due to exponential time complexity.
These algorithms seek to find solutions that are close to the optimal within a factor of the true solution. The performance of an approximation algorithm is often measured by the approximation ratio, which compares the quality of the solution provided by the algorithm to the quality of the optimal solution.
For example, a $$ρ$$-approximation algorithm guarantees that the solution will be within a factor of $$ρ$$ of the optimal solution. This approach is especially useful in real-world scenarios where near-optimal solutions are sufficient and can be obtained in a reasonable amount of time.
The design of approximation algorithms involves various techniques, such as greedy methods, local search, and linear programming relaxations. Greedy algorithms build up a solution piece by piece, always choosing the next piece that offers the most immediate benefit. Local search methods iteratively improve an initial solution by making small, local changes. Linear programming relaxations involve solving a relaxed version of the problem, where some constraints are loosened to make it easier to solve, and then converting the relaxed solution into a feasible solution for the original problem.
Approximation algorithms have been successfully applied to numerous NP-hard problems, including the TSP. Watch the following video to learn more about approximation algorithms for the TSP:
Reading
Read about intractable problems in Chapter 12 of An Open Guide to Data Structures and Algorithms.
Week 9 Quiz
- Complete this week's quiz in Gradescope by navigating to the appropriate link.
- Export it as a pdf using the Google Chrome plugin: https://gofullpage.com/. This plugin will do multiple screen captures while scrolling through your document, and then stitch them into a pdf that you can download.
- Upload the generated pdf to Anchor below.
Note: Anchor submissions can occur at any time during the term, but it is critical that you upload all of your work to Anchor before the last day of the term. Gradescope submissions must be submitted before the deadline (or the late deadline, if applicable).
Assignment 5
Due date: June 14
📌 This is an individual assignment. You are expected to work independently.
If you get stuck, confused, or have trouble with the assignment, you should use the #help-dsa2 channel in Discord or message an instructor. Remember not to spoil the assignment for others - do not post code samples from your work to public spaces.
Traveling Salesman
In this assignment, you will complete a couple of algorithms related to the traveling salesman problem.
▶️ Access the assignment on GitHub Classroom
Remember...
- Read the instructions carefully
- Plan before you code
- Debug if you aren't getting the desired output
- Attend office hours if you need additional support
- Ask for help in Discord
Week 10
Welcome to week 5! In this week, we will focus on wrapping up assignments and will have a final exam.
What's due this week
- Final exam on Thursday
- Assignment 5, due Friday
- All other assignment late submissions, due Friday
Final Exam
We will have our final exam on Thursday, June 13 at 6:00 PM GMT. Here are the details:
Link to Exam
You can access the exam on OnlineExamMaker here: https://t.onlineexammaker.com/doexam/pQEOjonqLl1.html.
Date and time
- Held during Week 10 during our normal class time: Thursday, June 13 at 6:00 PM GMT
- You can access the link to the final exam ahead of time, but it won't be enabled until 6:00 PM GMT the day of the exam
Material
- Everything that was covered during the term, with an emphasis on weeks 5 - 9
- Excludes the following lessons:
- Floyd-Warshall algorithm (week 7)
- Stable marriage problem (week 7)
- Linear programming (week 8)
- Duality (week 8)
- Reductions (week 9)
Format
- Administered online using a tool called OnlineExamMaker
- You can prepare and use any amount of notes on paper, but no other resources on the computer or Internet are allowed
- 75 minute exam from the time you begin
- Be sure to start the test between 6:00 - 6:15 PM GMT (it will close after that point)
- Exam will be composed of 20 multiple choice questions, shown one at a time
- You can't go back and change your answer after moving on to the next question
- We won't be using Zoom during the exam; you can just access the exam individually when you're ready
- Plan to have scratch paper for tracing/diagramming etc.
Proctoring software
- The exam website is also a proctoring service
- It will ask you to turn your camera on, and will take some pictures at random during the test
- It will flag any activity such as closing the test, changing programs, switching tabs, etc. If you do this too many times, it will close the exam and you will not be able to continue. Once you start the exam, you should expect to stay on the exam page and complete each question without any interruptions or using your computer for anything else.
- It disables copy and paste during the test
- It gives us the option to monitor your activity
Exam environment
- Plan to be in a test-taking environment
- Quiet, mostly private
Practice exam
There is a practice exam that you can take here: https://t.onlineexammaker.com/doexam/6QZO0aDmLJz.html. It is the same type of questions that will be on the final exam, but is only half the length (10 questions).
Questions
If you have any questions leading up to the exam, please ask in #dsa2-help on Discord or reach out to the instructor.