Many to many relationships
You've seen the use of foreign keys to represent the relationships between tables. Up until this point, the relationships have been between two tables: one with a primary key, and the other that uses a foreign key to indicate which item in the other table it is associated with. For example, an orders table keeps track of the customer_id of each order.
What about many-to-many relationships?
Think about it: Model these situations
How would you model these scenarios? Think about what tables you'd need, and what foreign keys would go on those tables.
- Actors can appear in many films. Films can have many actors.
- Posts can have many tags. Tags can be applied to many posts.
- Users can follow (and be followed by) many other users.
Where would you put the foreign keys on these tables?
If you only have two tables, there's no good answer.
Take the posts
and tags
example.
- If the foreign key goes on the posts table (
posts.tag_id
), then each post can only have one tag. - If the foreign key goes on the tags table (
tags.post_id
), then each tag can only apply to one post.
There are some bad answers possible here, like adding more than one column (posts.tag1_id
, posts.tag2_id
), or storing more than one id in a column (posts.tag_ids
with a comma-separated list of ids). However, these violate the principles of normalization.
Instead, we'll need to introduce a third table in between posts
and tags
to represent the association between them.
Associative entities
Many to many relationships in SQL databases are surprising because they require an additional table in order to establish the relationship between two entities. This table contains foreign keys to both of the related entities, and is used to represent the many to many relationship.
This third table is called an association table (there are lots of other names for it, like "junction table" and "join table").
Let's see how it would work for the examples above.
- To connect
actors
andfilms
, you could have anappearances
table. Eachappearance
would have anactor_id
andfilm_id
, to capture the fact that the actor appeared in that film. - To connect
posts
andtags
, you might have apost_tags
table, withpost_id
andtag_id
columns. Each row in thepost_tags
table would connect one tag to one post. - To connect users to the users that they follow, you might introduce a
following
table, with afollower_id
andfollows_id
, which both point back to theusers
table!
Terminology: belongs-to, has-many, and has-many-through
When discussing relationships between entities, it's common to use the terms "belongs to" and "has many" for this kind of relationship. An order belongs to a customer. A customer has many orders. An order also belongs to a product, and a product has many orders.
For many to many relationships, it can be helpful to describe the with the phrase "has-many-through". A post has many tags through the post_tags
table. A film has many actors through appearances.
JOIN for many-to-many relationships
To write the JOIN query for a many-to-many relationship, you need to join the main entity tables with the association table. Here's an example, using the tables students
, courses
, and enrollments
:
To fetch all the courses enrolled by a student with id = 1, you would write the following JOIN:
SELECT courses.*
FROM courses
JOIN enrollments ON courses.id = enrollments.course_id
JOIN students ON enrollments.student_id = students.id
WHERE students.id = 10;
In this statement, we first join the "courses" table with the "enrollments" table on the course_id column. Then we join the "enrollments" table with the "students" table on the student_id column. Finally, we filter the results to only show the courses enrolled by the student with id = 10
.
When we're writing JOIN queries, it's important to include the name of the table in the WHERE
clause. Since more than one table has a column called id
, SQL needs the name of the table to determine which id
is meant.
Practice: querying many-to-many
Practice reading and writing queries for multiple tables using a restaurant dataset on Replit.
Open Restaurant Reviews on Replit
Fork the repl to begin.
- Start by reading schema.sql to get a sense for the tables
- Read and run main.sql to see a little bit about what the data is like, and to see some example queries.
- Try writing some queries! Follow the instructions at the bottom of main.sql for some queries to try.