This article guides us on how to do an inner join in LINQ. We are going to discover how to seamlessly extract linked data and carry out additional LINQ tasks in conjunction with the join.
Additionally, we also explore how to perform multi-level joins, that are commonly employed in real-world LINQ queries.
NOTE: Some degree of basic understanding of LINQ is required. We have an excellent article for a quick refresher on LINQ Basic Concepts in C#, do check it out.
Next, let’s grasp the fundamental principles of an inner join.
What Is Inner Join?
SQL offers different types of joins:
- Inner join
- Group join
- Left/Right outer join
- Full outer join
- Cross join
Out of these joins, in this article, we will focus on the inner join since it’s the most used one.
An inner join relates two entities with a standard key to connect them.
To understand this, let’s define [1, 3, 5, 7, 9, 11]
& [2, 4, 6, 8, 10, 12]
as an odd and an even set. Is there any common integer in the two of them? we can say for sure that there is no common integer as an odd integer can never be also an even integer. The intersection between these two sets is empty:
Now, let’s take [1, 3, 5, 7, 9, 11]
& [2, 3, 5, 7, 11]
as an odd and a prime set. Some integers are both odd and prime. We can form a new set [3, 5, 7, 11]
consisting of common integers from the two sets. The intersection between these two sets is not empty:
Inner join is about the unique relationships between the groups defined based on commonalities such as the relationship between the odd and the prime sets.
Understanding the basics is essential, but what are some actual problems that can be addressed using inner joins? Let’s examine that.
The Usefulness of Inner Join
Let’s define a Student
table that has an Id
and a Name
:
Id | Name |
---|---|
1 | Boba Fett |
2 | Obi-Wan Kenobi |
3 | Darth Vader |
4 | Yoda |
5 | Grogu |
Let’s also define a Category
table with an Id
and a Name
of the category:
Id | Name |
---|---|
1 | Programming Language |
2 | DevOps |
3 | Database System |
Then, let’s add a Course
table with an Id
, Name
and the CategoryId
that the course belongs to:
Id | Name | CategoryId |
---|---|---|
1 | PYTHON | 1 |
2 | MSSQL | 3 |
3 | GIT | 2 |
Finally, let’s add an Enrolment
table that keeps a reference to the Student
and the Course
they enrolled in using their respective Ids:
Id | CourseId | StudentId |
---|---|---|
1 | 3 | 3 |
2 | 2 | 2 |
3 | 1 | 4 |
4 | 2 | 1 |
The final schema of our data is straightforward:
The inner join could help us answer questions like – What is the name of the student with the enrolment id 2? What’s the name of the course in the Enrolment
that has a certain Id
?
With our human eye, we can easily perform inner joins and say that “Obi-Wan Kenobi is enrolled for MSSQL” or “Enrolment number 4 is about MSSQL”, but in this article, we are going to learn how to answer these questions programmatically using LINQ.
How to Join Two Entities in LINQ?
Let’s begin by creating the entities we need.
The entity Course
consists of the Id
, Name
, and CategoryId
properties:
public class Course { public int Id { get; set; } public string Name { get; set; } public int CategoryId { get; set; } public static IEnumerable<Course> GetDummyCourses() { ... } }
We have also added GetDumyCourses()
method to populate a list of courses for our examples. You can see its full implementation on GitHub.
Next, let’s create the entity Category
, which consists of the Id
and Name
properties. It also has a helper method, GetDummyCourseCategories()
:
public class Category { public int Id { get; set; } public string Name { get; set; } public static IEnumerable<Category> GetDummyCourseCategories() { ... } }
The full implementation of GetDummyCourseCategories()
is on GitHub as well.
To have a look at our dummy categories and courses, we define two helper methods PrintDummyCourses()
and PrintDummyCategories()
where we simply iterate over the collections and print the result:
public static void PrintDummyCourses() { foreach (var course in Course.GetDummyCourses()) { Console.WriteLine( $"Id: {course.Id} " + $"Course: {course.Name} " + $"CategoryId: {course.CategoryId} " ); } } public static void PrintDummyCategories() { foreach (var category in Category.GetDummyCourseCategories()) { Console.WriteLine( $"Id: {category.Id} " + $"Category: {category.Name} "); } }
When we invoke the two helper methods, we get the list of courses and the list of categories:
-----the courses----- Id: 1 Course: CSHARP CategoryId: 1 Id: 2 Course: PYTHON CategoryId: 1 Id: 3 Course: JAVA CategoryId: 1 Id: 4 Course: RUBY CategoryId: 1 Id: 5 Course: GO CategoryId: 1 Id: 6 Course: MSSQL CategoryId: 2 Id: 7 Course: POSTGRES CategoryId: 2 Id: 8 Course: MONGODB CategoryId: 2 Id: 9 Course: MYSQL CategoryId: 2 Id: 10 Course: DOCKER CategoryId: 3 Id: 11 Course: JENKINS CategoryId: 3 Id: 12 Course: GIT CategoryId: 3 Id: 13 Course: KUBERNETES CategoryId: 3 -----the categories----- Id: 1 Category: Programming Language Id: 2 Category: DevOps Id: 3 Category: Database System
Storing Course
and Category
as separate entities provides us with great flexibility, but it isn’t very intuitive from the end-user perspective.
For example, our users would prefer to see the name of the category that a course belongs to, rather than having to guess based on the category ID.
We are going to tackle this issue by framing the relationship to return each course with its respective category name using an inner join on Course
and Category
with the help of LINQ.
Inner Join Using Query Syntax
LINQ provides us with the query syntax to combine two collections based on a common key:
from <alias_for_sequence_1> in sequence_1 join <alias_for_sequence_2> in sequence_2 on <alias_for_sequence_1>.<key_for_sequence_1> equals <alias_for_sequence_2>.<key_for_sequence_2> select <object> { .... }
Using the query syntax, we can create a new entity combining Course
and Category
with inner join on CategoryId
and Id
:
public static IEnumerable<dynamic> GetCoursesWithCategory( IEnumerable<Course> courses) { var categories = Category.GetDummyCourseCategories(); var courseWithCategory = from course in courses join category in categories on course.CategoryId equals category.Id select new { course.Id, course.Name, CategoryName = category.Name }; return courseWithCategory; }
We first fetch the list of all the categories using the helper method GetDummyCourseCategories()
then we execute the LINQ query that combines the courses
and categories
sequence using the join
clause, specifying the course.CategoryId
and category.Id
as the keys to join on.
Finally, we apply the select
clause to form the query result as an anonymous object containing properties Id
, Name
, and CategoryName
taken from the course
and category
sequences.
Now, if we print the result with the PrintCoursesWithCategory()
we get:
Id: 1 Course: CSHARP Category: Programming Language Id: 2 Course: PYTHON Category: Programming Language Id: 3 Course: JAVA Category: Programming Language Id: 4 Course: RUBY Category: Programming Language Id: 5 Course: GO Category: Programming Language Id: 6 Course: MSSQL Category: DevOps Id: 7 Course: POSTGRES Category: DevOps Id: 8 Course: MONGODB Category: DevOps Id: 9 Course: MYSQL Category: DevOps Id: 10 Course: DOCKER Category: Database System Id: 11 Course: JENKINS Category: Database System Id: 12 Course: GIT Category: Database System Id: 13 Course: KUBERNETES Category: Database System
Isn’t it great? No we have all the courses with their respective categories.
Inner Join Using Method Syntax
Alternatively, LINQ also provides us with the method or fluent syntax:
sequence_1.Join( sequence_2, <alias_for_sequence_1> => <alias_for_sequence_1>.<key_for_sequence_1>, <alias_for_sequence_2> => <alias_for_sequence_2>.<key_for_sequence_2>, (<variable_1>, <variable_2>) => new { ... })
Using the method syntax, let’s create a new method GetCoursesWithCategoryName()
that does exactly the same as GetCoursesWithCategory()
but with LINQ method syntax:
public static IEnumerable<dynamic> GetCoursesWithCategoryName( IEnumerable<Course> courses) { var categories = Category.GetDummyCourseCategories(); var courseWithCategoryName = courses.Join( categories, course => course.CategoryId, category => category.Id, (course, category) => new { course.Id, course.Name, CategoryName = category.Name } ); return courseWithCategoryName; }
Now, if we print the result with the PrintCoursesWithCategory()
we get:
Id: 1 Course: CSHARP Category: Programming Language Id: 2 Course: PYTHON Category: Programming Language Id: 3 Course: JAVA Category: Programming Language Id: 4 Course: RUBY Category: Programming Language Id: 5 Course: GO Category: Programming Language Id: 6 Course: MSSQL Category: DevOps Id: 7 Course: POSTGRES Category: DevOps Id: 8 Course: MONGODB Category: DevOps Id: 9 Course: MYSQL Category: DevOps Id: 10 Course: DOCKER Category: Database System Id: 11 Course: JENKINS Category: Database System Id: 12 Course: GIT Category: Database System Id: 13 Course: KUBERNETES Category: Database System
The GetCoursesWithCategoryName()
, joins the courses
list with the categories
list, using the Id
property of the category
object and the CategoryId
property of the course
object as the join keys. And finally, it uses the lambda passed as the fourth parameter to select the query results.
The choice between the query syntax and method syntax for joining related entities is primarily a matter of personal preference, rather than one being more straightforward, sophisticated, or efficient than the other.
Until now, our understanding is limited to merging two entities. However, it is highly beneficial to investigate whether we can merge multiple entities together.
How Do We Join Multiple Entities in C#?
We’ll explore the practical scenario of enrolling students in various courses to understand how multiple joins can assist us in relating multiple entities and retrieving pertinent data.
First and foremost we need two new entities – Student
& Enrolment
, let’s add that to our project.
Data Preparation
The Student
has Id
and Name
properties along with a helper method GetDummyStudents()
that returns list of students:
public class Student { public int Id { get; set; } public string Name { get; set; } public static IEnumerable<Student> GetDummyStudents() { ... } }
Next, Let’s add another entity named Enrolment
to capture which student gets enrolled in which course:
public class Enrolment { public int Id { get; set; } public int StudentId { get; set; } public int CourseId { get; set; } public static IEnumerable<Enrolment> GetDummyEnrolment() { ... } }
The Enrolment
has an Id
property, it keeps the reference of the student enrolled with StudentId
and the course the student gets enrolled in with CourseId
. You can check the full implementation on GitHub.
Performing the Inner Join
Here is the fun part, our users wish to see all the enrolments with the student name and course name for each enrolment instead of the IDs.
To answer our users’ needs, let’s create GetEnrolmentsRelation()
:
private static IEnumerable<dynamic> GetEnrolmentsRelation( IEnumerable<Enrolment> enrolments, IEnumerable<Student> students, IEnumerable<Course> courses) { return enrolments .Join( students, enrolment => enrolment.StudentId, student => student.Id, (enrolment, student) => new { enrolment, student } ) .Join( courses, enrolmentStudentRelation => enrolmentStudentRelation.enrolment.CourseId, course => course.Id, (enrolmentStudentRelation, course) => new { enrolmentStudentRelation, course } ); }
Firstly, we join the enrolments
data set with the students
data set, using the StudentId
property of the enrolment
object and the Id
property of the student
object as the join keys.
Secondly, we perform a second join that takes the first join result and joins it with the courses
data set, using the CourseId
property of the enrolment
object from enrolmentStudentRelation
and the Id
property of the course
object as the join keys.
As a rule, each subsequent join in a multiple join chain operates on the resultset from the previous join operation, so the second join joins a new collection with the result set obtained from the first join, the third with the second, and so on.
Finally, The result returned is a list of dynamic objects that have the full course
object and enrolmentStudentRelation
property which in turn contains the full enrolment
and student
objects instead of just their ids.
Now, to get each enrollment id, the name of the student, and the name of the course we need to perform a select operation on the join result. Let’s create a new method, GetEnrolments()
:
public static IEnumerable<dynamic> GetEnrolments( IEnumerable<Enrolment> enrolments) { var students = Student.GetDummyStudents(); var courses = Course.GetDummyCourses(); var enrolmentsRelation = GetEnrolmentsRelation( enrolments, students, courses); var result = enrolmentsRelation.Select( e => new { e.enrolmentStudentRelation.enrolment.Id, StudentName = e.enrolmentStudentRelation.student.Name, CourseName = e.course.Name, } ); return result; }
The GetEnrolments()
uses GetEnrolmentsRelation()
and then selects the properties it needs from the result of the join.
It ain’t worth it if we don’t verify the result. Let’s call PrintEnrolments()
, which prints formatted results on the console:
Id: 1, Student Name: Jane, Course: CSHARP Id: 2, Student Name: Jane, Course: MONGODB Id: 3, Student Name: Jane, Course: GIT Id: 4, Student Name: Sabrina, Course: PYTHON Id: 5, Student Name: Sabrina, Course: JAVA Id: 6, Student Name: Pat, Course: POSTGRES Id: 7, Student Name: Pat, Course: CSHARP Id: 8, Student Name: Theresa, Course: MYSQL Id: 9, Student Name: Janice, Course: MYSQL Id: 10, Student Name: Shane, Course: MSSQL Id: 11, Student Name: Shane, Course: GO Id: 12, Student Name: Shane, Course: JENKINS Id: 13, Student Name: Rick, Course: MYSQL Id: 14, Student Name: Julie, Course: GIT Id: 15, Student Name: Julie, Course: POSTGRES Id: 16, Student Name: Bruce, Course: CSHARP Id: 17, Student Name: Bruce, Course: DOCKER Id: 18, Student Name: Miguel, Course: RUBY Id: 19, Student Name: Miguel, Course: DOCKER Id: 20, Student Name: Miguel, Course: MONGODB
Thus far, we have built a solid foundation on how to join entities using LINQ. Can we apply other LINQ operators on the join result?
How Do We Use Join With Common LINQ Operators?
LINQ offers a diverse set of operators, including filtering, grouping, ordering, and more. While it’s not possible to cover all of them, in this section we’ll explore how to filter and group results obtained by joining Enrolment
with Student
and Course
.
For example, let’s suppose our users want to group enrolments based on course Id, with a focus on courses where ID < 5
. We already know how to join tables, so the only remaining tasks are to filter and group the results.
Let’s add a method FilterAndGroupEnrolments()
:
public static IEnumerable<dynamic> FilterAndGroupEnrolments( IEnumerable<Enrolment> enrolments) { var students = Student.GetDummyStudents(); var courses = Course.GetDummyCourses(); var enrolmentsRelation = GetEnrolmentsRelation( enrolments, students, courses); var result = enrolmentsRelation .Select( e => new { e.enrolmentStudentRelation.enrolment.Id, StudentName = e.enrolmentStudentRelation.student.Name, CourseId = e.course.Id, CourseName = e.course.Name, } ) .Where(x => x.CourseId < 5) .GroupBy(x => x.CourseId); return result; }
Firstly, combine Enrolment
, Course
, and Student
data using the already defined GetEnrolmentsRelation()
. After that, we filter the results to only include courses with an id less than 5. Finally, we group the filtered results by course id.
Now, let’s call PrintEnrolmentsGroup()
, which prints the result on the console:
Enrolements for Course Id: 1 Id: 1, Student Name: Geraldine, Course: CSHARP Id: 7, Student Name: Lillian, Course: CSHARP Id: 16, Student Name: Garrett, Course: CSHARP Enrolements for Course Id: 2 Id: 4, Student Name: Betsy, Course: PYTHON Enrolements for Course Id: 3 Id: 5, Student Name: Betsy, Course: JAVA Enrolements for Course Id: 4 Id: 18, Student Name: Melody, Course: RUBY
Similarly, we can apply other LINQ operators to the join result.
Conclusion
To sum up, in this article we have seen an overview of the inner join using LINQ, we covered its what, why, and how. Then, we explored various approaches to constructing LINQ queries and we discovered how effortless it is to connect multiple entities. Additionally, we briefly examined the usage of other LINQ operators, including filter and group, in conjunction with the inner join.