This article guides us on how to do an inner join in LINQ. We are going to discover how to seamlessly extract linked data and carry out additional LINQ tasks in conjunction with the join.

Additionally, we also explore how to perform multi-level joins, that are commonly employed in real-world LINQ queries.

To download the source code for this article, you can visit our GitHub repository.

NOTE: Some degree of basic understanding of LINQ is required. We have an excellent article for a quick refresher on LINQ Basic Concepts in C#, do check it out.

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!

Next, let’s grasp the fundamental principles of an inner join.

What Is Inner Join?

SQL offers different types of joins:

  • Inner join
  • Group join
  • Left/Right outer join
  • Full outer join
  • Cross join

Out of these joins, in this article, we will focus on the inner join since it’s the most used one.

An inner join relates two entities with a standard key to connect them.

To understand this, let’s define [1, 3, 5, 7, 9, 11] & [2, 4, 6, 8, 10, 12] as an odd and an even set. Is there any common integer in the two of them? we can say for sure that there is no common integer as an odd integer can never be also an even integer. The intersection between these two sets is empty:

odd even set diagram

Now, let’s take [1, 3, 5, 7, 9, 11] & [2, 3, 5, 7, 11] as an odd and a prime set. Some integers are both odd and prime. We can form a new set [3, 5, 7, 11] consisting of common integers from the two sets. The intersection between these two sets is not empty:

odd prime set diagram

Inner join is about the unique relationships between the groups defined based on commonalities such as the relationship between the odd and the prime sets. 

Understanding the basics is essential, but what are some actual problems that can be addressed using inner joins? Let’s examine that.

The Usefulness of Inner Join

Let’s define a Student table that has an Id and a Name:

IdName
1Boba Fett
2Obi-Wan Kenobi
3Darth Vader
4Yoda
5Grogu

Let’s also define a Category table with an Id and a Name of the category:

IdName
1Programming Language
2DevOps
3Database System

Then, let’s add a Course table with an Id, Name and the CategoryId that the course belongs to:

IdNameCategoryId
1PYTHON1
2MSSQL3
3GIT2

Finally, let’s add an Enrolment table that keeps a reference to the Student and the Course they enrolled in using their respective Ids:

IdCourseIdStudentId
133
222
314
421

The final schema of our data is straightforward:

 

Table schema of the data

The inner join could help us answer questions like – What is the name of the student with the enrolment id 2? What’s the name of the course in the Enrolment that has a certain Id?

With our human eye, we can easily perform inner joins and say that “Obi-Wan Kenobi is enrolled for MSSQL” or “Enrolment number 4 is about MSSQL”, but in this article, we are going to learn how to answer these questions programmatically using LINQ.

How to Join Two Entities in LINQ?

Let’s begin by creating the entities we need. 

The entity Course consists of the Id, Name, and CategoryId properties:

public class Course
{
    public int Id { get; set; }
    public string Name { get; set; }
    public int CategoryId { get; set; }

    public static IEnumerable<Course> GetDummyCourses()
    {
      ...
    }
}

We have also added GetDumyCourses() method to populate a list of courses for our examples. You can see its full implementation on GitHub.

Next, let’s create the entity Category, which consists of the Id and Name properties. It also has a helper method, GetDummyCourseCategories():

public class Category
{
    public int Id { get; set; }
    public string Name { get; set; }

    public static IEnumerable<Category> GetDummyCourseCategories()
    {
      ...
    }
}

The full implementation of GetDummyCourseCategories() is on GitHub as well.

To have a look at our dummy categories and courses, we define two helper methods PrintDummyCourses() and PrintDummyCategories() where we simply iterate over the collections and print the result:

public static void PrintDummyCourses()
{
    foreach (var course in Course.GetDummyCourses())
    {
        Console.WriteLine(
            $"Id: {course.Id} " +
            $"Course: {course.Name} " +
            $"CategoryId: {course.CategoryId} "
        );
    }
}

public static void PrintDummyCategories()
{
    foreach (var category in Category.GetDummyCourseCategories())
    {
        Console.WriteLine(
            $"Id: {category.Id} " +
            $"Category: {category.Name} ");
    }
}

When we invoke the two helper methods, we get the list of courses and the list of categories:

-----the courses-----

Id: 1 Course: CSHARP  CategoryId: 1
Id: 2 Course: PYTHON  CategoryId: 1
Id: 3 Course: JAVA  CategoryId: 1
Id: 4 Course: RUBY  CategoryId: 1
Id: 5 Course: GO  CategoryId: 1
Id: 6 Course: MSSQL  CategoryId: 2
Id: 7 Course: POSTGRES  CategoryId: 2
Id: 8 Course: MONGODB  CategoryId: 2
Id: 9 Course: MYSQL  CategoryId: 2
Id: 10 Course: DOCKER  CategoryId: 3
Id: 11 Course: JENKINS  CategoryId: 3
Id: 12 Course: GIT  CategoryId: 3
Id: 13 Course: KUBERNETES  CategoryId: 3

-----the categories-----

Id: 1 Category: Programming Language
Id: 2 Category: DevOps
Id: 3 Category: Database System

Storing Course and Category as separate entities provides us with great flexibility, but it isn’t very intuitive from the end-user perspective.

For example, our users would prefer to see the name of the category that a course belongs to, rather than having to guess based on the category ID.

We are going to tackle this issue by framing the relationship to return each course with its respective category name using an inner join on Course and Category with the help of LINQ.

Inner Join Using Query Syntax

LINQ provides us with the query syntax to combine two collections based on a common key:

from <alias_for_sequence_1> in sequence_1
join <alias_for_sequence_2> in sequence_2 
on <alias_for_sequence_1>.<key_for_sequence_1> 
equals <alias_for_sequence_2>.<key_for_sequence_2>
select <object>
{
  ....
}

Using the query syntax, we can create a new entity combining Course and Category with inner join on CategoryId and Id:

public static IEnumerable<dynamic> GetCoursesWithCategory(
    IEnumerable<Course> courses)
{
    var categories = Category.GetDummyCourseCategories();

    var courseWithCategory =
        from course in courses
        join category in categories on
        course.CategoryId equals category.Id
        select new
        {
            course.Id,
            course.Name,
            CategoryName = category.Name
        };

    return courseWithCategory;
}

We first fetch the list of all the categories using the helper method GetDummyCourseCategories() then we execute the LINQ query that combines the courses and categories sequence using the join clause, specifying the course.CategoryId and category.Id as the keys to join on.

Finally, we apply the select clause to form the query result as an anonymous object containing properties Id, Name, and CategoryName taken from the course and category sequences. 

Now, if we print the result with the PrintCoursesWithCategory() we get:

Id: 1 Course: CSHARP  Category: Programming Language
Id: 2 Course: PYTHON  Category: Programming Language
Id: 3 Course: JAVA  Category: Programming Language
Id: 4 Course: RUBY  Category: Programming Language
Id: 5 Course: GO  Category: Programming Language
Id: 6 Course: MSSQL  Category: DevOps
Id: 7 Course: POSTGRES  Category: DevOps
Id: 8 Course: MONGODB  Category: DevOps
Id: 9 Course: MYSQL  Category: DevOps
Id: 10 Course: DOCKER  Category: Database System
Id: 11 Course: JENKINS  Category: Database System
Id: 12 Course: GIT  Category: Database System
Id: 13 Course: KUBERNETES  Category: Database System

Isn’t it great? No we have all the courses with their respective categories. 

Inner Join Using Method Syntax

Alternatively, LINQ also provides us with the method or fluent syntax:

sequence_1.Join(
sequence_2,
<alias_for_sequence_1> => <alias_for_sequence_1>.<key_for_sequence_1>,
<alias_for_sequence_2> => <alias_for_sequence_2>.<key_for_sequence_2>,
(<variable_1>, <variable_2>) => new
{
    ...
})

Using the method syntax, let’s create a new method GetCoursesWithCategoryName() that does exactly the same as GetCoursesWithCategory() but with LINQ method syntax:

public static IEnumerable<dynamic> GetCoursesWithCategoryName(
    IEnumerable<Course> courses)
{
    var categories = Category.GetDummyCourseCategories();

    var courseWithCategoryName = courses.Join(
        categories,
        course => course.CategoryId,
        category => category.Id,
        (course, category) => new
        {
            course.Id,
            course.Name,
            CategoryName = category.Name
        }
    );

    return courseWithCategoryName;
}

Now, if we print the result with the PrintCoursesWithCategory() we get:

Id: 1 Course: CSHARP  Category: Programming Language
Id: 2 Course: PYTHON  Category: Programming Language
Id: 3 Course: JAVA  Category: Programming Language
Id: 4 Course: RUBY  Category: Programming Language
Id: 5 Course: GO  Category: Programming Language
Id: 6 Course: MSSQL  Category: DevOps
Id: 7 Course: POSTGRES  Category: DevOps
Id: 8 Course: MONGODB  Category: DevOps
Id: 9 Course: MYSQL  Category: DevOps
Id: 10 Course: DOCKER  Category: Database System
Id: 11 Course: JENKINS  Category: Database System
Id: 12 Course: GIT  Category: Database System
Id: 13 Course: KUBERNETES  Category: Database System

The GetCoursesWithCategoryName(), joins the courses list with the categories list, using the Id property of the category object and the CategoryId property of the course object as the join keys. And finally, it uses the lambda passed as the fourth parameter to select the query results.

The choice between the query syntax and method syntax for joining related entities is primarily a matter of personal preference, rather than one being more straightforward, sophisticated, or efficient than the other.

Until now, our understanding is limited to merging two entities. However, it is highly beneficial to investigate whether we can merge multiple entities together.

How Do We Join Multiple Entities in C#?

We’ll explore the practical scenario of enrolling students in various courses to understand how multiple joins can assist us in relating multiple entities and retrieving pertinent data.

First and foremost we need two new entities – Student & Enrolment, let’s add that to our project.

Data Preparation

The Student has Id and Name properties along with a helper method GetDummyStudents() that returns list of students:

public class Student
{
    public int Id { get; set; }
    public string Name { get; set; }

    public static IEnumerable<Student> GetDummyStudents()
    {
      ...
    }
}

Next, Let’s add another entity named Enrolment to capture which student gets enrolled in which course:

public class Enrolment
{
    public int Id { get; set; }
    public int StudentId { get; set; }
    public int CourseId { get; set; }

    public static IEnumerable<Enrolment> GetDummyEnrolment()
    {
      ...
    }
}

The Enrolment has an Id property, it keeps the reference of the student enrolled with StudentId and the course the student gets enrolled in with CourseId. You can check the full implementation on GitHub. 

Performing the Inner Join

Here is the fun part, our users wish to see all the enrolments with the student name and course name for each enrolment instead of the IDs.

To answer our users’ needs, let’s create GetEnrolmentsRelation():

private static IEnumerable<dynamic> GetEnrolmentsRelation(
    IEnumerable<Enrolment> enrolments,
    IEnumerable<Student> students,
    IEnumerable<Course> courses)
{
    return enrolments
        .Join(
            students,
            enrolment => enrolment.StudentId,
            student => student.Id,
            (enrolment, student) => new { enrolment, student }
        )
        .Join(
            courses,
            enrolmentStudentRelation =>
            enrolmentStudentRelation.enrolment.CourseId,
            course => course.Id,
            (enrolmentStudentRelation, course) =>
            new { enrolmentStudentRelation, course }
        );
}

Firstly, we join the enrolments data set with the students data set, using the StudentId property of the enrolment object and the Id property of the student object as the join keys. 

Secondly, we perform a second join that takes the first join result and joins it with the courses data set, using the CourseId property of the enrolment object from enrolmentStudentRelation and the Id property of the course object as the join keys.

As a rule, each subsequent join in a multiple join chain operates on the resultset from the previous join operation, so the second join joins a new collection with the result set obtained from the first join, the third with the second, and so on.

Finally, The result returned is a list of dynamic objects that have the full course object and enrolmentStudentRelation property which in turn contains the full enrolment and student objects instead of just their ids. 

Now, to get each enrollment id, the name of the student, and the name of the course we need to perform a select operation on the join result. Let’s create a new method, GetEnrolments():

public static IEnumerable<dynamic> GetEnrolments(
    IEnumerable<Enrolment> enrolments)
{
    var students = Student.GetDummyStudents();

    var courses = Course.GetDummyCourses();

    var enrolmentsRelation = GetEnrolmentsRelation(
        enrolments,
        students,
        courses);

    var result = enrolmentsRelation.Select(
        e =>
            new
            {
                e.enrolmentStudentRelation.enrolment.Id,
                StudentName = e.enrolmentStudentRelation.student.Name,
                CourseName = e.course.Name,
            }
    );

    return result;
}

The GetEnrolments() uses GetEnrolmentsRelation() and then selects the properties it needs from the result of the join.

It ain’t worth it if we don’t verify the result. Let’s call PrintEnrolments(), which prints formatted results on the console:

Id: 1, Student Name: Jane, Course: CSHARP
Id: 2, Student Name: Jane, Course: MONGODB
Id: 3, Student Name: Jane, Course: GIT
Id: 4, Student Name: Sabrina, Course: PYTHON
Id: 5, Student Name: Sabrina, Course: JAVA
Id: 6, Student Name: Pat, Course: POSTGRES
Id: 7, Student Name: Pat, Course: CSHARP
Id: 8, Student Name: Theresa, Course: MYSQL
Id: 9, Student Name: Janice, Course: MYSQL
Id: 10, Student Name: Shane, Course: MSSQL
Id: 11, Student Name: Shane, Course: GO
Id: 12, Student Name: Shane, Course: JENKINS
Id: 13, Student Name: Rick, Course: MYSQL
Id: 14, Student Name: Julie, Course: GIT
Id: 15, Student Name: Julie, Course: POSTGRES
Id: 16, Student Name: Bruce, Course: CSHARP
Id: 17, Student Name: Bruce, Course: DOCKER
Id: 18, Student Name: Miguel, Course: RUBY
Id: 19, Student Name: Miguel, Course: DOCKER
Id: 20, Student Name: Miguel, Course: MONGODB

Thus far, we have built a solid foundation on how to join entities using LINQ. Can we apply other LINQ operators on the join result? 

How Do We Use Join With Common LINQ Operators?

LINQ offers a diverse set of operators, including filtering, grouping, ordering, and more. While it’s not possible to cover all of them, in this section we’ll explore how to filter and group results obtained by joining Enrolment with Student and Course.

For example, let’s suppose our users want to group enrolments based on course Id, with a focus on courses where ID < 5. We already know how to join tables, so the only remaining tasks are to filter and group the results.

Let’s add a method FilterAndGroupEnrolments():

public static IEnumerable<dynamic> FilterAndGroupEnrolments(
    IEnumerable<Enrolment> enrolments)
{
    var students = Student.GetDummyStudents();

    var courses = Course.GetDummyCourses();

    var enrolmentsRelation = GetEnrolmentsRelation(
        enrolments,
        students,
        courses);

    var result = enrolmentsRelation
        .Select(
            e =>
                new
                {
                    e.enrolmentStudentRelation.enrolment.Id,
                    StudentName = e.enrolmentStudentRelation.student.Name,
                    CourseId = e.course.Id,
                    CourseName = e.course.Name,
                }
        )
        .Where(x => x.CourseId < 5)
        .GroupBy(x => x.CourseId);

    return result;
}

Firstly, combine Enrolment, Course, and Student data using the already defined GetEnrolmentsRelation(). After that, we filter the results to only include courses with an id less than 5. Finally, we group the filtered results by course id.

Now, let’s call PrintEnrolmentsGroup(), which prints the result on the console:

Enrolements for Course Id: 1
Id: 1, Student Name: Geraldine, Course: CSHARP
Id: 7, Student Name: Lillian, Course: CSHARP
Id: 16, Student Name: Garrett, Course: CSHARP
Enrolements for Course Id: 2
Id: 4, Student Name: Betsy, Course: PYTHON
Enrolements for Course Id: 3
Id: 5, Student Name: Betsy, Course: JAVA
Enrolements for Course Id: 4
Id: 18, Student Name: Melody, Course: RUBY

Similarly, we can apply other LINQ operators to the join result.

Conclusion

To sum up, in this article we have seen an overview of the inner join using LINQ, we covered its what, why, and how. Then, we explored various approaches to constructing LINQ queries and we discovered how effortless it is to connect multiple entities. Additionally, we briefly examined the usage of other LINQ operators, including filter and group, in conjunction with the inner join.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!