In this article, we’ll discuss how to create an outer join in LINQ. Most used are left and right outer join, but we’ll also talk about less common ones.

It’s hard to find an application that doesn’t rely on data. We often use just a single data source. But sometimes we need data from more than one source, be it a database, a file, or a C# collection, to mention just a few. This is where joins come in handy. There are a couple of types of joins and we will take a closer look at them.

To download the source code for this article, you can visit our GitHub repository.

So, let’s dive in. 

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!

Use of Outer Join in LINQ

In the most common scenario, we need data from one source that is somehow related to data in another source. We can use inner joins when we need to combine matching data coming from different sources.

For more information about inner joins, please check out our great article How to Do an Inner Join in LINQ?

But when do we use outer joins?

A typical scenario is when we have to handle missing data. We could, for example, set a default value for any missing piece of data in either source. 

Another use case is data analysis. We can use outer joins to compare data in two data sources or to find any discrepancies between them.

Finally, we use outer joins to merge data. This is the case when we need to preserve all the records from one or both of the sources while combining the data.

To demonstrate how outer joins work and how to use them in LINQ, let’s start by creating some sample data sources.

Data Setup for Outer Joins

Let’s create a Song class and an Author class. To keep things simple, we’ll only add a few properties to each:

public class Song
{
    public int Id { get; set; }
    public string Title { get; set; } = string.Empty;
    public int AuthorId { get; set; }
}

public class Author
{
    public int Id { get; set; }
    public string Name { get; set; } = string.Empty;
}

Each song has an author, hence the AuthorId as a foreign key in the class Song. Next, let’s create a list of songs and a list of authors:

public List<Song> Songs =
[
    new Song() { Id = 1, Title = "Alabama Sundown", AuthorId = 3},
    new Song() { Id = 2, Title = "New Beginning", AuthorId = 1},
    new Song() { Id = 3, Title = "To the Stars", AuthorId = 5},
    new Song() { Id = 4, Title = "First Time Ever", AuthorId = 2},
    new Song() { Id = 5, Title = "Nobody Listens", AuthorId = 4},
    new Song() { Id = 6, Title = "New Love", AuthorId = 1},
];

public List<Author> Authors =
[
    new Author() { Id = 1, Name = "Sue Sherrington"},
    new Author() { Id = 2, Name = "Luke Palmer" },
    new Author() { Id = 3, Name = "Gwen Felice" },
    new Author() { Id = 6, Name = "Brian Moore" },
    new Author() { Id = 7, Name = "Roy Cobbler" }
];

Our two lists represent the two data sources that we’ll be using in our examples. We can see that there are both matching and non-matching data. Some songs have the AuthorId property set to a value that doesn’t exist in the author list. Also, we have authors with no songs in the first list. 

We can use anonymous types in our LINQ query. We’re going to define a type named SongWithAuthor:

public class SongWithAuthor
{
    public string Title { get; set; } = string.Empty;
    public string AuthorName { get; set; } = string.Empty;
}

This way, our query will return us a list of SongWithAuthor objects.

Now that our data is prepared, we can utilize several types of outer joins on them. 

Left Outer Join and Right Outer Join in LINQ

Outer joins are used to return matching data from two data sources plus all the remaining data from either source. In left outer join (aka left join) all matching data is returned and all the data from the first source, even if it doesn’t match. Similarly, in a right outer join (aka right join), all data from the second source is returned besides the matching data. In both cases, the non-matching data takes the null value or any default value that we provide.

In our case, the left join will contain the matching data from both lists and the non-matching data from the songs list. The right join will contain all the matching data and the remaining data from the authors list. 

LINQ only supports left outer joins. To perform right outer joins, we have to swap the two data sources. Now, let’s see how to implement the joins in code. 

Outer Join in LINQ With Query and Method Syntax

LINQ comes in two flavors. There is query syntax and method syntax. We’re going to use both to explore the outer joins.

Query Syntax

So, let’s start with query syntax. To perform an outer join, we first have to use a group join and then call the DefaultIfEmpty() method on it. The group join produces an inner join. The DefaultIfEmpty() method ensures all elements from the first list are included, even if they’re unmatched:

public static List<SongWithAuthor> PerformLeftJoinWithQuerySyntax(
    List<Song> songs, List<Author> authors)
{
    var results =
        from s in songs
        join a in authors
        on s.AuthorId equals a.Id
        into songsWithAuthors
        from sa in songsWithAuthors.DefaultIfEmpty()
        select new SongWithAuthor { Title = s.Title, AuthorName = sa == null ? "unknown" : sa.Name };

    return results.ToList();
}

Here we have a function that performs a left join using query syntax and returns the results:

Left outer join - query syntax

Title: Alabama Sundown, Author: Gwen Felice
Title: New Beginning, Author: Sue Sherrington
Title: To the Stars, Author: unknown
Title: First Time Ever, Author: Luke Palmer
Title: Nobody Listens, Author: unknown
Title: New Love, Author: Sue Sherrington

As expected, we have all the elements from the first data source songs and only the matching data from the second source. If any data is missing in the second source, the default value of “unknown” is returned.

To perform the right outer join, we have to swap the data sources:

public static List<SongWithAuthor> PerformRightJoinWithQuerySyntax(
    List<Song> songs, List<Author> authors)
{
    var results =
        from a in authors
        join s in songs
        on a.Id equals s.AuthorId
        into songsWithAuthors
        from sa in songsWithAuthors.DefaultIfEmpty()
        select new SongWithAuthor { Title = sa == null ? "-" : sa.Title, AuthorName = a.Name };

    return results.ToList();
}

This time, we get all the data from the second source authors and only the matching data from the first source:

Right outer join - query syntax

Title: New Beginning, Author: Sue Sherrington
Title: New Love, Author: Sue Sherrington
Title: First Time Ever, Author: Luke Palmer
Title: Alabama Sundown, Author: Gwen Felice
Title: -, Author: Brian Moore
Title: -, Author: Roy Cobbler

Any missing data in the first source is replaced by a hyphen. 

Method Syntax

With the method syntax, we have to use the GroupJoin() method. From there, we have to use the SelectMany() and DefaultIfEmpty() methods. Let’s start with the left outer join:

public static List<SongWithAuthor> PerformLeftJoinWithMethodSyntax(
    List<Song> songs, List<Author> authors)
{
    var results = songs
                  .GroupJoin(
                      authors,
                      song => song.AuthorId,
                      author => author.Id,
                      (song, author) => new { song, author }
                  )
                  .SelectMany(
                      left => left.author.DefaultIfEmpty(),
                      (song, author) => new SongWithAuthor
                      {
                          Title = song.song.Title,
                          AuthorName = author == null ? "unknown" : author.Name
                      }
                  );

    return results.ToList();
}

We use the GroupJoin() method on the first data source, which in our case is songs. The second data source (authors) is passed as the first argument to the method. Next, we define the outer key selector (song.AuthorId) and inner key selector (author.Id). These are the properties used for matching. Then we project the result using an anonymous type.

In the SelectMany() method, we call the DefaultIfEmpty() method to perform the left join. The final result consists of objects of the SongWithAuthor type. Alternatively, we could use an anonymous type. If we were using query syntax, we would end up with the same result.

For the sake of completeness, let’s re-create the right outer join using method syntax:

public static List<SongWithAuthor> PerformRightJoinWithMethodSyntax(
    List<Song> songs, List<Author> authors)
{
    var results = authors
                  .GroupJoin(
                      songs,
                      author => author.Id,
                      song => song.AuthorId,                          
                      (author, song) => new { author, song }
                  )
                  .SelectMany(
                      right => right.song.DefaultIfEmpty(),
                      (author, song) => new SongWithAuthor
                      {
                          Title = song == null ? "-" : song.Title,
                          AuthorName = author.author.Name,
                      }
                  );

    return results.ToList();
}

Just like the left outer join, we use the GroupJoin() method in conjunction with the SelectMany() method. The only difference is that we swap the data sources.

Explore Other Outer Joins

The left and right outer joins we just discussed are relatively common. But they are not the only ones. The other types include the full outer join, left excluding join, right excluding join, and full outer excluding join.

Full Outer Join

The full outer join returns all matching and non-matching data from both sources. Again, any missing data is filled with the default value. To perform a full outer join, we first perform a left join, then a right join, and finally we merge the results of the two.

For the Union() method to work correctly, we have to define a class that implements the IEqualityComparer interface:

internal class SongWithAuthorComparer : IEqualityComparer<SongWithAuthor>
{
    public bool Equals(SongWithAuthor? x, SongWithAuthor? y)
    {
        if (x?.AuthorName == y?.AuthorName && x?.Title == y?.Title)
        {
            return true;
        }

        return false;
    }

    public int GetHashCode([DisallowNull] SongWithAuthor obj)
    {
        return obj.AuthorName.GetHashCode();
    }
}

Here, we specify what it means for two SongWithAuthor objects to be equal. We consider them equal if their AuthorName and Title properties are identical.

Now, we can use our SongWithAuthorComparer along with the Union() method:

public static List<SongWithAuthor> PerformFullOuterJoin(
    List<Song> songs, List<Author> authors)
{
    var resultsLeft = PerformLeftJoinWithQuerySyntax(songs, authors);
    var resultsRight = PerformRightJoinWithQuerySyntax(songs, authors);

    var results = resultsLeft.Union(resultsRight, new SongWithAuthorComparer());

    return results.ToList();
}

Here we make use of the PerformLeftJoinWithQuerySyntax() and PerformRightJoinWithQuerySyntax() methods, which we previously defined. Also, we could substitute these methods for the ones that use the LINQ method syntax.

Then, we call the Union() method on our resultsLeft collection, passing in our resultsRight collection and an instance of our SongWithAuthorComparer class, so the Union() method understands what we deem to be equal.

Now the union will return the results from the left and right outer joins, excluding duplicates:

Full outer join

Title: Alabama Sundown, Author: Gwen Felice
Title: New Beginning, Author: Sue Sherrington
Title: To the Stars, Author: unknown
Title: First Time Ever, Author: Luke Palmer
Title: Nobody Listens, Author: unknown
Title: New Love, Author: Sue Sherrington
Title: -, Author: Brian Moore
Title: -, Author: Roy Cobbler

The Excluding Joins

An excluding join returns non-matching data. We can implement it as the difference between two collections. In LINQ we use the Except() method to this end. 

In each of the three excluding joins, we’ll be calculating the difference between an outer join and the inner join. To do that, we need a method that returns the inner join:

public static List<SongWithAuthor> PerformInnerJoin(
    List<Song> songs, List<Author> authors)
{
    var results =
        from s in songs
        join a in authors
        on s.AuthorId equals a.Id
        select new SongWithAuthor { Title = s.Title, AuthorName = a.Name };

    return results.ToList();
}

And now let’s have a look at each excluding join one by one.

Left Excluding Join

The left excluding join returns the non-matching data from the first source. We implement it as the difference between the left join and the inner join:

public static List<SongWithAuthor> PerformLeftExcludingJoin(
    List<Song> songs, List<Author> authors)
{
    var resultsLeft = PerformLeftJoinWithQuerySyntax(songs, authors);
    var resultsInner = PerformInnerJoin(songs, authors);

    var results = resultsLeft.Except(resultsInner, new SongWithAuthorComparer());

    return results.ToList();
}

Here we’re again making use of methods that we defined before. As our result, we get only the data that is in the first source, but not in the other:

Left excluding join

Title: To the Stars, Author: unknown
Title: Nobody Listens, Author: unknown

Right Excluding Join

The right excluding join returns the non-matching data from the second source. This time, what we need is the difference between the right join and the inner join:

public static List<SongWithAuthor> PerformRightExcludingJoin(
    List<Song> songs, List<Author> authors)
{
    var resultsRight = PerformRightJoinWithQuerySyntax(songs, authors);
    var resultsInner = PerformInnerJoin(songs, authors);

    var results = resultsRight.Except(resultsInner, new SongWithAuthorComparer());

    return results.ToList();
}

This method returns the elements that are in the second data source, but not in the first one:

Right excluding join

Title: -, Author: Brian Moore
Title: -, Author: Roy Cobbler

Full Outer Excluding Join With LINQ

The full outer excluding join returns the non-matching data from both sources. We can visualize it as the difference between the full outer join and the inner join:

public static List<SongWithAuthor> PerformFullOuterExcludingJoin(
    List<Song> songs, List<Author> authors)
{
    var resultsFull = PerformFullOuterJoin(songs, authors);
    var resultsInner = PerformInnerJoin(songs, authors);

    var results = resultsFull.Except(resultsInner, new SongWithAuthorComparer());

    return results.ToList();
}

This join returns all the elements that are only in one data source, but not in the other:

Full outer excluding join

Title: To the Stars, Author: unknown
Title: Nobody Listens, Author: unknown
Title: -, Author: Brian Moore
Title: -, Author: Roy Cobbler

Conclusion

In this article, we’ve covered the most common ways to create an outer join in LINQ and some less common ones. We now know how to get the same results using both query and method syntax. It’s worth mentioning that the solutions that we demonstrated are not the only ones. There are also other ways of performing outer joins in LINQ, but the ones we used are pretty simple and clear.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!