In this article, we’ll discuss how to create an outer join in LINQ. Most used are left and right outer join, but we’ll also talk about less common ones.
It’s hard to find an application that doesn’t rely on data. We often use just a single data source. But sometimes we need data from more than one source, be it a database, a file, or a C# collection, to mention just a few. This is where joins come in handy. There are a couple of types of joins and we will take a closer look at them.
So, let’s dive in.
Use of Outer Join in LINQ
In the most common scenario, we need data from one source that is somehow related to data in another source. We can use inner joins when we need to combine matching data coming from different sources.
But when do we use outer joins?
A typical scenario is when we have to handle missing data. We could, for example, set a default value for any missing piece of data in either source.
Another use case is data analysis. We can use outer joins to compare data in two data sources or to find any discrepancies between them.
Finally, we use outer joins to merge data. This is the case when we need to preserve all the records from one or both of the sources while combining the data.
To demonstrate how outer joins work and how to use them in LINQ, let’s start by creating some sample data sources.
Data Setup for Outer Joins
Let’s create a Song
class and an Author
class. To keep things simple, we’ll only add a few properties to each:
public class Song { public int Id { get; set; } public string Title { get; set; } = string.Empty; public int AuthorId { get; set; } } public class Author { public int Id { get; set; } public string Name { get; set; } = string.Empty; }
Each song has an author, hence the AuthorId
as a foreign key in the class Song
. Next, let’s create a list of songs and a list of authors:
public List<Song> Songs = [ new Song() { Id = 1, Title = "Alabama Sundown", AuthorId = 3}, new Song() { Id = 2, Title = "New Beginning", AuthorId = 1}, new Song() { Id = 3, Title = "To the Stars", AuthorId = 5}, new Song() { Id = 4, Title = "First Time Ever", AuthorId = 2}, new Song() { Id = 5, Title = "Nobody Listens", AuthorId = 4}, new Song() { Id = 6, Title = "New Love", AuthorId = 1}, ]; public List<Author> Authors = [ new Author() { Id = 1, Name = "Sue Sherrington"}, new Author() { Id = 2, Name = "Luke Palmer" }, new Author() { Id = 3, Name = "Gwen Felice" }, new Author() { Id = 6, Name = "Brian Moore" }, new Author() { Id = 7, Name = "Roy Cobbler" } ];
Our two lists represent the two data sources that we’ll be using in our examples. We can see that there are both matching and non-matching data. Some songs have the AuthorId
property set to a value that doesn’t exist in the author list. Also, we have authors with no songs in the first list.
We can use anonymous types in our LINQ query. We’re going to define a type named SongWithAuthor
:
public class SongWithAuthor { public string Title { get; set; } = string.Empty; public string AuthorName { get; set; } = string.Empty; }
This way, our query will return us a list of SongWithAuthor
objects.
Now that our data is prepared, we can utilize several types of outer joins on them.
Left Outer Join and Right Outer Join in LINQ
Outer joins are used to return matching data from two data sources plus all the remaining data from either source. In left outer join (aka left join) all matching data is returned and all the data from the first source, even if it doesn’t match. Similarly, in a right outer join (aka right join), all data from the second source is returned besides the matching data. In both cases, the non-matching data takes the null value or any default value that we provide.
In our case, the left join will contain the matching data from both lists and the non-matching data from the songs list. The right join will contain all the matching data and the remaining data from the authors list.
LINQ only supports left outer joins. To perform right outer joins, we have to swap the two data sources. Now, let’s see how to implement the joins in code.
Outer Join in LINQ With Query and Method Syntax
LINQ comes in two flavors. There is query syntax and method syntax. We’re going to use both to explore the outer joins.
Query Syntax
So, let’s start with query syntax. To perform an outer join, we first have to use a group join and then call the DefaultIfEmpty()
method on it. The group join produces an inner join. The DefaultIfEmpty()
method ensures all elements from the first list are included, even if they’re unmatched:
public static List<SongWithAuthor> PerformLeftJoinWithQuerySyntax( List<Song> songs, List<Author> authors) { var results = from s in songs join a in authors on s.AuthorId equals a.Id into songsWithAuthors from sa in songsWithAuthors.DefaultIfEmpty() select new SongWithAuthor { Title = s.Title, AuthorName = sa == null ? "unknown" : sa.Name }; return results.ToList(); }
Here we have a function that performs a left join using query syntax and returns the results:
Left outer join - query syntax Title: Alabama Sundown, Author: Gwen Felice Title: New Beginning, Author: Sue Sherrington Title: To the Stars, Author: unknown Title: First Time Ever, Author: Luke Palmer Title: Nobody Listens, Author: unknown Title: New Love, Author: Sue Sherrington
As expected, we have all the elements from the first data source songs
and only the matching data from the second source. If any data is missing in the second source, the default value of “unknown” is returned.
To perform the right outer join, we have to swap the data sources:
public static List<SongWithAuthor> PerformRightJoinWithQuerySyntax( List<Song> songs, List<Author> authors) { var results = from a in authors join s in songs on a.Id equals s.AuthorId into songsWithAuthors from sa in songsWithAuthors.DefaultIfEmpty() select new SongWithAuthor { Title = sa == null ? "-" : sa.Title, AuthorName = a.Name }; return results.ToList(); }
This time, we get all the data from the second source authors
and only the matching data from the first source:
Right outer join - query syntax Title: New Beginning, Author: Sue Sherrington Title: New Love, Author: Sue Sherrington Title: First Time Ever, Author: Luke Palmer Title: Alabama Sundown, Author: Gwen Felice Title: -, Author: Brian Moore Title: -, Author: Roy Cobbler
Any missing data in the first source is replaced by a hyphen.
Method Syntax
With the method syntax, we have to use the GroupJoin()
method. From there, we have to use the SelectMany()
and DefaultIfEmpty()
methods. Let’s start with the left outer join:
public static List<SongWithAuthor> PerformLeftJoinWithMethodSyntax( List<Song> songs, List<Author> authors) { var results = songs .GroupJoin( authors, song => song.AuthorId, author => author.Id, (song, author) => new { song, author } ) .SelectMany( left => left.author.DefaultIfEmpty(), (song, author) => new SongWithAuthor { Title = song.song.Title, AuthorName = author == null ? "unknown" : author.Name } ); return results.ToList(); }
We use the GroupJoin()
method on the first data source, which in our case is songs
. The second data source (authors)
is passed as the first argument to the method. Next, we define the outer key selector (song.AuthorId
) and inner key selector (author.Id
). These are the properties used for matching. Then we project the result using an anonymous type.
In the SelectMany()
method, we call the DefaultIfEmpty()
method to perform the left join. The final result consists of objects of the SongWithAuthor
type. Alternatively, we could use an anonymous type. If we were using query syntax, we would end up with the same result.
For the sake of completeness, let’s re-create the right outer join using method syntax:
public static List<SongWithAuthor> PerformRightJoinWithMethodSyntax( List<Song> songs, List<Author> authors) { var results = authors .GroupJoin( songs, author => author.Id, song => song.AuthorId, (author, song) => new { author, song } ) .SelectMany( right => right.song.DefaultIfEmpty(), (author, song) => new SongWithAuthor { Title = song == null ? "-" : song.Title, AuthorName = author.author.Name, } ); return results.ToList(); }
Just like the left outer join, we use the GroupJoin()
method in conjunction with the SelectMany()
method. The only difference is that we swap the data sources.
Explore Other Outer Joins
The left and right outer joins we just discussed are relatively common. But they are not the only ones. The other types include the full outer join, left excluding join, right excluding join, and full outer excluding join.
Full Outer Join
The full outer join returns all matching and non-matching data from both sources. Again, any missing data is filled with the default value. To perform a full outer join, we first perform a left join, then a right join, and finally we merge the results of the two.
For the Union()
method to work correctly, we have to define a class that implements the IEqualityComparer
interface:
internal class SongWithAuthorComparer : IEqualityComparer<SongWithAuthor> { public bool Equals(SongWithAuthor? x, SongWithAuthor? y) { if (x?.AuthorName == y?.AuthorName && x?.Title == y?.Title) { return true; } return false; } public int GetHashCode([DisallowNull] SongWithAuthor obj) { return obj.AuthorName.GetHashCode(); } }
Here, we specify what it means for two SongWithAuthor
objects to be equal. We consider them equal if their AuthorName
and Title
properties are identical.
Now, we can use our SongWithAuthorComparer
along with the Union()
method:
public static List<SongWithAuthor> PerformFullOuterJoin( List<Song> songs, List<Author> authors) { var resultsLeft = PerformLeftJoinWithQuerySyntax(songs, authors); var resultsRight = PerformRightJoinWithQuerySyntax(songs, authors); var results = resultsLeft.Union(resultsRight, new SongWithAuthorComparer()); return results.ToList(); }
Here we make use of the PerformLeftJoinWithQuerySyntax()
and PerformRightJoinWithQuerySyntax()
methods, which we previously defined. Also, we could substitute these methods for the ones that use the LINQ method syntax.
Then, we call the Union()
method on our resultsLeft
collection, passing in our resultsRight
collection and an instance of our SongWithAuthorComparer
class, so the Union()
method understands what we deem to be equal.
Now the union will return the results from the left and right outer joins, excluding duplicates:
Full outer join Title: Alabama Sundown, Author: Gwen Felice Title: New Beginning, Author: Sue Sherrington Title: To the Stars, Author: unknown Title: First Time Ever, Author: Luke Palmer Title: Nobody Listens, Author: unknown Title: New Love, Author: Sue Sherrington Title: -, Author: Brian Moore Title: -, Author: Roy Cobbler
The Excluding Joins
An excluding join returns non-matching data. We can implement it as the difference between two collections. In LINQ we use the Except()
method to this end.
In each of the three excluding joins, we’ll be calculating the difference between an outer join and the inner join. To do that, we need a method that returns the inner join:
public static List<SongWithAuthor> PerformInnerJoin( List<Song> songs, List<Author> authors) { var results = from s in songs join a in authors on s.AuthorId equals a.Id select new SongWithAuthor { Title = s.Title, AuthorName = a.Name }; return results.ToList(); }
And now let’s have a look at each excluding join one by one.
Left Excluding Join
The left excluding join returns the non-matching data from the first source. We implement it as the difference between the left join and the inner join:
public static List<SongWithAuthor> PerformLeftExcludingJoin( List<Song> songs, List<Author> authors) { var resultsLeft = PerformLeftJoinWithQuerySyntax(songs, authors); var resultsInner = PerformInnerJoin(songs, authors); var results = resultsLeft.Except(resultsInner, new SongWithAuthorComparer()); return results.ToList(); }
Here we’re again making use of methods that we defined before. As our result, we get only the data that is in the first source, but not in the other:
Left excluding join Title: To the Stars, Author: unknown Title: Nobody Listens, Author: unknown
Right Excluding Join
The right excluding join returns the non-matching data from the second source. This time, what we need is the difference between the right join and the inner join:
public static List<SongWithAuthor> PerformRightExcludingJoin( List<Song> songs, List<Author> authors) { var resultsRight = PerformRightJoinWithQuerySyntax(songs, authors); var resultsInner = PerformInnerJoin(songs, authors); var results = resultsRight.Except(resultsInner, new SongWithAuthorComparer()); return results.ToList(); }
This method returns the elements that are in the second data source, but not in the first one:
Right excluding join Title: -, Author: Brian Moore Title: -, Author: Roy Cobbler
Full Outer Excluding Join With LINQ
The full outer excluding join returns the non-matching data from both sources. We can visualize it as the difference between the full outer join and the inner join:
public static List<SongWithAuthor> PerformFullOuterExcludingJoin( List<Song> songs, List<Author> authors) { var resultsFull = PerformFullOuterJoin(songs, authors); var resultsInner = PerformInnerJoin(songs, authors); var results = resultsFull.Except(resultsInner, new SongWithAuthorComparer()); return results.ToList(); }
This join returns all the elements that are only in one data source, but not in the other:
Full outer excluding join Title: To the Stars, Author: unknown Title: Nobody Listens, Author: unknown Title: -, Author: Brian Moore Title: -, Author: Roy Cobbler
Conclusion
In this article, we’ve covered the most common ways to create an outer join in LINQ and some less common ones. We now know how to get the same results using both query and method syntax. It’s worth mentioning that the solutions that we demonstrated are not the only ones. There are also other ways of performing outer joins in LINQ, but the ones we used are pretty simple and clear.