LINQ is an extremely useful library with many applications. These applications are not all utilized or understood equally. In this article, we are going to take a look at some advanced LINQ capabilities to perform grouping, joining, partitioning, and even converting object types.
Let’s begin.
Grouping As Part of Advanced LINQ Functionalities
Grouping is when we have a data set and we group elements of that set by defined criteria.
Group
First, let’s begin with defining an Employee
class. We can use the class properties to define criteria by which to group the data:
public class Employee { public int EmployeeId { get; set; } = 0; public string Name { get; set; } = string.Empty; public string Department { get; set; } = string.Empty; public double Salary { get; set; } = 0.0; }
We can use LINQ to perform grouping operations on our data set. The GroupBy()
method is a LINQ advanced method that transforms a collection and reorganizes it into groups that are arranged by a common key. GroupBy()
returns an IEnumerable<IGrouping<TKey, TElement>>
. For example, we can group a collection of Employees
by which Department
they work in:
var Employees = new List<Employee>() { new() { EmployeeId = 1, Name = "Alvin Johnston", Department = "Sales", Salary = 55000.00 }, new() { EmployeeId = 2, Name = "Jessica Cuevas", Department = "Engineering", Salary = 65000.00 }, } Employees.GroupBy(x => x.Department);
You can check our source code to see the full list of employees.
In this case, TKey
is a string
because that is the type of Department
. TElement
is of type Employee
as that is the object type that we are grouping.
If we print the resulting grouped data we will see:
Department - Engineering ------------------------- Jessica Cuevas Justin Vilches Ashley Montoya Department - IT ------------------------- Joey Delgado Silvio Mora Department - Administration ------------------------- Arjen Robben Mohammad Salah Department - Customer Service ------------------------- Nasir Jones Department - Sales ------------------------- Alvin Johnston Grace Silver
Grouping is a useful advanced LINQ process because it allows us to operate over a single data set. Say, for instance, we need to process payroll for all employees. Employees’ salaries are different per department. Instead of writing multiple queries against our database, we can query for all employees and then group them by the department. We can now operate over each group independently without having to query the database any further.
Group With Composite Keys
Next, let’s look at grouping with multiple keys. This is to say we can group data by more than one criterion as we did before with Department
:
var employeeDepartmentGroups = employees.GroupBy(x => new { x.Department, x.Salary });
In this example, we are grouping employees by Department
and Salary
. We do this by creating an anonymous object in lambda provided to the GroupBy()
method.
Printing this data out we will see grouped results:
Department, Salary - { Department = Sales, Salary = 55000 } ----------------------------------------------- Alvin Johnston Department, Salary - { Department = Engineering, Salary = 65000 } ------------------------------------------------------------------ Jessica Cuevas Department, Salary - { Department = Sales, Salary = 75000 } ------------------------------------------------------------------ Grace Silver Department, Salary - { Department = Engineering, Salary = 85000 } ------------------------------------------------------------------ Justin Vilches Ashley Montoya Department, Salary - { Department = IT, Salary = 85000 } ------------------------------------------------------------------ Joey Delgado Silvio Mora Department, Salary - { Department = Administration, Salary = 105000 } ------------------------------------------------------------------ Arjen Robben Department, Salary - { Department = Administration, Salary = 115000 } ------------------------------------------------------------------ Mohammad Salah Department, Salary - { Department = Customer Service, Salary = 45000 } ------------------------------------------------------------------ Nasir Jones
We can see, for example, that there are two people in the group where the employees are in the engineering department and are making an $85,000 salary.
Object properties are not the only criteria by which we can perform grouping. We can group data by our custom-defined keys:
var employeeDepartmentGroups = employees.GroupBy(employee => { var salaryLevel = employee.Salary < 50000 ? "Entry-Level" : employee.Salary >= 50000 && employee.Salary <= 85000 ? "Mid-Level" : "Senior-Level"; return salaryLevel; });
In this example, we create three salary bands on which to group employee salaries: Entry-Level, Mid-Level, and Senior-Level. We do this by creating an anonymous function in which we look at employee salaries and then return a string of which salary level the employee belongs to.
Now, let’s inspect the result once we print out these groups:
Salary - Entry-Level ----------------------------------------------- Nasir Jones Salary - Mid-Level ----------------------------------------------- Alvin Johnston Jessica Cuevas Grace Silver Justin Vilches Joey Delgado Ashley Montoya Silvio Mora Salary - Senior-Level ----------------------------------------------- Arjen Robben Mohammad Salah
Lastly, we can group data by criteria and then return a custom final object using LINQ. This final object can represent a desired transformation of the group data. To put this into action, let’s look at an example where we sum the salary of each employee per department:
var employeeDepartmentAggregateReport = employees.GroupBy(x => x.Department).Select(group => new { DepartmentName = group.First().Department, TotalDepartmentSalaryCosts = group.Sum(empl => empl.Salary) });
In this LINQ expression, we transform our list of all employee data into a custom object meant to serve as a salary aggregate report for each department. GroupBy()
will group our employee list by department. Select()
will return a new anonymous object with the department name and salary total.
After we print the department salary aggregate data, we can find our results:
Department Salary Report Sales - Total Salary - 130000 ----------------------------------------------- Engineering - Total Salary - 235000 ----------------------------------------------- IT - Total Salary - 170000 ----------------------------------------------- Administration - Total Salary - 220000 ----------------------------------------------- Customer Service - Total Salary - 45000
There are many reasons to use LINQ to produce robust code. GroupBy()
is a powerful tool in our belt to transform data meaningfully.
Use Advanced LINQ Methods to Convert Data Types
Often we want to transform an object from one type to another. We can use a few LINQ methods to do this easily on a single object or collection of objects as well.
For the examples, we will use the Director
and Administrator
classes. These classes inherit from the Employee
class:
public class Director : Employee { public string Permissions { get; set; } = string.Empty; public bool AbleToHire { get; set; } = false; } public class Administrator : Employee { public bool AbleToFire { get; set; } = false; }
Also, for this section, we are going to use a list of Employees
for our examples:
var MixedEmployees = new List<Employee>() { new Director() { AbleToHire = true, Permissions = "READ_WRITE_CREATE_DELETE", EmployeeId = 1, Name = "Rodrigo Suarez", Department = "Leadership", Salary = 175000.00 }, new Employee() { EmployeeId = 2, Name = "Jessica Cuevas", Department = "Engineering", Salary = 65000.00 }, ... };
You can check the source code to see the full list of MixedEmployees
OfType<T>
We can use OfType<T>()
to pull out all instances of T
from MixedEmployees
and return them in an IEnumerable<T>
. This is an incredibly useful advanced LINQ function as we can take advantage of objects that inherit from a base class or implement a common interface:
var directors = MixedEmployees.OfType<Director>();
Executing this code, directors
will be an IEnumerable<Director>
with one element. In this case, it is the director Rodrigo Suarez.
ConvertAll
First, let’s say that this method is not exactly a LINQ method as it comes from the System.Collections.Generics
namespace, but it is used a lot in LINQ operations. This method will cast all elements of a List<>
to another class type:
var admins = MixedEmployees.OfType<Administrator>(); admins.ToList().ConvertAll(d => JsonSerializer.Serialize(d));
Here, we use OfType()
to get an IEnumerable<Administrator>
as we did in the previous example. Next, we transform the enumerable into a List
. We do this because ConvertAll()
can only be used on a List
. Finally, we pass a lambda to transform elements of the collection into another type. In this case, we transform Administrator
objects into strings
using JsonSerializer.Serialize()
.
As you can see, we have to use the ToList
method as ConvertAll
works only with a List. We can also use the Select
method, to project the result, but it returns IEnumerable
as a result, and if we want a List
as a result, we have to use ToList
.
AsQueryable
Lastly, let’s discuss IQueryable
. IQueryable
is an interface that is popularly used to build queries over collections. Moreover, ORMs, such as Entity Framework, use IQueryables to interface with databases. Given the prevalence of databases and cloud database services, IQueryable
is an important topic to understand. With IQueryable
, we can build complex queries with multiple selections or filter statements before a query is executed. We can convert collections to IQueryable
using the AsQueryable()
method:
var queryableEmployees = employeeList.AsQueryable();
Now, we can use queryableEmployees
to build a query over our collection of employees. IQueryable
objects can be used in conjunction with an ORM library to interface with our project’s data storage solution. AsQueryable()
can be called on arrays, List
, Dictionary
, Lookup
, Stack
, Hashtable
, and most other collections that implement IEnumerable
.
Joining
Joining is similar to grouping as they allow us to group data by defined criteria. The difference is that joining allows us to do this with two different collections, as opposed to a single collection. In this section we will cover three different variations of joining: inner join, group join, and lastly left outer join.
Inner Join
Inner join is the basic variation of a joining operation. Let’s define a second collection to which we can join the employees
collection we previously defined:
var directors = new List<Director>() { new Director() { AbleToHire = true, Permissions = "READ_WRITE_CREATE_DELETE", EmployeeId = 100, Name = "Nikola Jokic", Department = "Leadership", Salary = 175000.00, DepartmentResponsibleFor = "Engineering" }, new Director() { AbleToHire = true, Permissions = "READ_WRITE_CREATE_DELETE", EmployeeId = 101, Name = "Petr Cech", Department = "Leadership", Salary = 175000.00, DepartmentResponsibleFor = "IT" }, ... };
Now, let’s join the directors
collection to the employees
collection:
var join = employees.Join(directors, em => em.Department, dir => dir.DepartmentResponsibleFor, (em, dir) => new { EmployeeName = em.Name, DirectorName = dir.Name, Department = em.Department });
We call Join()
on the employees
collection. The first parameter is the second collection we want to join with. In this case, we are joining with directors
.
The next parameter is a lambda expression where we select a property from the Employee
class to use a key selector. Similarly, the next parameter is the same but for the Director
class.
The last parameter is a lambda that creates a new anonymous object containing the information from an employee and director object that has matching key selectors Department
and DepartmentResponsibleFor
.
Let’s take a look at the resulting data:
Department: Engineering ------------------------ Employee: Jessica Cuevas Director: Nikola Jokic Department: Engineering ------------------------ Employee: Justin Vilches Director: Nikola Jokic Department: Engineering ------------------------ Employee: Montoya Director: Nikola Jokic Department: IT ------------------------ Employee: Joey Delgado Director: Petr Cech Department: IT ------------------------ Employee: Silvio Mora Director: Petr Cech
The result is a matching of a director and an employee from the department the director is responsible for.
Group Join
Group Join is very similar to an inner join but the difference is the resulting data is grouped by elements of the collection the method was called on. In the case of our previous examples, the resulting joined data is a single Director
element joined with all the employees they are responsible for.
First, let’s examine code that performs a group join on our collections:
var groupJoin = directors.GroupJoin(employees, dir => dir.DepartmentResponsibleFor, em => em.Department, (dir, emGroup) => new { dir.Name, EmployeeGroup = emGroup });
In this example, we are doing a group join of the directors
collection with the employees
collection. This results in a collection of objects that are comprised of the director and all the employees in the department they are responsible for. The first three parameters are the same as Join()
. The last parameter is a lambda with the parameters of a director and a group of employees.
We organize this data in a new anonymous object:
Department: Engineering -- Director: Nikola Jokic --------------------------------------------- Employee: Jessica Cuevas Employee: Justin Vilches Employee: Ashley Montoya Department: IT -- Director: Petr Cech --------------------------------------------- Employee: Joey Delgado Employee: Silvio Mora Department: R&D -- Director: Carl Friedrich Gauss ---------------------------------------------
The output here shows each director grouped with all the employees in their respective departments.
Left Outer Join
A left outer join is a group join where elements in the inner collection are represented in the final result, even when there are no matching elements in the outer collection. If we look at the directors
collection we can see we have a director that is responsible for the R&D department. We have no employees in the R&D department. Nevertheless, with a left outer join, we will still see the R&D director in our final result.
With LINQ, we can perform a left outer join by supplying a default outer object when there are no matching objects in the outer collection. To supply this default object we will use DefaultIfEmpty()
:
var groupJoin = directors.GroupJoin(employees, dir => dir.DepartmentResponsibleFor, em => em.Department, (dir, emGroup) => new { dir.Name, EmployeeGroup = emGroup.DefaultIfEmpty(new() { Name = "No Name" }) });
We return a new Employee
object with the Name
“No Name” when emGroup
is empty. emGroup
will only be empty if there are no employees that match the department for which the director is responsible.
Let’s examine the output of this method:
Department: Sales -- Director: Rodrigo Suarez --------------------------------------------- Employee: Alvin Johnston Employee: Grace Silver Department: Engineering -- Director: Nikola Jokic --------------------------------------------- Employee: Jessica Cuevas Employee: Justin Vilches Employee: Ashley Montoya Department: IT -- Director:Petr Cech --------------------------------------------- Employee: Joey Delgado Employee: Silvio Mora Department: R&D -- Director: Carl Friedrich Gauss --------------------------------------------- Employee: No Name
As we can see, without any employees in the R&D department there is a grouping for the director of the department but with the “No Name” default employee.
Generating Sequences
Where it is useful LINQ allows us easily generate sequences of objects robustly and intuitively. We can use Range()
and Repeat()
to generate sequences in an IEnumerable
.
Repeat(T element, int count)
can be used to create an IEnumerable<T>
of element
repeated a count
number of times.
Range(int start, int count)
will generate an IEnumerable<int>
that starts at start
and will continue the sequence for a count
number of times.
Repeat()
is a flexible method as it is a generic method that will accept any type we want to use and generate a sequence from it.
Let’s take a look at an example:
Enumerable.Repeat(new Employee(), 100);
Here we generate a collection of 100 Employee
. Repeat()
can be used creatively to avoid writing a loop to do the same task or to easily generate mock data in unit tests.
Similarly, Range()
can also be used creatively to generate sequences and even be used in a way where the sequence matters not. Range()
used with filtering can produce even more complex sequences:
Enumerable.Range(5, 5); // 5, 6, 7, 8, 9, 10 Enumerable.Range(1, 10).Select(n => n * n); // 1, 4, 9, 16, 25 ... 100 Enumerable.Range(1, 100).Where(n => n % 2 == 0); // 2, 4, 6, 8 ... 100 Enumerable.Range(1, 100).Select(_ => PerformSomeAction());
In the first example, we have a simple use of Range() where we generate the sequence of integers starting at 5 up to 10.
Next, we generate a list of all squares of the sequence 1 to 10. This example shows that we can apply a logical operation on a sequence of elements using Select()
.
In the third example, we filter the sequence by only elements that are even integers.
Lastly, we use Select()
and a discard _
to call PerformSomeAction()
100 times. Using _
indicates that we do not care about the value of the element in the sequence just that we want to call a method for as many elements as there are in the collection.
In essence, this is a great, short way to repeat an action several times.
Partitioning Operations
There are a few methods that allow us to partition collections sequentially in different ways. We will use the list of numbers for our examples:
List<int> ints = new List<int>() { 2, 7, 2, 4, 5, 8, 9, 6, 1, 8, 9, 7 };
First, let’s discuss Skip()
and SkipWhile()
. Both these methods indicate we want to skip through the collection’s elements sequentially. Skip(int x)
will skip through x
number of elements and return the rest of the IEnumerable
we are operating over. SkipWhile()
will accept a lambda as a parameter indicating we will skip elements until the condition is false
.
Let’s consider the example of using both methods:
ints.Skip(7); // 6, 1, 8, 9, 7 ints.SkipWhile(i => i < 9) // 9, 6, 1, 8, 9, 7
The first example skips the first 7 elements in ints
and then returns the rest of the List
. The next example skips all elements until an element doesn’t satisfy the condition i < 9
. This results in a collection of all elements starting at element 9
(index 6) and ahead.
On the other hand, we have Take()
and TakeWhile()
these methods will get the collection’s elements sequentially. Take(int count)
will return a list of the first count
number of elements. TakeWhile()
will accept a lambda that evaluates all elements until the condition is false
. Here are examples of these methods in action:
ints.Take(5); // 2, 7, 2, 4, 5 ints.TakeWhile(i => i < 9) // 2, 7, 2, 4, 5, 8
First, Take()
gets the first 5
elements in ints
and then returns them as a List
. Second, TakeWhile()
gets all elements until an element doesn’t satisfy the condition i < 9
. This result is a collection of elements starting at element 2
(index 0) until element 8
(index 5).
Conclusion
Overall, these are a few topics on some advanced uses for LINQ. Learning the fundamentals of these methods and applying them creatively can change the quality and efficiency of our code. They can make interfacing with a database much easier or even replace long sections of code with a few lines of LINQ code.
var directors = employeeList.OfType<Director>();
It should be
var directors = MixedEmployees.OfType<Director>();
Thank you. It is fixed.
Tripat,Thank you for catching that. Looks like its been fixed
Method ConvertAll is not from LINQ. The mothod that does this in LINQ is called Select.
You are correct here. We introduced it because it has been used a lot of times with LINQ actions, but we should have been more clear about it. Now, it’s a bit clearer.