In this article, we are going to take a look at the HashSet class in C#. We will discuss how to use it and its key features. Also, we are going to see some examples of how to use it in practice. Finally, we’ll compare it to other data structures available in C#.
Without further ado, let’s start!
What Is a HashSet in C#?
A HashSet is a collection of unique elements that uses a hash table for storage, allowing faster retrieval of elements than other collection types. Adding and removing elements to the HashSet also has constant time complexity. However, it does not maintain insertion order and cannot access elements by index.
The HashSet<T> class has been available since .NET 3.5 as part of the System.Collection.Generic
namespace and implements these interfaces:
public class HashSet<T> : System.Collections.Generic.ICollection<T>, System.Collections.Generic.IEnumerable<T>, System.Collections.Generic.IReadOnlyCollection<T>, System.Collections.Generic.ISet<T>, System.Runtime.Serialization.IDeserializationCallback, System.Runtime.Serialization.ISerializable
How to Create a HashSet in C#
For most of our examples, we intend to use a HashSet that contains strings of some of the popular programming languages used today.
Let’s start by creating an empty HashSet:
var languages = new HashSet<string>();
Add Items to a HashSet
Let’s begin by adding a set of programming languages into our HashSet<string>
object by taking advantage of the Add()
method:
public HashSet<string> ProgrammingLanguages() { var languages = new HashSet<string>(); languages.Add("C"); languages.Add("C++"); languages.Add("C#"); languages.Add("Java"); languages.Add("Scala"); languages.Add("TypeScript"); languages.Add("Python"); languages.Add("JavaScript"); languages.Add("Rust"); return languages; }
Here, we insert elements into the languages
HashSet by invoking the Add()
method.
In some cases, we may want to initialize HashSets directly with values:
var languages = new HashSet<string> { "C", "C++", "C#", "Java" };
For all our tests, we are going to reuse the same class object instance to make our examples as easy to follow as possible:
HashSetsInCSharpMethods hashSet = new HashSetsInCSharpMethods();
Let’s also initialize our HashSet in the test constructor to avoid repetitive code:
private readonly HashSet<string> _languages; public HashSetInCSharpUnitTests() { _languages = hashSet.ProgrammingLanguages(); }
Next, we can verify that our HashSet has nine elements and check whether it contains one of the elements (“C#” ):
Assert.IsInstanceOfType(_languages, typeof(HashSet<string>)); Assert.AreEqual(_languages.Count(), 9); Assert.IsTrue(_languages.Contains("C#"));
We can successfully prove that _languages
is of type HashSet<string>
. We use the inbuilt Count()
method to check the number of elements in our HashSet.
Also, we use the inbuilt Contains()
method to check if a HashSet has a specific element. The method takes the element as a parameter and returns a boolean value indicating whether or not the element is present in the set.
This technique can be useful for quickly checking if an element exists in the set without having to iterate through all of the elements.
Duplicate Elements in a HashSet
HashSets do not allow for duplicate elements. It will not affect the set if we try to add a duplicate element. It uses a hashing structure that ensures that each element can only appear once in the set.
Let’s attempt to add duplicate elements to the _languages
HashSet:
_languages.Add("C"); _languages.Add("C++"); _languages.Add("C#"); Assert.IsInstanceOfType(_languages, typeof(HashSet<string>)); Assert.AreEqual(_languages.Count(), 9);
When we try to add duplicate values to our _languages
HashSet doesn’t get modified, as the set already contains those values. Therefore, its count remains nine instead of twelve.
Retrieve HashSet Elements
Besides using the inbuilt Contains()
method to check whether a HashSet contains a specific value, we can use the inbuilt TryGetValue (T equalVal, out T actualVal)
technique to achieve the same result. The method takes two parameters, the first being the value to search for and the next being what the search finds or the default value when the search doesn’t yield any results.Â
Let’s put this theory into practice:
Assert.IsTrue(_languages.TryGetValue("C#", out _)); Assert.IsTrue(_languages.Contains("C#")); Assert.IsFalse(_languages.TryGetValue("Assembly", out _)); Assert.IsFalse(_languages.Contains("Assembly"));
Here, _languages
does not contain “Assembly” but contains “C#” and uses the discard operator to ignore the TryGetValue()
method’s return value. Â
Remove Elements From a HashSet
To remove an item from the HashSet, we can use the Remove()
method. Like the Add() method, it takes the object as a parameter and removes it from the HashSet:Â
public HashSet<string> RemoveElement(HashSet<string> hashSet, string value) { hashSet.Remove(value); return hashSet; }
Here, the RemoveElement()
method takes a HashSet<string>
and a string
as parameters and uses the Remove()
method to remove an element before returning the updated HashSet.Â
Next, we can verify that the RemoveElement()
method successfully removes elements:
var elementToRemove = "Java"; var updatedLanguages = hashSet.RemoveElement(_languages, elementToRemove); Assert.IsFalse(updatedLanguages.Contains(elementToRemove)); Assert.AreEqual(_languages.Count(), 8);
We invoke the RemoveElement()
method and pass our HashSet, and a string (“Java”), which we can prove is removed as the updated HashSet does not contain that value, and its count decreases by one.Â
RemoveWhere() Method
Besides using the inbuilt Remove()
method, we can use the RemoveWhere()
method that takes a predicate as a parameter to set conditions that determine whether we remove an element. To illustrate this concept, let’s implement a HashSet that stores unique random numbers:
public HashSet<int> RandomInts(int size) { var rand = new Random(); var numbers = new HashSet<int>(); for (int i = 0; i < size; i++) { numbers.Add(rand.Next()); } return numbers; }
Our RandomInts()
method takes the number of integers to generate as its sole input. It uses the inbuilt random class to generate random numbers and inserts each value into a HashSet before returning it.Â
Next, let’s implement a method that returns true when an integer is odd. We are going to use this method as our predicate when we eventually implement the RemoveWhereElement()
method:
public bool IsOdd(int num) { return num % 2 == 1; }
Finally, let’s implement our RemoveWhereElement()
method by passing the predicate function IsOdd()
as a parameter:
public HashSet<int> RemoveWhereElement(HashSet<int> hashSet) { hashSet.RemoveWhere(IsOdd); return hashSet; }
We can verify that the RemoveWhereElement()
method removes all the odd numbers in the HashSet:
var numbers = hashSet.RandomInts(100); var oddNumbers = new HashSet<int>(); foreach (var item in numbers) { if (hashSet.IsOdd(item) == true) { oddNumbers.Add(item); } } hashSet.RemoveWhereElement(numbers); var testValue = oddNumbers.First(); var checkValue = hashSet.IsOdd(testValue); Assert.IsTrue(checkValue); Assert.IsFalse(oddNumbers.IsSubsetOf(numbers)); Assert.AreEqual(numbers.Union(oddNumbers).Count(), 100);
Here, we create a HashSet to store odd numbers from the random numbers we generate. Next, we invoke the RemoveWhereElement()
method to remove all odd numbers from numbers
(contains all the numbers, including odd numbers). Then, we check whether the first element in the oddNumbers
HashSet is an odd number. Finally, we assert that oddNumbers
is not a subset of numbers
, and verify that their union still adds up to 100 elements.Â
Remove Elements From a HashSet Through the Clear() Method
What if we want to remove all the elements in a HashSet? We can make use of the Clear()
inbuilt method.
Let’s verify that the Clear()
method removes all elements from the _languages
HashSet:
_languages.Clear(); Assert.AreEqual(0, _languages.Count()); Assert.IsNull(_languages.FirstOrDefault());
After invoking the Clear()
method, we can prove that we remove all the elements from the _languages
HashSet as it has a count of zero. Â
Iterate Through a HashSet in C#
To iterate through a HashSet, we can use the statements available in C# such as for
, foreach
and while
loops to achieve our goals:
public List<int> CreateList(HashSet<int> hashSet) { var list = new List<int>(); foreach (var item in hashSet) { list.Add(item); } return list; }
Here, the CreateList()
method takes aHashSet<int>
object as its sole parameter and adds all the elements to the list.
Alternatively, we can simply call the inbuilt ToList()
method to convert the HashSet into a list:
var list = hashSet.ToList();
Let’s verify that CreateList()
successfully returns a populated List<int>
object:Â
var numbers = hashSet.RandomInts(100); var numbersList = hashSet.CreateList(numbers); CollectionAssert.AllItemsAreInstancesOfType(numbersList, typeof(int)); Assert.AreEqual(numbersList.Count(), numbers.Count());
We must remember that a HashSet does not store elements in a specific order, so the order in which we iterate through the elements varies.
HashSet Set Operations Methods in C#
Let’s understand some methods we can use for set operations as we work with HashSets.
IsProperSubsetOf/IsProperSuperSetOf
When we want to check whether a HashSet instance is a proper subset of another HashSet instance, we use the IsProperSubsetOf()
method. Likewise, we can use the IsProperSupersetOf()
method to determine if a HashSet is a superset of another HashSet:
var moreLanguages = new HashSet<string> {"C", "C++", "C#", "Java", "Scala", "TypeScript", "Python", "JavaScript", "Rust", "Assembly", "Pascal"}; Assert.IsTrue(_languages.IsSubsetOf(moreLanguages)); Assert.IsTrue(_languages.IsProperSubsetOf(moreLanguages)); Assert.IsTrue(moreLanguages.IsSupersetOf(_languages)); Assert.IsTrue(moreLanguages.IsProperSupersetOf(_languages));
We create a larger set moreLanguages
that contains more elements, including all the elements in the languages
set. Therefore, languages
is a proper subset of moreLanguages
, and the latter is the proper superset of the former.Â
UnionWith
When we want to join two sets, we perform a union operation. For example, when we want to perform a union between two sets, A and B, we copy the elements in set B over into set A.Â
Let’s perform a UnionWith()
operation between _languages
and moreLanguages
HashSet to illustrate this concept:
var moreLanguages = new HashSet<string> { "Assembly", "Pascal", "HTML", "CSS", "PHP" }; _languages.UnionWith(moreLanguages); Assert.AreEqual(_languages.Count(), 14);
The UnionWith()
method copies the elements in moreLanguages
HashSet into the _languages
HashSet hence, the latter now has fourteen elements instead of nine.Â
IntersectWith
An intersection between sets A and B entails finding the common elements. To accomplish such an operation in C#, we use the inbuilt IntersetWith()
method.Â
Let’s understand how to perform an intersection operation with an example:
var moreLanguages = new HashSet<string> { "C", "C++", "C#", "Java", "Scala", "Assembly", "Pascal", "HTML", "CSS", "PHP" }; _languages.IntersectWith(moreLanguages); Assert.AreEqual(_languages.Count(), 5); Assert.IsTrue(_languages.Contains("C")); Assert.IsTrue(_languages.Contains("C++")); Assert.IsTrue(_languages.Contains("C#")); Assert.IsTrue(_languages.Contains("Java")); Assert.IsTrue(_languages.Contains("Scala")); Assert.IsFalse(_languages.Contains("Assembly"));
Here, the IntersectWith()
method selects the elements that are common in both _languages
and moreLanguages
sets.Â
ExceptWith
This operation performs a set difference operation between two sets. If we perform a set difference between sets A and B, the operation returns the elements in A that are not present in B.
Let’s understand this concept with another example:
var moreLanguages = new HashSet<string> { "C", "C++", "C#", "Java", "Scala", "Assembly", "Pascal", "HTML", "CSS", "PHP" }; _languages.ExceptWith(moreLanguages); Assert.AreEqual(_languages.Count(), 4); Assert.IsTrue(_languages.Contains("TypeScript")); Assert.IsTrue(_languages.Contains("Python")); Assert.IsTrue(_languages.Contains("JavaScript")); Assert.IsTrue(_languages.Contains("Rust")); Assert.IsFalse(_languages.Contains("Assembly"));
The ExceptWith()
method returns the elements that are in _languages
but not in moreLanguages
, which are: “TypeScript”, “Python”, “JavaScript” and “Rust”.Â
SymmetricExceptWith
Sometimes, we may want to modify a HashSet to store unique elements between two sets. That’s where the SymmetricExceptWith()
method comes into play, as we can use it to accomplish our purpose.
Let’s look at using the SymmetricExceptWith()
method:
var moreLanguages = new HashSet<string> { "Assembly", "Pascal", "HTML", "CSS", "PHP" }; _languages.SymmetricExceptWith(moreLanguages); Assert.AreEqual(_languages.Count(), 14);
The SymmetricExceptWith()
method modifies the _languages
HashSet to make it have unique elements from both itself and moreLanguages
HashSet. Therefore, since both sets have unique values, the modified _languages
HashSet now contains fourteen elements.Â
Benefits of a HashSet in C#
First, since HashSets use hash tables, they facilitate quick insertion and retrieval operations as they have a constant access time of O(1).
Also, we can use HashSets in applications that do not allow duplicate elements, which helps us eliminate data redundancy.
Finally, HashSets can be useful for quickly checking if an element is present in the set without having to iterate through all of the elements.Â
Drawbacks of a HashSet in C#
One drawback of using a HashSet is that it does not maintain the order of its elements, so the order in which we iterate over them may vary.
Also, using HashSets may not be suitable for all situations where we need to store duplicate elements.
The HashSet data structure has limited methods available compared to other data structures such as lists or dictionaries.Â
Conclusion
In this article, we learn that a HashSet can be a useful solution in C# when inserting and retrieving elements as fast as possible. We can also use HashSets when we want to store unique elements.
However, it is important to consider our specific needs before deciding on a solution as it may not be suitable for all situations. Additionally, we have to keep in mind that a HashSet does not maintain the order of its elements and does not allow for accessing elements by their indices.