In this article, we will explore the techniques of performing case-insensitive substring search in C#, with the usage of the String.Contains() method, String.IndexOf() method, Regular Expressions, and LINQ in conjunction with String.Equals(). Additionally, we will conduct a comparative analysis of these methods using benchmarks to understand their efficiency better.
Let’s take a deeper dive into these concepts now!
Case-Insensitive Substring Search With String.Contains() Method
The String.Contains()
method is a versatile tool in substring search, offering multiple avenues to explore. This method comes with two overloads that accept string
parameters: Contains(String)
and Contains(String, StringComparison)
. In our exploration, we will discuss how to use these overloads to search substrings effectively.
Use StringComparison.OrdinalIgnoreCase
The StringComparison
parameter accepts one of the StringComparison enumeration values, with six distinct enum values. In this discussion, we’ll focus on two enumeration values within the StringComparison
parameter: Ordinal
and OrdinalIgnoreCase
.
The Ordinal
enumeration leverages ordinal or binary sort rules to compare strings, examining the individual unicode character codes. C# compares characters based on their fundamental binary representation in this approach. As a result, it treats lowercase and uppercase letters as distinct entities, preserving their case-sensitive differentiation. We must keep in mind that if no StringComparison
enum is provided, StringComparison.Ordinal
is employed as the default.
On the contrary, OrdinalIgnoreCase
, as implied by its name, considers lowercase and uppercase letters as equivalent during comparison, enabling a case-insensitive assessment. We utilize this enum throughout the article.
Let’s take a look at an example that showcases the concept in action:
var sourceString = "Code Maze"; var substringToSearch = "maze"; sourceString.Contains(substringToSearch, StringComparison.OrdinalIgnoreCase); // true
As we pass the StringComparison.OrdinalIgnoreCase
enum as an argument to the Contains()
method, it ignores case sensitivity. Consequently, this method returns true
as the "maze"
substring is present in the sourceString
variable.
Use String.ToUpperInvariant() Method
Another simple way to do a case-insensitive search is to transform the original string and the substring to uppercase. We use the String.ToUpperInvariant()
method as it transforms a string to uppercase while ensuring consistency across different cultures and locales.
It is advisable not to use the String.ToLowerInvariant()
method because a small group of characters can’t make a round trip when transformed into lowercase. Take a look at normalizing strings to uppercase to learn more about round trips.
Let’s look at this in action:
var sourceString = "Code Maze"; var substringToSearch = "maze"; sourceString.ToUpperInvariant().Contains(substringToSearch.ToUpperInvariant()); // true
Firstly, we transform the values in the sourceString
and the substringToSearch
variables into uppercase letters. Then the String.Contains()
method checks whether the sourceString
contains the substringToSearch
. The method returns true when it finds the substring.
Case-Insensitive Substring Search Using String.IndexOf() Method
The String.IndexOf()
method provides the zero-based index of the initial occurrence of a designated unicode character or string within the given instance. If the character or string is absent in the instance, the method returns -1.
The IndexOf()
method offers six overloads that take a string as an input parameter. However, we are emphasizing the IndexOf(String, StringComparison)
overload for searching a case-insensitive substring.
Let’s proceed with an illustrative example:
var sourceString = "Code Maze"; var substringToSearch = "maze"; sourceString.IndexOf(substringToSearch, StringComparison.OrdinalIgnoreCase); // 5
We pass the StringComparison.OrdinalIgnoreCase
enum to ignore case sensitivity. As a result, the IndexOf()
method returns 5, which represents the starting index of the substring in the source string. A positive value from the method signifies the successful discovery of the substring.
Case-Insensitive Substring Search With Regular Expressions
Using the Regex.IsMatch()
method, we determine whether a particular pattern exists within a specified input string. This method provides various overloads, and we utilize the IsMatch(String, String, RegexOptions)
overload to locate the substring in this particular scenario.
The RegexOptions
parameter opens up a world of possibilities with its 11 different enums. Some of the ones we frequently come across include IgnoreCase
, CultureInvariant
and IgnorePatternWhitespace
. It’s worth noting that if we don’t specify any option, the RegexOptions
parameter defaults to None
.
Let’s look at this in action:Â
var sourceString = "Code Maze"; var substringToSearch = "maze"; Regex.IsMatch(sourceString, substringToSearch, RegexOptions.IgnoreCase); // true
Here, we employ the RegexOptions.IgnoreCase
argument to ignore case sensitivity. The Regex.IsMatch()
method returns true
as the substring has been found.
Case-Insensitive Substring Search Using LINQ and String.Equals() Method
Next, let’s learn how to search substrings using LINQ, paired up with the String.Equals()
method.
When it comes to comparing strings while taking case sensitivity into account, we have two overloads of the Equals()
method at our disposal: Equals(String, StringComparison)
and Equals(String, String, StringComparison)
.
Let’s look at an example that utilizes the first overload:
var sourceString = "Code Maze"; var substringToSearch = "maze"; var separator = ' '; sourceString .Split(separator) .Any(word => word.Equals(substringToSearch, StringComparison.OrdinalIgnoreCase)); // true
Initially, we use the Split()
method to divide the sourceString
into a string array containing substrings delimited by a whitespace character, which we pass on as an argument to the method. Then, we use the Any()
method to determine whether any string within the array satisfies the condition of the Equals()
method.
We provide the StringComparison.OrdinalIgnoreCase
enum as an argument to the Equals()
method to disregard case sensitivity. If there is a string within the array that matches the substringToSearch
, it will return true and, in turn, the Any()
method will return true.
Performance Analysis of Case-Insensitive Substring Searches in C#
Now that we’ve learned different techniques to perform case-insensitive substring search, let’s analyze the benchmark result to determine which performance shines the most.
To test our methods, we’re using a longer string "A quick brown fox jumps over the lazy dog. The lazy dog barks loudly. The brown fox runs away quickly"
and the substring to search is "loudly"
.
With this, let’s take a look at the results:
| Method | Mean | Error | StdDev | Median | Allocated | |---------------------------- |-------------:|-----------:|------------:|-------------:|----------:| | StringIndexOf | 32.69 ns | 0.664 ns | 1.034 ns | 32.32 ns | - | | StringContains | 33.12 ns | 0.681 ns | 1.020 ns | 33.03 ns | - | | StringToUpperInvariant | 96.27 ns | 1.949 ns | 4.996 ns | 94.58 ns | 272 B | | RegexIsMatch | 163.24 ns | 1.585 ns | 1.405 ns | 163.02 ns | - | | LinqStringEquals | 452.85 ns | 8.183 ns | 7.254 ns | 452.51 ns | 952 B |
The String.IndexOf()
method demonstrates a speed improvement of approximately 14 times compared to the least efficient technique, which is LINQ with the String.Equals()
method. Following closely is the String.Contains()
method. Continuing, we encounter the String.ToUpperInvariant()
method, and subsequently, the Regex.IsMatch()
method.
Conclusion
Through this article, we’ve gained insights into different approaches for performing case-insensitive substring search in C#. Additionally, we’ve gained an understanding of the performance characteristics of each approach. This newfound knowledge will undoubtedly improve our proficiency in conducting substring searches in the future.