In this article, we are going to learn how to convert a string from Title Case to camelCase in C#.
Let’s start.
Initial String Conversion from Title Case to camelCase in C#
To cover all the aspects of a camelCase string, we are going to create the ToCamelCase
method. This method will transform a Title Case string (“Welcome to the Maze”) into a camelCase string (“welcomeToTheMaze”). For greater flexibility, we are going to build it as an extension method.
First of all, a camelCase string cannot have any space (” “) or underscore (“_”) separators.
We have to remove these separators. We can do this by splitting the words and joining them again excluding the separators:
public static string ToCamelCase(this string str) { var words = str.Split(new[] { "_", " " }, StringSplitOptions.RemoveEmptyEntries); return string.Join(string.Empty, words); }
Next, a camelCase string must contain all the joining words as uppercase-first. That means, we have to convert all the lowercase words so that we can get “WelcomeToTheMaze” instead of “WelcometotheMaze”:
public static string ToCamelCase(this string str) { var words = str.Split(new[] { "_", " " }, StringSplitOptions.RemoveEmptyEntries); words = words .Select(word => char.ToUpper(word[0]) + word.Substring(1)) .ToArray(); return string.Join(string.Empty, words); }
Finally, a camelCase string must start with a lowercase letter:
public static string ToCamelCase(this string str) { var words = str.Split(new[] { "_", " " }, StringSplitOptions.RemoveEmptyEntries); var leadWord = words[0].ToLower(); var tailWords = words.Skip(1) .Select(word => char.ToUpper(word[0]) + word.Substring(1)) .ToArray(); return $"{leadWord}{string.Join(string.Empty, tailWords)}"; }
We just turn the leading word to lowercase and keep the remaining words as uppercase-first.
Handling Acronyms
At this point, we have the routine that converts a string from Title Case to camelCase in C#. But, there is still a part missing. We have not handled acronyms yet. For example, the camelCase form of “ISODate” is “isoDate”, not “iSODate”.
This is a tricky part, but we can do this conversion using a regular expression:
public static string ToCamelCase(this string str) { var words = str.Split(new[] { "_", " " }, StringSplitOptions.RemoveEmptyEntries); var leadWord = Regex.Replace(words[0], @"([A-Z])([A-Z]+|[a-z0-9]+)($|[A-Z]\w*)", m => { return m.Groups[1].Value.ToLower() + m.Groups[2].Value.ToLower() + m.Groups[3].Value; }); var tailWords = words.Skip(1) .Select(word => char.ToUpper(word[0]) + word.Substring(1)) .ToArray(); return $"{leadWord}{string.Join(string.Empty, tailWords)}"; }
Now, let’s modify the Program
class to test our method:
var inputs = new[] { "Welcome to the Maze", "Welcome To The Maze", "WelcomeToTheMaze", "Welcome_To_The_Maze", "ISODate", "IOStream" }; foreach (var x in inputs) { Console.WriteLine($"{x} => {x.ToCamelCase()}"); }
Once we start the application, we are going to see our camelCase strings in the output window:
Welcome to the Maze => welcomeToTheMaze Welcome To The Maze => welcomeToTheMaze WelcomeToTheMaze => welcomeToTheMaze Welcome_To_The_Maze => welcomeToTheMaze ISODate => isoDate IOStream => ioStream
Conclusion
In this article, we have learned how to build a helper routine that can convert a string from Title Case to camelCase in C#. We have also learned the special case of acronyms.
Of course, we have to say that this is not a one-size-fits-all solution but an elegant way to achieve camelCase transformation in most cases. That said, we wanted to focus on the basic logic for the transformation without any additional complications by dealing with the encoding system. Due to that, it will not work well for a string with non-ASCII characters.
I have been using regex since the 70’s. I even used a line-oriented editor, where you had to construct sed commands to make changes to a line. It was for printing terminals; very slow. Regex’s are write once, read never. They are extremely hard to write, rarely have decent unit tests, documentation or use cases. The complex ones can time out. In short, I avoid them, unless I have some time to kill. So should you. For something as simple as this, a little bit of thought shouldn’t make your brain hurt that much. Here is how I solve the problem:
///
/// Collapse illegal characters to Pascal/CamelCase.
/// have a nice day => HaveANiceDay
///
/// The input string.
/// The collapse predicate. True removes/collapses the character
/// if set to true [make initial letter upper case].
public static StringBuilder CamelCaseCollapse(
[CanBeNull] this string @string,
[NotNull] Predicate collapse,
bool makeInitialLetterUpperCase)
{
if (collapse is null) throw new ArgumentNullException(nameof(collapse));
if (@string is null) return null;
StringBuilder result = new StringBuilder();
bool makeNextUpperCase = makeInitialLetterUpperCase;
foreach (char c in @string)
{
bool collapseThis = collapse(c);
if (!collapseThis) result.Append(makeNextUpperCase ? c.ToUpper() : c);
makeNextUpperCase = collapseThis;
}
return result;
}
// To call it I use:
StringBuilder stringBuilder = name.CamelCaseCollapse(
c => !IsLegalInCSharpName(c, allowUnderscore),
pascalCase);
private static bool IsLegalInCSharpName(char c, bool allowUnderscore)
=> char.IsLetterOrDigit(c) || allowUnderscore && c == Ascii.UnderscoreChar;
and yes this is not for UTF-16.
Note the code does not change the case of existing uppercase letters, which is how acronyms are preserved.
Anyway, the details are not that important. A couple of booleans manage the state and you are done. Regex are particularly bad at this sort of thing (Or at least I am trying to use regex). For example, try using a regex to parse a query string with ?key=name&key=name… It is easy to do in code with 2 for loops. Nasty with regex because of the nested states.
Thank you for the suggestion. We never said this is the only way of solving this thing, and I am pretty sure that there are a lot of different solutions, but we’ve shown a base and elegant way to do the transformation and our readers can modify it in whatever way they find it acceptable for them. Again, thanks for the suggestion, which can only add quality to the article.
Hi, and thanks for the nice article. Unfortunately, it misses a crucial aspect of string manipulation: UTF-16. All manipulation of strings that you show, and even if you would add the culture parameters, which are missing as well, are prone to errors which corrupt the string. Thus, all manipulations should be performed based on text elements (cf. System.Globalization.StringInfo).