Converting String to Byte Array in C#

In this article, we’ll explore the concept of converting a string to a byte array in C#. We will also talk about why this conversion is necessary and explore the methods we can use for that.

To download the source code for this article, you can visit our GitHub repository.

So let’s begin.

Why Conversion Is Necessary

We’ll most likely encounter the need to convert a string to a byte array to achieve efficient image processing, network communication, or encryption and decryption in our application.

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!

But why is this conversion necessary? Well, strings represent human-readable characters, while computers process information as binary data. This means we need to translate strings into a format machines can understand and efficiently process. And byte arrays offer such a format for storing binary data in a compact and optimized way.

Knowing this, let’s see the various methods for converting a string to a byte array.

Before We Begin

For all of the code samples in this article, let’s create a string literal that we’ll use to convert to a byte array:

var message = "Welcome to CodeMaze!";

With this string, we expect the same output for each conversion method:

87,101,108,99,111,109,101,32,116,111,32,67,111,100,101,109,97,122,101,33

Converting Using Encoding.GetBytes() Method

This method is the most common and recommended way for converting a string to a byte array in C#. It is a method that resides in the System.Text namespace, and it provides fine-grained control over the encoding used, ensuring proper interpretation of the data across multiple systems.

There are several encoding options and we’ll talk about choosing the right one later on, but now let’s see how to do the conversion using the UTF8 encoding option:

static byte[] ConvertStringToUTF8Bytes(string message)
{
    return Encoding.UTF8.GetBytes(message);
}

We define a static method and promptly return the byte array which now holds the encoded string. It does this by using the Encoding.GetBytes() method that performs the UTF8 encoding on our message variable, converting the string’s characters into their corresponding byte representations.

Choosing the Right Encoding Option

Selecting the “right” encoding for converting strings to byte arrays largely depends on context. Factors like the target platform, audience, storage and transmission efficiency, and interoperability, should guide our decision.

The UTF-8 format, widely used on the web, is ideal for internationalization and varied character sets (web APIs, JSON, network protocols), while the ASCII format is best for legacy systems, ASCII-reliant formats (some CSV files), and minimal file size needs (English-only text).

There’s also the UTF-16 format, which is fitting for internal data in applications dealing with European and Asian languages, and last but not least, UTF-32, whose rare use is limited to applications requiring the full range of Unicode characters, like specialized language processing or fonts.

By understanding these common encodings’ characteristics and usage scenarios, we can choose the most appropriate one for our specific needs.

Convert by Casting Individual Characters to Bytes

Casting characters to bytes offers a direct approach for string-to-byte-array conversion, but it’s crucial to note its limitations.

It effectively translates ASCII characters (0-127) to their byte representations based on the character’s position in the ASCII table. Still, it struggles with non-ASCII characters like Cyrillic, emojis, or special symbols, potentially leading to incorrect or incomplete conversions.

Let’s see how to use this technique for conversion:

static byte[] ConvertStringToByteArrayUsingCasting(string message)
{
    var byteArray = new byte[message.Length];

    for (int i = 0; i < message.Length; i++)
    {
        byteArray[i] = (byte)message[i];
    }

    return byteArray;
}

We initialize a byte array with a size equal to the length of the string. This creates a byte array with enough space to hold each string character as a byte value. Then we iterate through each character in the string, allowing us to process each character individually.

Within the loop, we assign each character to its corresponding index in the byte array. This effectively converts each character to its underlying byte value and stores it in the byte array.

Finally, we return the populated byte array. This array now contains the byte representation of the original string, where each character’s Unicode code point is stored as a single byte.

Convert To Byte Array Using Convert.ToByte() Method

This method is part of the Convert class and it offers various data type conversion utilities. Similar to the previous method, it provides a direct way to convert a single character to its byte value. It is however not designed for converting entire strings to byte arrays.

Conversion using this technique is straightforward:

static byte[] ConvertStringToByteArrayUsingConvertToByte(string message)
{
    var byteArray = new byte[message.Length];

    for (int i = 0; i < message.Length; i++)
    {
        byteArray[i] = Convert.ToByte(message[i]);
    }

    return byteArray;
}

While this code shares a similar structure with the previous example, it employs the Convert.ToByte() method to perform the character-to-byte conversion. This approach offers a more explicit and potentially more readable way to convert characters to their corresponding byte values.

Convert String to Byte Array Using Encoding.GetEncoding() Method

This method obtains an Encoding object for a specific encoding, which can then be used for various encoding-related operations, including converting strings to byte arrays using GetBytes().

Let’s explore using this method:

static byte[] ConvertStringToByteArrayUsingEncoding(string message)
{
    var encoding = Encoding.GetEncoding("ISO-8859-1");
    var byteCount = encoding.GetByteCount(message);
    var byteArray = new byte[byteCount];

    encoding.GetBytes(message, byteArray);

    return byteArray;
}

Our method employs encoding to transform a string into its corresponding byte array by obtaining a specific instance of the ISO-8859-1 encoding using the Encoding.GetEncoding() method. This encoding is crucial for determining how characters within the string are mapped to their byte values.

Next, it meticulously calculates the exact number of bytes required to accommodate the encoded string with encoding.GetByteCount()and constructs a byte array with the calculated byte count via new byte[].

It translates the string characters into their corresponding byte values while storing the encoded bytes within the provided array. In the end, it returns the now-populated byte array.

Benchmarks

Now that we’ve explored various methods for converting a string into a byte array, let’s put them to the test. We’ll analyze their performance using benchmarks, starting with a short string and then tackling a longer one.

For the short string, we’ll use our initial message value and add some extra text for the long one:

Welcome to CodeMaze, your one-stop destination for mastering all things .NET and C#! Explore a comprehensive learning experience tailored to your programming journey.

Let’s look at the short string benchmarks:

|                    Method |      Mean |     Error |    StdDev |    Median |    Gen0 | Allocated |
|-------------------------- |----------:|----------:|----------:|----------:|--------:|----------:|
|       ConvertUsingCasting |  21.03 ns |  0.396 ns |  0.371 ns |  20.93 ns |  0.0115 |  48   B   |
| ConvertUsingConvertToByte |  26.52 ns |  0.515 ns |  0.787 ns |  26.42 ns |  0.0114 |  48   B   |
|        ConvertToUTF8Bytes |  29.37 ns |  0.779 ns |  2.146 ns |  28.57 ns |  0.0114 |  48   B   |
|   ConvertUsingGetEncoding |  73.79 ns |  1.565 ns |  1.922 ns |  72.96 ns |  0.0114 |  48   B   |

Next, we have the long string benchmarks:

|                     Method |      Mean |     Error |    StdDev |    Median |    Gen0 | Allocated |
|--------------------------- |----------:|----------:|----------:|----------:|--------:|----------:|
|         ConvertToUTF8Bytes |  68.59 ns |  2.295 ns |  6.474 ns |  65.99 ns |  0.0458 |  192  B   |
|    ConvertUsingGetEncoding | 101.85 ns |  2.088 ns |  2.144 ns | 101.54 ns |  0.0459 |  192  B   |
|        ConvertUsingCasting | 108.66 ns |  2.245 ns |  2.672 ns | 108.25 ns |  0.0458 |  192  B   |
|  ConvertUsingConvertToByte | 159.17 ns |  2.415 ns |  2.966 ns | 159.84 ns |  0.0458 |  192  B   |

From the benchmark results, we can deduce that for converting short strings, casting individual characters is the fastest method, closely followed by the Convert.ToByte() method. While for converting long strings, the Encoding.GetBytes() method emerges as the fastest, and the Encoding.GetEncoding() method closely follows.

Choosing the Most Suitable Method for Specific Scenarios

For most string-to-byte-array conversions in C#, the Encoding.GetBytes() method is generally the most efficient and optimized choice because its internal optimizations and compiler assistance offer better performance.

But in rare cases where complete accuracy is not critical or we are dealing with simple character sets, casting individual characters to bytes can suffice.

The Convert.ToByte() method is a straightforward and readable alternative when working with small strings and single characters. It however might not handle all encodings correctly and could lead to data loss or corruption.

The Encoding.GetEncoding() method shines in situations requiring specific encodings beyond the ASCII range, especially for multilingual texts, but might introduce a slight overhead when we use it for strings containing only ASCII characters.

In summary, while the Encoding.GetBytes() method seems like the go-to choice for its efficiency and adaptability across a broad range of scenarios, understanding the strengths of alternative methods allows us to make informed decisions based on the unique characteristics of our data and the context of the application.

Conclusion

In this article, we discussed the importance of converting strings to byte arrays, explored various conversion methods, and emphasized the significance of selecting the appropriate encoding and conversion method based on the context of our application.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!

Converting String to Byte Array in C#

Why Conversion Is Necessary

Before We Begin

Converting Using Encoding.GetBytes() Method

Choosing the Right Encoding Option

Convert by Casting Individual Characters to Bytes

Convert To Byte Array Using Convert.ToByte() Method

Convert String to Byte Array Using Encoding.GetEncoding() Method

Benchmarks

Choosing the Most Suitable Method for Specific Scenarios

Conclusion

Leave a reply Cancel reply

Courses – Code Maze

Ad 1

Ad 2

Ad 3

Ad 4