In this article, we are going to learn about URI encode and decode and how we can achieve this using .NET.
So, let’s start.
What is URI Encoding?
Only certain characters are valid in a URL. We therefore sometimes need to perform a URI encode and decode on some characters, before we transmit them over the internet and we call this process URI Encoding.
We can classify characters in a URL as either reserved or unreserved. The reserved characters are those characters that have a special meaning and generally mark the various parts of a URL. For example, the ‘?‘ character in a URL indicates the start of any query parameters.
RFC3986 defines which characters are reserved and unreserved.
Reserved characters:
! # $ & ‘ ( ) * + , / : ; = ? @ [ ]
Unreserved characters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 – _ . ~
We need to encode reserved characters in a URL. To do this, we take the hexadecimal ASCII byte value of the character, preceded with a ‘%‘ character. For example, a space character encodes to ‘%20‘. Another name for URL encoding is percent-encoding, due to the ‘%‘ prefix character used.
This encoding is simple to perform, but instead of writing code to do this ourselves, .NET provides a few ways of encoding and decoding for us.
How to Encode and Decode URI Using the HttpUtility Class
The HttpUtility
class, which is part of the System.Web
namespace includes UrlEncode()
and UrlDecode()
methods:
var url = @"http://example.com/resource?foo=bar with space#fragment"; var httpUtilityEncoded = HttpUtility.UrlEncode(url); Console.WriteLine(httpUtilityEncoded); //http%3a%2f%2fexample.com%2fresource%3ffoo%3dbar+with+space%23fragment var httpUtilityDecoded = HttpUtility.UrlDecode(httpUtilityEncoded); Console.WriteLine(httpUtilityDecoded); //http://example.com/resource?foo=bar with space#fragment
These methods take a single string
parameter containing the URL to be either encoded or decoded. By default, these methods use a UTF-8 encoding, but if this is not the case, there is an overload to pass a different encoding instead. There are also other method overloads to pass a Byte[]
instead of a string
type.
How to Encode and Decode Using the WebUtility Class
The documentation states that if we are not within a web application, we should use the WebUtility
class to perform URL encoding and decoding instead. This class is in the Sytem.Net
namespace.
Usage is very similar to the previous examples, although there are no overloads:
var webUtilityEncoded = WebUtility.UrlEncode(url); Console.WriteLine(webUtilityEncoded); //http%3A%2F%2Fexample.com%2Fresource%3Ffoo%3Dbar+with+space%23fragment var webUtilityDecoded = WebUtility.UrlDecode(webUtilityEncoded); Console.WriteLine(webUtilityDecoded); //http://example.com/resource?foo=bar with space#fragment
How to Encode and Decode Using the Uri Class
Alternatively, we can use the Uri
class to encode and decode URLs. Instead of UrlEncode()
and UrlDecode()
, the methods are called EscapeDataString()
and UnescapeDataString()
:
var uriEncoded = Uri.EscapeDataString(url); Console.WriteLine(uriEncoded); //http%3A%2F%2Fexample.com%2Fresource%3Ffoo%3Dbar%20with%20space%23fragment var uriDecoded = Uri.UnescapeDataString(uriEncoded); Console.WriteLine(uriDecoded); //http://example.com/resource?foo=bar with space#fragment
Differences Between the Different Options
If we look closely at the output from the examples, we’ll notice that HttpUtility.UrlEncode()
produces lowercase encoding of reserved characters, whilst WebUtility.UrlEncode
and Uri.EscapeDataString
both output uppercase. So a ‘?‘ character encodes to either ‘%3f‘ or ‘%3F‘.
The way the space character is encoded also differs between these implementations. HttpUtility.UrlEncode
and WebUtility.UrlEncode
both encode a space character to ‘+‘, whereas Uri.EscapeDataString
encodes a space as ‘%20‘ instead.
If we look at how these methods encode other characters like ‘!‘, ‘(‘, ‘)‘, ‘*‘, and ‘~‘ we’ll see there are also differences, so this might influence which implementation we choose to use.
Another consideration is that there is a limit of 32766 characters for Uri.EscapeDataString
. If we try and encode more characters than that limit, it will throw a UriFormatException
– so if we do need to encode a particularly long URL, we’ll probably want to use either HttpUtility.UrlEncode
or WebUtility.UrlEncode
instead.
Conclusion
We’ve learned about URL encoding and discovered there are multiple implementations in .NET to perform both encoding and decoding. These methods differ slightly in how they encode and decode our URLs. The particular implementation we choose will depend upon our specific requirements.
There are actually a few others that could be of use to some people, with some examples here
https://stackoverflow.com/questions/575440/url-encoding-using-c-sharp/11236038#11236038