In the world of web development using .NET, it’s crucial to make sure the web addresses (URLs) our application uses are correct and safe. In this article, we will understand the basics of how to check if a URL is valid in C#. We’ll include some easy-to-understand code examples to help us implement these validations in our projects.

To download the source code for this article, you can visit our GitHub repository.

So let’s dive in.

Basics of URL Structure

URL, or Uniform Resource Locator, serves as an address for web resources. It’s a string whose basic structure consists of several components. They include the scheme (like “http” or “https“), the domain or hostname (such as “www.example.com”), and the path to the specific resource (e.g. “/v1/products”). Optional components may include query parameters and a fragment.

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!

For instance, let’s analyze the structure of the URL https://www.example.com/path/page?query=value#section:

ComponentValue
Schemehttps
Domainwww.example.com
Path/path/page
Query parametersquery=value
Fragmentsection

Understanding these components is fundamental to effective URL validation and navigation in web development.

Check if the URL Is Valid Using Regular Expressions (Regex)Β 

One effective method for URL validation in C# involves leveraging the built-in Regex class (regular expression). Using Regex, we can define patterns that URLs should adhere to. This allows us to perform flexible and customizable validation.

Let’s define a UrlValidator class with a single static method. It will use a Regex which validates web page URLs:

public static class UrlValidator
{
    public static bool ValidateUrlWithRegex(string url)
    {
        var urlRegex = new Regex(
            @"^(https?|ftps?):\/\/(?:[a-zA-Z0-9]" +
                    @"(?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}" +
                    @"(?::(?:0|[1-9]\d{0,3}|[1-5]\d{4}|6[0-4]\d{3}" +
                    @"|65[0-4]\d{2}|655[0-2]\d|6553[0-5]))?" +
                    @"(?:\/(?:[-a-zA-Z0-9@%_\+.~#?&=]+\/?)*)?$",
            RegexOptions.IgnoreCase);
        
        urlRegex.Matches(url);
        
        return urlRegex.IsMatch(url);
    }
}

Here, we validate a URL according to the standard conventions. Let’s break this pattern into smaller parts and take a closer look:

PatternExplanation
(https?|ftps?):\/\/This expression checks for the scheme (either HTTP(s) or FTP(s)) followed by "://".
(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}Checks for appropriate domain names, including subdomains.
(?::(?:0|[1-9]\d{0,3}|[1-5]\d{4}|6[0-4]\d{3}|65[0-4]\d{2}|655[0-2]\d|6553[0-5]))?Expression for an optional port number between 1 and 65535.
(?:\/(?:[-a-zA-Z0-9@:%_\+.~#?&=]+\/?)*)?This pattern checks for correct URL path structure (along with optional query parameters and a fragment) with valid characters and optional trailing slashes.

Overall, these parts ensure that the URL adheres to standard formatting rules. Now, let’s see the validation in action:

var url = "https://www.amazon.com";
var success = UrlValidator.ValidateUrlWithRegex(url);

Console.WriteLine($"The URL '{url}' is {(success ? "valid" : "invalid")}.");

var url2 = "ftp:////example.com///one?param=true";
success = UrlValidator.ValidateUrlWithRegex(url2);

Console.WriteLine($"The URL '{url2}' is {(success ? "valid" : "invalid")}.");

Here, we test the ValidateUrlWithRegex() method with 2 input URLs, outputting our result to the console.

Let’s check the console output:

The URL 'https://www.amazon.com' is valid.
The URL 'ftp:////example.com///one?param=true' is invalid.

Here, we see that our Regex validation correctly interprets the first URL as valid and the second one as an invalid one since it contains excess slash (‘/’) characters.

Check if the URL Is Valid Using the Built-in URI Class

The built-in Uri class is another option in .NET that provides us with a more straightforward approach to URL validation. We’re going to analyze two ways we can use it to validate URLs.

Using Uri.TryCreate

The first method we are going to look at is the Uri.TryCreate(). Its simplicity and ease of use make it a good choice for most use cases.

However, we should note that while Uri may accept some URLs as valid, but they will still be technically incorrect according to the URI specifications. Thus, they may behave unexpectedly in certain scenarios. This method has a more relaxed validation compared to regular expressions, so we may need additional validation steps for specific use cases.

Let’s define another validation method ValidateUrlWithUriCreate() in our validator class:

public static bool ValidateUrlWithUriCreate(string url, out Uri? uri)
{
    var success = Uri.TryCreate(url, UriKind.RelativeOrAbsolute, out uri);
    
    return success;
}

Here, we pass in the URL we want to validate as the first parameter to Uri.TryCreate() method. Then we specify that we accept either a Relative or an Absolute URL. The method will return true if the URL is valid and the result URI object will be stored in the variable uri.

Now, let’s see it in action:

url = "https://api.facebook.com:443";
success = UrlValidator.ValidateUrlWithUriCreate(url, out _);

Console.WriteLine($"The URL '{url}' is {(success ? "valid" : "invalid")}.");

url2 = "ftp:///api.site.com?value=word1 word2";
success = UrlValidator.ValidateUrlWithUriCreate(url2, out _);

Console.WriteLine($"The URL '{url2}' is {(success ? "valid" : "invalid")}.");

Similar to our previous method, we test the ValidateUrlWithUriCreate() method with another 2 URLs.

Now, let’s check the console output:

The URL 'https://api.facebook.com:443' is valid.
The URL 'ftp:///api.site.com?value=word1 word2' is invalid.

Again, our method correctly interprets the first URL as valid and the second as an invalid one.

Using Uri.IsWellFormedUriString

Apart from the TryCreate() method, the Uri class gives us another mechanism for stricter validation – namely the Uri.IsWellFormedUriString() method.

The Uri.IsWellFormedUriString() method makes sure that the string is a well-formed URL following the RFC 3986 and RFC 3987 specifications for URI syntax. By using it, we can determine if a string is a valid URL by attempting to construct one. It also ensures that the string does not require any further character escaping.

First, let’s define a ValidateUrlWithUriWellFormedString() method in our UrlValidator class:

public static bool ValidateUrlWithUriWellFormedString(string url)
{
    var success = Uri.IsWellFormedUriString(url, UriKind.RelativeOrAbsolute);
    
    return success;
}

Here, we simply call the method and specify that we accept either a Relative or an Absolute URL.

Next, we can use it to validate 2 URLs:

url = "https://site.company?q=search";
success = UrlValidator.ValidateUrlWithUriWellFormedString(url);

Console.WriteLine($"The URL '{url}' is {(success ? "valid" : "invalid")}.");

url2 = "ftp://api.site.com?value=word1 word2";
success = UrlValidator.ValidateUrlWithUriWellFormedString(url2);

Console.WriteLine($"The URL '{url2}' is {(success ? "valid" : "invalid")}.");

Again, we run validations on 2 input URLs, and can then inspect the console:

The URL 'https://site.company?q=search' is valid. 
The URL 'ftp://api.site.com?value=word1 word2' is invalid.

Here, we see that the first one is valid according to web standards. However, the second one is considered incorrect as its query string is improperly escaped. The white space between the words should be replaced by either “%20” or a “+”.

Check if URL Is Valid With HTTP Request

Another method we can validate URLs is by sending an HTTP request which checks the server’s response status. This way we can ensure the existence and accessibility/availability of the specified URL.

Drawbacks and Security Risks

While it provides real-time validation, drawbacks include the dependency on network connectivity and potential performance overhead due to the need for an actual request. Additionally, it may not cover cases where the server allows requests but the resource does not exist.

When making network calls to foreign domains or URLs, it’s important to consider potential security risks. These include cross-origin resource sharing (CORS) issues, the trustworthiness of external domains, and the potential for malicious content or data privacy concerns. To mitigate these risks, we can implement proper security measures such as using HTTPS, content security policies (CSP), and/or input validation.

Implement Sending an HTTP request to Check if URL Is Valid

Let’s now observe how we can use this strategy to validate URLs:

public static async Task<bool> ValidateUrlWithHttpClient(string url)
{
    using var client = new HttpClient();
    try
    {
        var response = await client.SendAsync(new HttpRequestMessage(HttpMethod.Head, url));
        
        return response.IsSuccessStatusCode;
    }
    catch (HttpRequestException e)
        when (e.InnerException is SocketException
              { SocketErrorCode: SocketError.HostNotFound })
    {
        return false;
    }
    catch (HttpRequestException e)
        when (e.StatusCode.HasValue && (int)e.StatusCode.Value > 500)
    {
        return true;
    }
}

Here, we use the .NET’s built-in HttpClientΒ to send HTTP requests to the targetted URLs. Note that we specify the HTTP HEAD method, as we’re only interested in the remote server returning us an OK status code, indicating that the requested resource/URL has been found.

In the case of a failure (meaning that the URL has not been found by DNS), we expect the HTTP call to throw an HttpRequestException. Furthermore, this exception wraps an inner one of the type SocketException that has its SocketErrorCode property set to HostNotFound. This indicates that DNS hasn’t been able to resolve this hostname.

Here it is important to note that the requested resource might be temporarily unavailable (e.g. returns a status code of 5XX), in which case we still consider the URL as valid.

Next, let’s see this validation in action:

url = "https://api.facebook.com";
success = await UrlValidator.ValidateUrlWithHttpClient(url);

Console.WriteLine($"The URL '{url}' is {(success ? "valid" : "invalid")}.");

url2 = "https://www.example-nonexistent-url.com";
success = await UrlValidator.ValidateUrlWithHttpClient(url2);

Console.WriteLine($"The URL '{url2}' is {(success ? "valid" : "invalid")}.");

This time, we use our ValidateWithHttpClient() method to check the URL validation.

We can once again check the console output:

The URL 'https://api.facebook.com' is valid.
The URL 'https://www.example-nonexistent-url.com' is invalid.

Our console results indicate that the first URL was successfully accessed, whilst the second one hasn’t been resolved by DNS, hence we consider it invalid.

Benchmark URL Validation Methods

Ultimately, the method we choose depends on our application’s validation and performance requirements. Let’s now compare the methods we discussed by running some performance benchmarks with BenchmarkDotNet. We are going to test the validations against the URL https://site.company?q=search.

With that, let’s assess the results:

 Method                                 | Mean              | Error             | StdDev            | Allocated |
--------------------------------------- |------------------:|------------------:|------------------:|----------:|
 UriCreateValidationBenchmark           |          85.49 ns |          1.024 ns |          0.908 ns |      56 B |
 UriWellFormedStringValidationBenchmark |         195.51 ns |          3.702 ns |          3.281 ns |     136 B |
 RegexUrlValidationBenchmark            |      20,660.69 ns |        401.489 ns |        763.874 ns |   23256 B |
 HttpClientValidationBenchmark          | 487,029,002.56 ns | 16,963,239.617 ns | 43,787,538.241 ns |   67560 B |

From our results, we learn that the Uri.TryCreate() method happens to be the fastest method for URL validation, making it ideal for quick and efficient validation of basic URLs. Next comes the Uri.IsWellFormedString() method, which runs around 2.5 times slower. Regex validation comes in 3rd place and, surprisingly, takes significantly more time. Finally, validation using HTTP calls is shown to be the slowest due to the network communication.

Best Practices for URL Validation and Comparison Between Different Methods

Adhering to best practices for URL validation in .NET involves combining multiple validation methods, such as Regex patterns and the Uri class, to create a robust validation strategy. It’s essential to balance strict validation and practical flexibility based on our application’s needs.

For basic validation needs where accuracy and reliability are needed, Uri.TryCreate() is a suitable choice. It provides comprehensive parsing and validation capabilities. Also, when we want quick and lightweight validation, especially in scenarios where performance is crucial, Uri.IsWellFormedUriString() can be sufficient.

If we need detailed control over the validation pattern or have specific requirements not covered by built-in methods, then regular expression offers us the most flexibility. However, they require careful crafting and testing.

Lastly, making HTTP calls for URL validation is appropriate when the URL’s availability and content are critical. Nonetheless, we should use them carefully due to their resource-intensive nature.

Conclusion

In this article, we’ve explored diverse and effective methods for validating URLs in C#. By utilizing Regex patterns, the built-in URI class, and even real-time checks with HTTP requests, we now have a toolbox of techniques to ensure the accuracy and security of URLs in our applications.

Remember, choosing the right method depends on our specific requirements, and combining them allows us to create a robust URL validation strategy tailored to our projects’ needs.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!