In this article, we look into various ways to calculate the size of a directory.

Applications dealing with file management or storage inevitably need to know how much space a directory takes up. Calculating the directory size helps with disk space management as we can use it to strategize data distribution, identify large directories, and generate reports for users.Â

Let’s start.

## How to Calculate the Size of a Directory

As with most problems, we find multiple approaches to tackling this one. That said, any method for calculating the size of a directory needs to traverse each file within the directory. As we review each file, we retrieve its total bytes and update the total sum of the directory. Additionally, we want to check for subdirectories and include their files too.Â

Note that the size of a directory is not the same as its size on the disk. The size we are about to calculate refers to the actual amount of data the directory contains. The size on the disk, on the other hand, refers to the allocated space the specified directory takes on the disk. It may be equal to or larger than the total directory size.

Now, we can move on to how to calculate the size.

Let’s create a class named `DirectorySizeCalculator`. In it, we define three different methods to get the size of a directory. All of them should yield the same value for a given directory. The `Directory` and `DirectoryInfo` classes provide us with helpful properties and methods to achieve our goals.

### Determine Directory Size Via Recursion

A recursive method is a method that calls itself. To learn more about recursion, check out our article. Adopting recursion in our code improves readability.Â

We call our method `GetSizeWithRecursion()`:

```public static long GetSizeWithRecursion(DirectoryInfo directory)
{
if (directory == null || !directory.Exists)
{
throw new DirectoryNotFoundException("Directory does not exist.");
}

long size = 0;

try
{
size += directory.GetFiles().Sum(file => file.Length);

size += directory.GetDirectories().Sum(GetSizeWithRecursion);
}
catch (UnauthorizedAccessException)
{
Console.WriteLine(\$"We do not have access to {directory}");
}
catch (Exception ex)
{
Console.WriteLine(\$"We encountered an error processing {directory}: ");
Console.WriteLine(\$"{ex.Message}");
}

return size;
}```

Our method takes in an instance of the `DirectoryInfo` class representing the root directory. It then initializes `size` to keep track of the total size. Using the `GetFiles()`, we retrieve all the files in the directory. By calling `Sum()` we iterate through each file and add up their `Length`, updating `size`. Next, we check for subdirectories with `GetDirectories()`. If any exist, we call `Sum()` and pass in our method name only since its signature matches the `Func` delegate parameter of this extension method of `Sum(this IEnumerable<TSource> source, Func<TSource, long>)`.

Recursive solutions are concise and readable. Yet, they may negatively affect performance due to the method call overhead for each subdirectory.Â Let’s move forward and consider other methods to determine the size of a directory.

### An Iterative Method to Compute the Size

C# has four types of loops or iteration methods – while, do-while,Â for, and foreach loops. Let’s define another method, `GetSizeByIteration()` which utilizes a stack and a while loop to calculate the directory size:

```public static long GetSizeByIteration(string directoryPath)
{
long size = 0;
var stack = new Stack<string>();

stack.Push(directoryPath);

while (stack.Count > 0)
{
string directory = stack.Pop();

try
{
var files = Directory.GetFiles(directory);

foreach (var file in files)
{
size += new FileInfo(file).Length;
}

var subDirectories = Directory.GetDirectories(directory);

foreach (var subDirectory in subDirectories)
{
stack.Push(subDirectory);
}
}
catch (UnauthorizedAccessException)
{
Console.WriteLine(\$"We do not have access to {directory}");
}
catch (Exception ex)
{
Console.WriteLine(\$"We encountered an error processing {directory}: ");
Console.WriteLine(\$"{ ex.Message}");
}
}

return size;
}```

In this method, the parameter is the `directoryPath` as opposed to an instance of `DirectoryInfo` in the recursive method. Here, we discuss when to choose between the `Directory` and `DirectoryInfo` classes. Next, we initialize a string `stack` and push the path onto it. Then we have the `while` loop running as long as the `stack` is not empty.Â

The iterative method avoids the potential method call overhead of a recursive method. However, as nesting levels increase, the traversal logic could become more complex and potentially impact performance.

Finally, let’s see how using multiple concurrent threads could help handle these issues and make things faster.

### Calculating Directory Size In Parallel

Parallel processing involves using multiple threads to execute tasks simultaneously. We discuss the details of how this works here.

Now, we design our last method `GetSizeByParallelProcessing()`:

```public static long GetSizeByParallelProcessing(DirectoryInfo directory,
SearchOption searchOption = SearchOption.AllDirectories)
{
if (directory == null || !directory.Exists)
{
throw new DirectoryNotFoundException("Directory does not exist.");
}

long size = 0;

try
{
Parallel.ForEach(directory.EnumerateFiles("*", searchOption), fileInfo =>
{
try
{
}
catch (UnauthorizedAccessException)
{
}
catch (Exception ex)
{
Console.WriteLine(\$"Error processing {fileInfo.FullName}: ");
Console.WriteLine(\$"{ex.Message}");
}
});
}
catch (UnauthorizedAccessException)
{
}
catch (Exception ex)
{
Console.WriteLine(\$"Error processing {directory.FullName}: ");
Console.WriteLine(\$"{ex.Message}");
}

return size;
}```

This method includes a `SearchOption`Â  parameter that we pass into `EnumerateFiles()`. It is an enum that lets us indicate whether we want to focus only on the top directory or consider subdirectories in our calculation. In other words, it helps us manage the depth of our traversal. `SearchOption.AllDirectories` includes all subfolders while `SearchOption.TopDirectoryOnly` focuses on the files in the specified directory. Â

`Parallel.ForEach()` enables the concurrent processing of files in the directory (and subdirectory). We use `Interlocked.Add()` when multiple threads are updating the same variable, in this case, `size`, simultaneously. The Interlocked class enables us to perform certain operations atomically, preventing race conditions.

`EnumerateFiles()` and `EnumerateDirectories()` are more efficient than the `GetFiles()` and `GetDirectories()` methods because they allow us to process our targets without loading them all into memory upfront. The enumerate methods have overloads that include a `searchPattern` and `SearchOption` parameters. The search pattern allows us to restrict what files/directories we wish to enumerate. While `SearchOption`, as we have already seen, is an enum we use to specify the depth of our search traversal.

Breaking down tasks into smaller pieces using parallel programming enhances performance and makes the system more responsive. But, note that it might require more CPU usage, especially when processing multiple subdirectories concurrently.

## Conclusion

We wrote three methods that calculate the size of a directory. The advantages of each method come with certain tradeoffs. Therefore, when deciding whether to use recursive, iterative, or parallel processing methods, there are a few key factors to consider. These include the overall size of the directory structure, resource constraints, performance requirements, and any other unique needs of our application. By accounting for all these variables, we can decide on the best-suited method to calculate our directory size.Â