In this article, we will discuss why having multiple databases behind an application is useful and how dynamically switching an EF DbContext can be achieved without experiencing downtime. Feel free to check out our series of articles on Entity Framework Core.

To download the source code for this article, you can visit our GitHub repository.

Let’s start.

Why Should We Use Multiple Databases?

Let’s discuss the numerous advantages of using multiple databases within an application.

Flexibility in Database Technology

One of the most significant benefits is the ability to employ different database technologies for various types of data. For example, if an area of the application contains a large quantity of unstructured data, it would be challenging to store it in an SQL database. Instead, we might consider storing this data in a NoSQL database. Conversely, if another area of the application uses structured data, we can use an SQL database.

Scalability

Another benefit is the ability to scale the databases separately. Using the previous example, the NoSQL database may receive a lot more queries than the SQL one, causing a slowdown. Since we are using two different databases, we can simply scale the NoSQL one.

Support Code Maze on Patreon to get rid of ads and get the best discounts on our products!
Become a patron at Patreon!

Decoupling

Utilizing multiple databases also encourages the development of a microservices architecture. Building a microservice for each database becomes straightforward when data boundaries are clear enough to define separate databases.

Multitenancy

In a multitenant application, it’s possible to use multiple databases. This is relevant when one of the tenants requires it due to having larger amounts of data and superior throughput needs than other tenants. In such cases, it becomes logical to allocate a distinct database.

Caching

And the last technical advantage of using multiple databases is caching. For example, we can offload read-heavy operations to an in-memory cache database, like Redis. This approach ensures better responsiveness and also frees up resources from our main database.

Business Demand

Sometimes, it’s not our choice to use multiple databases. In cases of compliance, we may be required by law to store sensitive user information in a separate database. Business requirements can also force us to use multiple databases, for example when we need to archive rarely accessed data.

Let’s see what our options are from the application side.

Structuring Our Application for Multi-Database Operation

There are multiple ways to implement our domain and data model in a modern application. When dealing with databases that store different data (like in the case of microservices) or data in a different shape (like NoSQL and SQL databases) it’s logical to use different models that can accommodate the different needs of the data.

However, when we are relying on multiple databases that store the same data, the two main ways to structure our models are by splitting them into query and command models (known as CQRS) or by maintaining a single domain model for both reading and writing data. To learn more about CQRS, check out our article on the topic.

In this article, we will focus on the single-domain model approach.

Now that we’ve discussed the theoretical aspects, let’s dive into a few possible implementations using Entity Framework Core.

Using a Single DbContext

The most straightforward approach is to use a single DbContext and to change the connection string based on a value that is provided by another service.

Setting up the Infrastructure

Let’s start by creating a new console application and installing the following NuGet packages:

Microsoft.EntityFrameworkCore
Microsoft.EntityFrameworkCore.Sqlite
Microsoft.Extensions.Configuration
Microsoft.Extensions.Configuration.Json

These packages will allow us to use Entity Framework Core, read configuration values from the appsettings.json file, and configure our application using Microsoft’s dependency injection framework. For simplicity, we will use two SQLite databases, but any other database server would work just as well. Just remember to install the appropriate Entity Framework database provider NuGet package.

Now, let’s use the sqlite3 command-line tool to create two SQLite databases at the root of the project:

sqlite3 primary.db
sqlite3 secondary.db

When we execute the sqlite3 command it will create the database file and open a console connection to that database. Note that the console prompt will change into sqlite>. We can query the currently open databases name with the .databases command to ensure it has been created. To quit the SQLite console we simply use the .quit command.

When we finished creating the databases, let’s add an appsettings.json file with the following content, at the root of the project, to store our connection strings:

{
    "ConnectionStrings": {
        "Primary": "Data Source=primary.db;",
        "Secondary": "Data Source=secondary.db;"
    }
}

Next, in the Program.cs file, let’s set up our ConfigurationBuilder and ServiceCollection instances, and load the appsettings.json:

var services = new ServiceCollection();
var configurationBuilder = new ConfigurationBuilder();
configurationBuilder.AddJsonFile("appsettings.json");
var configuration = configurationBuilder.Build();
services.AddSingleton<IConfiguration>(configuration);

We’ve finished setting up the infrastructure setup, so now let’s continue with creating the necessary code.

Creating the DataSourceProvider

For dynamic accessing and setting the connection string, let’s create an enum to make the choice clear in the code: 

public enum DataSource
{
    Primary,
    Secondary
}

Now, let’s create the DataSourceProvider service and its interface:

public class DataSourceProvider : IDataSourceProvider
{
    private readonly IConfiguration _configuration;
    public DataSource CurrentDataSource { get; set; }

    public DataSourceProvider(IConfiguration configuration)
    {
        _configuration = configuration;
    }
    
    public string GetConnectionString()
    {
        return CurrentDataSource switch
        {
            DataSource.Primary => _configuration.GetConnectionString("Primary")!,
            DataSource.Secondary => _configuration.GetConnectionString("Secondary")!,
            _ => throw new ArgumentOutOfRangeException()
        };
    }
}

public interface IDataSourceProvider
{
    DataSource CurrentDataSource { set; }
    string GetConnectionString();
}

We begin by declaring a CurrentDataSource property with a public getter and setter to be able to change the connection string. Next, we need to inject an IConfiguration instance via the constructor. This ensures that the DataSourceProvider service can retrieve the connection string from our appsettings.json file. Finally we’ll implement the GetConnectionString() method, which returns either the Primary or the Secondary connection string from the configuration, based on the value of the CurrentDataSource property.

We will use this service to construct our DbContext, so let’s create one.

Creating the DbContext

To implement Entity Framework’s DbContext, we simply need to inherit from it and create an empty constructor that accepts a single DbContextOptions<AppDbContext> parameter:

public class AppDbContext : DbContext
{
    public AppDbContext(DbContextOptions<AppDbContext> options) : base(options)
    {
    }
}

We don’t need any DbSet for now, since we’re primarily looking to verify the connectivity to our database. Let’s register our AppDbContext with the SQLite provider along with the DataSourceProvider service in the Program.cs file, and then construct the ServiceProvider:

services.AddDbContext<AppDbContext>((sp, o) => 
    o.UseSqlite(sp.GetRequiredService<IDataSourceProvider>().GetConnectionString()));
services.AddSingleton<IDataSourceProvider, DataSourceProvider>();
var serviceProvider = services.BuildServiceProvider();

Now, when we request an AppDbContext from the DI, it first retrieves the connection string from the DataSourceProvider service and constructs a DbContext using that connection string. So it’s crucial to understand that we have to set the CurrentDataSource property’s value before requesting an AppDbContext. If not, the DbContext will be created using the previous CurrentDataSource value.

Switching Connection Strings

Let’s test the connection by implementing a simple user input handling:

string? input;
do
{
    Console.WriteLine("Choose a data source: [1] Primary, [2] Secondary");
    input = Console.ReadLine();
} while (input is not ("1" or "2"));

serviceProvider.GetRequiredService<IDataSourceProvider>().CurrentDataSource = input switch
{
    "1" => DataSource.Primary,
    "2" => DataSource.Secondary,
    _ => throw new ArgumentOutOfRangeException()
};

We prompt the user to type either 1 or 2 into the console and adjust the CurrentDataSource property based on their input. Remember, we should only request an AppDbContext after assigning the DataSource:

var dbContext = serviceProvider.GetRequiredService<AppDbContext>();

Console.WriteLine("Current connection string: " + dbContext.Database.GetConnectionString());
Console.WriteLine($"CanConnect result: {await dbContext.Database.CanConnectAsync()}");

To confirm that the chosen option is relayed to Entity Framework, we can display the current connection string on the console. Then, we test the connection to the database and inform the user of the outcome:

CanConnect result: true

The advantage of this method is its simplicity and flexibility. The DataSourceProvider service can be seamlessly replaced with, for instance, HttpContextAccessor, allowing for the receipt of a connection string from the client or deciding on a database based on an HTTP header value. One limitation, however, is that we must be particularly attentive when assigning the CurrentDataSource property value. It’s easy to unintentionally use an outdated connection string in our AppDbContext if we’re not meticulous. Another limitation is that this method is impractical for multi-user web applications unless the CurrentDataSource property value is stored on a user-by-user basis, given the nature of service lifetimes.

For the time being, we’ll aim to address these issues with a different implementation that uses two DbContext classes.

Using a Separate DbContext

To employ two distinct DbContext classes, an abstraction layer above them is essential. This is where the repository pattern comes into play. For a deeper understanding of repositories, refer to our article on the subject.

Creating Another DbContext

Having established that, let’s rename our previous AppDbContext to PrimaryDbContext and create a copy named SecondaryDbContext:

public sealed class PrimaryDbContext : DbContext
{
    public PrimaryDbContext(DbContextOptions<PrimaryDbContext> options) : base(options)
    {
    }
}

public class SecondaryDbContext : DbContext
{
    public SecondaryDbContext(DbContextOptions<SecondaryDbContext> options) : base(options)
    {
    }
}

Let’s update the dependency injection registrations too:

services.AddDbContext<PrimaryDbContext>(o => o.UseSqlite(configuration.GetConnectionString("Primary")));
services.AddDbContext<SecondaryDbContext>(o => o.UseSqlite(configuration.GetConnectionString("Secondary")));

Each DbContext receives its distinct connection string, eliminating the need for the DataSourceProvider service. It can now be safely removed.

Implementing the Repository Pattern with DbContexts

Now let’s implement the Repository class with its interface:

public class Repository : IRepository
{
    private readonly PrimaryDbContext _primaryDbContext;
    private readonly SecondaryDbContext _secondaryDbContext;

    public Repository(PrimaryDbContext primaryDbContext, SecondaryDbContext secondaryDbContext)
    {
        _primaryDbContext = primaryDbContext;
        _secondaryDbContext = secondaryDbContext;
    }

    public async Task<bool> TestConnection(DataSource dataSource)
    {
        return dataSource switch
        {
            DataSource.Primary => await _primaryDbContext.Database.CanConnectAsync(),
            DataSource.Secondary => await _secondaryDbContext.Database.CanConnectAsync(),
            _ => throw new ArgumentOutOfRangeException(nameof(dataSource), dataSource, null)
        };
    }
}

public interface IRepository
{
    Task<bool> TestConnection(DataSource dataSource);
}

We begin by integrating both DbContext classes, allowing the repository to determine which to use. Every method we introduce should include a DataSource parameter. Using this, the repository can select between the DbContext instances. For demonstration, we introduce a TestConnection() method, which simply invokes the appropriate DbContext‘s CanConnectAsync() method to verify the database connection’s functionality.

Before assessing our solution, remember the dependency injection registration:

services.AddScoped<IRepository, Repository>();

As observed, we opt for the Scoped lifetime here, in contrast to the Singleton choice we made for the DataSourceProvider service. This distinction is due to the need for only one DataSourceProvider instance that aligns with the application’s lifespan, ensuring it retains the selected DataSource‘s value. This particularity makes it unsuitable for multi-user web applications. With a Singleton service, every user shares the same instance, and a connection string alteration by one user affects all. However, registering the DataSourceProvider as a Scoped service means it resets its value post every request. Therefore, persistent storage becomes necessary to retain individual user connection string preferences.

Thus, the Scoped lifetime is preferable here. By default, all DbContext classes enlist as Scoped, and we should consistently align with, or opt for, lifespans shorter than those of the services we use. This precaution avoids inadvertently extending the lifespan of these services beyond their designated scope. For a comprehensive understanding, refer to our article on service lifetimes.

Switching a DbContext

Time to test our solution by calling the TestConnection() method with both DataSource.Primary and DataSource.Secondary arguments:

var repository = serviceProvider.GetRequiredService<IRepository>();

var primaryResult = await repository.TestConnection(DataSource.Primary);
var secondaryResult = await repository.TestConnection(DataSource.Secondary);
Console.WriteLine($"Primary connection result: {primaryResult}");
Console.WriteLine($"Secondary connection result: {secondaryResult}");

We should see the following output:

Primary connection result: True
Secondary connection result: True

We’ve effectively addressed both challenges presented by our initial approach. Now, before making a call, we’re always certain about which DbContext we’re employing, and we can make choices tailored to individual users. That said, consider a scenario where numerous methods in the repository are invoked and they exclusively utilize the primary database. In such a case, indicating the DataSource for every single method could become rather repetitive.

We can streamline things by introducing a RepositoryFactory class, eliminating the need for this property.

Implementing the Factory Pattern

The factory pattern is a design pattern that helps us separate the creation logic of a class or a group of classes from the classes themselves.

Creating a Repository for Each DbContext

To be able to utilize the factory pattern let’s first modify our existing Repository class:

public class PrimaryRepository : IRepository
{
    private readonly PrimaryDbContext _primaryDbContext;

    public PrimaryRepository(PrimaryDbContext primaryDbContext)
    {
        _primaryDbContext = primaryDbContext;
    }

    public async Task<bool> TestConnection()
    {
        return await _primaryDbContext.Database.CanConnectAsync();
    }
}

public interface IRepository
{
    Task<bool> TestConnection();
}

We removed the SecondaryDbContext entirely since this is now a PrimaryRepository. We also removed the DataSource parameter from the IRepository interface’s TestConnection() method, because we won’t need it anymore. And we cleaned up the implementation of said method, as we are only checking the connection to the primary database through our PrimaryRepository.

Now let’s create the SecondaryRepository too based on this:

public class SecondaryRepository : IRepository
{
    private readonly SecondaryDbContext _secondaryDbContext;

    public SecondaryRepository(SecondaryDbContext secondaryDbContext)
    {
        _secondaryDbContext = secondaryDbContext;
    }

    public async Task<bool> TestConnection()
    {
        return await _secondaryDbContext.Database.CanConnectAsync();
    }
}

It’s totally the same for now, but it uses the SecondaryDbContext to connect to our secondary database. Let’s update the dependency injection registration of the IRepository interface:

services.AddTransient<IRepository, PrimaryRepository>();
services.AddTransient<IRepository, SecondaryRepository>();

We registered both repositories to the same interface. But how does the dependency injection framework decide which implementation we request? It doesn’t have to, since we will request all of them. That’s where the factory pattern comes into play.

Implementing the Repository Factory

Let’s create the RepositoryFactory class:

public class RepositoryFactory : IRepositoryFactory
{
    private readonly IEnumerable<IRepository> _repositories;

    public RepositoryFactory(IEnumerable<IRepository> repositories)
    {
        _repositories = repositories;
    }

    public IRepository GetRepository(DataSource dataSource)
    {
        return dataSource switch
        {
            DataSource.Primary => _repositories.Single(r => r is PrimaryRepository),
            DataSource.Secondary => _repositories.Single(r => r is SecondaryRepository),
            _ => throw new ArgumentOutOfRangeException(nameof(dataSource), dataSource, null)
        };
    }
}

public interface IRepositoryFactory
{
    IRepository GetRepository(DataSource dataSource);
}

In the constructor, we requested all the implementations of the IRepository interface. Then we declared the factory method, which will get the DataSource as a parameter and will return the appropriate IRepository for the given DataSource.  Then, inside the method, we select the appropriate repository from the available repositories. Also, remember to register the factory:

services.AddTransient<IRepositoryFactory, RepositoryFactory>();

At this point, using the DataSource enum is a bit redundant, so we should remove it entirely, by making the GetRepository() method generic:

public class RepositoryFactory : IRepositoryFactory
{
    private readonly IEnumerable<IRepository> _repositories;

    public RepositoryFactory(IEnumerable<IRepository> repositories)
    {
        _repositories = repositories;
    }

    public IRepository GetRepository<TRepository>() where TRepository : IRepository
    {
        return _repositories.Single(r => r is TRepository);
    }
}

public interface IRepositoryFactory
{
    IRepository GetRepository<TRepository>() where TRepository : IRepository;
}

Now, we won’t have to add or remove values from an enum when we introduce a new database or remove an existing one. We applied the IRepository constraint to the TRepository type parameter, ensuring that the caller can only provide classes as parameters that implement the IRepository interface.

By implementing the factory pattern, we also made our classes adhere to the single responsibility principle. Our repositories have no information about databases other than their own, or how to choose between them, so they do only one thing. For example, our factory only chooses the appropriate repository.

Testing the Solution

To test our implementation, let’s request an IRepositoryFactory from the dependency injection container:

var repositoryFactory = serviceProvider.GetRequiredService<IRepositoryFactory>();
var primaryRepository = repositoryFactory.GetRepository<PrimaryRepository>();
var secondaryRepository = repositoryFactory.GetRepository<SecondaryRepository>();

var primaryResult = await primaryRepository.TestConnection();
var secondaryResult = await secondaryRepository.TestConnection();
    
Console.WriteLine($"Primary connection result: {primaryResult}");
Console.WriteLine($"Secondary connection result: {secondaryResult}");

Then we get both a PrimaryRepository and a SecondaryRepository from the factory, and print the result of their TestConnection() methods:

Primary connection result: True
Secondary connection result: True

We successfully made our code cleaner and implemented all the best practices to make it easy to test and maintain.

The Lifetime of Repositories

An important aspect that we should discuss is the lifetime of the repositories created by the factory. We’ve registered everything as Transient, which means that every time we request an instance, the dependency injection framework will create a new one for us. However, when we request the factory, it requests all the repositories at once. This means that by calling repositoryFactory.GetRepository<PrimaryRepository>() twice on the same factory we will get the same repository object:

var serviceCollection = new ServiceCollection();
serviceCollection.AddTransient<PrimaryDbContext>(_ => new PrimaryDbContext(new()));
serviceCollection.AddTransient<IRepository, PrimaryRepository>();
serviceCollection.AddTransient<IRepositoryFactory, RepositoryFactory>();

var serviceProvider = serviceCollection.BuildServiceProvider();

var repositoryFactory = serviceProvider.GetRequiredService<IRepositoryFactory>();
var primaryRepository1 = repositoryFactory.GetRepository<PrimaryRepository>();
var primaryRepository2 = repositoryFactory.GetRepository<PrimaryRepository>();

Assert.That(primaryRepository2, Is.SameAs(primaryRepository1));

It is an important thing to keep in mind, that our repository’s lifetime is bound to the factory’s lifetime. So we should not register our factory as singleton, since that will transitively turn all our repositories into singleton, and we should avoid that. Similarly, if we register our repositories as Transient, but the factory as Scoped, then effectively the repositories will be Scoped.

Conclusion

In this article, we overviewed the benefits of using multiple repositories in an application.

We also discussed different approaches to accessing multiple databases using Entity Framework Core. We touched on some design patterns too, like using repositories or factories.

Finally, we tested some important lifetime behaviors of the repositories.

Liked it? Take a second to support Code Maze on Patreon and get the ad free reading experience!
Become a patron at Patreon!