In this article, we will discuss why having multiple databases behind an application is useful and how dynamically switching an EF DbContext can be achieved without experiencing downtime. Feel free to check out our series of articles on Entity Framework Core.
Let’s start.
Why Should We Use Multiple Databases?
Let’s discuss the numerous advantages of using multiple databases within an application.
Flexibility in Database Technology
One of the most significant benefits is the ability to employ different database technologies for various types of data. For example, if an area of the application contains a large quantity of unstructured data, it would be challenging to store it in an SQL database. Instead, we might consider storing this data in a NoSQL database. Conversely, if another area of the application uses structured data, we can use an SQL database.
Scalability
Another benefit is the ability to scale the databases separately. Using the previous example, the NoSQL database may receive a lot more queries than the SQL one, causing a slowdown. Since we are using two different databases, we can simply scale the NoSQL one.
Decoupling
Utilizing multiple databases also encourages the development of a microservices architecture. Building a microservice for each database becomes straightforward when data boundaries are clear enough to define separate databases.
Multitenancy
In a multitenant application, it’s possible to use multiple databases. This is relevant when one of the tenants requires it due to having larger amounts of data and superior throughput needs than other tenants. In such cases, it becomes logical to allocate a distinct database.
Caching
And the last technical advantage of using multiple databases is caching. For example, we can offload read-heavy operations to an in-memory cache database, like Redis. This approach ensures better responsiveness and also frees up resources from our main database.
Business Demand
Sometimes, it’s not our choice to use multiple databases. In cases of compliance, we may be required by law to store sensitive user information in a separate database. Business requirements can also force us to use multiple databases, for example when we need to archive rarely accessed data.
Let’s see what our options are from the application side.
Structuring Our Application for Multi-Database Operation
There are multiple ways to implement our domain and data model in a modern application. When dealing with databases that store different data (like in the case of microservices) or data in a different shape (like NoSQL and SQL databases) it’s logical to use different models that can accommodate the different needs of the data.
However, when we are relying on multiple databases that store the same data, the two main ways to structure our models are by splitting them into query and command models (known as CQRS) or by maintaining a single domain model for both reading and writing data. To learn more about CQRS, check out our article on the topic.
In this article, we will focus on the single-domain model approach.
Now that we’ve discussed the theoretical aspects, let’s dive into a few possible implementations using Entity Framework Core.
Using a Single DbContext
The most straightforward approach is to use a single DbContext
and to change the connection string based on a value that is provided by another service.
Setting up the Infrastructure
Let’s start by creating a new console application and installing the following NuGet packages:
Microsoft.EntityFrameworkCore Microsoft.EntityFrameworkCore.Sqlite Microsoft.Extensions.Configuration Microsoft.Extensions.Configuration.Json
These packages will allow us to use Entity Framework Core, read configuration values from the appsettings.json
file, and configure our application using Microsoft’s dependency injection framework. For simplicity, we will use two SQLite databases, but any other database server would work just as well. Just remember to install the appropriate Entity Framework database provider NuGet package.
Now, let’s use the sqlite3
command-line tool to create two SQLite databases at the root of the project:
sqlite3 primary.db sqlite3 secondary.db
When we execute the sqlite3
command it will create the database file and open a console connection to that database. Note that the console prompt will change into sqlite>
. We can query the currently open databases name with the .databases
command to ensure it has been created. To quit the SQLite console we simply use the .quit
command.
When we finished creating the databases, let’s add an appsettings.json
file with the following content, at the root of the project, to store our connection strings:
{ "ConnectionStrings": { "Primary": "Data Source=primary.db;", "Secondary": "Data Source=secondary.db;" } }
Next, in the Program.cs
file, let’s set up our ConfigurationBuilder
and ServiceCollection
instances, and load the appsettings.json
:
var services = new ServiceCollection(); var configurationBuilder = new ConfigurationBuilder(); configurationBuilder.AddJsonFile("appsettings.json"); var configuration = configurationBuilder.Build(); services.AddSingleton<IConfiguration>(configuration);
We’ve finished setting up the infrastructure setup, so now let’s continue with creating the necessary code.
Creating the DataSourceProvider
For dynamic accessing and setting the connection string, let’s create an enum
to make the choice clear in the code:
public enum DataSource { Primary, Secondary }
Now, let’s create the DataSourceProvider
service and its interface:
public class DataSourceProvider : IDataSourceProvider { private readonly IConfiguration _configuration; public DataSource CurrentDataSource { get; set; } public DataSourceProvider(IConfiguration configuration) { _configuration = configuration; } public string GetConnectionString() { return CurrentDataSource switch { DataSource.Primary => _configuration.GetConnectionString("Primary")!, DataSource.Secondary => _configuration.GetConnectionString("Secondary")!, _ => throw new ArgumentOutOfRangeException() }; } } public interface IDataSourceProvider { DataSource CurrentDataSource { set; } string GetConnectionString(); }
We begin by declaring a CurrentDataSource
property with a public getter and setter to be able to change the connection string. Next, we need to inject an IConfiguration
instance via the constructor. This ensures that the DataSourceProvider
service can retrieve the connection string from our appsettings.json
file. Finally we’ll implement the GetConnectionString()
method, which returns either the Primary
or the Secondary
connection string from the configuration, based on the value of the CurrentDataSource
property.
We will use this service to construct our DbContext
, so let’s create one.
Creating the DbContext
To implement Entity Framework’s DbContext
, we simply need to inherit from it and create an empty constructor that accepts a single DbContextOptions<AppDbContext>
parameter:
public class AppDbContext : DbContext { public AppDbContext(DbContextOptions<AppDbContext> options) : base(options) { } }
We don’t need any DbSet
for now, since we’re primarily looking to verify the connectivity to our database. Let’s register our AppDbContext
with the SQLite provider along with the DataSourceProvider
service in the Program.cs
file, and then construct the ServiceProvider
:
services.AddDbContext<AppDbContext>((sp, o) => o.UseSqlite(sp.GetRequiredService<IDataSourceProvider>().GetConnectionString())); services.AddSingleton<IDataSourceProvider, DataSourceProvider>(); var serviceProvider = services.BuildServiceProvider();
Now, when we request an AppDbContext
from the DI, it first retrieves the connection string from the DataSourceProvider
service and constructs a DbContext
using that connection string. So it’s crucial to understand that we have to set the CurrentDataSource
property’s value before requesting an AppDbContext.
If not, the DbContext
will be created using the previous CurrentDataSource
value.
Switching Connection Strings
Let’s test the connection by implementing a simple user input handling:
string? input; do { Console.WriteLine("Choose a data source: [1] Primary, [2] Secondary"); input = Console.ReadLine(); } while (input is not ("1" or "2")); serviceProvider.GetRequiredService<IDataSourceProvider>().CurrentDataSource = input switch { "1" => DataSource.Primary, "2" => DataSource.Secondary, _ => throw new ArgumentOutOfRangeException() };
We prompt the user to type either 1 or 2 into the console and adjust the CurrentDataSource
property based on their input. Remember, we should only request an AppDbContext
after assigning the DataSource
:
var dbContext = serviceProvider.GetRequiredService<AppDbContext>(); Console.WriteLine("Current connection string: " + dbContext.Database.GetConnectionString()); Console.WriteLine($"CanConnect result: {await dbContext.Database.CanConnectAsync()}");
To confirm that the chosen option is relayed to Entity Framework, we can display the current connection string on the console. Then, we test the connection to the database and inform the user of the outcome:
CanConnect result: true
The advantage of this method is its simplicity and flexibility. The DataSourceProvider
service can be seamlessly replaced with, for instance, HttpContextAccessor
, allowing for the receipt of a connection string from the client or deciding on a database based on an HTTP header value. One limitation, however, is that we must be particularly attentive when assigning the CurrentDataSource
property value. It’s easy to unintentionally use an outdated connection string in our AppDbContext
if we’re not meticulous. Another limitation is that this method is impractical for multi-user web applications unless the CurrentDataSource
property value is stored on a user-by-user basis, given the nature of service lifetimes.
For the time being, we’ll aim to address these issues with a different implementation that uses two DbContext
classes.
Using a Separate DbContext
To employ two distinct DbContext
classes, an abstraction layer above them is essential. This is where the repository pattern comes into play. For a deeper understanding of repositories, refer to our article on the subject.
Creating Another DbContext
Having established that, let’s rename our previous AppDbContext
to PrimaryDbContext
and create a copy named SecondaryDbContext
:
public sealed class PrimaryDbContext : DbContext { public PrimaryDbContext(DbContextOptions<PrimaryDbContext> options) : base(options) { } } public class SecondaryDbContext : DbContext { public SecondaryDbContext(DbContextOptions<SecondaryDbContext> options) : base(options) { } }
Let’s update the dependency injection registrations too:
services.AddDbContext<PrimaryDbContext>(o => o.UseSqlite(configuration.GetConnectionString("Primary"))); services.AddDbContext<SecondaryDbContext>(o => o.UseSqlite(configuration.GetConnectionString("Secondary")));
Each DbContext
receives its distinct connection string, eliminating the need for the DataSourceProvider
service. It can now be safely removed.
Implementing the Repository Pattern with DbContexts
Now let’s implement the Repository
class with its interface:
public class Repository : IRepository { private readonly PrimaryDbContext _primaryDbContext; private readonly SecondaryDbContext _secondaryDbContext; public Repository(PrimaryDbContext primaryDbContext, SecondaryDbContext secondaryDbContext) { _primaryDbContext = primaryDbContext; _secondaryDbContext = secondaryDbContext; } public async Task<bool> TestConnection(DataSource dataSource) { return dataSource switch { DataSource.Primary => await _primaryDbContext.Database.CanConnectAsync(), DataSource.Secondary => await _secondaryDbContext.Database.CanConnectAsync(), _ => throw new ArgumentOutOfRangeException(nameof(dataSource), dataSource, null) }; } } public interface IRepository { Task<bool> TestConnection(DataSource dataSource); }
We begin by integrating both DbContext
classes, allowing the repository to determine which to use. Every method we introduce should include a DataSource
parameter. Using this, the repository can select between the DbContext
instances. For demonstration, we introduce a TestConnection()
method, which simply invokes the appropriate DbContext
‘s CanConnectAsync()
method to verify the database connection’s functionality.
Before assessing our solution, remember the dependency injection registration:
services.AddScoped<IRepository, Repository>();
As observed, we opt for the Scoped
lifetime here, in contrast to the Singleton
choice we made for the DataSourceProvider
service. This distinction is due to the need for only one DataSourceProvider
instance that aligns with the application’s lifespan, ensuring it retains the selected DataSource
‘s value. This particularity makes it unsuitable for multi-user web applications. With a Singleton
service, every user shares the same instance, and a connection string alteration by one user affects all. However, registering the DataSourceProvider
as a Scoped
service means it resets its value post every request. Therefore, persistent storage becomes necessary to retain individual user connection string preferences.
Thus, the Scoped
lifetime is preferable here. By default, all DbContext
classes enlist as Scoped
, and we should consistently align with, or opt for, lifespans shorter than those of the services we use. This precaution avoids inadvertently extending the lifespan of these services beyond their designated scope. For a comprehensive understanding, refer to our article on service lifetimes.
Switching a DbContext
Time to test our solution by calling the TestConnection()
method with both DataSource.Primary
and DataSource.Secondary
arguments:
var repository = serviceProvider.GetRequiredService<IRepository>(); var primaryResult = await repository.TestConnection(DataSource.Primary); var secondaryResult = await repository.TestConnection(DataSource.Secondary); Console.WriteLine($"Primary connection result: {primaryResult}"); Console.WriteLine($"Secondary connection result: {secondaryResult}");
We should see the following output:
Primary connection result: True Secondary connection result: True
We’ve effectively addressed both challenges presented by our initial approach. Now, before making a call, we’re always certain about which DbContext
we’re employing, and we can make choices tailored to individual users. That said, consider a scenario where numerous methods in the repository are invoked and they exclusively utilize the primary database. In such a case, indicating the DataSource
for every single method could become rather repetitive.
We can streamline things by introducing a RepositoryFactory
class, eliminating the need for this property.
Implementing the Factory Pattern
The factory pattern is a design pattern that helps us separate the creation logic of a class or a group of classes from the classes themselves.
Creating a Repository for Each DbContext
To be able to utilize the factory pattern let’s first modify our existing Repository
class:
public class PrimaryRepository : IRepository { private readonly PrimaryDbContext _primaryDbContext; public PrimaryRepository(PrimaryDbContext primaryDbContext) { _primaryDbContext = primaryDbContext; } public async Task<bool> TestConnection() { return await _primaryDbContext.Database.CanConnectAsync(); } } public interface IRepository { Task<bool> TestConnection(); }
We removed the SecondaryDbContext
entirely since this is now a PrimaryRepository
. We also removed the DataSource
parameter from the IRepository
interface’s TestConnection()
method, because we won’t need it anymore. And we cleaned up the implementation of said method, as we are only checking the connection to the primary database through our PrimaryRepository
.
Now let’s create the SecondaryRepository
too based on this:
public class SecondaryRepository : IRepository { private readonly SecondaryDbContext _secondaryDbContext; public SecondaryRepository(SecondaryDbContext secondaryDbContext) { _secondaryDbContext = secondaryDbContext; } public async Task<bool> TestConnection() { return await _secondaryDbContext.Database.CanConnectAsync(); } }
It’s totally the same for now, but it uses the SecondaryDbContext
to connect to our secondary database. Let’s update the dependency injection registration of the IRepository
interface:
services.AddTransient<IRepository, PrimaryRepository>(); services.AddTransient<IRepository, SecondaryRepository>();
We registered both repositories to the same interface. But how does the dependency injection framework decide which implementation we request? It doesn’t have to, since we will request all of them. That’s where the factory pattern comes into play.
Implementing the Repository Factory
Let’s create the RepositoryFactory
class:
public class RepositoryFactory : IRepositoryFactory { private readonly IEnumerable<IRepository> _repositories; public RepositoryFactory(IEnumerable<IRepository> repositories) { _repositories = repositories; } public IRepository GetRepository(DataSource dataSource) { return dataSource switch { DataSource.Primary => _repositories.Single(r => r is PrimaryRepository), DataSource.Secondary => _repositories.Single(r => r is SecondaryRepository), _ => throw new ArgumentOutOfRangeException(nameof(dataSource), dataSource, null) }; } } public interface IRepositoryFactory { IRepository GetRepository(DataSource dataSource); }
In the constructor, we requested all the implementations of the IRepository
interface. Then we declared the factory method, which will get the DataSource
as a parameter and will return the appropriate IRepository
for the given DataSource
. Then, inside the method, we select the appropriate repository from the available repositories. Also, remember to register the factory:
services.AddTransient<IRepositoryFactory, RepositoryFactory>();
At this point, using the DataSource
enum is a bit redundant, so we should remove it entirely, by making the GetRepository()
method generic:
public class RepositoryFactory : IRepositoryFactory { private readonly IEnumerable<IRepository> _repositories; public RepositoryFactory(IEnumerable<IRepository> repositories) { _repositories = repositories; } public IRepository GetRepository<TRepository>() where TRepository : IRepository { return _repositories.Single(r => r is TRepository); } } public interface IRepositoryFactory { IRepository GetRepository<TRepository>() where TRepository : IRepository; }
Now, we won’t have to add or remove values from an enum when we introduce a new database or remove an existing one. We applied the IRepository
constraint to the TRepository
type parameter, ensuring that the caller can only provide classes as parameters that implement the IRepository
interface.
By implementing the factory pattern, we also made our classes adhere to the single responsibility principle. Our repositories have no information about databases other than their own, or how to choose between them, so they do only one thing. For example, our factory only chooses the appropriate repository.
Testing the Solution
To test our implementation, let’s request an IRepositoryFactory
from the dependency injection container:
var repositoryFactory = serviceProvider.GetRequiredService<IRepositoryFactory>(); var primaryRepository = repositoryFactory.GetRepository<PrimaryRepository>(); var secondaryRepository = repositoryFactory.GetRepository<SecondaryRepository>(); var primaryResult = await primaryRepository.TestConnection(); var secondaryResult = await secondaryRepository.TestConnection(); Console.WriteLine($"Primary connection result: {primaryResult}"); Console.WriteLine($"Secondary connection result: {secondaryResult}");
Then we get both a PrimaryRepository
and a SecondaryRepository
from the factory, and print the result of their TestConnection()
methods:
Primary connection result: True Secondary connection result: True
We successfully made our code cleaner and implemented all the best practices to make it easy to test and maintain.
The Lifetime of Repositories
An important aspect that we should discuss is the lifetime of the repositories created by the factory. We’ve registered everything as Transient
, which means that every time we request an instance, the dependency injection framework will create a new one for us. However, when we request the factory, it requests all the repositories at once. This means that by calling repositoryFactory.GetRepository<PrimaryRepository>()
twice on the same factory we will get the same repository object:
var serviceCollection = new ServiceCollection(); serviceCollection.AddTransient<PrimaryDbContext>(_ => new PrimaryDbContext(new())); serviceCollection.AddTransient<IRepository, PrimaryRepository>(); serviceCollection.AddTransient<IRepositoryFactory, RepositoryFactory>(); var serviceProvider = serviceCollection.BuildServiceProvider(); var repositoryFactory = serviceProvider.GetRequiredService<IRepositoryFactory>(); var primaryRepository1 = repositoryFactory.GetRepository<PrimaryRepository>(); var primaryRepository2 = repositoryFactory.GetRepository<PrimaryRepository>(); Assert.That(primaryRepository2, Is.SameAs(primaryRepository1));
It is an important thing to keep in mind, that our repository’s lifetime is bound to the factory’s lifetime. So we should not register our factory as singleton, since that will transitively turn all our repositories into singleton, and we should avoid that. Similarly, if we register our repositories as Transient
, but the factory as Scoped
, then effectively the repositories will be Scoped
.
Conclusion
In this article, we overviewed the benefits of using multiple repositories in an application.
We also discussed different approaches to accessing multiple databases using Entity Framework Core. We touched on some design patterns too, like using repositories or factories.
Finally, we tested some important lifetime behaviors of the repositories.