When and How to Configure a Read Replica Database in Django

Pablo Grill
May 7, 2020

One of the most important tasks in software engineering is to choose an architecture that best suits a specific problem. The wrong choice of architecture can result in a nonfunctional site that could affect your business. One of the most common problems related to architectures is performance.

Performance issues usually appear in large systems and are difficult to diagnose because they are difficult to reproduce (sometimes, the issue only occurs in a specific environment). They can be generated by several factors, such as hardware, network issues, a non performant program, etc. In this post we will focus on performance issues related to the database and how we can deal with them using read-replicas.

Performance Issues in Databases

The database is one of the main sources of slowness that a system can have, and it can be generated by several factors. One of the most common is having the wrong schema design. A database that wasn’t designed with the expected operations in mind probably will have a poor performance. However, a well-designed schema is not a guarantee of a good performance, and there are other factors that can affect it, like the number of connections, for example. If we have a system with a single database that is used by thousands of users, the number of connections that are hitting it will be considerable. If the number of connections is high, the database doesn’t have enough resources to deal with all of them in real time, bringing up performance issues. Fortunately, the database has a solution for that problem: replicas.

Replicas in Databases

The idea of replication in software and hardware refers to the technique of using multiple copies of a resource in order to guarantee performance, availability, and fault tolerance. When we talk about replication in databases, we are talking about having several servers running with the same "data." There are several configurations available to implement database replication, for example:

  • All the instances allow read/write

  • A single master instance (that allows read/write) and several read-only instances. In that schema, the replication and the synchronization of data can be done synchronously or asynchronously.

In this post, we will talk about the second configuration, assuming that the read replica is updated in a synchronous way.

Configure Read Replicas in a Django App

To configure and use read replicas in a Django app we need to take care of these points:

  • Declaring the read replicas instances as databases in the Django application

  • Configuring a router to choose the read replicas when required

Declaring the Read Replicas Instance as Databases

To allow the usage of the read replicas, we need to declare them as additional databases in the Django settings file. This step is easy and can be done following the official documentation for multiple databases.

The idea is to set the master instance (read/write server) as the default database and declare all the read replicas as additional databases. For example, if we have configured two replicas, the settings file should look like this:

DATABASES = {
    'default': {
        'NAME': 'master',
        'ENGINE': 'django.db.backends.postgresql',
        'HOST': 'host_to_the_master_instance'
        'USER': 'master_postgres_user',
        'PASSWORD': 'master_password'
    },
    'replica_1': {
        'NAME': 'replica_1',
        'ENGINE': 'django.db.backends.postgresql',
        'HOST': 'host_to_the_replica_1_instance'
        'USER': 'replica_1_postgres_user',
        'PASSWORD': 'replica_1_password'
    },
    'replica_2': {
        'NAME': 'replica_2',
        'ENGINE': 'django.db.backends.postgresql',
        'HOST': 'host_to_the_replica_2_instance'
        'USER': 'replica_2_postgres_user',
        'PASSWORD': 'replica_2_password'
    },
}

After this configuration step, we should have the two replicas available to use from the Django app. To check that, we can use the method using in a queryset. For example, we can test that we can access the model User from all the databases (the master and also the replicas).

# The users are retrieved from the master instance
User.objects.all()

# The users are retrieved from the replica 1 instance
User.objects.using('replica_1').all()

# The users are retrieved from the replica 2 instance
User.objects.using('replica_2').all()

We can also specify which database to use when we save an object. In that case, if we try to save an object in one of the replicas, we will obtain an error because the instance only allows read operations.

# To save the instance in the default database. 
# The operation is OK
u = User(email='user@user.com')
u.save()

# Try to save the instance in the replica instance. 
# The operation will throw an error
u = User(email='user@user.com')
u.save(using='replica_1')

Configuring a Router to Choose the Read Replicas When Required

After the previous step, we have the replicas instance available for the Django application, and we can access it manually when it is required. However, this solution doesn’t scale because it's the developer’s responsibility to detect when it is a read-only operation and choose the replica. Moreover, imagine that in the future we create more replicas; we'll need to review all the code and manually select which replica to use in each of these operations.

The alternative to it is using a Django database router. We can create a custom router that can use the default instance for write operations and choose a random replica in the read ones. An example of a router that implements this behavior is the CustomRouter:

import random

class CustomRouter:
    def db_for_read(self, model, **hints):
        """
        Return one of the replicas
        """
        return random.choice(['replica_1', 'replica_2'])

    def db_for_write(self, model, **hints):
        # Always return the default database
        return 'default'

    def allow_relation(self, obj1, obj2, **hints):
        return True

    def allow_migrate(self, db, app_label, model_name=None, **hints):
        return True

Then, we need to specify in Django's setting that we want to use this router instead of the default one.

DATABASE_ROUTERS = ['path.to.CustomRouter’]

After that changes, all the read operations in the app will hit the replicas, while the write operations will hit the master instance.


In summary, enabling read replicas in Django is pretty easy and it can improve the performance of the application. As we have shown in this post, the only changes that we need to implement are configurations and settings; the rest of our code will look the same. We hope this post helped you better understand how to use replicas in a Django application and achieve a higher performance.

"When and How to Configure a Read Replica Database in Django" by Pablo Grill is licensed under CC BY SA. Source code examples are licensed under MIT.

Photo by sophilabs.

Categorized under research & learning.

We are Sophilabs

A software design and development agency that helps companies build and grow products by delivering high-quality software through agile practices and perfectionist teams.