September 17, 2017

Control Django DB Performance using django-test-query-counter

A library is introduced in this post to help maintain the database performance in Django Apps through testing. It works by measuring queries run during unit tests, and also by offering by a command that compares test query counts from two different tests runs and provides detailed query information such as stacktraces. It aims to detect possible Database performance issues as soon as possible.

Introduction

When a problem arises that slows down your application, it’s not uncommon to relegate performance and deal with the problem. The same code can produce twice as many queries as before. Some common reasons behind that are:

  • New requirements which create the need to retrieve information about related records
  • Usage of different instances of QuerySet objects for the same query.
  • Lack of select_related, prefetch_related, or subqueries in the original queryset.
  • Lack of caching capabilities like @lrucache, @cached_property, and Django cache framework.

If a user detects a flaw, such as a page taking too long to load, they can make a complaint. Then, we find ourselves spending a lot of time trying to figure out what is causing that specific slow down. In the project I worked on, we often created entities to make 500 or more queries in one simple request. So it would be much more cost effective if we could detect that in an earlier stage of development. The figure below helps visualize the cost of change as a function of time, which also applies to performance problems.

The cost of change as a function of time
The cost of change as a function of time

The official Django documentation describes some typical problems and how to solve them. There are also great tools to debug Django applications like django-debug-toolbar which show what is happening on a particular request, including a detailed DB count. And while they excel when debugging on a one time basis, they are not suited for integrating in an automated workflow. Like Code Coverage, and regular functional tests, DB performance should always be measured and kept under control to ensure quality and performance. So it’s a good idea to integrate the database performance check into the regular coding workflow. For example it can be included as an additional check before pushing code into the master branch.

The solution

django-query-counter
django-query-counter

Django-test-query-counter (or simply query-counter) works by tracking each query made during the unit tests. It operates under the premise “less queries means more speed”. Django-test-query-counter controls the number of DB queries for each request inside each test case. Technically, it’s a Django app with two parts. One is a middleware for tracking each query done inside each request of each unit test which generates a JSON file with the query count information for a particular test run. The second is a management command for checking two query count JSON files, and displaying violations on the number of DB queries on the second test run. While it can be used locally, it shines when used along with a CI like Jenkins, where you can run unit tests periodically, along with the DB performance checks. While Query-counter doesn’t integrate with any particular CI, it is straightforward enough to include the check_query_count call on the CI tasks.

Take for example a test where a list of books is retrieved and then the information about the first one is received:

class BookTester(TestCase):
    def test_getbook(self):
      # This makes 30 queries
      book_list = self.client.get('/api/books/all')
      self.assertEqual(
        book_list.status_code,
        status.HTTP_200_OK
      )
      first_book = json.decode(book_list.data)

      # This makes 20 queries
      book_info = self.client.get(
        '/api/books/{}'.format(first_book['id'])
      )
      self.assertEqual(
        book_info.status_code,
        status.HTTP_200_OK
      )

In the previous test, query-counter tracked the following information:

  • BookTester.test_getbook (50 queries)
    * /api/books/all (20 queries)
    
    * /api/books/5 (30 queries)
    

The query-counter has a tolerance threshold k expressed in percentage. The performance check fails if a request has more than k percent queries than the last successful run. For example if the threshold is 10%, then the /api/books/all call can tolerate at most 22 queries.

Using the app locally

The query-counter is available at GitHub and PyPI.

  1. First of all, install the app from PIP:

    $ pip install django-request-query-counter

  2. Then add it to your INSTALLED_APPS in your settings file:

    INSTALLED_APPS = ( ... 'test_query_count', ... )

  3. After that, run your tests as usual:

    $ python manage.py test

  4. If the query-counter was installed successfully, you will be able to see a file named reports/query_count.json directory of your app, which counts the number of queries from each test. Copy your file locally:

    $ cp reports/query_count.json last_query_count.json

  5. Modify your application by adding new queries in your code and running the tests again, and then run the check_query_count command

    $ python manage.py check_query_count --last-count-file last_query_count.json

That will check last_query_count.json against reports/query_count.json and will report every test that passed the 10% threshold (the default). You can configure that number if you feel it’s necessary.

CI interaction

This app can easily be integrated into a CI like Jenkins or Travis. The flow used is described in the figure in the previous section.

  1. Run unit tests
  2. Run check_query_count against the last successful build query_count.json build artifact file. The build should be marked unstable if this steps fails because of a violation.
  3. Archive the query_count.json build artifact.

The script below illustrates a possible bash script to run step 2 for Jenkins. It is similar to the local case except the query_count.json is downloaded from the CI instead of having it locally.

curl http://yourci.com/yourjob/lastSuccessfulBuild/artifact/reports/query_count.json -o last_query_count.json
python manage.py check_query_count --last-count-file last_query_count.json

Next steps

Every feedback on this tool will be welcome. This is a new library and we will be adding more features to it, you are free to contribute with bug reports, or pull requests. Some items in the roadmap are:

  • Include detailed stacktraces in queries..
  • Analyze individual queries not using indexes, via SQL explain. For example trigger warnings on queries not using indexes.
  • HTML graphical report of queries executed by each line of code.

"Control Django DB Performance using django-test-query-counter" by Ignacio Avas is licensed under CC BY SA. Source code examples are licensed under MIT. Categorized under django and open source.

We’re Hiring
Apply for one of our open positions and find out what you are capable of.
Join us