November 8, 2016
SQL Window Functions in Django
What are Window Functions?
Window functions can be seen as close relatives to our well-known aggregate functions. They perform a calculation over a set of rows and return a single value. Unlike aggregate functions, they do not group the input rows into a single output row, thus keeping the original information intact.
Window functions are part of the standard since version SQL:2003.
Setting up a test environment
Start your favorite editor with SQL support (make sure your RDBMS supports window functions) and create a table with some random data so we can go through all the subjects with working examples. We'll be using a table that represents connections from users to our system. Create a new database with a table as follows:
CREATE TABLE connections ( id INT, user_id INT, source_ip VARCHAR(15), connection_timestamp TIMESTAMP, bytes_transferred INT )
Now populate it with:
INSERT INTO connections VALUES (1, 1, '192.168.1.100', '2016-10-01T13:30:00', 15720000), (2, 1, '192.168.1.100', '2016-10-01T00:40:16', 19500000), (3, 2, '192.168.1.100', '2016-10-01T19:09:48', 9840000), (4, 3, '192.168.1.200', '2016-10-01T22:40:58', 10000), (5, 3, '192.168.1.200', '2016-10-01T13:15:03', 20310000), (6, 3, '192.168.1.200', '2016-10-01T10:10:48', 2010000), (7, 4, '192.168.1.200', '2016-10-01T23:25:21', 20310000), (8, 4, '192.168.1.200', '2016-10-01T04:06:49', 810000), (9, 1, '192.168.1.200', '2016-10-01T16:07:10', 91280000)
How to use them
Window functions are declared just as an aggregate function followed by an
OVER clause, which indicates exactly how rows are being grouped, plus a few other details that we'll dive into later.
We could query the average of bytes transferred per source IP across the connections from our table like this:
SELECT *, AVG(bytes_transferred) OVER (PARTITION BY source_ip) FROM connections
...which results in
Notice that we still have the original 9 rows with the result added as a new column. Our window function didn't alter the original input at all.
Also notice that we are using the good old
AVG function here. This is because all existing aggregate functions can be used as window functions. There are, though, a few new functions that can only be used in the latter form.
The group of rows onto which the window function is applied is called "partition". In its basic form, a partition is no different than a normal aggregate function group: simply a set of rows considered "equal" by some criteria, and the function will be applied over all these rows to return a single result. However, a window function is not executed once for each partition, but rather for each row, and the result may vary from row to row...
Enter the "window frame"
If my row belongs to a partition that doesn't change and the window function executes across all the rows within the partition, how can the result vary?
It can vary because the first part isn't totally true; the window function actually executes over a subset of the partition called "window frame". In our previous example the window frame was equal to the partition so this behavior was not exhibited, but we can alter the window frame like this:
SELECT *, AVG(bytes_transferred) OVER (PARTITION BY source_ip ORDER BY bytes_transferred) FROM connections
avg column shows different results for rows of the same partition because the window frame is different for each row. When using
ORDER BY, the window frame contains all the rows from the start of the partition to our current row, including all subsequent rows that are equal to the current one according to the
ORDER BY clause. Remember this behavior, else you will probably end up with results that don't seem to make sense. Of course, once you grasp the concept of the window frame you'll realize it's not a "strange behavior" but a powerful feature that will allow you to precisely define what rows a window function can see.
For example, let's look at the first 2 rows: we know the partition is defined by the
source_ip field, so the first 3 rows belong to it. In the first row, the window frame consists of only that row, so the average is equal to that row's
bytes_transferred. In the second row, the window frame goes from the first row to the current row, so the average is calculated with both their values.
Now take a look at row #7: the window frame here goes from row #4 (the first of this partition) to row #8, which is past our current row, because that row's
bytes_transferred is equal to the current row's according to the
ORDER BY criteria. This results in rows #7 and #8 having the same window frame, and thus the same
The window frame can be defined with a "frame clause" as well, but we won't dive into its details here. Postgresql documentation has a great explanation of how it works. If neither this nor the
ORDER BY clause is used, the window frame is the whole partition.
Window functions execution order
Window functions execute after
GROUP BY clauses, meaning that we can't use the results of window functions in any of these (without resorting to sub-queries) and also that these constructs will affect the input rows that end up forming our partitions.
Aggregate functions are also executed before window functions, which means that we can't use the results of a window function inside an aggregate function but we can do it the other way around.
Window functions vs typical SQL queries
Say we want to fetch the last connection from each user including all the columns of the table, we have a few alternatives before resorting to window functions. Let's review them:
Alternative 1: sub-query with aggregate function
Here we use a sub-query which returns the maximum
connection_timestamp from the current user and compares it against the current row's
connection_timestamp to check if they're equal. The drawback is that the sub-query is executed once per row, causing considerable performance issues when dealing with larger numbers of rows (
~20s in my PostgreSQL setup with 13k rows vs
~35ms using the solution at the end with window functions).
SELECT * FROM connections WHERE connection_timestamp = ( SELECT max(connection_timestamp) FROM connections AS sub WHERE sub.user_id = connections.user_id )
Alternative 2: joining the table with itself
Now we join the table against itself on the condition that the
user_id is the same on both rows and that the row "on the left" has a smaller
connection_timestamp than the one "on the right". This query returns all connections for which another more recent connection by the same user exists. Finally, we select all the connections that don't exist in that set, and we have what we want.
The downside with this approach is readability (you really have to look into it to figure what it does) and the inability to fetch the "N" latest connections dynamically, as it would require to join the table against itself as many times as "latest connections" we want.
SELECT * FROM connections WHERE id NOT IN ( SELECT c1.id FROM connections AS c1 JOIN connections AS c2 USING (user_id) WHERE c1.connection_timestamp < c2.connection_timestamp )
With window functions
Finally, with the help of the
row_number function and a sub-query, we can achieve our goal like this:
SELECT * FROM ( SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY connection_timestamp DESC) FROM connections ) AS sub WHERE row_number = 1
row_number window function simply returns the index of the current row within its partition starting from 1, so we are asking for all the connections that are first when ordered by the larger
connection_timestamp, partitioning by
user_id. Since we can't use this
row_number inside the
WHERE clause, we wrap the whole thing in a sub-query and do the filtering in the outer-query.
Django is in the title but you don't talk about it
Right! Well, window functions have the advantage that their syntax is contained in a single statement that goes in the
SELECT list where any other column would go (or in the
ORDER BY clause, where they are also allowed), so we don't need any special support to use them with Django's ORM.
annotate call with a
RawSQL expression is enough to make use of window functions.
from django.db.models.expressions import RawSQL from connections.models import Connection Connection.objects.annotate(row_number=RawSQL( 'row_number() OVER (PARTITION BY user_id ORDER BY connection_timestamp DESC)',  ))