Partial aggregation: The beautiful way

Aggregations are a fundamental feature of SQL. Functions such as min, max, count, sum, avg and so on are used in virtually every application – it is basically impossible to write a reasonably complex applications without them.

Here is a trivial aggregate: All it does is summing up numbers from 1 to 10:

test=# SELECT sum(x) FROM generate_series(1, 10) AS x;

 sum

-----

  55

(1 row)

So far so good. But what if we only want to sum up even numbers? We can rely on a simple trick: NULL values are ignored by aggregation functions. This makes a lot of sense because NULL really means “undefined” and you cannot average or sum up undefined values.  A simple CASE WHEN can do the job:

test=# SELECT sum(CASE WHEN x % 2 = 0 THEN x ELSE NULL END)

            FROM generate_series(1, 10) AS x;

 sum

-----

  30

(1 row)

This works perfectly – however, it is a bit old fashioned and pretty nasty to write (especially in case of more complex operations).

The fancy way …

There is a more fancy way to do that:

test=# SELECT          sum(x) FILTER (WHERE x % 2 = 0) AS even,

                                   sum(x) FILTER (WHERE x % 2 = 1) AS odd

            FROM  generate_series(1, 10) AS x;

 even | odd

------+-----

   30 |  25

(1 row)

The FILTER keywords allow user to tell the systems, which rows should make it into the aggregate function. The new syntax is ways more readable and basically also a lot shorter.

Hans-Juergen Schoenig
Hans-Jürgen Schönig has 15 years of experience with PostgreSQL. He is consultant and CEO of the company „Cybertec Schönig & Schönig GmbH“ (www.cybertec.at, www.postgresql-support.de), which has served countless customers around the globe.