Next stop: Joining 1 million tables

Posted on 2014-09-17 by Hans-Jürgen Schönig

This week I started my preparations for one of my talks in Madrid. The topic is: “Joining 1 million tables”. Actually 1 million tables is quite a lot and I am not sure if there is anybody out there who has already tried to do something similar. Basically the idea is to join 1 million tables containing just a single integer column (and no data). My idea was to come up with a join like this:

[hs@jacqueline 1million]$ ./make_join.pl 4
\timing 
SELECT 1 
FROM   t_tab_1, 
       t_tab_2, 
       t_tab_3, 
       t_tab_4 
WHERE  t_tab_1.id = t_tab_2.id AND 
       t_tab_2.id = t_tab_3.id AND 
       t_tab_3.id = t_tab_4.id AND 
       1 = 1;

To give you an impression of the size of the problem I am trying to solve – here is some data on the SQL:

[hs@jacqueline 1million]$ ./make_join.pl 1000000 | wc
2000007 5000024 54666838

The statement itself is 54 MB in size and is around 2 million lines long.
The first observation is that the standard planner doing exhaustive search clearly cannot do it. The potential number of joins simply sky rockets – even ways below a thousand tables the system won’t respond anymore because planning simply takes too long.

So, I tried to focus my tests on genetic query optimization, which proved to be a lot better for this little competition. Joining a couple hundreds of tables with Geqo is actually possible straight away without having to change any parameters at all. This was pretty impressive actually, as in real life you would never attempt to do this kind of operation. Given some runtime estimations it seems that gradually increasing the number of tables would mean that 1 million tables can actually be joined in 1575 days. The thing is just: I don’t have time to run this for 5 years as I got to produce my slides for Madrid fairly soon ;).

Geqo is highly configurable, so some parameters can be changed to speed up things by tweaking the generic optimization process and making it a bit more lazy. I managed to cut runtime to roughly one year. To speed this up it seems that I got to inspect the backend process and see what is going on … stay tuned 😉

More recent blog posts about joins can be found here in our join-related blog spot.