<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cybertec PostgreSQL Pros&#039; Blog &#38; Archives</title>
	<atom:link href="http://www.cybertec.at/en/feed" rel="self" type="application/rss+xml" />
	<link>http://www.cybertec.at/en</link>
	<description>Professionals of the most advanced open-source database provider PostgreSQL</description>
	<lastBuildDate>Wed, 04 Jan 2012 12:22:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<meta xmlns="http://www.w3.org/1999/xhtml" name="robots" content="noindex,follow" />
		<item>
		<title>Thoughts about autonomous transactions</title>
		<link>http://www.cybertec.at/en/thoughts-about-autonomous-transactions</link>
		<comments>http://www.cybertec.at/en/thoughts-about-autonomous-transactions#comments</comments>
		<pubDate>Mon, 19 Dec 2011 11:32:56 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1580</guid>
		<description><![CDATA[Thoughts about autonomous transactions   When reading Gianni Ciolli’s wiki page about autonomous transactions this week I spent some time to think about things in more detail. The idea of an autonomous transaction is actually quite simple:   START TRANSACTION; some_sql_action; START AUTONONOUS TRANSACTION; INSERT INTO log_table VALUES (some_data); COMMIT AUTONOMOUS TRANSACTION; some_more_sql_action; ROLLBACK;   [...]]]></description>
			<content:encoded><![CDATA[<p>
	<style type="text/css"><!--
		@page { margin: 0.79in }
		P { margin-bottom: 0.08in }
	--></style>
</p>
<h1 lang="en-US" style="margin-bottom: 0in;"><b>Thoughts about autonomous transactions</b></h1>
<p lang="en-US" style="margin-bottom: 0in"> </p>
<p lang="en-US" style="margin-bottom: 0in">When reading Gianni Ciolli’s wiki page about autonomous transactions this week I spent some time to think about things in more detail.</p>
<p lang="en-US" style="margin-bottom: 0in">The idea of an autonomous transaction is actually quite simple:</p>
<p lang="en-US" style="margin-bottom: 0in"> </p>
<pre lang="en-US" style="margin-bottom: 0in; text-decoration: none;">START TRANSACTION;

some_sql_action;


START AUTONONOUS TRANSACTION;

INSERT INTO log_table VALUES (some_data);

COMMIT AUTONOMOUS TRANSACTION;


some_more_sql_action;

ROLLBACK;</pre>
<p lang="en-US" style="margin-bottom: 0in"> </p>
<p lang="en-US" style="margin-bottom: 0in">The main advantage here is actually that you can commit the log entry without having to commit the parent transaction. This makes the log entry visible if the parent transaction fails badly. Reporting errors to a table is ways more robust in this case. Technically the idea floating around is to push transactions to a stack doing some push / pop operations as autonomous transactions come and go.</p>
<p lang="en-US" style="margin-bottom: 0in"> </p>
<p lang="en-US" style="margin-bottom: 0in">BUT: What should happen in this case?</p>
<pre lang="en-US" style="margin-bottom: 0in; text-decoration: none;">

<i>START TRANSACTION;</i>
<i>SELECT some_data FROM some_table WHERE id = 1 FOR UPDATE;</i>

<i>START AUTONONOUS TRANSACTION;</i>
<i>SELECT some_data FROM some_table WHERE id = 1 FOR UPDATE; -- > problem</i>
<i>COMMIT AUTONOMOUS TRANSACTION;</i>

<i>some_more_sql_action;</i>

<i>ROLLBACK;</i></pre>
<p lang="en-US" style="margin-bottom: 0in"> </p>
<p lang="en-US" style="margin-bottom: 0in">If the child transaction is really supposed to be fully independent of its parent this should actually never be possible. The parent transaction locks up a row which is then used by the autonomous transaction. This leaves us with two choices: Either the child transaction is NOT fully independent or you allow users to “kill” themselves by locking up the connection.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/thoughts-about-autonomous-transactions/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>“PostgreSQL vs. Ingres” — or “Ingres RIP”</title>
		<link>http://www.cybertec.at/en/postgresql-vs-ingres-or-ingres-rip</link>
		<comments>http://www.cybertec.at/en/postgresql-vs-ingres-or-ingres-rip#comments</comments>
		<pubDate>Tue, 04 Oct 2011 09:13:56 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1426</guid>
		<description><![CDATA[Given Josh’s postings on Ingres earlier I would like to add some comments: Some time ago one of our folks here at Cybertec played around with Ingres a little. The idea was basically to see how things work and to check which stuff could be useful to PostgreSQL (some useful SQL instruction or so). One [...]]]></description>
			<content:encoded><![CDATA[<div id="_mcePaste" >Given Josh’s postings on Ingres earlier I would like to add some comments: Some time ago one of our folks here at Cybertec played around with Ingres a little. The idea was basically to see how things work and to check which stuff could be useful to PostgreSQL (some useful SQL instruction or so).</div>
<br/>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">One of the first things was to try and compile the thing. Compiling PostgreSQL is pretty much what people do every day and it has never really been an issue to make PostgreSQL work from scratch. So, the same thing was tried with Ingres: To make it short “oh dear” … this is nothing anybody would want to compile from scratch.</div>
<div id="_mcePaste">To get some impression I checked out some random header files to see how the code is organized and how things basically look inside. The first thing I remember seeing was a giant copyright basically stating that things are all “computer associates” (CA) owned and so on … I could not fight the feeling that this license statement up there is actually more important than the code later on.</div>
<div id="_mcePaste">So, if you are a guy who is seriously interested in a free database, Open Source and all that you definitely don’t want to see that. You want to be able to compile things with reasonable effort and you want to avoid the feeling of working for CA.</div>
<br/>
<div id="_mcePaste">I am pretty sure that I am not the only one who made similar observations and who turned down Ingres for similar reasons. If you look at the PostgreSQL source instead it is a lot more “friendly” an it tends to invite people more than this “stay out of here”-style Ingres code-license-code-base-mixture.</div>]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/postgresql-vs-ingres-or-ingres-rip/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Users and join enlightenments</title>
		<link>http://www.cybertec.at/en/users-and-join-enlightenments</link>
		<comments>http://www.cybertec.at/en/users-and-join-enlightenments#comments</comments>
		<pubDate>Tue, 27 Sep 2011 10:07:42 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1409</guid>
		<description><![CDATA[  When being out on the road doing consulting it is sometimes impressing to figure out how little people actually know about how joins work. Especially outer joins seem to be a mystery for most people. Here is an example which might bring some enlightenment: test=# CREATE TABLE a (id int4); CREATE TABLE test=# CREATE [...]]]></description>
			<content:encoded><![CDATA[<p> </p>
<p>When being out on the road doing consulting it is sometimes impressing to figure out how little people actually know about how joins work. Especially outer joins seem to be a mystery for most people.</p>
<p>Here is an example which might bring some enlightenment:</p>
<pre>test=# CREATE TABLE a (id int4);
CREATE TABLE
test=# CREATE TABLE b (id int4);
CREATE TABLE
test=# INSERT INTO a VALUES (1), (2), (3);
INSERT 0 3
test=# INSERT INTO b VALUES (2), (3), (4);
INSERT 0 3</pre>
<p> </p>
<p>We just insert a couple of rows and see how things work out.</p>
<p>Let us start with a simple LEFT JOIN:</p>
<pre>test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id);
 id | id 
----+----
  1 |   
  2 |  2
  3 |  3
(3 rows)</pre>
<p> </p>
<p>The goal here is to get all rows from the left hand side and see which rows match on the right side. The “missing” value is filled with a NULL token.</p>
<p>One of the key misunderstandings is this magical ON-clause:</p>
<pre>test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id AND b.id = 2);
 id | id 
----+----
  1 |   
  2 |  2
  3 |   
(3 rows)</pre>
<p> </p>
<p>To most people this result is surprising. It is important to notice that the filter is only applied to the right side. So, we get one more NULL value. This is pretty easy to see and comprehend if the usecase is that simple – but, in case of an outer-join orgy it is not that trivial anymore.</p>
<p>The example I have just shown is not what people usually expect. Here is an alternative:</p>
<pre>test=# SELECT * FROM a LEFT JOIN b ON (a.id = b.id) WHERE b.id = 2;
 id | id 
----+----
  2 |  2
(1 row)</pre>
<p> </p>
<p>The WHERE-clause will filter on both columns at the same time. This will actually turn the LEFT JOIN into a useless operation because the WHERE-clause will filter stuff away anyway. NOTE: This is a mistake which can commonly observed; most people don’t get ON and WHERE right when it comes to outer joins.</p>
<p><strong>FULL OUTER joins</strong></p>
<p>FULL OUTER joins are the next bug miracle:</p>
<pre>test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id);
 id | id 
----+----
  1 |   
  2 |  2
  3 |  3
    |  4
(4 rows)</pre>
<p> </p>
<p>We take all rows from the left hand side, take the matches on the right hand side and add NULL values in case no matches are found. So far this is no surprise …</p>
<p>The problem usually starts when we try to make the ON clause a little bit more fancy:</p>
<pre>test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id AND b.id = 2);
 id | id 
----+----
  1 |   
  2 |  2
  3 |   
    |  4
    |  3
(5 rows)</pre>
<p> </p>
<p>Most people would not expect “3” to exist at all on the right hand side. As you can see it is moved to the NULL-part of the join. This is not too intuitive for most people. However, this is perfectly expected behavior.</p>
<p>If we filter in the WHERE clause we will again turn the outer join into a useless operation:</p>
<pre>test=# SELECT * FROM a FULL JOIN b ON (a.id = b.id) WHERE b.id = 2;
 id | id 
----+----
  2 |  2
(1 row)</pre>
<p> </p>
<p>Maybe this will bring some enlightenment to some users out there.</p>
<p> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/users-and-join-enlightenments/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fixing stuff at 3 o’clock in the morning</title>
		<link>http://www.cybertec.at/en/fixing-stuff-at-3-oclock-in-the-morning</link>
		<comments>http://www.cybertec.at/en/fixing-stuff-at-3-oclock-in-the-morning#comments</comments>
		<pubDate>Mon, 22 Aug 2011 10:38:22 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1404</guid>
		<description><![CDATA[PostgreSQL is famous for its scalability and especially for its exceptionally good support for transactions. But, not only this makes PostgreSQL fairly unique.We are currently already experimenting with some customer applications to see if they will work nicely with PostgreSQL 9.1. One of the features which will make PostgreSQL really cool is the fact that [...]]]></description>
			<content:encoded><![CDATA[
<p>PostgreSQL is famous for its scalability and especially for its exceptionally good support for transactions. But, not only this makes PostgreSQL fairly unique.<br /><br />We are currently already experimenting with some customer applications to see if they will work nicely with PostgreSQL 9.1. One of the features which will make PostgreSQL really cool is the fact that we now have writable CTEs.<br /><br />I think most of us will know this scenario: It is 3 o’clock in the morning. You are dead tired and something just does not want to work. You want to delete data from a table … a lot of data. Are you brave enough to hit commit? You want to, let’s say, 20 mio rows … wonna risk it?<br /><br />Well, with writable CTEs things will be a little easier.<br />Here is a demo:<br /><br /></p>
<pre>test=# CREATE TABLE t_test AS SELECT * FROM generate_series(1, 10000) AS id;
SELECT 10000</pre>
<p><br />First we create a small demo with some simple data.<br />Now we do our fancy cleanup. For the sake of security we do it inside a transaction:<br /><br /></p>
<pre>test=# BEGIN;
BEGIN</pre>
<p><br />How about analyzing data while it is being deleted? Here is how it works …<br /><br /></p>
<pre>test=# WITH x AS ( DELETE FROM t_test RETURNING * ) 
    SELECT min(id), max(id), avg(id), count(*) FROM x;
 min |  max  |          avg          | count 
-----+-------+-----------------------+-------
 1 | 10000 | 5000.5000000000000000 | 10000
(1 row)</pre>
<p><br />We put the DELETE inside a so called WITH-clause. WITH will give us a “virtual table” called “x” which contains the data returned by DELETE. Note, we used DELETE in combination with a  “RETURNING *” … so we essentially tell DELETE to give us the data we are about to kick. As we might want to do it for 20 mio rows or so it is definitely too much for a night-time reading exercise. So, we turn this data coming back from the DELETE into something we can then nicely analyze inside our SELECT part. Given this analysis it is a lot easier to judge if our DELETE has really touched the data we wanted it to touch. It greatly reduces the odds of breaking something late at night. <br /><br />If we feel safe, we can nicely commit …<br /><br /></p>
<pre>test=# COMMIT;
COMMIT</pre>
<p><br />Sometimes people have to delete a lot of data. Common table expressions (CTEs) make this process a lot more reliable and robust.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/fixing-stuff-at-3-oclock-in-the-morning/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2.5 mio tables — “yes, we REALLY can”</title>
		<link>http://www.cybertec.at/en/2-5-mio-tables-yes-we-really-can</link>
		<comments>http://www.cybertec.at/en/2-5-mio-tables-yes-we-really-can#comments</comments>
		<pubDate>Tue, 16 Aug 2011 12:24:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1384</guid>
		<description><![CDATA[There are points in your life when you look into the mirror asking yourself if your model of the world you used to have is really still sustainable … Some 5 years ago we fired up some 8.1 database servers which have successfully served XX bio transactions since then. Running a large PostgreSQL database successfully [...]]]></description>
			<content:encoded><![CDATA[
<p><br />There are points in your life when you look into the mirror asking yourself if your model of the world you used to have is really still sustainable …</p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Some 5 years ago we fired up some 8.1 database servers which have successfully served XX bio transactions since then. Running a large PostgreSQL database successfully on UNIX over a number of years is actually nothing special. So, why care? Well, this thing is special — and trust me, it was never intended to be …</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Imagine you start designing a simple database system for a small company (maybe 10 people). Things should be scalable and simply work for the given purpose. You might want to use some partitioning and maybe do a little trickery with indexing. So far so good … nothing special.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">But, sometimes things just don’t work out as expected. In this special case it means: End users started to behave somehow totally differently then expected — and: there is A LOT more growth then ever anticipated. So, given the totally unexpected way people use the database, given the seriously higher growth in both users and amount of data the number of tables / partitions in the PostgreSQL systems went up and up and up and up ….</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">This kind of growth is especially scary when you take into account that in case of high growth it is usually not too easy to make a serious change in architecture within a small period of time (there is never enough time). The business side is usually occupied with sales, important extensions required by the customers and so on. Given a (meanwhile) complex application it is not too easy to simply introduce some PL/Proxy layer to scale out more easily without investing a lot of time and (hardly available manpower). So, you are actually forced to “tweak” things a little to make sure that you can actually buy some time for a change which can reflect the new needs …</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">And months go by and you keep tweaking and months go by and at some point you end up with something like this:</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">db=# \x</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Expanded display is on.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">db=# SELECT count(*) FROM pg_tables;</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">–[ RECORD 1 ]–</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">count | 2574107</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">I want to point out again that this was never intentional … nobody EVER planned to do put over 2 mio tables in the very same PostgreSQL database on the very same box. This happened definitely WAYS beyond my control — again; it was never intentional and NEVER planned this way. But still, it is impressing to see how well PostgreSQL behaves under really disgusting workloads.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Files, files, files …</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Let us think for a moment how PostgreSQL stores tables inside a database instance — they simply go into the same directory unless you use tablespaces:</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">postgres@db1:/var/lib/postgresql/8.1/main/base/16385$ ls | wc –l</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">8104310</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">How can the operating system handle that? Well, first of all you have to adjust some kernel parameters to make this work. Secondly you got to make sure that you filesystem uses a tree down to the file level — otherwise a “seq scan” on the file list will kill you during lookup. Fortunately this was properly set when the system was set up.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Again, this is not sane design and it was never intended — nobody knew beforehand for how long this can be sustained. It just ended up a little scary <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">What really impresses me here is that this thing works like a charm. We did not have a single database crash in 5 years. Interestingly the fileystem is playing along well for years as well — no filesystem corruption for 5 years; it simply works like a charm while serving around 20 mio transactions a week. The load on the system is actually fairly mixed — there is a large number of small transactions but also a fair list of real killers (bulk load, creating / removing hundreds or tables at a time and so on).</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Why not consolidate into larger tables?</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Well, given the large number of tables people would actually think that using larger tables would make more sense. Given the load this is actually not quite true … recall, using a seriously partitioned structure you basically get a “free index pointer” to a chunk of data without actually having to index the data itself. If you want to read a million rows at once it is a lot cheaper to look up the right table containing and read it sequentially than to actually fetch 1 mio rows from an insane amount of data by traversing some insanely large index. A seq scan on the right 1 mio rows is a lot faster than to actually go through a (largely) uncached, giant index.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Tribute to PostgreSQL 8.1</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">It is seriously impressing how far you can squeeze even an old release such as 8.1 when things get really hard to manage. Given the vast number of improvements which went into PostgreSQL since 8.1 it is fun too imagine how much better and easier to handle a new release might be. Still, this kind of application is definitely a killer … and PostgreSQL can really show its potential.</div>
<p>Some 5 years ago we fired up some 8.1 database servers which have successfully served XX bio transactions since then. Running a large PostgreSQL database successfully on UNIX over a number of years is actually nothing special. So, why care? Well, this thing is special — and trust me, it was never intended to be …</p>
<p>Imagine you start designing a simple database system for a small company (maybe 10 people). Things should be scalable and simply work for the given purpose. You might want to use some partitioning and maybe do a little trickery with indexing. So far so good … nothing special.</p>
<p>But, sometimes things just don’t work out as expected. In this special case it means: End users started to behave somehow totally differently then expected — and: there is A LOT more growth then ever anticipated. So, given the totally unexpected way people use the database, given the seriously higher growth in both users and amount of data the number of tables / partitions in the PostgreSQL systems went up and up and up and up ….</p>
<p>This kind of growth is especially scary when you take into account that in case of high growth it is usually not too easy to make a serious change in architecture within a small period of time (there is never enough time). The business side is usually occupied with sales, important extensions required by the customers and so on. Given a (meanwhile) complex application it is not too easy to simply introduce some PL/Proxy layer to scale out more easily without investing a lot of time and (hardly available manpower). So, you are actually forced to “tweak” things a little to make sure that you can actually buy some time for a change which can reflect the new needs …</p>
<p>And months go by and you keep tweaking and months go by and at some point you end up with something like this:</p>
<pre>db=# \x
Expanded display is on.</pre>
<pre>
db=# SELECT count(*) FROM pg_tables;
-[ RECORD 1 ]--
count | 2574107</pre>
<pre></pre>
<p>I want to point out again that this was never intentional … nobody EVER planned to do put over 2 mio tables in the very same PostgreSQL database on the very same box. This happened definitely WAYS beyond my control — again; it was never intentional and NEVER planned this way. But still, it is impressing to see how well PostgreSQL behaves under really disgusting workloads.</p>
<p><strong>Files, files, files …</strong></p>
<p>Let us think for a moment how PostgreSQL stores tables inside a database instance — they simply go into the same directory unless you use tablespaces:</p>
<p>postgres@db1:/var/lib/postgresql/8.1/main/base/16385$ ls | wc –l<br />8104310</p>
<p>How can the operating system handle that? Well, first of all you have to adjust some kernel parameters to make this work. Secondly you got to make sure that you filesystem uses a tree down to the file level — otherwise a “seq scan” on the file list will kill you during lookup. Fortunately this was properly set when the system was set up.</p>
<p>Again, this is not sane design and it was never intended — nobody knew beforehand for how long this can be sustained. It just ended up a little scary <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>What really impresses me here is that this thing works like a charm. We did not have a single database crash in 5 years. Interestingly the fileystem is playing along well for years as well — no filesystem corruption for 5 years; it simply works like a charm while serving around 20 mio transactions a week. The load on the system is actually fairly mixed — there is a large number of small transactions but also a fair list of real killers (bulk load, creating / removing hundreds or tables at a time and so on).</p>
<p><strong>Why not consolidate into larger tables? </strong></p>
<p>Well, given the large number of tables people would actually think that using larger tables would make more sense. Given the load this is actually not quite true … recall, using a seriously partitioned structure you basically get a “free index pointer” to a chunk of data without actually having to index the data itself. If you want to read a million rows at once it is a lot cheaper to look up the right table containing and read it sequentially than to actually fetch 1 mio rows from an insane amount of data by traversing some insanely large index. A seq scan on the right 1 mio rows is a lot faster than to actually go through a (largely) uncached, giant index.</p>
<p><strong>Tribute to PostgreSQL 8.1</strong></p>
<p>It is seriously impressing how far you can squeeze even an old release such as 8.1 when things get really hard to manage. Given the vast number of improvements which went into PostgreSQL since 8.1 it is fun too imagine how much better and easier to handle a new release might be. Still, this kind of application is definitely a killer … and PostgreSQL can really show its potential.</p>
<p> </p>
<p> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/2-5-mio-tables-yes-we-really-can/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nested savepoints can be a trap</title>
		<link>http://www.cybertec.at/en/nested-savepoints-can-be-a-trap</link>
		<comments>http://www.cybertec.at/en/nested-savepoints-can-be-a-trap#comments</comments>
		<pubDate>Mon, 08 Aug 2011 10:55:00 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1370</guid>
		<description><![CDATA[PostgreSQL is famous for its scalability and especially for its exceptionally good transaction manager. Sometimes PostgreSQL is observed as so excellent that people forget about proper coding conventions.Nested savepoints are an area which can cause issues in this area.First of all: For the purpose of this blog entry a nested savepoint is this: test=# BEGIN; [...]]]></description>
			<content:encoded><![CDATA[
<p>PostgreSQL is famous for its scalability and especially for its exceptionally good transaction manager. Sometimes PostgreSQL is observed as so excellent that people forget about proper coding conventions.<br />Nested savepoints are an area which can cause issues in this area.<br /><br />First of all: For the purpose of this blog entry a nested savepoint is this:</p>
<pre>test=# BEGIN;
BEGIN</pre>
<p>… some action …</p>
<pre>test=# SAVEPOINT a1;
SAVEPOINT</pre>
<p>… some action …</p>
<pre>test=# SAVEPOINT a2;
SAVEPOINT</pre>
<p>… some action …</p>
<pre>test=# SAVEPOINT a3;
SAVEPOINT</pre>
<p>If you nest thousands and thousands of savepoints inside one transaction without ever releasing a single savepoint you will figure out that after some time the performance of your transaction is about to decline rapidly.<br /><br />The reason is fairly simple. Here is a brief output of some profiling session:<br /><br />Flat profile:<br /><br />Each sample counts as 0.01 seconds.</p>
<pre> %   cumulative   self              self     total
time   seconds   seconds    calls   s/call   s/call  name
35.20      0.69     0.69 11442642     0.00     0.00  LWLockAcquire
28.06      1.24     0.55 11442642     0.00     0.00  LWLockRelease
20.41      1.64     0.40 11433942     0.00     0.00  SimpleLruReadPage_ReadOnly
 5.61      1.75     0.11 11443596     0.00     0.00  TransactionIdPrecedes
 4.08      1.83     0.08 11429226     0.00     0.00  SubTransGetParent
 4.08      1.91     0.08     4716     0.00     0.00  SubTransGetTopmostTransaction
 0.51      1.92     0.01     9728     0.00     0.00  slot_deform_tuple
&nbsp;</pre>
<p>Every additional savepoints adds some small additional effort needed inside the database to keep track of the transaction status and so on. so, if your transaction has accumulated thousands of unreleased savepoints you will see a lot of of heavy activity inside the backend which is simply related to locking. Nothing bad is going to happend and PostgreSQL will definitely survive 500 mio savepoints inside the same transaction — however, you might not be happy with the performance after all. <br /><br />How to get around this issue:</p>
<pre>test=# RELEASE SAVEPOINT a1;
RELEASE</pre>
<p>once a savepoint is not needed anymore it can simply be released. this will resolve the extensive CPU consumption shown in the list above.<br /><br />So, after all … thank you all for having so much faith in us and that you think we can do millions of savepoints inside a transaction <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> . yes, we can — but, you should not necessarily do it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/nested-savepoints-can-be-a-trap/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PostgreSQL: Who is in?</title>
		<link>http://www.cybertec.at/en/postgresql-who-is-in</link>
		<comments>http://www.cybertec.at/en/postgresql-who-is-in#comments</comments>
		<pubDate>Thu, 14 Jul 2011 19:15:53 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1347</guid>
		<description><![CDATA[When I am on the road as PostgreSQL consultant I am sometimes surprised about the amount of work which is wasted simply by the lack of knowledge. Recently I have come across a simple problem: How many people are in the room at a given point in time? It is interesting to see how many [...]]]></description>
			<content:encoded><![CDATA[
<p>When I am on the road as PostgreSQL consultant I am sometimes surprised about the amount of work which is wasted simply by the lack of knowledge. Recently I have come across a simple problem: How many people are in the room at a given point in time?</p>
<p>It is interesting to see how many lines of code are sometimes needed to solve a problem. Long programs seem to be an especially common fetish in the Java world.</p>
<p><strong>How would you approach a problem like that in PostgreSQL?</strong></p>
<p><strong> </strong>Here is a simplified version:</p>
<pre>test=# CREATE TABLE t_room (id serial, name text, action text);
NOTICE:  CREATE TABLE will create implicit sequence "t_room_id_seq" for serial column "t_room.id"
CREATE TABLE

<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;">
</span>
<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;">We create a table with an ID, some name and an action.</span>

<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;">Now we feed some data:</span></pre>
<p> </p>
<pre>test=# INSERT INTO t_room (name, action) VALUES ('hans', 'IN');
INSERT 0 1
test=# INSERT INTO t_room (name, action) VALUES ('paul', 'IN');
INSERT 0 1
test=# INSERT INTO t_room (name, action) VALUES ('james', 'IN');
INSERT 0 1
test=# INSERT INTO t_room (name, action) VALUES ('paul', 'OUT');
INSERT 0 1
test=# INSERT INTO t_room (name, action) VALUES ('john', 'IN');
INSERT 0 1
test=# INSERT INTO t_room (name, action) VALUES ('hans', 'OUT');
INSERT 0 1
<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;">
</span></pre>
<pre><span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;"> </span></pre>
<p><span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal;">This is already enough to make the fun start:</span></p>
<p> </p>
<pre>test=# SELECT *,
       sum(CASE WHEN action = 'IN' THEN 1 ELSE -1 END)
               OVER ( ORDER BY id) AS field FROM t_room;

 id | name  | action | field
----+-------+--------+-------
  1 | hans  | IN     |     1
  2 | paul  | IN     |     2
  3 | james | IN     |     3
  4 | paul  | OUT    |     2
  5 | john  | IN     |     3
  6 | hans  | OUT    |     2
(6 rows)</pre>
<p> </p>
<p>Basically we have defined 2 actions: Somebody can enter the room or he can leave it. Order is provided by the id we have used in the example. So, how can we translate this to a counter? First of all we have to define that we want to increment the counter by 1 if somebody goes in and decrement it by 1 if somebody leaves. now we can do a running total.</p>
<p>This can be achieved with a simple windowing function.</p>
<p>Sometimes problems which make people code for days can be solved in one line.</p>
<p> </p>
<p> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/postgresql-who-is-in/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unexpected missing feature</title>
		<link>http://www.cybertec.at/en/unexpected-missing-feature</link>
		<comments>http://www.cybertec.at/en/unexpected-missing-feature#comments</comments>
		<pubDate>Mon, 20 Jun 2011 11:25:18 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1324</guid>
		<description><![CDATA[Dealing with banking-related is always an interesting thing. Last week I came across an interesting issue which seems to take some more thinking. The challenge is actually quite simple: “We have to trace all changes to data and structure”. Well, the first thing does not seem like a huge problem but how about changes to [...]]]></description>
			<content:encoded><![CDATA[
<p>Dealing with banking-related is always an interesting thing. Last week I came across an interesting issue which seems to take some more thinking.</p>
<p>The challenge is actually quite simple: “We have to trace all changes to data and structure”. Well, the first thing does not seem like a huge problem but how about changes to data structures? Well, somebody would argue that …</p>
<p> </p>
<p><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; white-space: pre;">SET log_statement TO mod;</span></p>
<p> </p>
<p>… will do the job. This will display all DDLs and commands doing changes to data.   However, consider …</p>
<p> </p>
<p><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; white-space: pre;">test=# SET log_statement TO none;<br /></span><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; line-height: 18px; white-space: pre;">SET</span></p>
<p> </p>
<p>This is a perfectly valid SQL command leaving us with a problem when it comes to tracking and auditing. We cannot tell for certain who has modified what because logging of DDLs can be turned off by user interaction.</p>
<p>It seems some additional and reliable way to prohibit that is needed — alternatively some additional tracing / auditing mode might help as well. It does not seem to be too attractive to have permissions of GUC variables because it might add an extra level of complexity nobody might want.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/unexpected-missing-feature/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Linuxwochen in Vienna</title>
		<link>http://www.cybertec.at/en/linuxwochen-in-vienna</link>
		<comments>http://www.cybertec.at/en/linuxwochen-in-vienna#comments</comments>
		<pubDate>Mon, 09 May 2011 12:48:34 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1291</guid>
		<description><![CDATA[  Linuxwochen in Vienna This was a fairly interesting weekend. This years “Linuxwochen” made a stop in Vienna and people enjoyed 3 days of interesting presentations and talks. What somehow surprised me was the fact the Microsoft gave some talks on some Open Source components provided by them — mostly in the area of open [...]]]></description>
			<content:encoded><![CDATA[
<p><span style="font-weight: 800;"> </span></p>
<p> </p>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">Linuxwochen in Vienna</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">This was a fairly interesting weekend. This years “Linuxwochen” made a stop in Vienna and people enjoyed 3 days of interesting presentations and talks.</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">What somehow surprised me was the fact the Microsoft gave some talks on some Open Source components provided by them — mostly in the area of open government data and cloud-compliance. It is amusing to see how bad the sentiment towards those people actually was … but,  after all — my PostgreSQL talk had at least twice as many listeners than the “Open Data” talk provided by Microsoft before <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</div>
<div id="_mcePaste" style="position: absolute; left: -10000px; top: 0px; width: 1px; height: 1px; overflow-x: hidden; overflow-y: hidden;">I gave a talk called “PostgreSQL roadshow” (video will be available soon) which showed some new technologies in the PostgreSQL area such as KNN, PgOpenCL, writable CTEs and so on. It seems people are highly interested in some stuff which goes seriously beyond “key / value”.</div>
<p>
<p>This was a fairly interesting weekend. This years “Linuxwochen” made a stop in Vienna and people enjoyed 3 days of interesting presentations and talks.</p>
<p>What somehow surprised me was the fact the Microsoft gave some talks on some Open Source components provided by them — mostly in the area of open government data and cloud-compliance. It is amusing to see how bad the sentiment towards those people actually was … but,  after all — my PostgreSQL talk had at least twice as many listeners than the “Open Data” talk provided by Microsoft before <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>I gave a talk called “PostgreSQL roadshow” (video will be available soon) which showed some new technologies in the PostgreSQL area such as KNN, PgOpenCL, writable CTEs and so on. It seems people are highly interested in some stuff which goes seriously beyond “key / value”.</p>
</p>
<p> </p>
<p> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/linuxwochen-in-vienna/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nest it – “yes we can”</title>
		<link>http://www.cybertec.at/en/nest-it-yes-we-can</link>
		<comments>http://www.cybertec.at/en/nest-it-yes-we-can#comments</comments>
		<pubDate>Mon, 18 Apr 2011 12:01:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.cybertec.at/en/?p=1278</guid>
		<description><![CDATA[Sometimes it starts to be a little scary if somebody comes up and says “yes we can”. For some reason there is always a little suspicion around in those cases. Last week I have successfully answered the “can you nest subselects”-question for the 52342424285675th time so i decided to move “yes we can” a little [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes it starts to be a little scary if somebody comes up and says “yes we can”. For some reason there is always a little suspicion around in those cases.</p>
<p>Last week I have successfully answered the “can you nest subselects”-question for the 52342424285675<sup>th</sup> time so i decided to move “yes we can” a little further and turn it into “yes, we REALLY can” — here is the proof:</p>

<pre> 
WITH x AS ( SELECT row_number() 
    OVER (PARTITION BY (SELECT * 
        FROM ( SELECT 1) AS e) ORDER BY 1) 
            FROM ( SELECT 1 d) AS y)
    SELECT ( SELECT * FROM ( SELECT * FROM x) AS x) AS output
        FROM ( SELECT 1 AS z) AS x
        WHERE z &gt;= ( SELECT 1)
            OR z IN ( (SELECT 1), (SELECT 1))
        ORDER BY ( SELECT 1)
        LIMIT ( SELECT 1 FROM ( SELECT * FROM x) AS a);

[hs@lola ~]$ psql test &lt; /tmp/file.sql 
 output 
--------
      1
(1 row)

</pre>

<p>I think we have fully proven now that PostgreSQL can REALLY nest subselects. If this is not enough evidende – step up now or keep silent forever <img src='http://www.cybertec.at/en/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.cybertec.at/en/nest-it-yes-we-can/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

