Exposing PostgreSQL server logs to users via SQL

During the last training session a curious participant asked if there’s a way to easily expose the database log to users – and indeed, there’s a pretty neat way for SQL-based access! So this time a quick demo on that. The approach, taking advantage of the File Foreign Data Wrapper extension is actually even brought out in the official docus, but still not too known, although mentioned as an “obvious use case” 😉 I must say though that this approach is best suited for development setups as under normal circumstances you would most probably want to keep the lid on your logs.

Setup steps

1. First you need to change the server configuration (postgresql.conf) and enable CSV logging as described in details here. This of course might result in some overhead on busy systems, as compared to ‘sysout’ as all the “columns” or info that Postgres has on the logged event is logged, especially problematic with log_statement = ‚all‘.

# set in postgresql.conf + restart/reload server (chaning logging_collector needs restart)
log_destination = 'csvlog'
logging_collector = on
log_filename = 'postgresql.log'	# NB! File will grow indefinitely and needs external truncating
					# Also note that file extension needs to stay .log

2. Install the “file_fdw” extension (“contrib” package needed) and create a foreign file server and a foreign table, linking to our above configured log file name.

CREATE EXTENSION file_fdw;

CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE pglog (
  log_time timestamp(3) with time zone,
  user_name text,
  database_name text,
  process_id integer,
  connection_from text,
  session_id text,
  session_line_num bigint,
  command_tag text,
  session_start_time timestamp with time zone,
  virtual_transaction_id text,
  transaction_id bigint,
  error_severity text,
  sql_state_code text,
  message text,
  detail text,
  hint text,
  internal_query text,
  internal_query_pos integer,
  context text,
  query text,
  query_pos integer,
  location text,
  application_name text
) SERVER pglog
OPTIONS ( filename 'pg_log/postgresql.csv', format 'csv' );

3. Grant access as needed, or if you want that every user can see only his/her own entries, bring views into play, with security_barrier set when security matters. For 9.5+ servers one could even use the flashy Row Level Security mechanisms to set up some more obscure row visibility rules. Downside is that you need to set up a parent-child relationship then, as RLS cannot work with the “virtual” table directly.

GRANT SELECT ON pglog TO public;

# or the view based way

CREATE OR REPLACE view v_pglog WITH (security_barrier=true) AS SELECT * FROM pglog WHERE user_name = current_user;
GRANT SELECT ON v_pglog TO public;

4. And another additional idea – a handy way to expose and physically keep around (automatic truncation) only 7 days of logs is to define 7 child tables for a master one. Process would then look something like that:

# set in postgresql.conf + reload conf
log_filename = 'postgresql-%a.csv'	# Keep 7d of logs in files ‘postgresql-Mon.csv’ etc.

CREATE TABLE pglog (
  log_time timestamp(3) with time zone,
  ...
  application_name text
);

CREATE FOREIGN TABLE pglog_mon (
    …
    )  SERVER pglog
    OPTIONS ( filename 'pg_log/postgresql-Mon.csv', format 'csv' );
ALTER TABLE  pglog_mon INHERIT  pglog;

# rinse and repeat for Tue...Sun

Not a „one size fits all“ solution

The only problem with the laid out approach of course is that it might not be a perfect fit if you need relatively frequent monitoring queries on those logs, as queries need to read through all of the logfiles every time as we can see via EXPLAIN:

krl@postgres=# explain select * from pglog where error_severity in ('ERROR', 'FATAL') order by log_time desc limit 1;
                                 QUERY PLAN                                  
─────────────────────────────────────────────────────────────────────────────
 Limit  (cost=13587.31..13587.31 rows=1 width=556)
   ->  Sort  (cost=13587.31..13589.27 rows=782 width=556)
         Sort Key: log_time DESC
         ->  Foreign Scan on pglog  (cost=0.00..13583.40 rows=782 width=556)
               Filter: (error_severity = ANY ('{ERROR,FATAL}'::text[]))
               Foreign File: pg_log/postgresql.csv
               Foreign File Size: 45645532
(7 rows)

In such cases a typical approach would be to write somekind of simple logs importing Python cronjob that scans and parses the CSV logfiles and inserts entries into an actual table (typically on a dedicated logging database), where “log_time” column could be indexed for better performance. Or another direction (if you’re not super worried about privacy) would be to use a 3rd party SaaS provider like Loggly or Scalyr who have according log exporting means available.

Kaarel Moppel
I’ve been interested with databases for the last 10 years, working last 6 years exclusively with PostgreSQL. And still I’m constantly surprised by it’s powerful set of features and the fast pace of development by the globally friendly community. On my spare time I enjoy playing soccer and travelling.