Citus columnar storage hands-on

Preamble

In one of our project we plan to replicate part of an Oracle database to either MariaDB or PostgreSQL. The goal is to use one of these open source database technology to create a kind of Business Intelligence (BI) solution.

One cool feature to implement when you design a BI solution is to use a columnar storage that mostly respond to BI ad-hoc queries (not all of course, type use case is aggregation on few columns). In Oracle database the (paid) feature is called In-Memory Column Store and is club with figures put in memory as well as stored in a columnar fashion.

in MariaDB the feature is called ColumnStore but my trial to use it has not been as smooth as expected… i might come back to it with a more recent release to see if things have improved…

When I was looking for a columnar storage for PostgreSQL I immediately found Swarm64 (non free product) but apparently it is more a columnstore indexing product on traditional heap table than a new columnar storage engine.

The second one which name was known to me already was Citus an open source columnar storage engine. Citus is the scale-out/sharding solution that is behind the Hyperscale PostgreSQL offer of Microsoft Azure. With Citus release 10, published in March 2021, they have added (on top of other cool features) a columnar storage. I feel like Microsoft is committing to the project, better that than falling in the ElasticSearch story…

citus-banner
citus-banner

One current trade-off of Citus is the lack of update/delete on objects using this storage engine so you must keep in mind the append only way of working. No doubt it will evolve…

My testing server is a bare metal server with 12 cores and 64GB memory, I’m using PostgreSQL 14.2. The underlining filesystem is Veritas VxVM/VxFS.

Citus installation

I initialize an empty PostgreSQL database:

[postgres@server1 ~]$ pg_ctl initdb -D /postgres/14/data/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
 
The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
 
Data page checksums are disabled.
 
fixing permissions on existing directory /postgres/14/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Paris
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
 
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
 
Success. You can now start the database server using:
 
    /usr/pgsql-14/bin/pg_ctl -D /postgres/14/data -l logfile start

Change pg_hba.conf to allow connection with the account you wish from the client you wish. I will personnally use local connection from PostgreSQL cluster only…

To install Citus configure their repository with:

[root@server1 ~]# curl https://install.citusdata.com/community/rpm.sh | sudo bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html
 
curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

As I’m behind a corporate proxy I had to configure the .curlrc file of my postgres account as well as using insecure option to bypass the painful proxy certificate configuration:

[root@server1 ~]# cat .curlrc
proxy = "http://proxy_user:proxy_password@proxy_server.domain.com:proxy_port/"
insecure

Finally:

[root@server1 ~]# curl https://install.citusdata.com/community/rpm.sh | sudo bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8697  100  8697    0     0  30766      0 --:--:-- --:--:-- --:--:-- 30731
Detected operating system as rhel/7.
Checking for curl...
Detected curl...
Checking for postgresql14-server...
Detected postgresql14-server...
Checking for EPEL repositories...
Installing epel-release repo...
WARNING:
The EPEL repository could not be installed. This means You may not be able to satisfy package dependencies.
To fix this, manually install EPEL repository and then install packages.
More information: https://fedoraproject.org/wiki/EPEL#How_can_I_use_these_extra_packages.3F
 
Downloading repository file: https://repos.citusdata.com/community/config_file.repo?os=rhel&dist=7&source=script... done.
Installing pygpgme to verify GPG signatures... done.
Installing yum-utils... done.
Generating yum cache for citusdata_community... done.
 
The repository is set up! You can now install packages.

Again for my proxy server I had to remove the SSL check with sslverify=1 (from sslverify=0) in /etc/yum.repos.d/citusdata_community.repo and finally:

[root@server1 ~]# yum install citus102_14.x86_64

I have immediately tried to create the extension but:

postgres=# create extension if not exists citus;
ERROR:  Citus can only be loaded via shared_preload_libraries
HINT:  Add citus to shared_preload_libraries configuration variable in postgresql.conf in master and workers. Note that citus should be at the beginning of shared_preload_libraries.

Changed ion postgresql.conf, citus must be first shared library you load if you have multiple:

shared_preload_libraries = 'citus'
shared_preload_libraries = 'citus,pg_stat_statements,pg_stat_monitor'

Extension creation worked this time:

postgres=# create extension if not exists citus;
CREATE EXTENSION
postgres=# select * from pg_extension;
  oid  | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-------+---------+----------+--------------+----------------+------------+-----------+--------------
 14472 | plpgsql |       10 |           11 | f              | 1.0        |           |
 16387 | citus   |       10 |           11 | f              | 10.2-4     |           |
(2 rows)

And voilà we are ready to use it with a simple “using columnar” at the end of table creation commands…

Citus columnar storage testing preparation

For testing I have decided to use the data model that is provided for free by MariaDB in their GitHub repository and that I have used while testing MariaDB columnstore. I also thought it would be a cool exercise to practice a bit and load this data model in my PostgreSQL cluster.

The airlines and airports csv files are in the repository but you have to download the flights csv that is much bigger (6GB). To simplify what I have done with MariaDB I have simply wget the file with:

[root@server1 tmp]# wget https://sample-columnstore-data.s3.us-west-2.amazonaws.com/flights.csv

I also modified the creation table script of the flights table to change few column from TINYINT to SMALLINT and few number columns to character. I have also added a last unused column (last_comma decimal) to manage the last comma at the end of each line in the csv to have PostgreSQL COPY command working. So my final creation script. For the traditional (heap) version just remove the USING COLUMNAR keywords:

create table airlines
  (
     iata_code varchar(2),
     airline   varchar(30)
  )
using columnar;
 
drop table airports;
create table airports
  (
     iata_code varchar(3),
     airport   varchar(80),
     city      varchar(30),
     state     varchar(2),
     country   varchar(30),
     latitude  decimal(11, 4),
     longitude decimal(11, 4)
  )
using columnar;
 
drop table flights;
create table flights
  (
     year                smallint,
     month               smallint,
     day                 smallint,
     day_of_week         smallint,
     fl_date             date,
     carrier             varchar(2),
     tail_num            varchar(6),
     fl_num              smallint,
     origin              varchar(5),
     dest                varchar(5),
     crs_dep_time        varchar(4),
     dep_time            varchar(4),
     dep_delay           decimal,
     taxi_out            decimal,
     wheels_off          varchar(4),
     wheels_on           varchar(4),
     taxi_in             decimal,
     crs_arr_time        varchar(4),
     arr_time            varchar(4),
     arr_delay           decimal,
     cancelled           decimal,
     cancellation_code   varchar(4),
     diverted            decimal,
     crs_elapsed_time    decimal,
     actual_elapsed_time decimal,
     air_time            decimal,
     distance            decimal,
     carrier_delay       decimal,
     weather_delay       decimal,
     nas_delay           decimal,
     security_delay      decimal,
     late_aircraft_delay decimal,
     last_comma          decimal
  )
using columnar;

Load the data with (ensure you have around 5-10GB for your PostgreSQL cluster:

postgres=# copy airlines from '/tmp/airlines.csv' delimiter ',' csv;
COPY 30
 
postgres=# copy airports from '/tmp/airports.csv' delimiter ',' csv header;
COPY 399
 
postgres=# copy flights from '/tmp/flights.csv' delimiter ',' csv;
COPY 38083735
postgres=# select count(*) from flights;
  count
----------
 38083735
(1 row)

As a trick I have used, if you face issue while loading external file you can use this trick to load a file line in one single row and then use the string functions to cut/split/translate the raw string. The chosen delimiter must nit be present in the external file:

postgres=# copy flights from '/tmp/flights.csv' delimiter ',' csv;
ERROR:  extra data after last expected column
CONTEXT:  COPY flights, line 1: "2015,1,15,4,2015-01-15,"AA","N4YTAA","362","DFW","MKE","2000","2032",32.00,13.00,"2045","2236",5.00,..."
 
postgres=# CREATE TEMP TABLE tmp_data (raw text);
CREATE TABLE
postgres=# copy tmp_data  from '/tmp/flights.csv' delimiter '$';
COPY 38083735

To put myself in best possible situation I have VACUUM all tables and as advised in one of the official Citus blog post I have CHEKCKPOINT twice:

postgres=# vacuum freeze verbose analyze flights;
INFO:  aggressively vacuuming "public.flights"
INFO:  table "flights": found 0 removable, 38083735 nonremovable row versions in 705284 out of 705284 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 797
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 6.99 s, system: 6.04 s, elapsed: 39.03 s.
INFO:  aggressively vacuuming "pg_toast.pg_toast_16970"
INFO:  table "pg_toast_16970": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 797
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  analyzing "public.flights"
INFO:  "flights": scanned 30000 of 705284 pages, containing 1619734 live rows and 0 dead rows; 30000 rows in sample, 38079082 estimated total rows
VACUUM
postgres=# checkpoint;
CHECKPOINT
postgres=# checkpoint;
CHECKPOINT

You can ensure you have used the Citus storage engine with (Access method: columnar):

postgres=# \d+ airlines
                                                 Table "public.airlines"
  Column   |         Type          | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
-----------+-----------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 iata_code | character varying(2)  |           |          |         | extended |             |              |
 airline   | character varying(30) |           |          |         | extended |             |              |
Access method: columnar

Citus columnar storage testing results

To get the timing of an executed statement use:

postgres=# \timing on
Timing is on.

To remove the auto pause while displaying results use. I have not found on how to suppress the output like in Oracle SQL*Plus:

postgres=# \pset pager off
Pager usage is off.

The test query I have used is the same I used while testing MariaDB columnstore and is the airline_delay_types_by_year.sql script. The modified version for PostgreSQL:

\timing on
\pset pager off
SELECT q.airline,
       q.year,
       q.delay_type,
       q.delay
FROM
    (SELECT a.airline,
            f.year,
            'Airline Delay' AS delay_type,
            COUNT(*) AS delay
     FROM flights f
     JOIN airlines a ON f.carrier = a.iata_code
     WHERE f.carrier_delay > 0 GROUP  BY a.airline,
                                             f.year
     UNION ALL SELECT a.airline,
                      f.year,
                      'Late Aircraft Delay' AS delay_type,
                      COUNT(*) AS delay
     FROM flights f
     JOIN airlines a ON f.carrier = a.iata_code
     WHERE f.late_aircraft_delay > 0 GROUP  BY a.airline,
                                                   f.year
     UNION ALL SELECT a.airline,
                      f.year,
                      'Air System Delay' AS delay_type,
                      COUNT(*) AS delay
     FROM flights f
     JOIN airlines a ON f.carrier = a.iata_code
     WHERE f.nas_delay > 0 GROUP  BY a.airline,
                                         f.year
     UNION ALL SELECT a.airline,
                      f.year,
                      'Weather Delay' delay_type,
                                      COUNT(*) AS delay
     FROM flights f
     JOIN airlines a ON f.carrier = a.iata_code
     WHERE f.weather_delay > 0 GROUP  BY a.airline,
                                             f.year) AS q
ORDER  BY 1,2,3;

Then to get only the execution time and not the result I have used this bash command:

[postgres@server1 ~]$ psql --port=5433 --host=server1.domain.com --username=postgres -f airline_delay_types_by_year.sql | grep 'Time:'
Time: 7679.404 ms (00:07.679)

I have run multiple time the columnar and heap version of test tables and I always eliminate the few first run that are here only to warm the database cache.

With more than five executions I got an average running time of 7.6 seconds for the columnar version. And with more than 5 executions of the heap version I got an average of 16.5 seconds:

ColumnarHeap
7.6 seconds16.5 seconds

Of course this is not a benchmark as I tuned almost nothing on my test PostgreSQL instance and that I have NOT created any indexes on the heap version of my table. The only non-default instance parameters I used are:

listen_addresses = 'server1.domain.com'
port = 5433
shared_buffers = 1GB

References

About Post Author

Share the knowledge!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>