Getting started with MariaDB Galera Cluster and Percona XtraDB Cluster

Preamble

I have already played a long time back with MySQL Cluster, was even before I started this blog that’s why I have never published anything on this. It was working and implementation went very well. Even if nowadays release I used can be considered as old. I imagine it is even smother now… The only drawback I saw was the usage of NDB storage engine that is obviously not InnoDB and that has its set of restrictions, maybe some have been removed with last releases.

Then I met a MariaDB representative that mentioned Galera cluster. I then attended few webinar about it and honestly I have been impressed by the product. I gave it a try and had even been more impressed by how easy it is to set it up and multiple times yes ! Storage engine behind is InnoDB and rely on a shared-nothing architecture (each cluster node has a full copy of the data) !! Did I mention product is free and open source ? last but not least they are not new in business as their first release has been published in 2009…

I have recently read an article about NoSQL architecture and even if they are not relational database what is really appreciated in this world is fast and easy provisioning of new nodes in existing cluster. I have to say that with Galera cluster it is best of both world: relational database and super fast provisioning.

Galera cluster comes as a library from Codership company and you need to use a modified edition of MySQL coming either from Percona or MariaDB.

Testing of this blog has been done using Oracle Linux Server release 6.5 64 bits with MariaDB Galera 5.5.33a and Percona XtraDB Cluster 5.5.34.

My first server is called server1.domain.com and as non-routable 192.168.56.101 IP address. Second server is called server2.domain.com and as non-routable 192.168.56.102 IP address.

Galera cluster installation

I initially tried to compile Galera library:

[root@server1 galera-24.2.7-src]# yum -y install scons.noarch
[root@server1 galera-24.2.7-src]# yum -y install boost-devel.x86_64
[root@server1 galera-24.2.7-src]# yum -y install openssl-devel.x86_64
[root@server1 galera-24.2.7-src]# yum -y install check-devel.x86_64

Then simply execute scons compile command in directory where you have uncompress the source galera package. If you have all pre-requisites installed then it should goes smoothly, unless the error message is quite self explaining and just install the needed package(s) to resolve:

[root@server1 galera-25.3.1-src]# scons

Then as seen around on internet (in this thread for example: Shinguz: Building Galera Replication from Scratch) there is no “scons install” command so the dirty solution to install the library is:

cp garb/garbd /home/mysql/product/mysql-5.5.15-wsrep-21.2/bin/
cp libgalera_smm.so /home/mysql/product/mysql-5.5.15-wsrep-21.2/lib/plugin/

Or lazy approach if you are lucky to work on one of the supported Linux release, even if I’m using Oracle Linux Server release 6.4 I have been able to install the RedHat 6 package (Oracle Linux is not a bit to bit copy of RedHat but most packages work fine on it):

[root@server1 tmp]# rpm -Uvh galera-25.3.1-1.rhel6.x86_64.rpm
Preparing...                ########################################### [100%]
   1:galera                 ########################################### [100%]

if you check in rpm you will see that few more files are install versus the basic cp command when you compile the source library:

[root@server1 tmp]# rpm -qpl galera-25.3.1-1.rhel6.x86_64.rpm
/etc/init.d/garb
/etc/sysconfig/garb
/usr/bin
/usr/bin/garbd
/usr/lib64/galera
/usr/lib64/galera/libgalera_smm.so
/usr/share/doc/galera
/usr/share/doc/galera/COPYING
/usr/share/doc/galera/README
/usr/share/doc/galera/README-MySQL

Even if using MariaDB Galera Cluster 5.5.33a I tried to put the latest Galera library but then MariaDB start fail for:

131113 12:48:28 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
131113 12:48:28 [ERROR] WSREP: provider interface version mismatch: need '23', found '25'
131113 12:48:28 [ERROR] WSREP: wsrep_load(): interface version mismatch: my version 23, provider version 25
131113 12:48:28 [ERROR] WSREP: wsrep_load(/usr/lib64/galera/libgalera_smm.so) failed: Invalid argument (22). Reverting to no provider.
131113 12:48:28 [Note] WSREP: Read nil XID from storage engines, skipping position init
131113 12:48:28 [Note] WSREP: wsrep_load(): loading provider library 'none'
131113 12:48:28 [ERROR] Aborting

So I had to put the exact needed release i.e. Galera 23.2.7…

Even if minimal configuration is a 3 nodes Galera cluster or 2 nodes and Galera Arbitrator Daemon I have decided to do a testing with only 2 nodes. This minimum number of node is to handle split brain situation, as 1 node on 2 is not considered as majority…

As we have seen in a previous post I use MySQL Optimal Configuration Architecture (MOCA) as a standard to install MySQL and forks:

Directory Used for
/mysql/data01/mysql01 Strore MyISAM and InnoDB files, dataxx directories can also be created to spread I/O
/mysql/dump/mysql01 All log files (slow log, error log, general log, …)
/mysql/logs/mysql01 All binary logs (log-bin, relay_log)
/mysql/software/mysql01 MySQL binaries (the my.cnf file is then stored in a conf subdirectory, as well as socket and pid files)

I also define the below aliases to ease stop/start and connection to MySQL instance:

[mysql@server1 ~]$ alias
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'
alias mysql01='/mysql/software/mysql01/bin/mysql --defaults-file=/mysql/software/mysql01/conf/my.cnf --user=root --password=`cat ~mysql/.root_password`'
alias start_mysql01='cd /mysql/software/mysql01/; ./bin/mysqld_safe --defaults-file=/mysql/software/mysql01/conf/my.cnf &'
alias stop_mysql01='/mysql/software/mysql01/bin/mysqladmin --defaults-file=/mysql/software/mysql01/conf/my.cnf --user=root --password=`cat ~mysql/.root_password` shutdown'
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'

Of course this must be done on all nodes of your cluster…

Galera cluster configuration

As stated in in Codership documentation the only mandatory parameters are (all the rest is personal tuning):

query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2

For information the my.cnf file I use is:

[mysqld]
user=mysql
basedir=/mysql/software/mysql01
datadir=/mysql/data01/mysql01
port=3316
pid_file=/mysql/software/mysql01/conf/mysql01.pid
log_error=/mysql/dump/mysql01/mysql01.err
default-storage-engine=innodb
tmpdir=/tmp
binlog_format=row
server-id=1
log_bin=/mysql/logs/mysql01/mysql-bin.log
socket=/mysql/software/mysql01/conf/mysql01.sock
sql_mode='TRADITIONAL,PIPES_AS_CONCAT'
slow_query_log_file=/mysql/dump/mysql01/mysql01-slow.log
slow_query_log=1
long_query_time=5
 
innodb_buffer_pool_size=50M
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=12M
innodb_log_file_size=50M
innodb_log_files_in_group = 4
innodb_thread_concurrency=16
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=2
 
query_cache_size=0
 
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=100M"
#wsrep_cluster_address='gcomm://192.168.56.101,192.168.56.102'
wsrep_cluster_address='gcomm://'
wsrep_cluster_name='galera_cluster01'
wsrep_node_address='192.168.56.101' | '192.168.56.102'
wsrep_node_name='server1.domain.com' | 'server2.domain.com'
wsrep_sst_method=rsync | xtrabackup | mysqldump
wsrep_sst_auth=root:root_password
 
[mysql]
default-character-set=utf8
socket=/mysql/software/mysql01/conf/mysql01.sock
 
[client]
default-character-set=utf8
socket=/mysql/software/mysql01/conf/mysql01.sock

Please note that when starting Galera cluster with only one node you must comment out wsrep_cluster_address parameter (same as in case of split brain). Once second node is started then you can set it up dynamically with set global command.

The wsrep_sst_method parameter determine the State Snapshot Transfer (SST) method to be used:

method speed blocks the donor can be done on live node? logical/physical requires root access to MySQL server?
mysqldump slow yes yes logical both donor and joiner
rsync fastest yes no physical none
xtrabackup fast for a very short time no physical donor only

On first node issue mysql_install_db script and start your MySQL instance.

I create the below test table:

MariaDB [(none)]> drop database test;
Query OK, 0 rows affected (0.01 sec)
 
MariaDB [(none)]> create database test character set=utf8 collate=utf8_general_ci;
Query OK, 1 row affected (0.00 sec)
mysql> use test
Database changed
MariaDB [test]> CREATE TABLE test1(id int, descr varchar(50), primary key (id));
Query OK, 0 rows affected (0.08 sec)

That I load with below procedure:

DELIMITER $$
 
DROP PROCEDURE IF EXISTS fill_test1;
 
CREATE PROCEDURE fill_test1()
BEGIN
  DECLARE count INT DEFAULT 1;
 
  WHILE count <= 10000 DO
    INSERT INTO test1 VALUES(count,count);
    SET count=count+1;
  END WHILE;
END;
$$
 
DELIMITER ;

Implementation with rsync

You just need to ensure tool is installed and port 4444 is opened between cluster nodes, also this is only method where you do not need to put powerful account password in clear text (wsrep_sst_auth):

So on my first node (server1.domain.com) I have the following table and nothing on node2 (server2.domain.com):

MariaDB [test]> call fill_test1;
Query OK, 1 row affected (2.29 sec)
 
MariaDB [test]> select count(*) from test1;
+----------+
| count(*) |
+----------+
|    10000 |
+----------+
1 row in set (0.06 sec)

Then on server2 just execute mysql_install_db install script:

[mysql@server2 mysql01]$ ./scripts/mysql_install_db --defaults-file=/mysql/software/mysql01/conf/my.cnf

Then simply start MySQL (obviously ensure wsrep_cluster_address is well set to ‘gcomm://192.168.56.101,192.168.56.102’):

[mysql@server2 mysql01]$ start_mysql01
[1] 2878
[mysql@server2 mysql01]$ 131202 14:27:59 mysqld_safe Logging to '/mysql/dump/mysql01/mysql01.err'.
131202 14:27:59 mysqld_safe Starting mysqld daemon with databases from /mysql/data01/mysql01
131202 14:27:59 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.GTO7qaOQaJ                                 --pid-file=/mysql/data01/mysql01/server2.domain.com-recover.pid
 
[mysql@server2 mysql01]$ 131202 14:28:08 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1

While the cluster is synchronizing you cannot immediately connect to it:

[mysql@server2 mysql01]$ mysql01
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/mysql/software/mysql01/conf/mysql01.sock' (111)

In fact not while the rsync processes are running:

[mysql@server2 mysql01]$ ps -ef | grep mysql
root      1512  1493  0 12:34 pts/0    00:00:00 su - mysql
mysql     1513  1512  0 12:34 pts/0    00:00:00 -bash
mysql     2878  1513  0 14:27 pts/0    00:00:00 /bin/sh ./bin/mysqld_safe --defaults-file=/mysql/software/mysql01/conf/my.cnf
mysql     3390  2878  0 14:28 pts/0    00:00:00 /mysql/software/mysql01/bin/mysqld --defaults-file=/mysql/software/mysql01/conf/my.cnf --basedir=/mysql/software/mysql01 --datadir=/mysql/data01/mysql01 --plugin-dir=/mysql/software/mysql01/lib/plugin --log-error=/mysql/dump/mysql01/mysql01.err --pid-file=/mysql/software/mysql01/conf/mysql01.pid --socket=/mysql/software/mysql01/conf/mysql01.sock --port=3316 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1
mysql     3398  3390  0 14:28 pts/0    00:00:00 /bin/bash -ue /mysql/software/mysql01//bin/wsrep_sst_rsync --role joiner --address 192.168.56.102 --auth  --datadir /mysql/data01/mysql01/ --defaults-file /mysql/software/mysql01/conf/my.cnf --parent 3390
mysql     3423     1  0 14:28 ?        00:00:00 rsync --daemon --port 4444 --config /mysql/data01/mysql01//rsync_sst.conf
mysql     3441  3423  0 14:28 ?        00:00:00 rsync --daemon --port 4444 --config /mysql/data01/mysql01//rsync_sst.conf
mysql     3442  3441 52 14:28 ?        00:00:05 rsync --daemon --port 4444 --config /mysql/data01/mysql01//rsync_sst.conf
mysql     3484  3398  0 14:28 pts/0    00:00:00 sleep 1
mysql     3485  1513  0 14:28 pts/0    00:00:00 ps -ef
mysql     3486  1513  0 14:28 pts/0    00:00:00 grep mysql

Once synchronization is over you can connect and see your test table has been replicated from server1. Now let’s connect on server1, issue below command and change my.cnf to reflect the change for next restart:

MariaDB [test]> set global wsrep_cluster_address='gcomm://192.168.56.101,192.168.56.102';
Query OK, 0 rows affected (4.07 sec)

And that’s it !!

Implementation with mysqldump

Not like rsync and xtrabackup, mysqldump replicate figures when the node joining the cluster is already running (export/import).

I trash MySQL on my second server (server2.domain.com) with:

[mysql@server2 ~]$ stop_mysql01
131206 11:29:37 mysqld_safe mysqld from pid file /mysql/software/mysql01/conf/mysql01.pid ended
[1]+  Done                    ./bin/mysqld_safe --defaults-file=/mysql/software/mysql01/conf/my.cnf  (wd: /mysql/software/mysql01)
[mysql@server2 mysql]$ cd /mysql
[mysql@server2 mysql]$ rm -rf data01/mysql01/* dump/mysql01/* logs/mysql01/*
[mysql@server2 mysql]$ cd /mysql/software/mysql01/
[mysql@server2 mysql01]$ ./scripts/mysql_install_db --defaults-file=/mysql/software/mysql01/conf/my.cnf

Then recreate,start it and grant the below privileges to allow mysqldump to work:

MariaDB [(none)]> delete from mysql.user where user='';
Query OK, 2 rows affected (0.04 sec)
 
MariaDB [(none)]> update mysql.user set password=password('root_password');
Query OK, 4 rows affected (0.04 sec)
Rows matched: 4  Changed: 4  Warnings: 0
 
MariaDB [(none)]> grant all on *.* to root@'server1' identified by 'root_password';
Query OK, 0 rows affected (0.00 sec)
 
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.02 sec)

Ensure the below parameters are set on all cluster nodes:

wsrep_sst_method=mysqldump
wsrep_sst_auth=root:root_password

We see the password in encrypted at MySQL level:

MariaDB [(none)]> show variables like 'wsrep_sst%';
+---------------------------------+-----------+
| Variable_name                   | Value     |
+---------------------------------+-----------+
| wsrep_sst_auth                  | ********  |
| wsrep_sst_donor                 |           |
| wsrep_sst_donor_rejects_queries | OFF       |
| wsrep_sst_method                | mysqldump |
| wsrep_sst_receive_address       | AUTO      |
+---------------------------------+-----------+
5 rows in set (0.00 sec)

I was wondering how the mysqldump connection was working between cluster nodes, I had partially the answer when my first test failed for (extract from mysql01.err log file on SST donor node):

.
.
ERROR 1130 (HY000): Host 'server1' is not allowed to connect to this MariaDB server
131205 16:24:44 [ERROR] WSREP: Process completed with error: wsrep_sst_mysqldump --user 'root' --password 'root_password' --host '192.168.56.102' --port '3316' --local-port '3316' --socket '/mysql/software/mysql01/conf/mysql01.sock' --datadir '/mysql/data01/mysql01/' --gtid '844371cf-5918-11e3-9422-0ffd82754721:25': 1 (Operation not permitted)
131205 16:24:44 [ERROR] WSREP: Try 1/3: 'wsrep_sst_mysqldump --user 'root' --password 'root_password' --host '192.168.56.102' --port '3316' --local-port '3316' --socket '/mysql/software/mysql01/conf/mysql01.sock' --datadir '/mysql/data01/mysql01/' --gtid '844371cf-5918-11e3-9422-0ffd82754721:25'' failed: 1 (Operation not permitted)
ERROR 1130 (HY000): Host 'server1' is not allowed to connect to this MariaDB server
.
.

After multiple unfruitful attemps I came to conclusion that account specified in wsrep_sst_auth must be able to connect from any cluster nodes to any other cluster nodes (in case of crash or reinstantiation). So if using root account (never do this in production) you should see something like, on all cluster nodes !!:

MariaDB [test]> select host,user from mysql.user where user='root';
+-----------+------+
| host      | user |
+-----------+------+
| 127.0.0.1 | root |
| ::1       | root |
| localhost | root |
| server1   | root |
| server2   | root |
+-----------+------+
5 rows in set (0.01 sec)

Tips:
Make sure it is working well (flush privileges command may be used to avoid becoming crazy) with an intereactive session…

Counter to rsync here you need to start your second node (server2.domain.com in my case) with a blank wsrep_cluster_address parameter (wsrep_cluster_address=’gcomm://’) and then start the replication with:

MariaDB [(none)]> set global wsrep_cluster_address='gcomm://192.168.56.101,192.168.56.102';
Query OK, 0 rows affected (3.01 sec)

Then you should be able to see your data replicated…

Implementation with xtrabackup

Here the conceptual issue I had was: how will it find the xtrabackup executable ?

I first start by installing it on all cluster nodes in /mysql/software/percona-xtrabackup-2.1.6-Linux-x86_64 directory, as you might guess it is 2.1.6 release. I also added it in profile of my MySQL Linux account on all nodes:

[mysql@server1 ~]$ xtrabackup -v
xtrabackup version 2.1.6 for Percona Server 5.1.70 unknown-linux-gnu (x86_64) (revision id: 702)

The first test I did has just corrupted my second node MySQL instance with no particular error message. So decided to have a look to wsrep_sst_xtrabackup script and when executing it on command prompt I got:

[mysql@server2 ~]$ /mysql/software/mysql01/bin/wsrep_sst_xtrabackup
Can't find nc in the path

Which I solved with:

[root@server2 ~]# yum list nc*
[root@server2 ~]# yum install nc.x86_64

Xtrabackup SST method is working same way as rsync, the instance must not be running when using it. Means that it is only when instance start that xtrabackup can be used. So once you have configured you second node stop and start MySQL instance with below needed parameters:

wsrep_sst_method=xtrabackup
wsrep_sst_auth=root:root_password

I got below error message (on donor mysql error file):

131206 14:49:32 [Warning] WSREP: Received unknown signal: 'Can't find innobackupex in the path'
131206 14:49:32 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'donor' --address '192.168.56.102:4444/xtrabackup_sst' --auth 'root:root_password' --socket '/mysql/software/mysql01/conf/mysql01.sock' --datadir '/mysql/data01/mysql01/' --defaults-file '/mysql/software/mysql01/conf/my.cnf' --gtid '844371cf-5918-11e3-9422-0ffd82754721:26': 22 (Invalid argument)

I did not want to modify wsrep_sst_xtrabackup script so decided to copy all xtrabackup executables and shell scripts in MySQL binary directory. This answers to my conceptual question on how does it know where are xtrabackup executables. Before restarting second node I killed:

[mysql@server2 ~]$ ps -ef | grep wsrep_sst_xtrabackup | grep -v grep
mysql    25342     1  0 14:49 pts/1    00:00:00 /bin/bash -ue /mysql/software/mysql01//bin/wsrep_sst_xtrabackup --role joiner --address 192.168.56.102 --auth root:root_password --datadir /mysql/data01/mysql01/ --defaults-file /mysql/software/mysql01/conf/my.cnf --parent 25334

When starting second node it failed for (SST donor MySQL Error file):

WSREP_SST: [ERROR] innobackupex finished with error: 1.  Check /mysql/data01/mysql01//innobackup.backup.log (20131206 16:53:03.294)
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20131206 16:53:03.299)
131206 16:53:03 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup --role 'donor' --address '192.168.56.102:4444/xtrabackup_sst' --auth 'root:root_password' --socket '/mysql/software/mysql01/conf/mysql01.sock' --datadir '/mysql/data01/mysql01/' --defaults-file '/mysql/software/mysql01/conf/my.cnf' --gtid '844371cf-5918-11e3-9422-0ffd82754721:26'

/mysql/data01/mysql01//innobackup.backup.log log file contains:

tar: -: Cannot write: Broken pipe
tar: Error is not recoverable: exiting now
innobackupex: 'tar chf -' returned with exit code 2.
innobackupex: Error: Failed to stream '/tmp/backup-my.cnf': 2 at /mysql/software/mysql01//bin/innobackupex line 4722.

As mentioned many times on Google this error is sporadic so I re-started it without changing anything on second node and this time it worked fine…

Other issues encountered

With Percona XtraDB Cluster 5.5.34-23.7.6.565 I had below error when executing mysql_install_db:

Installing MySQL system tables...
/mysql/software/mysql01/bin/mysqld: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory

So had to install (two first commands to understand what to install):

[mysql@server1 ~]$ yum whatprovides libssl.so.6
[mysql@server1 ~]$ yum list openssl098e*
[root@server1 ~]# yum install openssl098e.x86_64

The Galera release that has to be installed is also 23 so do not try to put the latest:

131120 17:32:31 [ERROR] WSREP: provider interface version mismatch: need '23', found '25'
131120 17:32:31 [ERROR] WSREP: wsrep_load(): interface version mismatch: my version 23, provider version 25

References

2 thoughts on “Getting started with MariaDB Galera Cluster and Percona XtraDB Cluster

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>