IT World https://blog.yannickjaquier.com RDBMS, Unix and many more... Wed, 16 Dec 2020 11:54:10 +0000 en-US hourly 1 https://wordpress.org/?v=5.6 MariaDB ColumnStore installation and testing – part 2 https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html#respond Sun, 24 Jan 2021 08:53:27 +0000 https://blog.yannickjaquier.com/?p=5060 Preamble After a first article (link) using the container edition of MariaDB ColumnStore I wanted to deploy it on an existing custom MariaDB server installation. Because where I work we do prefer to put files where we like using the MOCA architecture. I had give up on this part as the MariaDB documentation is really […]

The post MariaDB ColumnStore installation and testing – part 2 appeared first on IT World.

]]>

Table of contents

Preamble

After a first article (link) using the container edition of MariaDB ColumnStore I wanted to deploy it on an existing custom MariaDB server installation. Because where I work we do prefer to put files where we like using the MOCA architecture.

I had give up on this part as the MariaDB documentation is really too poor and might come back to this article to update if things evolve positively…

MariaDB Community Server installation and configuration

I have updated my MOCA layout for MariaDB that we have seen a long time ago. MOCA stands for MariaDB Optimal Configuration Architecture (MOCA). So below MariaDB directory naming convention, mariadb01 is the name of the instance:

Directory Used for
/mariadb/data01/mariadb01 Strore MyISAM and InnoDB files, dataxx directories can also be created to spread I/O
/mariadb/dump/mariadb01 All log files (slow log, error log, general log, …)
/mariadb/logs/mariadb01 All binary logs (log-bin, relay_log)
/mariadb/software/mariadb01 MariaDB binaries, you might also want to use /mariadb/software/10.5.4 and factor binaries for multiple MariaDB instances.
I personally believe that the extra 1GB for binaries is worth the flexibility it gives. In other words you can upgrade one without touching the others.
The my.cnf file is then stored in a conf sub-directory, as well as socket and pid files.

I have create a mariadb Linux account in dba group and a /mariadb mount point of 5GB (xfs).

The binaries I downloaded is mariadb-10.5.4-linux-systemd-x86_64.tar.gz (for systems with systemd) as I have a recent Linux… The tar.gz release is obviously deliberate as I want to be able to put it in the directory of my choice. If you take the RPM you can only have one engine per server that can be really limiting (and really hard to manage with your customers) with modern powerful servers…

I created /mariadb/software/mariadb01/conf/my.cnf file with below content (this is just a starting point, any tuning on it for your own workload is mandatory):

[server]
# Primary variables
basedir                         = /mariadb/software/mariadb01
datadir                         = /mariadb/data01/mariadb01
max_allowed_packet              = 256M
max_connect_errors              = 1000000
pid_file                        = /mariadb/software/mariadb01/conf/mariadb01.pid
skip_external_locking
skip_name_resolve

# Logging
log_error                       = /mariadb/dump/mariadb01/mariadb01.err
log_queries_not_using_indexes   = ON
long_query_time                 = 5
slow_query_log                  = ON     # Disabled for production
slow_query_log_file             = /mariadb/dump/mariadb01/mariadb01-slow.log

tmpdir                          = /tmp
user                            = mariadb

# InnoDB Settings
default_storage_engine          = InnoDB
innodb_buffer_pool_size         = 1G    # Use up to 70-80% of RAM
innodb_file_per_table           = ON
innodb_flush_log_at_trx_commit  = 0
innodb_flush_method             = O_DIRECT
innodb_log_buffer_size          = 16M
innodb_log_file_size            = 512M
innodb_stats_on_metadata        = ON
innodb_read_io_threads          = 64
innodb_write_io_threads         = 64

# New plugin directory for Columnstore
plugin_dir                      = /usr/lib64/mysql/plugin
plugin_maturity                 = beta

[client-server]
port                            = 3316
socket                          = /mariadb/software/mariadb01/conf/mariadb01.sock

As root account I executed:

[root@server4 ~]# /mariadb/software/mariadb01/scripts/mariadb-install-db --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb
Installing MariaDB/MySQL system tables in '/mariadb/data01/mariadb01' ...
OK

To start mysqld at boot time you have to copy
support-files/mysql.server to the right place for your system


Two all-privilege accounts were created.
One is root@localhost, it has no password, but you need to
be system 'root' user to connect. Use, for example, sudo mysql
The second is mariadb@localhost, it has no password either, but
you need to be the system 'mariadb' user to connect.
After connecting you can set the password, if you would need to be
able to connect as any of these users with a password and without sudo

See the MariaDB Knowledgebase at https://mariadb.com/kb or the
MySQL manual for more instructions.

You can start the MariaDB daemon with:
cd '/mariadb/software/mariadb01' ; /mariadb/software/mariadb01/bin/mysqld_safe --datadir='/mariadb/data01/mariadb01'

You can test the MariaDB daemon with mysql-test-run.pl
cd '/mariadb/software/mariadb01/mysql-test' ; perl mysql-test-run.pl

Please report any problems at https://mariadb.org/jira

The latest information about MariaDB is available at https://mariadb.org/.
You can find additional information about the MySQL part at:
https://dev.mysql.com
Consider joining MariaDB's strong and vibrant community:
Get Involved

This is new (at least to me) that from now one you can connect with mariadb or root account without any password. In my mariadb Linux account I created three below aliases:

alias mariadb01='/mariadb/software/mariadb01/bin/mariadb --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb'
alias start_mariadb01='cd /mariadb/software/mariadb01/; ./bin/mariadbd-safe --defaults-file=/mariadb/software/mariadb01/conf/my.cnf &'
alias stop_mariadb01='/mariadb/software/mariadb01/bin/mariadb-admin --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=mariadb shutdown' 

Start and stop command are working fine but client connection (mariadb01 alias) failed for:

/mariadb/software/mariadb01/bin/mariadb: error while loading shared libraries: libncurses.so.5: cannot open shared object file: No such file or directory

I resolved it with:

dnf -y install ncurses-compat-libs-6.1-7.20180224.el8.x86_64

You can also connect with root Linux account using (MariaDB accounts cannot be faked):

[root@server4 ~]# /mariadb/software/mariadb01/bin/mariadb --defaults-file=/mariadb/software/mariadb01/conf/my.cnf --user=root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.5.4-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]>

MariaDB ColumnStore installation and configuration

I expected, as it is written everywhere, to have ColumnStore available as a storage engine. But found nothing implemented by default:

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
8 rows in set (0.000 sec)

MariaDB [(none)]> show plugins;
+-------------------------------+----------+--------------------+---------+---------+
| Name                          | Status   | Type               | Library | License |
+-------------------------------+----------+--------------------+---------+---------+
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| MRG_MyISAM                    | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MEMORY                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| CSV                           | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| Aria                          | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| MyISAM                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| SPATIAL_REF_SYS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| GEOMETRY_COLUMNS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| inet6                         | ACTIVE   | DATA TYPE          | NULL    | GPL     |
| inet_aton                     | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet_ntoa                     | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet6_aton                    | ACTIVE   | FUNCTION           | NULL    | GPL     |
| inet6_ntoa                    | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4                       | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv6                       | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4_compat                | ACTIVE   | FUNCTION           | NULL    | GPL     |
| is_ipv4_mapped                | ACTIVE   | FUNCTION           | NULL    | GPL     |
| CLIENT_STATISTICS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INDEX_STATISTICS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| TABLE_STATISTICS              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| USER_STATISTICS               | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| wsrep                         | ACTIVE   | REPLICATION        | NULL    | GPL     |
| SQL_SEQUENCE                  | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| PERFORMANCE_SCHEMA            | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| InnoDB                        | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| INNODB_TRX                    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCKS                  | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_LOCK_WAITS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP                    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_RESET              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMPMEM                 | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMPMEM_RESET           | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_PER_INDEX          | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_CMP_PER_INDEX_RESET    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_PAGE            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_PAGE_LRU        | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_BUFFER_POOL_STATS      | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_METRICS                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_DEFAULT_STOPWORD    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_DELETED             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_BEING_DELETED       | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_CONFIG              | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_INDEX_CACHE         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_FT_INDEX_TABLE         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLES             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLESTATS         | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_INDEXES            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_COLUMNS            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FIELDS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FOREIGN            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_FOREIGN_COLS       | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_TABLESPACES        | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_DATAFILES          | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_VIRTUAL            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_MUTEXES                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_SYS_SEMAPHORE_WAITS    | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| INNODB_TABLESPACES_ENCRYPTION | ACTIVE   | INFORMATION SCHEMA | NULL    | BSD     |
| SEQUENCE                      | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
| user_variables                | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| unix_socket                   | ACTIVE   | AUTHENTICATION     | NULL    | GPL     |
| FEEDBACK                      | DISABLED | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_GROUPS            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_QUEUES            | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_STATS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| THREAD_POOL_WAITS             | ACTIVE   | INFORMATION SCHEMA | NULL    | GPL     |
| partition                     | ACTIVE   | STORAGE ENGINE     | NULL    | GPL     |
+-------------------------------+----------+--------------------+---------+---------+
68 rows in set (0.002 sec)

You have also this query that I found on MariaDB web site:

SELECT plugin_name, plugin_version, plugin_maturity FROM information_schema.plugins ORDER BY plugin_name;

I had to configure official MariaDB repository as explained in documentation:

[root@server4 ~]# cat /etc/yum.repos.d/mariadb.repo
# MariaDB 10.5 RedHat repository list - created 2020-07-09 15:06 UTC
# http://downloads.mariadb.org/mariadb/repositories/
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.5/rhel8-amd64
module_hotfixes=1
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

One the repository is configured you can see what’s available with:

dnf list mariadb*

I can see a MariaDB-columnstore-engine.x86_64 but its installation will also install MariaDB-server.x86_64 which I do not want… So far I have not found a way to just have the .so file to inject this ColumStore storage engine in my custom MariaDB Server installation…

[root@server4 ~]# mcsSetConfig CrossEngineSupport Host 127.0.0.1
[root@server4 ~]# mcsSetConfig CrossEngineSupport Port 3316
[root@server4 ~]# mcsSetConfig CrossEngineSupport User cross_engine
[root@server4 ~]# mcsSetConfig CrossEngineSupport Password cross_engine_passwd
MariaDB [(none)]> CREATE USER 'cross_engine'@'127.0.0.1' IDENTIFIED BY "cross_engine_passwd";
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT SELECT ON *.* TO 'cross_engine'@'127.0.0.1';
Query OK, 0 rows affected (0.001 sec)
[root@server4 ~]# systemctl status mariadb-columnstore
● mariadb-columnstore.service - mariadb-columnstore
   Loaded: loaded (/usr/lib/systemd/system/mariadb-columnstore.service; enabled; vendor preset: disabled)
   Active: active (exited) since Mon 2020-07-13 15:42:19 CEST; 3min 15s ago
  Process: 27960 ExecStop=/usr/bin/mariadb-columnstore-stop.sh (code=exited, status=0/SUCCESS)
  Process: 27998 ExecStart=/usr/bin/mariadb-columnstore-start.sh (code=exited, status=0/SUCCESS)
 Main PID: 27998 (code=exited, status=0/SUCCESS)

Jul 13 15:42:11 server4.domain.com systemd[1]: Stopped mariadb-columnstore.
Jul 13 15:42:11 server4.domain.com systemd[1]: Starting mariadb-columnstore...
Jul 13 15:42:12 server4.domain.com mariadb-columnstore-start.sh[27998]: Job for mcs-storagemanager.service failed because the control process exited with error code.
Jul 13 15:42:12 server4.domain.com mariadb-columnstore-start.sh[27998]: See "systemctl status mcs-storagemanager.service" and "journalctl -xe" for details.
Jul 13 15:42:19 server4.domain.com systemd[1]: Started mariadb-columnstore.
[root@server4 ~]# systemctl status mcs-storagemanager.service
● mcs-storagemanager.service - storagemanager
   Loaded: loaded (/usr/lib/systemd/system/mcs-storagemanager.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2020-07-13 15:42:14 CEST; 8min ago
  Process: 28010 ExecStartPre=/usr/bin/mcs-start-storagemanager.py (code=exited, status=1/FAILURE)

Jul 13 15:42:14 server4.domain.com systemd[1]: Starting storagemanager...
Jul 13 15:42:14 server4.domain.com mcs-start-storagemanager.py[28010]: S3 storage has not been set up for MariaDB ColumnStore. StorageManager service fails to start.
Jul 13 15:42:14 server4.domain.com systemd[1]: mcs-storagemanager.service: Control process exited, code=exited status=1
Jul 13 15:42:14 server4.domain.com systemd[1]: mcs-storagemanager.service: Failed with result 'exit-code'.
Jul 13 15:42:14 server4.domain.com systemd[1]: Failed to start storagemanager.
[root@server4 columnstore]# cat /var/log/mariadb/columnstore/debug.log
Jul 13 15:05:42 server4 IDBFile[26302]: 42.238256 |0|0|0| D 35 CAL0002: Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks, exception: unable to open Buffered file
Jul 13 15:05:42 server4 controllernode[26302]: 42.238358 |0|0|0| D 29 CAL0000: TableLockServer::load(): could not open the save file/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
Jul 13 15:42:17 server4 IDBFile[28020]: 17.117913 |0|0|0| D 35 CAL0002: Failed to open file: /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks, exception: unable to open Buffered file
Jul 13 15:42:17 server4 controllernode[28020]: 17.118009 |0|0|0| D 29 CAL0000: TableLockServer::load(): could not open the save file/var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 columnstore]# grep -v ^# /etc/columnstore/storagemanager.cnf | grep -v -e '^$'
[ObjectStorage]
service = LocalStorage
object_size = 5M
metadata_path = /mariadb/columnstore/storagemanager/metadata
journal_path = /mariadb/columnstore/storagemanager/journal
max_concurrent_downloads = 21

max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
region = some_region
bucket = some_bucket
[LocalStorage]
path = /mariadb/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /mariadb/columnstore/storagemanager/cache
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/fake-cloud
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/cache
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/metadata
[mariadb@server4 mariadb]$ mkdir -p /mariadb/columnstore/storagemanager/journal
[root@server4 ~]# mcsGetConfig -a | grep /var/lib
SystemConfig.DBRoot1 = /var/lib/columnstore/data1
SystemConfig.DBRMRoot = /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves
SystemConfig.TableLockSaveFile = /var/lib/columnstore/data1/systemFiles/dbrm/tablelocks
SessionManager.TxnIDFile = /var/lib/columnstore/data1/systemFiles/dbrm/SMTxnID
OIDManager.OIDBitmapFile = /var/lib/columnstore/data1/systemFiles/dbrm/oidbitmap
WriteEngine.BulkRoot = /var/lib/columnstore/data/bulk
WriteEngine.BulkRollbackDir = /var/lib/columnstore/data1/systemFiles/bulkRollback
[root@server4 ~]# mcsSetConfig SystemConfig DBRoot1 /mariadb/columnstore/data1
[root@server4 ~]# mcsGetConfig SystemConfig DBRoot1
/mariadb/columnstore/data1
[root@server4 ~]# mcsSetConfig SystemConfig DBRMRoot /mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves
[root@server4 ~]# mcsSetConfig SystemConfig TableLockSaveFile /mariadb/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 ~]# mcsSetConfig SessionManager TxnIDFile /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID
[root@server4 ~]# mcsSetConfig OIDManager OIDBitmapFile /mariadb/columnstore/data1/systemFiles/dbrm/oidbitmap
[root@server4 ~]# mcsSetConfig WriteEngine BulkRoot /mariadb/columnstore/data/bulk
[root@server4 ~]# mcsSetConfig WriteEngine BulkRollbackDir /mariadb/columnstore/data1/systemFiles/bulkRollback
[root@server4 ~]# mkdir -p /mariadb/columnstore/data1/systemFiles/dbrm/BRM_saves /mariadb/columnstore/data1/systemFiles/dbrm/tablelocks
[root@server4 ~]# mkdir -p /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID /mariadb/columnstore/data1/systemFiles/dbrm/SMTxnID
[root@server4 ~]# mkdir -p /mariadb/columnstore/data/bulk /mariadb/columnstore/data1/systemFiles/bulkRollback

Taking inspiration from the container version I have changed plugin_dir variable and plugin maturity allowance to:

# New plugin directory for Columnstore
plugin_dir                      = /usr/lib64/mysql/plugin
plugin_maturity                 = beta

Plugin maturity parameter is to avoid:

MariaDB [(none)]> INSTALL PLUGIN IF NOT EXISTS Columnstore SONAME 'ha_columnstore.so';
ERROR 1126 (HY000): Can't open shared library 'ha_columnstore.so' (errno: 1, Loading of beta plugin Columnstore is prohibited by --plugin-maturity=gamma)

And tried to load the plugin with:

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
8 rows in set (0.001 sec)

MariaDB [(none)]> INSTALL PLUGIN IF NOT EXISTS Columnstore SONAME 'ha_columnstore.so';
Query OK, 0 rows affected (0.111 sec)

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | ColumnStore storage engine                                                                      | YES          | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.001 sec)

On the paper it works but the connection is lost when you try to create a table with ColumnStore storage engine…

References

The post MariaDB ColumnStore installation and testing – part 2 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-2.html/feed 0
MariaDB ColumnStore installation and testing – part 1 https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-1.html https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-1.html#respond Wed, 23 Dec 2020 09:49:32 +0000 https://blog.yannickjaquier.com/?p=5057 Preamble After I have seen the announcement of MariaDB Community saying that ColumnStore has been added as a pluggable storage engine for free I wanted to test it. I have anyway frighted with few changes in the way to install and configure from scratch a MariaDB server so decided to put a small chapter on […]

The post MariaDB ColumnStore installation and testing – part 1 appeared first on IT World.

]]>

Table of contents

Preamble

After I have seen the announcement of MariaDB Community saying that ColumnStore has been added as a pluggable storage engine for free I wanted to test it. I have anyway frighted with few changes in the way to install and configure from scratch a MariaDB server so decided to put a small chapter on this part.

This blog post has been written using Oracle Linux 8.2 (yeah I know MariaDB is not supported on this OS but it is really similar to RedHat and free) and MariaDB Community Server 10.5.4.

My first try has been using my own personalized installation of MariaDB and I have tried to add the ColumnStore storage engine inside. To be honest at the time of writing this post the official documentation is too poor and I have not been able to conclude on this point. So decided to fall back to container implementation as MariaDB has done a lot in blog posts and webinars they created… I’m, of course, using the hype container engine called Podman.

I have also decided to make MariaDB my standard MySQL flavor and so will not use anymore the one coming from Oracle. The main reason being the open source strategy and the governance of MariaDB versus the one of Oracle corporation with MySQL. By the way many big players have already made this transition few years back (Wikipedia, Google, …).

MariaDB ColumnStore container installation and configuration with Podman

I start first by creating a dedicated LVM volume to store containers and images:

[root@server4 ~]# lvcreate -L 10g -n lvol20 vg00
  Logical volume "lvol20" created.
[root@server4 ~]# mkfs -t xfs /dev/vg00/lvol20
meta-data=/dev/vg00/lvol20       isize=512    agcount=4, agsize=655360 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=2621440, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@server4 containers]# grep containers /etc/fstab
/dev/mapper/vg00-lvol20   /var/lib/containers                    xfs    defaults        0 0

Then trying to download the official MariaDB ColumnStore image:

[root@server4 ~]# podman pull mariadb/columnstore
Trying to pull container-registry.oracle.com/mariadb/columnstore...
  Get https://container-registry.oracle.com/v2/: dial tcp: lookup container-registry.oracle.com on 164.129.154.205:53: no such host
Trying to pull docker.io/mariadb/columnstore...
  Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 164.129.154.205:53: no such host
Trying to pull registry.fedoraproject.org/mariadb/columnstore...
  Get https://registry.fedoraproject.org/v2/: dial tcp: lookup registry.fedoraproject.org on 164.129.154.205:53: no such host
Trying to pull quay.io/mariadb/columnstore...
  Get https://quay.io/v2/: dial tcp: lookup quay.io on 164.129.154.205:53: no such host
Trying to pull registry.centos.org/mariadb/columnstore...
  Get https://registry.centos.org/v2/: dial tcp: lookup registry.centos.org on 164.129.154.205:53: no such host
Error: error pulling image "mariadb/columnstore": unable to pull mariadb/columnstore: 5 errors occurred:
        * Error initializing source docker://container-registry.oracle.com/mariadb/columnstore:latest: error pinging docker registry container-registry.oracle.com: Get https://container-registry.oracle.com/v2/: dial tcp: lookup container-registry.oracle.com on 164.129.154.205:53: no such host
        * Error initializing source docker://mariadb/columnstore:latest: error pinging docker registry registry-1.docker.io: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 164.129.154.205:53: no such host
        * Error initializing source docker://registry.fedoraproject.org/mariadb/columnstore:latest: error pinging docker registry registry.fedoraproject.org: Get https://registry.fedoraproject.org/v2/: dial tcp: lookup registry.fedoraproject.org on 164.129.154.205:53: no such host
        * Error initializing source docker://quay.io/mariadb/columnstore:latest: error pinging docker registry quay.io: Get https://quay.io/v2/: dial tcp: lookup quay.io on 164.129.154.205:53: no such host
        * Error initializing source docker://registry.centos.org/mariadb/columnstore:latest: error pinging docker registry registry.centos.org: Get https://registry.centos.org/v2/: dial tcp: lookup registry.centos.org on 164.129.154.205:53: no such host

As suggested I had to configure my corporate proxy:

[root@server4 ~]# cat /etc/profile.d/http_proxy.sh
export HTTP_PROXY=http://proxy_account:proxy_password@proxy_serveur:proxy_port
export HTTPS_PROXY=http://proxy_account:proxy_password@proxy_serveur:proxy_port

Failed for a proxy certificate issue:

[root@server4 ~]# podman pull mariadb/columnstore
Trying to pull container-registry.oracle.com/mariadb/columnstore...
  Get https://container-registry.oracle.com/v2/: x509: certificate signed by unknown authority
Trying to pull docker.io/mariadb/columnstore...
  Get https://registry-1.docker.io/v2/: x509: certificate signed by unknown authority
Trying to pull registry.fedoraproject.org/mariadb/columnstore...
  manifest unknown: manifest unknown
Trying to pull quay.io/mariadb/columnstore...
  Get https://quay.io/v2/: x509: certificate signed by unknown authority
Trying to pull registry.centos.org/mariadb/columnstore...
  Get https://registry.centos.org/v2/: x509: certificate signed by unknown authority
Error: error pulling image "mariadb/columnstore": unable to pull mariadb/columnstore: 5 errors occurred:
        * Error initializing source docker://container-registry.oracle.com/mariadb/columnstore:latest: error pinging docker registry container-registry.oracle.com: Get https://container-registry.oracle.com/v2/: x509: certificate signed by unknown authority
        * Error initializing source docker://mariadb/columnstore:latest: error pinging docker registry registry-1.docker.io: Get https://registry-1.docker.io/v2/: x509: certificate signed by unknown authority
        * Error initializing source docker://registry.fedoraproject.org/mariadb/columnstore:latest: Error reading manifest latest in registry.fedoraproject.org/mariadb/columnstore: manifest unknown: manifest unknown
        * Error initializing source docker://quay.io/mariadb/columnstore:latest: error pinging docker registry quay.io: Get https://quay.io/v2/: x509: certificate signed by unknown authority
        * Error initializing source docker://registry.centos.org/mariadb/columnstore:latest: error pinging docker registry registry.centos.org: Get https://registry.centos.org/v2/: x509: certificate signed by unknown authority

Exported the one of my Windows/Chrome configuration:

columnstore01
columnstore01

And loaded it in my Linux guest (VirtualBox):

[root@server4 ~]# cp /tmp/zarootca.cer /etc/pki/ca-trust/source/anchors/
[root@server4 ~]# update-ca-trust extract

Went well this time:

[root@server4 ~]# podman pull mariadb/columnstore
Trying to pull container-registry.oracle.com/mariadb/columnstore...
  unable to retrieve auth token: invalid username/password: unauthorized: authentication required
Trying to pull docker.io/mariadb/columnstore...
Getting image source signatures
Copying blob 7361994e337a done
Copying blob 6910e5a164f7 done
Copying blob d3a9faedef9c done
Copying blob 09d6834f75a6 done
Copying blob 68e5e07852c8 done
Copying blob df75e1d0f89f done
Copying blob 026abfbced9b done
Copying blob 97d3b9b39f85 done
Copying blob ae7bd0c62cca done
Copying blob 4feabe6971fa done
Copying blob 3833a7277c1f done
Copying blob 97e0996c4e98 done
Copying config 5a61255d05 done
Writing manifest to image destination
Storing signatures
5a61255d059ff8e913b623218d59b45fcda11364676abcde26c188ab5248dec3

Simply create a new container (mcs_container) using this newly download image with:

[root@server4 ~]# podman run -d -p 3306:3306 --name mcs_container mariadb/columnstore
25809ac451884ab4753be0ed512a6fb08bc2d2c2a8f7384659a4281e2a2fa36d
[root@server4 ~]# podman ps -a
CONTAINER ID  IMAGE                                 COMMAND               CREATED        STATUS            PORTS                   NAMES
25809ac45188  docker.io/mariadb/columnstore:latest  /bin/sh -c column...  9 seconds ago  Up 6 seconds ago  0.0.0.0:3306->3306/tcp  mcs_container

Connect to it with:

[root@server4 ~]# podman exec -it mcs_container bash
[root@25809ac45188 /]# mariadb
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.5.4-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show engines;
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                         | Transactions | XA   | Savepoints |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
| Columnstore        | YES     | ColumnStore storage engine                                                                      | YES          | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                           | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                       | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage. Used for internal temporary tables and privilege tables | NO           | NO   | NO         |
| MyISAM             | YES     | Non-transactional engine with good performance and small data footprint                         | NO           | NO   | NO         |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                  | YES          | NO   | YES        |
| InnoDB             | DEFAULT | Supports transactions, row-level locking, foreign keys and encryption for tables                | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                              | NO           | NO   | NO         |
| CSV                | YES     | Stores tables as CSV files                                                                      | NO           | NO   | NO         |
+--------------------+---------+-------------------------------------------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.001 sec)

MariaDB ColumnStore default schema configuration

On MariaDB official Github I have downloaded a zip copy of their mariadb-columnstore-samples. Push it to your container with:

[root@server4 ~]# podman cp /tmp/mariadb-columnstore-samples-master.zip mcs_container:/tmp

In flights sub-directory the schema creation and loading is made of:

  • create_flights_db.sh
  • get_flight_data.sh
  • load_flight_data.sh

Creation went well but to get (on internet) the data I had (again) to configure my corporate proxy:

[root@25809ac45188 flights]# cat ~/.curlrc
proxy = proxy_serveur:proxy_port
proxy-user = "proxy_account:proxy_password"

I have also added -k option to curl in get_flight_data.sh to avoid certificate issue:

#!/bin/bash
#
# This script will remotely invoke the bureau of transportation statistics web form to retrieve data by month:
# https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time
# for the specific columns listed in the SQL and utilized by the sample schema.

mkdir -p data
for y in {2018..2018}; do
  for m in {1..12}; do
    yyyymm="$y-$(printf %02d $m)"
    echo "$yyyymm"
    curl -k -L -o data.zip -d "sqlstr=+SELECT+YEAR%2CMONTH%2CDAY_OF_MONTH%2CDAY_OF_WEEK%2CFL_DATE%2CCARRIER%2CTAIL_NUM%2CFL_NUM%2CORIGIN%2CDEST%2CCRS_DEP_TIME%2CDEP_TIME%2CDEP_DELAY%2CTAXI_OUT%2CWHEELS_OFF%2CWHEELS_ON%2CTAXI_IN%2CCRS_ARR_TIME%2CARR_TIME%2CARR_DELAY%2CCANCELLED%2CCANCELLATION_CODE%2CDIVERTED%2CCRS_ELAPSED_TIME%2CACTUAL_ELAPSED_TIME%2CAIR_TIME%2CDISTANCE%2CCARRIER_DELAY%2CWEATHER_DELAY%2CNAS_DELAY%2CSECURITY_DELAY%2CLATE_AIRCRAFT_DELAY+FROM++T_ONTIME+WHERE+Month+%3D$m+AND+YEAR%3D$y" https://www.transtats.bts.gov/DownLoad_Table.asp?Table_ID=236
    rm -f *.csv
    unzip data.zip
    rm -f data.zip
    mv *.csv $yyyymm.csv
    tail -n +2 $yyyymm.csv > data/$yyyymm.csv
    rm -f $yyyymm.csv
  done
done

Data download and loading went well and I end up the configuration by creating an account to access to figures from remote:

MariaDB [(none))]> grant all on *.* to 'yjaquier'@'%' identified by 'secure_password';
Query OK, 0 rows affected (0.001 sec)

Power BI Desktop configuration

I have obviously started by downloading and installing Power BI Desktop. I have also installed MariaDB Connector/ODBC (3.1.9).

Configure a User DSN in ODBC Data Sources (64 bits):

columnstore02
columnstore02

Supply the account and password we have just created above:

columnstore03
columnstore03

In Power BI choose an ODBC database connection and use the recently created User DSN:

columnstore04
columnstore04

Finally by using the few queries provide in MariaDB columnstore samples Github I have been able to make some graphics. Airports map:

columnstore05
columnstore05

Delay by airlines and by delay type:

columnstore06
columnstore06

Is MariaDB ColumnStore worth the effort ?

Is it really faster to use ColumnStore ? My Linux guest (VirtualBox 6.1.12) has 4 cores and 8GB of RAM, I’m also using VirtualBox Host I/ Cache (Sata 7200 RPM HDD) for the guest disk configuration.

In no way this is a benchmark but I really wanted to have a feeling on how much performance improvement this new columnar storage is delivering. I have not tuned any parameter from the official column store container from MariaDB (InnoDB buffer pool is 128MB).

I have just created a standard InnoDB flights2 table with exact same columns as flights table and fill it with:

MariaDB [flights]> insert into flights2 select * from flights;
Query OK, 7856869 rows affected (6 min 28.798 sec)

And used the airline_delay_types_by_year.sql script, I have created an InnoDB version using my flights2 table and got below result that is an average over five runs:

Columnstore InnoDB
1 minute 30 seconds 2 minutes 30 seconds

References

The post MariaDB ColumnStore installation and testing – part 1 appeared first on IT World.

]]>
https://blog.yannickjaquier.com/mysql/mariadb-columnstore-installation-and-testing-part-1.html/feed 0
Hiveserver2 monitoring with Jconsole and Grafana in HDP 3.x https://blog.yannickjaquier.com/hadoop/hiveserver2-monitoring-with-jconsole-and-grafana-in-hdp-3-x.html https://blog.yannickjaquier.com/hadoop/hiveserver2-monitoring-with-jconsole-and-grafana-in-hdp-3-x.html#respond Sun, 22 Nov 2020 08:54:14 +0000 https://blog.yannickjaquier.com/?p=5027 Preamble Since we have migrated in HFP 3 we had recurring issue with HiveServer2 memory (Ambari memory alert) or the process simply got stuck. When trying to monitor memory consumption we discovered that with HDP 3.x the Ambari Metrics for HiveServer2 heap usage are not displayed anymore in Grafana (No datapoints): I have tried to […]

The post Hiveserver2 monitoring with Jconsole and Grafana in HDP 3.x appeared first on IT World.

]]>

Table of contents

Preamble

Since we have migrated in HFP 3 we had recurring issue with HiveServer2 memory (Ambari memory alert) or the process simply got stuck. When trying to monitor memory consumption we discovered that with HDP 3.x the Ambari Metrics for HiveServer2 heap usage are not displayed anymore in Grafana (No datapoints):

hiveserver201
hiveserver201

I have tried to correct the chart in Grafana when logged (admin account) but even if the metrics are listed there is nothing to display:

hiveserver202
hiveserver202

We got advised to use Java Management Extensions (JMX) Technology to monitor and manage this HiveServer2 Java process using jconsole. Using jconsole would also allow us to issue on-demand a garbage collector task.

Last but not least I have finally found an article on how to restore the HiveServer2 metrics in Grafana…

We are running Hortonworks Data Platform (HDP) 3.1.4 so hive 3.1.0 and Ambari 2.7.4.

Hiveserver2 monitoring with Jconsole

As it is clearly described in Java official documentation to activate JMX for your Java process you simply need to add below option when executing your process:

-Dcom.sun.management.jmxremote

Adding only this parameter will allow local monitoring i.e. the jconsole must be launched on the server where is running your Java process.

The Cloudera documentation has (as usual) quite a few typo issues and what you need to modify is HADOOP_OPTS environment variable. This is done in Ambari for Hiveserver2 in Hive services and hive-env template parameter. I have chosen to re-export the variable keeping its previous value not to change the default Ambari setting:

hiveserver203
hiveserver203

So I added:

export HADOOP_OPTS="-Dcom.sun.management.jmxremote=true $HADOOP_OPTS"

You can find your Hiveserver2 process pid using:

[root@hive_server ~]# ps -ef | grep java |grep hiveserver2
hive      9645     1  0 09:55 ?        00:01:14 /usr/jdk64/jdk1.8.0_112/bin/java -Dproc_jar -Dhdp.version=3.1.4.0-315 -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=8004 -Xloggc:/var/log/hive/hiveserver2-gc-%t.log -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCCause -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive/hs2_heapdump.hprof -Dhive.log.dir=/var/log/hive -Dhive.log.file=hiveserver2.log -Dhdp.version=3.1.4.0-315 -Xmx8192m -Dproc_hiveserver2 -Xmx48284m -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/usr/hdp/current/hive-server2/conf//parquet-logging.properties -Dyarn.log.dir=/var/log/hadoop/hive -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/usr/hdp/3.1.4.0-315/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/3.1.4.0-315/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/current/hadoop-client -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/3.1.4.0-315/hive/lib/hive-service-3.1.0.3.1.4.0-315.jar org.apache.hive.service.server.HiveServer2 --hiveconf hive.aux.jars.path=file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar
hive     18276     1  0 10:02 ?        00:00:49 /usr/jdk64/jdk1.8.0_112/bin/java -Dproc_jar -Dhdp.version=3.1.4.0-315 -Djava.net.preferIPv4Stack=true -Xloggc:/var/log/hive/hiveserverinteractive-gc-%t.log -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCCause -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive/hsi_heapdump.hprof -Dhive.log.dir=/var/log/hive -Dhive.log.file=hiveserver2Interactive.log -Dhdp.version=3.1.4.0-315 -Xmx8192m -Dproc_hiveserver2 -Xmx2048m -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/usr/hdp/current/hive-server2/conf_llap//parquet-logging.properties -Dyarn.log.dir=/var/log/hadoop/hive -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/usr/hdp/3.1.4.0-315/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/3.1.4.0-315/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native -Dhadoop.log.dir=/var/log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/current/hadoop-client -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/3.1.4.0-315/hive/lib/hive-service-3.1.0.3.1.4.0-315.jar org.apache.hive.service.server.HiveServer2 --hiveconf hive.aux.jars.path=file:///usr/hdp/current/hive-server2/lib/hive-hcatalog-core.jar

Then run jconsole command from your $JDK_HOME/bin directory. You run it from $JDK_HOME/bin directory and you obviously need to set the DISPLAY environment variable and have a X server running on your desktop (MobaXterm strongly recommended):

hiveserver204
hiveserver204

Click on Connect and acknowledge the SSL warning, you can now start monitoring:

hiveserver205
hiveserver205

Or perform a garbage collector if you think you have a non-expected memory issue with the process:

hiveserver206
hiveserver206

If you want to be able to remotely monitor you need to add com.sun.management.jmxremote.port=portnum parameter.

To disable SSL if you do not have a certificate use com.sun.management.jmxremote.ssl=false.

Then comes authentication, you can:

  • De-activate it with com.sun.management.jmxremote.authenticate=false (not recommended but simplest way to remotely connect)
  • Use LDAP authentication with com.sun.management.jmxremote.login.config=ExampleCompanyConfig and java.security.auth.login.config=ldap.config
  • Use a file base authentication using com.sun.management.jmxremote.password.file=pwFilePath

To, at least, activate file base authentication get the password file template from $JDK_HOME/jre/lib/management/jmxremote.password.template file and put it in /usr/hdp/current/hive-server2/conf/. Rename it to jmxremote.password and create the users (role in fact according to official documentation):

[root@hive_server ~]# tail /usr/hdp/current/hive-server2/conf/jmxremote.password
# For # security, you should either restrict the access to this file,
# or specify another, less accessible file in the management config file
# as described above.
#
# Following are two commented-out entries.  The "measureRole" role has
# password "QED".  The "controlRole" role has password "R&D".
#
# monitorRole  QED
# controlRole   R&D
yjaquier   secure_password

Also modify $JDK_HOME//jre/lib/management/jmxremote.access to reflect your newly create role/user. Here I have just chosen to inherit highest privilege role:

[root@hive_server jdk1.8.0_112]# tail $JDK_HOME/jre/lib/management/jmxremote.access
# o The "controlRole" role has readwrite access and can create the standard
#   Timer and Monitor MBeans defined by the JMX API.

monitorRole   readonly
controlRole   readwrite \
              create javax.management.monitor.*,javax.management.timer.* \
              unregister
yjaquier   readwrite \
           create javax.management.monitor.*,javax.management.timer.* \
           unregister

My HADOOP_OPTS variable is now:

export HADOOP_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.password.file=$HIVE_CONF_DIR/jmxremote.password -Dcom.sun.management.jmxremote.port=8004 $HADOOP_OPTS"

If HiveServer2 is not restarting you will most probably find in error log something like:

[root@hive_server hive]# cat hive-server2.err
Error: Password file read access must be restricted: /usr/hdp/current/hive-server2/conf//jmxremote.password
Error: Password file read access must be restricted: /usr/hdp/current/hive-server2/conf//jmxremote.password

Easy to solve with:

[root@hive_server conf]# chmod 600 jmxremote.password
[root@hive_server conf]# ll jmxremote.password
-rw------- 1 hive hadoop 2880 Jun 25 14:32 jmxremote.password

And with my local jconsole program (C:\Program Files\Java\jdk1.8.0_241\bin for me) of my desktop JDK installation I can remotely connect (also notice the better visual quality with Windows edition):

hiveserver207
hiveserver207

And have the exact same display as the Linux release, in a way it is more convenient because from your desktop you can access to any Java process of your Hadoop cluster…

hiveserver208
hiveserver208

Hiveserver2 monitoring with Grafana

Coming back to Ambari configuration I have seen a strange list of parameters:

hiveserver209
hiveserver209

While in Confluence official docuementation I can see (https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Metrics.1):

hiveserver210
hiveserver210

And this parameter is truly set to false in my environment:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.server2.metrics.enabled;
+-------------------------------------+
|                 set                 |
+-------------------------------------+
| hive.server2.metrics.enabled=false  |
+-------------------------------------+
1 row selected (0.251 seconds)

So clearly Ambari has a bug and the parameter in “Advanced hiveserver2-site” is not the good one. I have decided to add this in “Custom hiveserver2-site”, saved and restarted required component to end up with a strange behavior. the parameter got moved to “Advanced hiveserver2-site” with a checkbox (like it should have been since the beginning):

hiveserver211
hiveserver211

Back in Ambari I have modified the Hiveserver2 chart to use default.General.memory.heap.max, default.General.memory.heap.used and default.General.memory.heap.committed metrics to finally get:

hiveserver212
hiveserver212

References

The post Hiveserver2 monitoring with Jconsole and Grafana in HDP 3.x appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/hiveserver2-monitoring-with-jconsole-and-grafana-in-hdp-3-x.html/feed 0
Spark dynamic allocation how to configure and use it https://blog.yannickjaquier.com/hadoop/spark-dynamic-allocation-how-to-configure-and-use-it.html https://blog.yannickjaquier.com/hadoop/spark-dynamic-allocation-how-to-configure-and-use-it.html#respond Thu, 22 Oct 2020 08:52:23 +0000 https://blog.yannickjaquier.com/?p=4994 Preamble Since we have started to put Spark job in production we asked ourselves the question of how many executors, number of cores per executor and executor memory we should put. What if we put too much and are wasting resources and could we improve the response time if we put more ? In other […]

The post Spark dynamic allocation how to configure and use it appeared first on IT World.

]]>

Table of contents

Preamble

Since we have started to put Spark job in production we asked ourselves the question of how many executors, number of cores per executor and executor memory we should put. What if we put too much and are wasting resources and could we improve the response time if we put more ?

In other words those spark-submit parameters (we have an Hortonworks Hadoop cluster and so are using YARN):

  • –executor-memory MEM – Memory per executor (e.g. 1000M, 2G) (Default: 1G).
  • –executor-cores NUM – Number of cores per executor. (Default: 1 in YARN mode, or all available cores on the worker in standalone mode)
  • –num-executors NUM – Number of executors to launch (Default: 2). If dynamic allocation is enabled, the initial number of executors will be at least NUM.

And in fact it is written in above description of num-executors Spark dynamic allocation is partially answering to the former question.

Spark dynamic allocation is a feature allowing your Spark application to automatically scale up and down the number of executors. And only the number of executors not the memory size and not the number of cores of each executor that must still be set specifically in your application or when executing spark-submit command. So the promise is your application will dynamically be able to request more executors and release them back to cluster pool based on your application workload. Of course if using YARN you will be tightly linked to the ressource allocated to the queue to which you have submitted your application (–queue parameter of spark-submit).

This blog post has been written using Hortonworks Data Platform (HDP) 3.1.4 and so Spark2 2.3.2.

Spark dynamic allocation setup

As it is written in official documentation the shuffle jar must be added to the classpath of all NodeManagers. If like me you are running HDP 3 I have discovered that everything was already configured. The jar of this external shuffle library is:

[root@server jars]# ll /usr/hdp/current/spark2-client/jars/*shuffle*
-rw-r--r-- 1 root root 67763 Aug 23  2019 /usr/hdp/current/spark2-client/jars/spark-network-shuffle_2.11-2.3.2.3.1.4.0-315.jar

And in Ambari the YARN configuration was also already done:

spark_dynamic_allocation01
spark_dynamic_allocation01

Remark:
We still have old Spark 1 variables and you should now concentrate only on the spark2_xx variables. Same this is spark2_shuffle that must be appended to yarn.nodemanager.aux-services.

Then again quoting official documentation you have two parameters to set inside your application to have the feature activated:

There are two requirements for using this feature. First, your application must set spark.dynamicAllocation.enabled to true. Second, you must set up an external shuffle service on each worker node in the same cluster and set spark.shuffle.service.enabled to true in your application.

This part was not obvious to me but as it is written spark.dynamicAllocation.enabled and spark.shuffle.service.enabled must not only be set at cluster level but also in your application or as a spark-submit parameter ! I would even say that setting those parameters in Ambari makes no difference but as you can see below all was done by default in my HDP 3.1.4 cluster:

spark_dynamic_allocation02
spark_dynamic_allocation02
spark_dynamic_allocation03
spark_dynamic_allocation03

For the complete list of parameters refer to the official Spark dynamic allocation parameter list.

Spark dynamic allocation testing

For the testing code I have done a mix in PySpark of multiple test code I have seen around on Internet. Using Python is avoiding me a boring sbt compilation phase before testing…

The source code is (spark_dynamic_allocation.py):

# from pyspark.sql import SparkSession
from pyspark import SparkConf
from pyspark import SparkContext
# from pyspark_llap import HiveWarehouseSession
from time import sleep

def wait_x_seconds(x):
  sleep(x*10)

conf = SparkConf().setAppName("Spark dynamic allocation").\
        set("spark.dynamicAllocation.enabled", "true").\
        set("spark.shuffle.service.enabled", "true").\
        set("spark.dynamicAllocation.initialExecutors", "1").\
        set("spark.dynamicAllocation.executorIdleTimeout", "5s").\
        set("spark.executor.cores", "1").\
        set("spark.executor.memory", "512m")

sc = SparkContext.getOrCreate(conf)

# spark = SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()
# spark.stop()

sc.parallelize(range(1,6), 5).foreach(wait_x_seconds)

exit()

So in short I run five parallel processes that will each wait x*10 seconds when x is from 1 to 5 (range(1,6)). We will start with one executor and expect Spark to scale up and then down as the shorter timers will end in order (10 seconds, 20 seconds, ..). I have also exaggerated a bit in parameters as spark.dynamicAllocation.executorIdleTimeout is changed to 5s that I see in my example the executors being killed (default is 60s).

The command to execute it is, Hive Warehouse Connector not really mandatory here but it became an habit. Notice that I do not specify anything in command line as all will be setup in Python script:

spark-submit --master yarn --queue llap --jars /usr/hdp/current/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.4.0-315.jar
--py-files /usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.4.0-315.zip spark_dynamic_allocation.py

By default our spark-submit is in INFO mode, and the important part of the output is:

.
20/04/09 14:34:14 INFO Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
20/04/09 14:34:16 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.75.37.249:36332) with ID 1
20/04/09 14:34:16 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 1)
.
.
20/04/09 14:34:17 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
20/04/09 14:34:18 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 4)
20/04/09 14:34:19 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 5)
20/04/09 14:34:20 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.75.37.249:36354) with ID 2
20/04/09 14:34:20 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 2)
20/04/09 14:34:20 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, yarn01.domain.com, executor 2, partition 1, PROCESS_LOCAL, 7869 bytes)
20/04/09 14:34:20 INFO BlockManagerMasterEndpoint: Registering block manager yarn01.domain.com:29181 with 114.6 MB RAM, BlockManagerId(2, yarn01.domain.com, 29181, None)
20/04/09 14:34:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on yarn01.domain.com:29181 (size: 3.7 KB, free: 114.6 MB)
20/04/09 14:34:21 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.75.37.249:36366) with ID 3
20/04/09 14:34:21 INFO ExecutorAllocationManager: New executor 3 has registered (new total is 3)
20/04/09 14:34:21 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, yarn01.domain.com, executor 3, partition 2, PROCESS_LOCAL, 7869 bytes)
20/04/09 14:34:21 INFO BlockManagerMasterEndpoint: Registering block manager yarn01.domain.com:44000 with 114.6 MB RAM, BlockManagerId(3, yarn01.domain.com, 44000, None)
20/04/09 14:34:21 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on yarn01.domain.com:44000 (size: 3.7 KB, free: 114.6 MB)
20/04/09 14:34:22 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.75.37.249:36376) with ID 5
20/04/09 14:34:22 INFO ExecutorAllocationManager: New executor 5 has registered (new total is 4)
20/04/09 14:34:22 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, yarn01.domain.com, executor 5, partition 3, PROCESS_LOCAL, 7869 bytes)
20/04/09 14:34:22 INFO BlockManagerMasterEndpoint: Registering block manager yarn01.domain.com:32822 with 114.6 MB RAM, BlockManagerId(5, yarn01.domain.com, 32822, None)
20/04/09 14:34:22 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on yarn01.domain.com:32822 (size: 3.7 KB, free: 114.6 MB)
20/04/09 14:34:27 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, yarn01.domain.com, executor 1, partition 4, PROCESS_LOCAL, 7869 bytes)
20/04/09 14:34:27 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 10890 ms on yarn01.domain.com (executor 1) (1/5)
20/04/09 14:34:27 INFO PythonAccumulatorV2: Connected to AccumulatorServer at host: 127.0.0.1 port: 31354
20/04/09 14:34:29 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.75.37.248:57764) with ID 4
20/04/09 14:34:29 INFO ExecutorAllocationManager: New executor 4 has registered (new total is 5)
20/04/09 14:34:29 INFO BlockManagerMasterEndpoint: Registering block manager worker01.domain.com:38365 with 114.6 MB RAM, BlockManagerId(4, worker01.domain.com, 38365, None)
20/04/09 14:34:34 INFO ExecutorAllocationManager: Request to remove executorIds: 4
20/04/09 14:34:34 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 4
20/04/09 14:34:34 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 4
20/04/09 14:34:34 INFO ExecutorAllocationManager: Removing executor 4 because it has been idle for 5 seconds (new desired total will be 4)
20/04/09 14:34:38 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 4.
20/04/09 14:34:38 INFO DAGScheduler: Executor lost: 4 (epoch 0)
20/04/09 14:34:38 INFO BlockManagerMasterEndpoint: Trying to remove executor 4 from BlockManagerMaster.
20/04/09 14:34:38 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(4, worker01.domain.com, 38365, None)
20/04/09 14:34:38 INFO BlockManagerMaster: Removed 4 successfully in removeExecutor
20/04/09 14:34:38 INFO YarnScheduler: Executor 4 on worker01.domain.com killed by driver.
20/04/09 14:34:38 INFO ExecutorAllocationManager: Existing executor 4 has been removed (new total is 4)
20/04/09 14:34:41 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 20892 ms on yarn01.domain.com (executor 2) (2/5)
20/04/09 14:34:46 INFO ExecutorAllocationManager: Request to remove executorIds: 2
20/04/09 14:34:46 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 2
20/04/09 14:34:46 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 2
20/04/09 14:34:46 INFO ExecutorAllocationManager: Removing executor 2 because it has been idle for 5 seconds (new desired total will be 3)
20/04/09 14:34:48 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 2.
20/04/09 14:34:48 INFO DAGScheduler: Executor lost: 2 (epoch 0)
20/04/09 14:34:48 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster.
20/04/09 14:34:48 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, yarn01.domain.com, 29181, None)
20/04/09 14:34:48 INFO BlockManagerMaster: Removed 2 successfully in removeExecutor
20/04/09 14:34:48 INFO YarnScheduler: Executor 2 on yarn01.domain.com killed by driver.
20/04/09 14:34:48 INFO ExecutorAllocationManager: Existing executor 2 has been removed (new total is 3)
20/04/09 14:34:52 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 30897 ms on yarn01.domain.com (executor 3) (3/5)
20/04/09 14:34:57 INFO ExecutorAllocationManager: Request to remove executorIds: 3
20/04/09 14:34:57 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 3
20/04/09 14:34:57 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 3
20/04/09 14:34:57 INFO ExecutorAllocationManager: Removing executor 3 because it has been idle for 5 seconds (new desired total will be 2)
20/04/09 14:34:59 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 3.
20/04/09 14:34:59 INFO DAGScheduler: Executor lost: 3 (epoch 0)
20/04/09 14:34:59 INFO BlockManagerMasterEndpoint: Trying to remove executor 3 from BlockManagerMaster.
20/04/09 14:34:59 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, yarn01.domain.com, 44000, None)
20/04/09 14:34:59 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
20/04/09 14:34:59 INFO YarnScheduler: Executor 3 on yarn01.domain.com killed by driver.
20/04/09 14:34:59 INFO ExecutorAllocationManager: Existing executor 3 has been removed (new total is 2)
20/04/09 14:35:03 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 40831 ms on yarn01.domain.com (executor 5) (4/5)
20/04/09 14:35:08 INFO ExecutorAllocationManager: Request to remove executorIds: 5
20/04/09 14:35:08 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 5
20/04/09 14:35:08 INFO YarnClientSchedulerBackend: Actual list of executor(s) to be killed is 5
20/04/09 14:35:08 INFO ExecutorAllocationManager: Removing executor 5 because it has been idle for 5 seconds (new desired total will be 1)
20/04/09 14:35:10 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 5.
20/04/09 14:35:10 INFO DAGScheduler: Executor lost: 5 (epoch 0)
20/04/09 14:35:10 INFO BlockManagerMasterEndpoint: Trying to remove executor 5 from BlockManagerMaster.
20/04/09 14:35:10 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(5, yarn01.domain.com, 32822, None)
20/04/09 14:35:10 INFO BlockManagerMaster: Removed 5 successfully in removeExecutor
20/04/09 14:35:10 INFO YarnScheduler: Executor 5 on yarn01.domain.com killed by driver.
20/04/09 14:35:10 INFO ExecutorAllocationManager: Existing executor 5 has been removed (new total is 1)
20/04/09 14:35:17 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 50053 ms on yarn01.domain.com (executor 1) (5/5)
20/04/09 14:35:17 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
.

We clearly see the allocation and removal of executors but it is even more clear with the Spark UI web interface:

spark_dynamic_allocation04
spark_dynamic_allocation04

The executors dynamically added in blue well contrast with the ones dynamically removed in red…

One of my colleague asked me if by mistake he allocates too many initial executors and his over allocation is wasting ressource. I have done this trial by specifying in my code:

set("spark.dynamicAllocation.initialExecutors", "1").\

And Spark Dynamic allocation has been really clever by de-allocating almost instantly the non-needed executors:

spark_dynamic_allocation05
spark_dynamic_allocation05

References

The post Spark dynamic allocation how to configure and use it appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/spark-dynamic-allocation-how-to-configure-and-use-it.html/feed 0
INSERT OVERWRITE does not delete old directories https://blog.yannickjaquier.com/hadoop/insert-overwrite-does-not-delete-old-directories.html https://blog.yannickjaquier.com/hadoop/insert-overwrite-does-not-delete-old-directories.html#respond Tue, 22 Sep 2020 08:07:13 +0000 https://blog.yannickjaquier.com/?p=4990 Preamble In one of our processes we are daily overwriting a table (a partition of this table to be accurate) and, by good luck, we noticed the table size kept increasing till reaching a size that was bigger than her sibling history one !! We did a quick check on HDFS and saw that old […]

The post INSERT OVERWRITE does not delete old directories appeared first on IT World.

]]>

Table of contents

Preamble

In one of our processes we are daily overwriting a table (a partition of this table to be accurate) and, by good luck, we noticed the table size kept increasing till reaching a size that was bigger than her sibling history one !! We did a quick check on HDFS and saw that old files have not been deleted…

I have been able to reproduce the issue in a simple example and I think I have found the opened bug for this… This looks pretty amazing to find such bug as I feel Hadoop has reach a good maturity level…

We are running Hortonworks Data Platform (HDP) 3.1.4. So Hive release is 3.1.0.

INSERT OVERWRITE test case

I have below creation script (the database name is yannick):

drop table yannick.test01 purge;
create table yannick.test01(val int, descr string) partitioned by (fab string, lot_partition string) stored as orc;
insert into table yannick.test01 partition(fab='GVA', lot_partition='TEST') values(1, 'One');

Initially from HDFS standpoint things are crystal clear:

[hdfs@client ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/
Found 1 items
drwxrwx---+  - hive hadoop          0 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000
[hdfs@client ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/*
Found 2 items
-rw-rw----+  3 hive hadoop          1 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000/_orc_acid_version
-rw-rw----+  3 hive hadoop        696 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000/bucket_00000

So one directory with one ORC file.

I have then tried to know from Hive standpoint which directory(ie) is used by this table. I initially tried directly querying our Hive metastore (MySQL):

mysql> select TBLS.TBL_NAME, PARTITIONS.PART_NAME, SDS.LOCATION
    -> from SDS, TBLS, PARTITIONS, DBS
    -> where TBLS.TBL_NAME='test01'
    -> and DBS.NAME = 'yannick'
    -> and TBLS.DB_ID = DBS.DB_ID
    -> and PARTITIONS.SD_ID = SDS.SD_ID
    -> and TBLS.TBL_ID = PARTITIONS.TBL_ID
    -> order by 1,2;
+----------+----------------------------+------------------------------------------------------------------------------------------------------------------+
| TBL_NAME | PART_NAME                  | LOCATION                                                                                                         |
+----------+----------------------------+------------------------------------------------------------------------------------------------------------------+
| test01   | fab=GVA/lot_partition=TEST | hdfs://namenode01.domain.com:8020/warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST |
+----------+----------------------------+------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

But only the root folder is given and I have not been able to find in Hive metastore a table displaying this level of detail. The solution is simply coming from Hive virtual columns:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> SELECT input__file__name FROM yannick.test01 WHERE fab="GVA" and lot_partition="TEST";
+----------------------------------------------------+
|                 input__file__name                  |
+----------------------------------------------------+
| hdfs://namenode01.domain.com:8020/warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000/bucket_00000 |
+----------------------------------------------------+
1 row selected (0.431 seconds)

INSERT OVERWRITE does not delete old directories

If I INSERT OVERWRITE in this table in same exact partition I’m expecting Hive to do HDFS cleaning automatically and I surely not expect to have old folder kept forever. Unfortunately this is what happens if I insert overwrite in same partition:

insert overwrite table yannick.test01 partition(fab='GVA', lot_partition='TEST') values(2,'Two');

If I select used ORC files I get, as expected:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> SELECT input__file__name FROM yannick.test01 WHERE fab="GVA" and lot_partition="TEST";
INFO  : Compiling command(queryId=hive_20200406150529_c05aac38-6933-4a8e-b7ec-6ae9016e67f0): SELECT input__file__name FROM yannick.test01 WHERE fab="GVA" and lot_partition="TEST"
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:input__file__name, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20200406150529_c05aac38-6933-4a8e-b7ec-6ae9016e67f0); Time taken: 0.114 seconds
INFO  : Executing command(queryId=hive_20200406150529_c05aac38-6933-4a8e-b7ec-6ae9016e67f0): SELECT input__file__name FROM yannick.test01 WHERE fab="GVA" and lot_partition="TEST"
INFO  : Completed executing command(queryId=hive_20200406150529_c05aac38-6933-4a8e-b7ec-6ae9016e67f0); Time taken: 0.002 seconds
INFO  : OK
+----------------------------------------------------+
|                 input__file__name                  |
+----------------------------------------------------+
| hdfs://namenode01.domain.com:8020/warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/base_0000002/bucket_00000 |
+----------------------------------------------------+
1 row selected (0.211 seconds)

But if you look at HDFS level:

[hdfs@client ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/
Found 2 items
drwxrwx---+  - hive hadoop          0 2020-04-06 15:04 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/base_0000002
drwxrwx---+  - hive hadoop          0 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000
[hdfs@client ~]$ hdfs dfs -ls /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/*
Found 2 items
-rw-rw----+  3 hive hadoop          1 2020-04-06 15:04 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/base_0000002/_orc_acid_version
-rw-rw----+  3 hive hadoop        696 2020-04-06 15:04 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/base_0000002/bucket_00000
Found 2 items
-rw-rw----+  3 hive hadoop          1 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000/_orc_acid_version
-rw-rw----+  3 hive hadoop        696 2020-04-06 14:55 /warehouse/tablespace/managed/hive/yannick.db/test01/fab=GVA/lot_partition=TEST/delta_0000001_0000001_0000/bucket_00000

This can also be done with the File View interface in Ambari:

insert_overwrite01
insert_overwrite01

The old former directory has not been deleted and this happen as often as you insert overwrite…

We have also tried to play with auto.purge table property:

As of Hive 2.3.0 (HIVE-15880), if the table has TBLPROPERTIES (“auto.purge”=”true”) the previous data of the table is not moved to Trash when INSERT OVERWRITE query is run against the table. This functionality is applicable only for managed tables (see managed tables) and is turned off when “auto.purge” property is unset or set to false.

For example:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> ALTER TABLE yannick.test01 SET TBLPROPERTIES ("auto.purge" = "true");
No rows affected (0.174 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted yannick.test01;
+-------------------------------+----------------------------------------------------+-----------------------------+
|           col_name            |                     data_type                      |           comment           |
+-------------------------------+----------------------------------------------------+-----------------------------+
| # col_name                    | data_type                                          | comment                     |
| val                           | int                                                |                             |
| descr                         | string                                             |                             |
|                               | NULL                                               | NULL                        |
| # Partition Information       | NULL                                               | NULL                        |
| # col_name                    | data_type                                          | comment                     |
| fab                           | string                                             |                             |
| lot_partition                 | string                                             |                             |
|                               | NULL                                               | NULL                        |
| # Detailed Table Information  | NULL                                               | NULL                        |
| Database:                     | yannick                                            | NULL                        |
| OwnerType:                    | USER                                               | NULL                        |
| Owner:                        | hive                                               | NULL                        |
| CreateTime:                   | Mon Apr 06 14:55:48 CEST 2020                      | NULL                        |
| LastAccessTime:               | UNKNOWN                                            | NULL                        |
| Retention:                    | 0                                                  | NULL                        |
| Location:                     | hdfs://namenode01.domain.com:8020/warehouse/tablespace/managed/hive/yannick.db/test01 | NULL                        |
| Table Type:                   | MANAGED_TABLE                                      | NULL                        |
| Table Parameters:             | NULL                                               | NULL                        |
|                               | COLUMN_STATS_ACCURATE                              | {\"BASIC_STATS\":\"true\"}  |
|                               | auto.purge                                         | true                        |
|                               | bucketing_version                                  | 2                           |
|                               | last_modified_by                                   | hive                        |
|                               | last_modified_time                                 | 1586180747                  |
|                               | numFiles                                           | 0                           |
|                               | numPartitions                                      | 0                           |
|                               | numRows                                            | 0                           |
|                               | rawDataSize                                        | 0                           |
|                               | totalSize                                          | 0                           |
|                               | transactional                                      | true                        |
|                               | transactional_properties                           | default                     |
|                               | transient_lastDdlTime                              | 1586180747                  |
|                               | NULL                                               | NULL                        |
| # Storage Information         | NULL                                               | NULL                        |
| SerDe Library:                | org.apache.hadoop.hive.ql.io.orc.OrcSerde          | NULL                        |
| InputFormat:                  | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat    | NULL                        |
| OutputFormat:                 | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat   | NULL                        |
| Compressed:                   | No                                                 | NULL                        |
| Num Buckets:                  | -1                                                 | NULL                        |
| Bucket Columns:               | []                                                 | NULL                        |
| Sort Columns:                 | []                                                 | NULL                        |
| Storage Desc Params:          | NULL                                               | NULL                        |
|                               | serialization.format                               | 1                           |
+-------------------------------+----------------------------------------------------+-----------------------------+
43 rows selected (0.172 seconds)

But here the problem is not being able to recover figures from Trash in case of an human error because previous figures are simply not deleted…

Workaround we have found but we are sad not being able to rely on this basic thing:

  • TRUNCATE TABLE yannick.test01 PARTITION(fab=”GVA”, lot_partition=”TEST”);
  • ALTER TABLE yannick.test01 drop PARTITION(fab=”GVA”, lot_partition=”TEST”);
  • Deleteing manually the old forlder works but this is quite dangerous and not natural at all to do so…

References

The post INSERT OVERWRITE does not delete old directories appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/insert-overwrite-does-not-delete-old-directories.html/feed 0
Spark lineage issue and how to handle it with Hive Warehouse Connector https://blog.yannickjaquier.com/hadoop/spark-lineage-issue-and-how-to-handle-it-with-hive-warehouse-connector.html https://blog.yannickjaquier.com/hadoop/spark-lineage-issue-and-how-to-handle-it-with-hive-warehouse-connector.html#respond Sun, 23 Aug 2020 08:03:34 +0000 https://blog.yannickjaquier.com/?p=4983 Preamble One of my teammate has submitted me an interesting issue. In a Spark script he was reading a table partition, doing some operations on the resulting DataFrame and then tried to overwrite the modified DataFrame back in the same partition. Obviously this was hanging so this blog post… Internally this hanging problem is not […]

The post Spark lineage issue and how to handle it with Hive Warehouse Connector appeared first on IT World.

]]>

Table of contents

Preamble

One of my teammate has submitted me an interesting issue. In a Spark script he was reading a table partition, doing some operations on the resulting DataFrame and then tried to overwrite the modified DataFrame back in the same partition.

Obviously this was hanging so this blog post… Internally this hanging problem is not a bug but a feature called Spark lineage. It avoid for exemple to loose or corrupt you data in case of a crash of your process.

We are using HDP 3.1.4 and so Spark 2.3.2.3.1.4.0-315. So the below script will use Hive Warehouse Connector (HWC).

Test case and problem

The creation script of my small test table is the following:

drop table yannick.test01 purge;
create table yannick.test01(val int, descr string) partitioned by (fab string, lot_partition string) stored as orc;

insert into yannick.test01 partition(fab='GVA', lot_partition='TEST') values(1,'One');
insert into yannick.test01 partition(fab='GVA', lot_partition='TEST') values(2,'Two');
insert into yannick.test01 partition(fab='GVA', lot_partition='TEST') values(3,'Three');

In Beeline it gives:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> select * from yannick.test01;
+-------------+---------------+-------------+-----------------------+
| test01.val  | test01.descr  | test01.fab  | test01.lot_partition  |
+-------------+---------------+-------------+-----------------------+
| 1           | One           | GVA         | TEST                  |
| 2           | Two           | GVA         | TEST                  |
| 3           | Three         | GVA         | TEST                  |
+-------------+---------------+-------------+-----------------------+
3 rows selected (0.317 seconds)

To simulate a modification of the current partition in a DataFrame and the write back I have written below PySpark script:

>>> from pyspark_llap import HiveWarehouseSession
>>> from pyspark.sql.functions import *
>>> hive = HiveWarehouseSession.session(spark).build()
>>> df01=hive.executeQuery("select * from yannick.test01");
>>> df02=df01.withColumn('val',col('val')+1)
>>> df02=df01.withColumn('val',col('val')+1)
>>> df02.show()
+---+-----+---+-------------+
|val|descr|fab|lot_partition|
+---+-----+---+-------------+
|  4|Three|GVA|         TEST|
|  3|  Two|GVA|         TEST|
|  2|  One|GVA|         TEST|
+---+-----+---+-------------+

I have added 1 to val column values. But when I try to write it back the hanging command is this one:

df02.write.mode('overwrite').format(HiveWarehouseSession().HIVE_WAREHOUSE_CONNECTOR).option("partition", "fab,lot_partition").option('table','yannick.test01').save()

While if you try to append figures it works well:

df02.write.mode('append').format(HiveWarehouseSession().HIVE_WAREHOUSE_CONNECTOR).option("partition", "fab,lot_partition").option('table','yannick.test01').save()

My teammate has implemented a solution to write in a temporary table (through HWC) and then launch another process (because doing the two operations in a single script is hanging) selecting this temporary table to insert overwrite back in final table, working but not sexy.

Spark lineage solution

In fact as we will see there is no magic solution and overall this Spark lineage is a good thing. I have simply found a solution to do it all in one single script. And, at least, it simplifies our schedule.

I write this intermediate DataFrame in a Spark ORC table (versus a Hive table accessible through HWC):

>>> df02.write.format('orc').mode('overwrite').saveAsTable('temporary_table')
>>> df03=sql('select * from temporary_table');
>>> df03.show()
+---+-----+---+-------------+
|val|descr|fab|lot_partition|
+---+-----+---+-------------+
|  4|Three|GVA|         TEST|
|  3|  Two|GVA|         TEST|
|  2|  One|GVA|         TEST|
+---+-----+---+-------------+

Later you can use things like:

sql('select * from temporary_table').show()
sql('show tables').show()
sql('drop table temporary_table purge')

I was also wondering where are going those tables because I did not see it in default database of the traditional Hive managed table directory:

[hdfs@client_node ~]$ hdfs dfs -ls  /warehouse/tablespace/managed/
Found 1 items
drwxrwxrwx+  - hive hadoop          0 2020-03-09 16:33 /warehouse/tablespace/managed/hive

Destination is set by those Ambari/Spark parameters:

spark_lineage01
spark_lineage01
[hdfs@server ~]$ hdfs dfs -ls /apps/spark/warehouse
Found 1 items
drwxr-xr-x   - mfgdl_ingestion hdfs          0 2020-03-20 14:21 /apps/spark/warehouse/temporary_table
sql('select * from temporary_table').write.mode('overwrite').format(HiveWarehouseSession().HIVE_WAREHOUSE_CONNECTOR).
option("partition", "fab,lot_partition").option('table','yannick.test01').save()

It also clarify (really ?) a bit this story of Spark Metastore vs Hive Metastore…

Spark lineage second problem and partial solution

Previous solution went well till my teammate told me that it was still hanging and he shared his code with me. I noticed he was using, for performance reason, the persist() function when reading the source table in a DataFrame:

scala> val df01=hive.executeQuery("select * from yannick.test01").persist()

I have found a mention of this is in Close HiveWarehouseSession operations:

Spark can invoke operations, such as cache(), persist(), and rdd(), on a DataFrame you obtain from running a HiveWarehouseSession executeQuery() or table(). The Spark operations can lock Hive resources. You can release any locks and resources by calling the HiveWarehouseSession close().

So I tried using below Spark Scale code:

scala> import com.hortonworks.hwc.HiveWarehouseSession
import com.hortonworks.hwc.HiveWarehouseSession

scala> import com.hortonworks.hwc.HiveWarehouseSession._
import com.hortonworks.hwc.HiveWarehouseSession._

scala> val HIVE_WAREHOUSE_CONNECTOR="com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector"
HIVE_WAREHOUSE_CONNECTOR: String = com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector

scala> val hive = HiveWarehouseSession.session(spark).build()
hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@25f3207e

scala> val df01=hive.executeQuery("select * from yannick.test01").persist()
df01: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [val: int, descr: string ... 2 more fields]

scala> df01.show()
20/03/24 18:29:51 WARN TaskSetManager: Stage 0 contains a task of very large size (439 KB). The maximum recommended task size is 100 KB.
+---+-----+---+-------------+
|val|descr|fab|lot_partition|
+---+-----+---+-------------+
|  3|Three|GVA|         TEST|
|  1|  One|GVA|         TEST|
|  2|  Two|GVA|         TEST|
+---+-----+---+-------------+


scala> val df02=df01.withColumn("val",$"val" + 1)
df02: org.apache.spark.sql.DataFrame = [val: int, descr: string ... 2 more fields]

scala> df02.show()
20/03/24 18:30:16 WARN TaskSetManager: Stage 1 contains a task of very large size (439 KB). The maximum recommended task size is 100 KB.
+---+-----+---+-------------+
|val|descr|fab|lot_partition|
+---+-----+---+-------------+
|  4|Three|GVA|         TEST|
|  2|  One|GVA|         TEST|
|  3|  Two|GVA|         TEST|
+---+-----+---+-------------+


scala> hive.close()

scala> val hive = HiveWarehouseSession.session(spark).build()
hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@1c9f274d

scala> df02.write.mode("overwrite").format(HIVE_WAREHOUSE_CONNECTOR).option("partition", "fab,lot_partition").option("table","yannick.test01").save()
20/03/24 18:33:44 WARN TaskSetManager: Stage 2 contains a task of very large size (439 KB). The maximum recommended task size is 100 KB.

And voila my table has been correctly written back without any hanging. Looks like a marvelous solution till I received a feedback from my teammate that is using PySpark:

>>> hive.close()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'HiveWarehouseSessionImpl' object has no attribute 'close'

I have tried to have a look to this HWC source code and apparently the close() function has not been exposed to be used in PySpark… Definitively since we moved to HPD 3 this HWC implementation looks not very mature and we have already identified many issue with it…

If someone has found something interesting please share and I might come back on this article if we find a sexy solution…

References

The post Spark lineage issue and how to handle it with Hive Warehouse Connector appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/spark-lineage-issue-and-how-to-handle-it-with-hive-warehouse-connector.html/feed 0