IT World https://blog.yannickjaquier.com RDBMS, Unix and many more... Wed, 27 Nov 2019 09:17:11 +0100 en-US hourly 1 https://wordpress.org/?v=5.3 Hive fetch task really improving response time by bypassing MapReduce ? https://blog.yannickjaquier.com/hadoop/hive-fetch-task-really-improving-response-time-by-bypassing-mapreduce.html https://blog.yannickjaquier.com/hadoop/hive-fetch-task-really-improving-response-time-by-bypassing-mapreduce.html#respond Sun, 24 Nov 2019 09:42:15 +0000 https://blog.yannickjaquier.com/?p=4814 Preamble Our internal customers have started to report erratic response time for similar queries in their Spotfire dashboard. They claimed the queries were almost the same but response time was ten times slower for some of them… Isn’t it my first performance issue with Hadoop ? Yessss it is !!!! We are running HDP-2.6 (2.6.4.0-91) […]

The post Hive fetch task really improving response time by bypassing MapReduce ? appeared first on IT World.

]]>

Table of contents

Preamble

Our internal customers have started to report erratic response time for similar queries in their Spotfire dashboard. They claimed the queries were almost the same but response time was ten times slower for some of them…

Isn’t it my first performance issue with Hadoop ? Yessss it is !!!!

We are running HDP-2.6 (2.6.4.0-91) and Hive release in this edition is 1.2.1000.

Identical queries not same response time

The first job was to extract the two “similar” queries from the Spotfire dashboard and compare them. I have written similar between quotes because what is similar from user perspective can be really different in real life. To execute them and find myself in the flow of all queries in log files I have used the same trick as for Oracle, means adding a comment with my first name.

Query01:

select --Yannick01
lot_id,
wafer_id,
flow_id,
param_id,
start_t,
finish_t,
param_name,
param_unit,
param_low_limit,
param_high_limit,
nb_dies_tested,
nb_dies_failed
from prod_spotfire_refined.tbl_bin_stat_orc
where fab = "C2WF"
and lot_partition in ("Q842")
and lot_id in ("Q842889")
and wafer_id in ('Q842889-01E3','Q842889-02D6','Q842889-03D1','Q842889-04C4','Q842889-05B7','Q842889-06B2','Q842889-07A5','Q842889-08A0','Q842889-09G6',
'Q842889-10A0','Q842889-11G6','Q842889-12G1','Q842889-13F4','Q842889-14E7','Q842889-15E2','Q842889-16D5','Q842889-17D0','Q842889-18C3','Q842889-19B6',
'Q842889-20C3','Q842889-21B6','Q842889-22B1','Q842889-23A4','Q842889-24H2')
and flow_id in ("EWS1")
and start_t in ('2019.01.04-19:50:55','2019.01.05-02:21:26','2019.01.05-08:33:59','2019.01.05-14:06:24','2019.01.05-19:35:23','2019.01.05-22:25:30',
'2019.01.06-03:49:53','2019.01.06-09:19:52','2019.01.06-14:23:47','2019.01.06-19:27:35','2019.01.07-00:47:59','2019.01.07-06:37:21','2019.01.07-11:15:14',
'2019.01.07-15:56:55','2019.01.07-20:05:50','2019.01.07-22:48:09','2019.01.08-04:37:37','2019.01.08-08:48:26','2019.01.08-13:51:34','2019.01.08-18:31:38',
'2019.01.09-00:01:41','2019.01.09-04:11:44','2019.01.09-09:45:08','2019.01.09-13:47:11')
and finish_t in ('2019.01.05-01:52:02','2019.01.05-08:30:52','2019.01.05-14:01:33','2019.01.05-19:32:20','2019.01.05-22:22:15','2019.01.06-03:46:46',
'2019.01.06-09:16:43','2019.01.06-14:20:42','2019.01.06-19:24:22','2019.01.07-00:44:48','2019.01.07-06:34:15','2019.01.07-11:12:02','2019.01.07-15:53:45',
'2019.01.07-20:02:41','2019.01.07-22:26:37','2019.01.08-04:34:30','2019.01.08-08:45:20','2019.01.08-13:48:27','2019.01.08-18:28:33','2019.01.08-23:58:34',
'2019.01.09-04:08:41','2019.01.09-09:41:57','2019.01.09-13:44:03','2019.01.09-18:01:54')
and hbin_number in ("9")
and sbin_number in ("403");

Query02:

select --Yannick02
lot_id,
wafer_id,
flow_id,
param_id,
start_t,
finish_t,
param_name,
param_unit,
param_low_limit,
param_high_limit,
nb_dies_tested,
nb_dies_failed
from prod_spotfire_refined.tbl_bin_stat_orc
where fab = "C2WF"
and lot_partition in ("Q840")
and lot_id in ("Q840401")
and wafer_id in ('Q840401-01E6','Q840401-02E1','Q840401-03D4','Q840401-04C7','Q840401-05C2','Q840401-06B5','Q840401-07B0','Q840401-08A3','Q840401-09H1',
'Q840401-10A3','Q840401-11H1','Q840401-12G4','Q840401-13F7','Q840401-14F2','Q840401-15E5','Q840401-16E0','Q840401-17D3','Q840401-18C6','Q840401-19C1',
'Q840401-20C6','Q840401-21C1','Q840401-22B4','Q840401-23A7','Q840401-24A2','Q840401-25H0')
and flow_id in ("EWS1")
and start_t in ('2018.12.27-10:42:54','2018.12.27-12:01:57','2018.12.27-13:18:47','2018.12.27-14:36:31','2018.12.27-15:55:57','2018.12.27-17:13:42',
'2018.12.27-18:31:34','2018.12.27-19:49:27','2018.12.27-21:05:30','2018.12.27-22:23:36','2018.12.27-23:40:11','2018.12.28-00:56:40','2018.12.28-02:15:51',
'2018.12.28-03:41:23','2018.12.28-04:58:02','2018.12.28-06:16:11','2018.12.28-07:34:40','2018.12.28-08:55:29','2018.12.28-10:13:25','2018.12.28-11:30:34',
'2018.12.28-12:48:08','2018.12.28-14:06:12','2018.12.28-15:23:00','2018.12.28-16:39:50','2018.12.28-17:56:57')
and finish_t in ('2018.12.27-12:00:40','2018.12.27-13:17:30','2018.12.27-14:35:13','2018.12.27-15:54:39','2018.12.27-17:12:26','2018.12.27-18:30:16',
'2018.12.27-19:48:12','2018.12.27-21:04:15','2018.12.27-22:22:19','2018.12.27-23:38:53','2018.12.28-00:55:24','2018.12.28-02:14:34','2018.12.28-03:32:41',
'2018.12.28-04:56:43','2018.12.28-06:14:52','2018.12.28-07:33:21','2018.12.28-08:54:12','2018.12.28-10:12:06','2018.12.28-11:29:17','2018.12.28-12:46:50',
'2018.12.28-14:04:55','2018.12.28-15:21:43','2018.12.28-16:38:30','2018.12.28-17:55:38','2018.12.28-19:13:18')
and hbin_number in ("1")
and sbin_number in ("1");

At this stage I have to say that queries are pretty similar and our users might be right as the response time should not be that different between the two queries…

Query02 returns 682 rows in around 330 seconds. Query01 returns 38,664 rows in around 25 seconds (16 seconds to really execute the query as you can see below, the rest is network transfert).

So we have a factor of 13 times less efficient to return 57 times less row

Partitions statistics and concatenation

I have started by checking that the two involved partitions are almost the same from statistics point of view and that compaction has been done for both of them.

From global statistics there is a difference but not that much. Number of rows and total partition size are really close. So close that it cannot explain the factor ten in response time:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q842");
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+
|             col_name              |                                                          data_type                                                          |           comment           |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+
| # col_name                        | data_type                                                                                                                   | comment                     |
|                                   | NULL                                                                                                                        | NULL                        |
| lot_id                            | string                                                                                                                      |                             |
| wafer_id                          | string                                                                                                                      |                             |
| flow_id                           | string                                                                                                                      |                             |
| start_t                           | string                                                                                                                      |                             |
| finish_t                          | string                                                                                                                      |                             |
| hbin_number                       | int                                                                                                                         |                             |
| hbin_name                         | string                                                                                                                      |                             |
| sbin_number                       | int                                                                                                                         |                             |
| sbin_name                         | string                                                                                                                      |                             |
| param_id                          | string                                                                                                                      |                             |
| param_name                        | string                                                                                                                      |                             |
| param_unit                        | string                                                                                                                      |                             |
| param_low_limit                   | float                                                                                                                       |                             |
| param_high_limit                  | float                                                                                                                       |                             |
| nb_dies_tested                    | int                                                                                                                         |                             |
| nb_dies_failed                    | int                                                                                                                         |                             |
| nb_dies_good                      | int                                                                                                                         |                             |
| ingestion_date                    | string                                                                                                                      |                             |
|                                   | NULL                                                                                                                        | NULL                        |
| # Partition Information           | NULL                                                                                                                        | NULL                        |
| # col_name                        | data_type                                                                                                                   | comment                     |
|                                   | NULL                                                                                                                        | NULL                        |
| fab                               | string                                                                                                                      |                             |
| lot_partition                     | string                                                                                                                      |                             |
|                                   | NULL                                                                                                                        | NULL                        |
| # Detailed Partition Information  | NULL                                                                                                                        | NULL                        |
| Partition Value:                  | [C2WF, Q842]                                                                                                                | NULL                        |
| Database:                         | prod_spotfire_refined                                                                                                       | NULL                        |
| Table:                            | tbl_bin_stat_orc                                                                                                            | NULL                        |
| CreateTime:                       | Mon Feb 11 13:03:15 CET 2019                                                                                                | NULL                        |
| LastAccessTime:                   | UNKNOWN                                                                                                                     | NULL                        |
| Protect Mode:                     | None                                                                                                                        | NULL                        |
| Location:                         | hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842  | NULL                        |
| Partition Parameters:             | NULL                                                                                                                        | NULL                        |
|                                   | COLUMN_STATS_ACCURATE                                                                                                       | {\"BASIC_STATS\":\"true\"}  |
|                                   | numFiles                                                                                                                    | 14                          |
|                                   | numRows                                                                                                                     | 143216514                   |
|                                   | rawDataSize                                                                                                                 | 151433505731                |
|                                   | totalSize                                                                                                                   | 1353827219                  |
|                                   | transient_lastDdlTime                                                                                                       | 1554163118                  |
|                                   | NULL                                                                                                                        | NULL                        |
| # Storage Information             | NULL                                                                                                                        | NULL                        |
| SerDe Library:                    | org.apache.hadoop.hive.ql.io.orc.OrcSerde                                                                                   | NULL                        |
| InputFormat:                      | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat                                                                             | NULL                        |
| OutputFormat:                     | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat                                                                            | NULL                        |
| Compressed:                       | No                                                                                                                          | NULL                        |
| Num Buckets:                      | -1                                                                                                                          | NULL                        |
| Bucket Columns:                   | []                                                                                                                          | NULL                        |
| Sort Columns:                     | []                                                                                                                          | NULL                        |
| Storage Desc Params:              | NULL                                                                                                                        | NULL                        |
|                                   | serialization.format                                                                                                        | 1                           |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840");
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+
|             col_name              |                                                          data_type                                                          |           comment           |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+
| # col_name                        | data_type                                                                                                                   | comment                     |
|                                   | NULL                                                                                                                        | NULL                        |
| lot_id                            | string                                                                                                                      |                             |
| wafer_id                          | string                                                                                                                      |                             |
| flow_id                           | string                                                                                                                      |                             |
| start_t                           | string                                                                                                                      |                             |
| finish_t                          | string                                                                                                                      |                             |
| hbin_number                       | int                                                                                                                         |                             |
| hbin_name                         | string                                                                                                                      |                             |
| sbin_number                       | int                                                                                                                         |                             |
| sbin_name                         | string                                                                                                                      |                             |
| param_id                          | string                                                                                                                      |                             |
| param_name                        | string                                                                                                                      |                             |
| param_unit                        | string                                                                                                                      |                             |
| param_low_limit                   | float                                                                                                                       |                             |
| param_high_limit                  | float                                                                                                                       |                             |
| nb_dies_tested                    | int                                                                                                                         |                             |
| nb_dies_failed                    | int                                                                                                                         |                             |
| nb_dies_good                      | int                                                                                                                         |                             |
| ingestion_date                    | string                                                                                                                      |                             |
|                                   | NULL                                                                                                                        | NULL                        |
| # Partition Information           | NULL                                                                                                                        | NULL                        |
| # col_name                        | data_type                                                                                                                   | comment                     |
|                                   | NULL                                                                                                                        | NULL                        |
| fab                               | string                                                                                                                      |                             |
| lot_partition                     | string                                                                                                                      |                             |
|                                   | NULL                                                                                                                        | NULL                        |
| # Detailed Partition Information  | NULL                                                                                                                        | NULL                        |
| Partition Value:                  | [C2WF, Q840]                                                                                                                | NULL                        |
| Database:                         | prod_spotfire_refined                                                                                                       | NULL                        |
| Table:                            | tbl_bin_stat_orc                                                                                                            | NULL                        |
| CreateTime:                       | Mon Feb 11 13:02:12 CET 2019                                                                                                | NULL                        |
| LastAccessTime:                   | UNKNOWN                                                                                                                     | NULL                        |
| Protect Mode:                     | None                                                                                                                        | NULL                        |
| Location:                         | hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840  | NULL                        |
| Partition Parameters:             | NULL                                                                                                                        | NULL                        |
|                                   | COLUMN_STATS_ACCURATE                                                                                                       | {\"BASIC_STATS\":\"true\"}  |
|                                   | numFiles                                                                                                                    | 11                          |
|                                   | numRows                                                                                                                     | 109795564                   |
|                                   | rawDataSize                                                                                                                 | 116612625917                |
|                                   | totalSize                                                                                                                   | 989787753                   |
|                                   | transient_lastDdlTime                                                                                                       | 1554163118                  |
|                                   | NULL                                                                                                                        | NULL                        |
| # Storage Information             | NULL                                                                                                                        | NULL                        |
| SerDe Library:                    | org.apache.hadoop.hive.ql.io.orc.OrcSerde                                                                                   | NULL                        |
| InputFormat:                      | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat                                                                             | NULL                        |
| OutputFormat:                     | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat                                                                            | NULL                        |
| Compressed:                       | No                                                                                                                          | NULL                        |
| Num Buckets:                      | -1                                                                                                                          | NULL                        |
| Bucket Columns:                   | []                                                                                                                          | NULL                        |
| Sort Columns:                     | []                                                                                                                          | NULL                        |
| Storage Desc Params:              | NULL                                                                                                                        | NULL                        |
|                                   | serialization.format                                                                                                        | 1                           |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+-----------------------------+--+

Even if a bit boring and long I have also checked the columns statistics, again not much differences:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc param_id partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| param_id                | string                |                       |                       | 0                     | 92393                 | 5.7145                | 10                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.445 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc param_id partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| param_id                | string                |                       |                       | 0                     | 169450                | 5.6886                | 10                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.457 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc lot_id partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| lot_id                  | string                |                       |                       | 0                     | 290                   | 7.1705                | 10                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.407 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc lot_id partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| lot_id                  | string                |                       |                       | 0                     | 172                   | 7.071                 | 10                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.416 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc wafer_id partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| wafer_id                | string                |                       |                       | 0                     | 3744                  | 11.9485               | 12                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.444 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc wafer_id partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| wafer_id                | string                |                       |                       | 0                     | 6867                  | 11.8972               | 12                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.398 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc flow_id partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| flow_id                 | string                |                       |                       | 0                     | 18                    | 4.1747                | 30                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.423 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc flow_id partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| flow_id                 | string                |                       |                       | 0                     | 37                    | 4.3368                | 31                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.448 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc start_t partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| start_t                 | string                |                       |                       | 0                     | 17811                 | 19.0                  | 19                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.425 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc start_t partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| start_t                 | string                |                       |                       | 0                     | 11549                 | 19.0                  | 19                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.408 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc finish_t partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| finish_t                | string                |                       |                       | 0                     | 11059                 | 19.0                  | 19                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.398 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc finish_t partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| finish_t                | string                |                       |                       | 0                     | 12060                 | 19.0                  | 19                    |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.382 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc hbin_number partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| hbin_number             | int                   | 0                     | 65535                 | 0                     | 61                    |                       |                       |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.356 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc hbin_number partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| hbin_number             | int                   | 0                     | 65535                 | 0                     | 58                    |                       |                       |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.364 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc sbin_number partition(fab = "C2WF", lot_partition="Q840");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| sbin_number             | int                   | 1                     | 65535                 | 0                     | 3288                  |                       |                       |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.417 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc sbin_number partition(fab = "C2WF", lot_partition="Q842");
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
|        col_name         |       data_type       |          min          |          max          |       num_nulls       |    distinct_count     |      avg_col_len      |      max_col_len      |       num_trues       |      num_falses       |        comment        |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
| # col_name              | data_type             | min                   | max                   | num_nulls             | distinct_count        | avg_col_len           | max_col_len           | num_trues             | num_falses            | comment               |
|                         | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  | NULL                  |
| sbin_number             | int                   | 0                     | 65535                 | 0                     | 3148                  |                       |                       |                       |                       | from deserializer     |
+-------------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+--+
3 rows selected (0.344 seconds)

The global table and statistics information can be get in a more condensed way using:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe extended prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840");

| Detailed Partition Information  | Partition(values:[C2WF, Q840], dbName:prod_spotfire_refined, tableName:tbl_bin_stat_orc, createTime:1549886532, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:lot_id, type:string, comment:null), FieldSchema(name:wafer_id, type:string, comment:null), FieldSchema(name:flow_id, type:string, comment:null), FieldSchema(name:start_t, type:string, comment:null), FieldSchema(name:finish_t, type:string, comment:null), FieldSchema(name:hbin_number, type:int, comment:null), FieldSchema(name:hbin_name, type:string, comment:null), FieldSchema(name:sbin_number, type:int, comment:null), FieldSchema(name:sbin_name, type:string, comment:null), FieldSchema(name:param_id, type:string, comment:null), FieldSchema(name:param_name, type:string, comment:null), FieldSchema(name:param_unit, type:string, comment:null), FieldSchema(name:param_low_limit, type:float, comment:null), FieldSchema(name:param_high_limit, type:float, comment:null), FieldSchema(name:nb_dies_tested, type:int, comment:null), FieldSchema(name:nb_dies_failed, type:int, comment:null), FieldSchema(name:nb_dies_good, type:int, comment:null), FieldSchema(name:ingestion_date, type:string, comment:null), FieldSchema(name:fab, type:string, comment:null), FieldSchema(name:lot_partition, type:string, comment:null)], location:hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{totalSize=989787753,  numRows=109795564, rawDataSize=116612625917, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=11, transient_lastDdlTime=1554163118})  |                       |

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe extended prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q842");

| Detailed Partition Information  | Partition(values:[C2WF, Q842], dbName:prod_spotfire_refined, tableName:tbl_bin_stat_orc, createTime:1549886595, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:lot_id, type:string, comment:null), FieldSchema(name:wafer_id, type:string, comment:null), FieldSchema(name:flow_id, type:string, comment:null), FieldSchema(name:start_t, type:string, comment:null), FieldSchema(name:finish_t, type:string, comment:null), FieldSchema(name:hbin_number, type:int, comment:null), FieldSchema(name:hbin_name, type:string, comment:null), FieldSchema(name:sbin_number, type:int, comment:null), FieldSchema(name:sbin_name, type:string, comment:null), FieldSchema(name:param_id, type:string, comment:null), FieldSchema(name:param_name, type:string, comment:null), FieldSchema(name:param_unit, type:string, comment:null), FieldSchema(name:param_low_limit, type:float, comment:null), FieldSchema(name:param_high_limit, type:float, comment:null), FieldSchema(name:nb_dies_tested, type:int, comment:null), FieldSchema(name:nb_dies_failed, type:int, comment:null), FieldSchema(name:nb_dies_good, type:int, comment:null), FieldSchema(name:ingestion_date, type:string, comment:null), FieldSchema(name:fab, type:string, comment:null), FieldSchema(name:lot_partition, type:string, comment:null)], location:hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{totalSize=1353827219, numRows=143216514, rawDataSize=151433505731, COLUMN_STATS_ACCURATE={"BASIC_STATS":"true"}, numFiles=14, transient_lastDdlTime=1554163118})  |             

Same as you would do on a traditional RDBMS I have extracted the explain plans of two queries:

explain --extended
select --Yannick01
lot_id,
wafer_id,
flow_id,
param_id,
start_t,
finish_t,
param_name,
param_unit,
param_low_limit,
param_high_limit,
nb_dies_tested,
nb_dies_failed
from prod_spotfire_refined.tbl_bin_stat_orc
where fab = "C2WF"
and lot_partition in ("Q842")
and lot_id in ("Q842889")
and wafer_id in ('Q842889-01E3','Q842889-02D6','Q842889-03D1','Q842889-04C4','Q842889-05B7','Q842889-06B2','Q842889-07A5','Q842889-08A0','Q842889-09G6','Q842889-10A0','Q842889-11G6','Q842889-12G1','Q842889-13F4','Q842889-14E7','Q842889-15E2','Q842889-16D5','Q842889-17D0','Q842889-18C3','Q842889-19B6','Q842889-20C3','Q842889-21B6','Q842889-22B1','Q842889-23A4','Q842889-24H2')
and flow_id in ("EWS1")
and start_t in ('2019.01.04-19:50:55','2019.01.05-02:21:26','2019.01.05-08:33:59','2019.01.05-14:06:24','2019.01.05-19:35:23','2019.01.05-22:25:30','2019.01.06-03:49:53','2019.01.06-09:19:52','2019.01.06-14:23:47','2019.01.06-19:27:35','2019.01.07-00:47:59','2019.01.07-06:37:21','2019.01.07-11:15:14','2019.01.07-15:56:55','2019.01.07-20:05:50','2019.01.07-22:48:09','2019.01.08-04:37:37','2019.01.08-08:48:26','2019.01.08-13:51:34','2019.01.08-18:31:38','2019.01.09-00:01:41','2019.01.09-04:11:44','2019.01.09-09:45:08','2019.01.09-13:47:11')
and finish_t in ('2019.01.05-01:52:02','2019.01.05-08:30:52','2019.01.05-14:01:33','2019.01.05-19:32:20','2019.01.05-22:22:15','2019.01.06-03:46:46','2019.01.06-09:16:43','2019.01.06-14:20:42','2019.01.06-19:24:22','2019.01.07-00:44:48','2019.01.07-06:34:15','2019.01.07-11:12:02','2019.01.07-15:53:45','2019.01.07-20:02:41','2019.01.07-22:26:37','2019.01.08-04:34:30','2019.01.08-08:45:20','2019.01.08-13:48:27','2019.01.08-18:28:33','2019.01.08-23:58:34','2019.01.09-04:08:41','2019.01.09-09:41:57','2019.01.09-13:44:03','2019.01.09-18:01:54')
and hbin_number in ("9")
and sbin_number in ("403");

Explain                                                                                                                                                                                                                                                         
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Plan not optimized by CBO.                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                
Stage-0                                                                                                                                                                                                                                                         
   Fetch Operator                                                                                                                                                                                                                                               
      limit:-1                                                                                                                                                                                                                                                  
      Stage-1                                                                                                                                                                                                                                                   
         Map 1                                                                                                                                                                                                                                                  
         File Output Operator [FS_2725118]                                                                                                                                                                                                                      
            compressed:false                                                                                                                                                                                                                                    
            Statistics:Num rows: 1 Data size: 780 Basic stats: COMPLETE Column stats: COMPLETE                                                                                                                                                                  
            table:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}                                      
            Select Operator [SEL_2725117]                                                                                                                                                                                                                       
               outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11"]                                                                                                                            
               Statistics:Num rows: 1 Data size: 780 Basic stats: COMPLETE Column stats: COMPLETE                                                                                                                                                               
               Filter Operator [FIL_2725119]                                                                                                                                                                                                                    
                  predicate:((lot_partition) IN ('Q842') and (lot_id) IN ('Q842889') and (wafer_id) IN ('Q842889-01E3', 'Q842889-02D6', 'Q842889-03D1', 'Q842889-04C4', 'Q842889-05B7', 'Q842889-06B2', 'Q842889-07A5', 'Q842889-08A0', 'Q842889-09G6', 'Q842889-10A0', 'Q842889-11G6', 'Q842889-12G1', 'Q842889-13F4', 'Q842889-14E7', 'Q842889-15E2', 'Q842889-16D5', 'Q842889-17D0', 'Q842889-18C3', 'Q842889-19B6', 'Q842889-20C3', 'Q842889-21B6', 'Q842889-22B1', 'Q842889-23A4', 'Q842889-24H2') and (flow_id) IN ('EWS1') and (start_t) IN ('2019.01.04-19:50:55', '2019.01.05-02:21:26', '2019.01.05-08:33:59', '2019.01.05-14:06:24', '2019.01.05-19:35:23', '2019.01.05-22:25:30', '2019.01.06-03:49:53', '2019.01.06-09:19:52', '2019.01.06-14:23:47', '2019.01.06-19:27:35', '2019.01.07-00:47:59', '2019.01.07-06:37:21', '2019.01.07-11:15:14', '2019.01.07-15:56:55', '2019.01.07-20:05:50', '2019.01.07-22:48:09', '2019.01.08-04:37:37', '2019.01.08-08:48:26', '2019.01.08-13:51:34', '2019.01.08-18:31:38', '2019.01.09-00:01:41', '2019.01.09-04:11:44', '2019.01.09-09:45:08', '2019.01.09-13:47:11') and (finish_t) IN ('2019.01.05-01:52:02', '2019.01.05-08:30:52', '2019.01.05-14:01:33', '2019.01.05-19:32:20', '2019.01.05-22:22:15', '2019.01.06-03:46:46', '2019.01.06-09:16:43', '2019.01.06-14:20:42', '2019.01.06-19:24:22', '2019.01.07-00:44:48', '2019.01.07-06:34:15', '2019.01.07-11:12:02', '2019.01.07-15:53:45', '2019.01.07-20:02:41', '2019.01.07-22:26:37', '2019.01.08-04:34:30', '2019.01.08-08:45:20', '2019.01.08-13:48:27', '2019.01.08-18:28:33', '2019.01.08-23:58:34', '2019.01.09-04:08:41', '2019.01.09-09:41:57', '2019.01.09-13:44:03', '2019.01.09-18:01:54') and (hbin_number) IN ('9') and (sbin_number) IN ('403')) (type: boolean) 
                  Statistics:Num rows: 1 Data size: 972 Basic stats: COMPLETE Column stats: COMPLETE                                                                                                                                                            
                  TableScan [TS_2725115]                                                                                                                                                                                                                        
                     alias:tbl_bin_stat_orc                                                                                                                                                                                                                     
                     Statistics:Num rows: 143216514 Data size: 151433505731 Basic stats: COMPLETE Column stats: COMPLETE                                                                                                                                        
                                                                                                                                                                                                                                                                

21 rows selected. 

Remark:
I also tried the EXTENDED mode of EXPAIN PLAN command (only available option of my Hive release, but plenty other are available in latest Hive releases) that, obviously, provide a more verbose output. Maybe too much verbose but from it I have seen an interesting additional information that tell you that your query is effectively using partition pruning and access only to the expected partitions:

partition values:                                                                                                                                                                                                                                   
  fab C2WF                                                                                                                                                                                                                                          
  lot_partition Q840     
explain --extended
select --Yannick02
lot_id,
wafer_id,
flow_id,
param_id,
start_t,
finish_t,
param_name,
param_unit,
param_low_limit,
param_high_limit,
nb_dies_tested,
nb_dies_failed
from prod_spotfire_refined.tbl_bin_stat_orc
where fab = "C2WF"
and lot_partition in ("Q840")
and lot_id in ("Q840401")
and wafer_id in ('Q840401-01E6','Q840401-02E1','Q840401-03D4','Q840401-04C7','Q840401-05C2','Q840401-06B5','Q840401-07B0','Q840401-08A3','Q840401-09H1','Q840401-10A3','Q840401-11H1','Q840401-12G4','Q840401-13F7','Q840401-14F2','Q840401-15E5','Q840401-16E0','Q840401-17D3','Q840401-18C6','Q840401-19C1','Q840401-20C6','Q840401-21C1','Q840401-22B4','Q840401-23A7','Q840401-24A2','Q840401-25H0')
and flow_id in ("EWS1")
and start_t in ('2018.12.27-10:42:54','2018.12.27-12:01:57','2018.12.27-13:18:47','2018.12.27-14:36:31','2018.12.27-15:55:57','2018.12.27-17:13:42','2018.12.27-18:31:34','2018.12.27-19:49:27','2018.12.27-21:05:30','2018.12.27-22:23:36','2018.12.27-23:40:11','2018.12.28-00:56:40','2018.12.28-02:15:51','2018.12.28-03:41:23','2018.12.28-04:58:02','2018.12.28-06:16:11','2018.12.28-07:34:40','2018.12.28-08:55:29','2018.12.28-10:13:25','2018.12.28-11:30:34','2018.12.28-12:48:08','2018.12.28-14:06:12','2018.12.28-15:23:00','2018.12.28-16:39:50','2018.12.28-17:56:57')
and finish_t in ('2018.12.27-12:00:40','2018.12.27-13:17:30','2018.12.27-14:35:13','2018.12.27-15:54:39','2018.12.27-17:12:26','2018.12.27-18:30:16','2018.12.27-19:48:12','2018.12.27-21:04:15','2018.12.27-22:22:19','2018.12.27-23:38:53','2018.12.28-00:55:24','2018.12.28-02:14:34','2018.12.28-03:32:41','2018.12.28-04:56:43','2018.12.28-06:14:52','2018.12.28-07:33:21','2018.12.28-08:54:12','2018.12.28-10:12:06','2018.12.28-11:29:17','2018.12.28-12:46:50','2018.12.28-14:04:55','2018.12.28-15:21:43','2018.12.28-16:38:30','2018.12.28-17:55:38','2018.12.28-19:13:18')
and hbin_number in ("1")
and sbin_number in ("1");


Explain                                                                                                                                                                                                                                                         
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Plan not optimized by CBO.                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                
Stage-0                                                                                                                                                                                                                                                         
   Fetch Operator                                                                                                                                                                                                                                               
      limit:-1                                                                                                                                                                                                                                                  
      Select Operator [SEL_2725124]                                                                                                                                                                                                                             
         outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11"]                                                                                                                                  
         Filter Operator [FIL_2725126]                                                                                                                                                                                                                          
            predicate:((lot_partition) IN ('Q840') and (lot_id) IN ('Q840401') and (wafer_id) IN ('Q840401-01E6', 'Q840401-02E1', 'Q840401-03D4', 'Q840401-04C7', 'Q840401-05C2', 'Q840401-06B5', 'Q840401-07B0', 'Q840401-08A3', 'Q840401-09H1', 'Q840401-10A3', 'Q840401-11H1', 'Q840401-12G4', 'Q840401-13F7', 'Q840401-14F2', 'Q840401-15E5', 'Q840401-16E0', 'Q840401-17D3', 'Q840401-18C6', 'Q840401-19C1', 'Q840401-20C6', 'Q840401-21C1', 'Q840401-22B4', 'Q840401-23A7', 'Q840401-24A2', 'Q840401-25H0') and (flow_id) IN ('EWS1') and (start_t) IN ('2018.12.27-10:42:54', '2018.12.27-12:01:57', '2018.12.27-13:18:47', '2018.12.27-14:36:31', '2018.12.27-15:55:57', '2018.12.27-17:13:42', '2018.12.27-18:31:34', '2018.12.27-19:49:27', '2018.12.27-21:05:30', '2018.12.27-22:23:36', '2018.12.27-23:40:11', '2018.12.28-00:56:40', '2018.12.28-02:15:51', '2018.12.28-03:41:23', '2018.12.28-04:58:02', '2018.12.28-06:16:11', '2018.12.28-07:34:40', '2018.12.28-08:55:29', '2018.12.28-10:13:25', '2018.12.28-11:30:34', '2018.12.28-12:48:08', '2018.12.28-14:06:12', '2018.12.28-15:23:00', '2018.12.28-16:39:50', '2018.12.28-17:56:57') and (finish_t) IN ('2018.12.27-12:00:40', '2018.12.27-13:17:30', '2018.12.27-14:35:13', '2018.12.27-15:54:39', '2018.12.27-17:12:26', '2018.12.27-18:30:16', '2018.12.27-19:48:12', '2018.12.27-21:04:15', '2018.12.27-22:22:19', '2018.12.27-23:38:53', '2018.12.28-00:55:24', '2018.12.28-02:14:34', '2018.12.28-03:32:41', '2018.12.28-04:56:43', '2018.12.28-06:14:52', '2018.12.28-07:33:21', '2018.12.28-08:54:12', '2018.12.28-10:12:06', '2018.12.28-11:29:17', '2018.12.28-12:46:50', '2018.12.28-14:04:55', '2018.12.28-15:21:43', '2018.12.28-16:38:30', '2018.12.28-17:55:38', '2018.12.28-19:13:18') and (hbin_number) IN ('1') and (sbin_number) IN ('1')) (type: boolean) 
            TableScan [TS_2725122]                                                                                                                                                                                                                              
               alias:tbl_bin_stat_orc                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                                

12 rows selected.

We can anyway notice that the explain plan of Query02 is not displaying any statistics even if they are there in the table. I would have expected in clear text something like “fetch task”, let see why…

When executing the queries, the first observation I have made is that query02 does not do a MapReduce job (using TZ engine as configured by default on our cluster) but a direct HDFS access. You can see that the query is NOT doing a MapReduce in Beeline when just after having submitted your query you don’t see the usual graphical display of number of Map and Reduce jobs but directly the query result…

Query01 performs a standard MapReduce job:

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      9          9        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 15.92 s
--------------------------------------------------------------------------------

We have seen above that there is a factor of 13 times less efficient to return 57 times less row. so looks like the direct HDFS access called a fetch task is not that optimal…

Remark:
The “Plan not optimized by CBO” message might look scary so decided to dig a bit on it. First you have to be on your Hive server and check /var/log/hive/hiveserver2.log file (Hadoop parameter is hive_log_dir). Now you understand why I have added stupid comment in my query (–Yannick01 and –Yannick02), as this was a trick I was using with Oracle to find myself in V$SQL. Here opening this huge file you can search for your name and below the text of your query you should be able to see something like:

2019-04-04 12:06:12,633 INFO  [HiveServer2-Handler-Pool: Thread-15548880]: parse.BaseSemanticAnalyzer (CalcitePlanner.java:canCBOHandleAst(405)) - Not invoking CBO because the statement has too few joins

I have also checked the partitions sizes as well as number of ORC files in each partition (compaction):

hdfs@client_node:~$ hdfs dfs -ls -r -t /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842
Found 14 items
-rwxrwxrwx   3 mfgdl_ingestion hadoop   10185787 2019-03-30 17:16 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000006_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    4240294 2019-03-30 17:16 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000007_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop   10696310 2019-03-30 17:16 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000005_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  268531575 2019-03-30 17:16 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000001_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  263145620 2019-03-30 17:16 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000004_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  257281211 2019-03-30 17:28 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000003_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  275941797 2019-03-30 17:34 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000000_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  252093563 2019-03-30 17:40 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000002_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    2127431 2019-03-30 22:38 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000018_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    2144463 2019-03-31 10:21 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000018_0_copy_1
-rwxrwxrwx   3 mfgdl_ingestion hadoop    2833270 2019-03-31 23:38 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000015_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    1563968 2019-04-01 10:50 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000021_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    1418099 2019-04-01 17:31 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000021_0_copy_1
-rwxrwxrwx   3 mfgdl_ingestion hadoop    1623831 2019-04-02 01:58 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842/000017_0

hdfs@client_node:~$ hdfs dfs -du -h -s /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842
1.3 G  /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q842

hdfs@client_node:~$ hdfs dfs -ls -r -t /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840
Found 12 items
-rwxrwxrwx   3 mfgdl_ingestion hadoop  298096825 2019-03-30 17:01 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000000_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop    7651387 2019-03-30 17:01 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000004_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  262040206 2019-03-30 17:02 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000001_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  177619545 2019-03-30 17:02 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000003_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop  241590172 2019-03-30 17:13 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000002_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop     588415 2019-03-30 22:38 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000035_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop     466219 2019-03-31 10:21 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000037_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop     496742 2019-03-31 23:38 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000038_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop     463094 2019-04-01 10:50 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000037_0_copy_1
-rwxrwxrwx   3 mfgdl_ingestion hadoop     362538 2019-04-01 17:31 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000041_0
-rwxrwxrwx   3 mfgdl_ingestion hadoop     412610 2019-04-02 01:58 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000038_0_copy_1
drwxrwxrwx   - mfgaewsp        hadoop          0 2019-04-03 16:12 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/.hive-staging_hive_2019-04-03_16-12-30_095_3617150963696131727-133149

hdfs@client_node:~$ hdfs dfs -du -h -s /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840
943.9 M  /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840

Again no particular big differences in size and number of files for two partitions…

To speed up long query I have tried to compact the partition using:

ALTER TABLE prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840") CONCATENATE;

Reaching this status:

hdfs@client_node:~$ hdfs dfs -ls -r -t /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840
Found 5 items
-rwxrwxrwx   3 mfgaewsp hadoop    8702924 2019-04-03 16:12 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000004_0
-rwxrwxrwx   3 mfgaewsp hadoop  266770170 2019-04-03 16:12 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000001_0
-rwxrwxrwx   3 mfgaewsp hadoop  178527568 2019-04-03 16:12 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000003_0
-rwxrwxrwx   3 mfgaewsp hadoop  242418393 2019-04-03 16:13 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000002_0
-rwxrwxrwx   3 mfgaewsp hadoop  291710785 2019-04-03 16:13 /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840/000000_0

Then we discovered that concatenate command is destroying the existing statistics so we had to recompute them again !

ANALYZE TABLE prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840") COMPUTE STATISTICS FOR COLUMNS;

Remark:
We have been able to use the FOR COLUMNS without specifying the columns list because our table does not contains any complex type like Array.

Statistics went back to initial situation:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> describe formatted prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840");
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|             col_name              |                                                          data_type                                                          |                                                                                                                                                                                                                                    comment                                                                                                                                                                                                                                     |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| # col_name                        | data_type                                                                                                                   | comment                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                                   | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| lot_id                            | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| wafer_id                          | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| flow_id                           | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| start_t                           | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| finish_t                          | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| hbin_number                       | int                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| hbin_name                         | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| sbin_number                       | int                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| sbin_name                         | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| param_id                          | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| param_name                        | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| param_unit                        | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| param_low_limit                   | float                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| param_high_limit                  | float                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| nb_dies_tested                    | int                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| nb_dies_failed                    | int                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| nb_dies_good                      | int                                                                                                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| ingestion_date                    | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                   | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| # Partition Information           | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| # col_name                        | data_type                                                                                                                   | comment                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|                                   | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| fab                               | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| lot_partition                     | string                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                   | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| # Detailed Partition Information  | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Partition Value:                  | [C2WF, Q840]                                                                                                                | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Database:                         | prod_spotfire_refined                                                                                                       | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Table:                            | tbl_bin_stat_orc                                                                                                            | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| CreateTime:                       | Mon Feb 11 13:02:12 CET 2019                                                                                                | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| LastAccessTime:                   | UNKNOWN                                                                                                                     | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Protect Mode:                     | None                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Location:                         | hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840  | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Partition Parameters:             | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                                   | COLUMN_STATS_ACCURATE                                                                                                       | {\"COLUMN_STATS\":{\"lot_id\":\"true\",\"wafer_id\":\"true\",\"flow_id\":\"true\",\"start_t\":\"true\",\"finish_t\":\"true\",\"hbin_number\":\"true\",\"hbin_name\":\"true\",\"sbin_number\":\"true\",\"sbin_name\":\"true\",\"param_id\":\"true\",\"param_name\":\"true\",\"param_unit\":\"true\",\"param_low_limit\":\"true\",\"param_high_limit\":\"true\",\"nb_dies_tested\":\"true\",\"nb_dies_failed\":\"true\",\"nb_dies_good\":\"true\",\"ingestion_date\":\"true\"}}  |
|                                   | numFiles                                                                                                                    | 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
|                                   | numRows                                                                                                                     | 109795564                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                   | rawDataSize                                                                                                                 | 116612625917                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
|                                   | totalSize                                                                                                                   | 988129840                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                   | transient_lastDdlTime                                                                                                       | 1554300803                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                                   | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| # Storage Information             | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| SerDe Library:                    | org.apache.hadoop.hive.ql.io.orc.OrcSerde                                                                                   | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| InputFormat:                      | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat                                                                             | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| OutputFormat:                     | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat                                                                            | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Compressed:                       | No                                                                                                                          | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Num Buckets:                      | -1                                                                                                                          | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Bucket Columns:                   | []                                                                                                                          | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Sort Columns:                     | []                                                                                                                          | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| Storage Desc Params:              | NULL                                                                                                                        | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|                                   | serialization.format                                                                                                        | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+-----------------------------------+-----------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
describe extended prod_spotfire_refined.tbl_bin_stat_orc partition(fab = "C2WF", lot_partition="Q840");

| Detailed Partition Information  | Partition(values:[C2WF, Q840], dbName:prod_spotfire_refined, tableName:tbl_bin_stat_orc, createTime:1549886532, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:lot_id, type:string, comment:null), FieldSchema(name:wafer_id, type:string, comment:null), FieldSchema(name:flow_id, type:string, comment:null), FieldSchema(name:start_t, type:string, comment:null), FieldSchema(name:finish_t, type:string, comment:null), FieldSchema(name:hbin_number, type:int, comment:null), FieldSchema(name:hbin_name, type:string, comment:null), FieldSchema(name:sbin_number, type:int, comment:null), FieldSchema(name:sbin_name, type:string, comment:null), FieldSchema(name:param_id, type:string, comment:null), FieldSchema(name:param_name, type:string, comment:null), FieldSchema(name:param_unit, type:string, comment:null), FieldSchema(name:param_low_limit, type:float, comment:null), FieldSchema(name:param_high_limit, type:float, comment:null), FieldSchema(name:nb_dies_tested, type:int, comment:null), FieldSchema(name:nb_dies_failed, type:int, comment:null), FieldSchema(name:nb_dies_good, type:int, comment:null), FieldSchema(name:ingestion_date, type:string, comment:null), FieldSchema(name:fab, type:string, comment:null), FieldSchema(name:lot_partition, type:string, comment:null)], location:hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q840, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{totalSize=988129840, numRows=109795564, rawDataSize=116612625917, COLUMN_STATS_ACCURATE={"COLUMN_STATS":{"lot_id":"true","wafer_id":"true","flow_id":"true","start_t":"true","finish_t":"true","hbin_number":"true","hbin_name":"true","sbin_number":"true","sbin_name":"true","param_id":"true","param_name":"true","param_unit":"true","param_low_limit":"true","param_high_limit":"true","nb_dies_tested":"true","nb_dies_failed":"true","nb_dies_good":"true","ingestion_date":"true"}}, numFiles=5, transient_lastDdlTime=1554300803})  |       

Remark:
Note that the ALL COLUMNS change the COLUMN_STATS_ACCURATE value that has moved from {\”BASIC_STATS\”:\”true\”} to {\”COLUMN_STATS\”: xxx}.

But it did not change the poor response type of Query02….

Fetch task performing worst than MapReduce ?

From the official documentation and many web sites around they say that simple queries can be executed as fetch task instead of traditional Map Reduce to minimize latency. Fetch tasks are direct HDFS access using “hdfs dfs –get” or “hdfs dfs –cat” commands son on the paper the overhead of creation MAp and Reduce jobs is gone. On simple queries means single sourced query not having any subquery and not having any aggregations, distincts, lateral views and joins.

You can change parameters only for your session using:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion;
+----------------------------------+--+
|               set                |
+----------------------------------+--+
| hive.fetch.task.conversion=more  |
+----------------------------------+--+
1 row selected (0.007 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion.threshold;
+--------------------------------------------------+--+
|                       set                        |
+--------------------------------------------------+--+
| hive.fetch.task.conversion.threshold=1073741824  |
+--------------------------------------------------+--+
1 row selected (0.007 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion.threshold=524288000;
No rows affected (0.003 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion.threshold;
+-------------------------------------------------+--+
|                       set                       |
+-------------------------------------------------+--+
| hive.fetch.task.conversion.threshold=524288000  |
+-------------------------------------------------+--+
1 row selected (0.006 seconds)


0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion=none;
No rows affected (0.004 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion;
+----------------------------------+--+
|               set                |
+----------------------------------+--+
| hive.fetch.task.conversion=none  |
+----------------------------------+--+

After changing hive.fetch.task.conversion to none, means disabling Hive fetch task, the query move to a traditional MapReduce job:

INFO  : Status: Running (Executing on YARN cluster with App id application_1551777498072_117912)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED     14         14        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 20.19 s
--------------------------------------------------------------------------------

And more importantly query execution time moved from around from 330 seconds to 28 seconds, 20 seconds for execution, rest is network transfer

Fetch task are supposed to be much faster than MapReduce job but we clearly show that it’s not the case at all on our cluster… Bug ?

To go further

To debug Hive fetch, with the help of a consultant working for us, we have exported an online variable:

hdfs@client_node:~$ export HADOOP_ROOT_LOGGER=debug,console

Then we simulated a Hive fetch on a datafile of one partition with a size of less than 1GB with:

hdfs@client_node:~$ hdfs dfs -cat /apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q846/000000_0 > /tmp/test_cq
19/04/08 12:17:38 DEBUG util.Shell: setsid exited with exit code 0
19/04/08 12:17:38 DEBUG conf.Configuration: parsing URL jar:file:/usr/hdp/2.6.4.0-91/hadoop/hadoop-common-2.7.3.2.6.4.0-91.jar!/core-default.xml
19/04/08 12:17:38 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@7a36aefa
19/04/08 12:17:38 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/2.6.4.0-91/0/core-site.xml
19/04/08 12:17:38 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@58c1c010
19/04/08 12:17:38 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true
19/04/08 12:17:38 DEBUG util.KerberosName: Kerberos krb5 configuration not found, setting default realm to empty
19/04/08 12:17:38 DEBUG security.Groups:  Creating new Groups object
19/04/08 12:17:38 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
19/04/08 12:17:38 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
19/04/08 12:17:38 DEBUG security.JniBasedUnixGroupsMapping: Using JniBasedUnixGroupsMapping for Group resolution
19/04/08 12:17:38 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
19/04/08 12:17:38 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
19/04/08 12:17:38 DEBUG security.UserGroupInformation: hadoop login
19/04/08 12:17:38 DEBUG security.UserGroupInformation: hadoop login commit
19/04/08 12:17:38 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hdfs
19/04/08 12:17:38 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: hdfs" with name hdfs
19/04/08 12:17:38 DEBUG security.UserGroupInformation: User entry: "hdfs"
19/04/08 12:17:38 DEBUG security.UserGroupInformation: Assuming keytab is managed externally since logged in from subject.
19/04/08 12:17:38 DEBUG security.UserGroupInformation: UGI loginUser:hdfs (auth:SIMPLE)
19/04/08 12:17:38 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
19/04/08 12:17:38 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true
19/04/08 12:17:38 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
19/04/08 12:17:38 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket
19/04/08 12:17:39 DEBUG hdfs.HAUtil: No HA service delegation token found for logical URI hdfs://ManufacturingDataLakeHdfs
19/04/08 12:17:39 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
19/04/08 12:17:39 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = true
19/04/08 12:17:39 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
19/04/08 12:17:39 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/lib/hadoop-hdfs/dn_socket
19/04/08 12:17:39 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
19/04/08 12:17:39 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$S  erver$ProtoBufRpcInvoker@238d68ff
19/04/08 12:17:39 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@70e38ce1
19/04/08 12:17:39 DEBUG unix.DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@3dfc1841: starting with interruptCheckPeriodMs = 60000
19/04/08 12:17:39 DEBUG shortcircuit.DomainSocketFactory: The short-circuit local reads feature is enabled.
19/04/08 12:17:39 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
19/04/08 12:17:39 DEBUG ipc.Client: The ping interval is 60000 ms.
19/04/08 12:17:39 DEBUG ipc.Client: Connecting to master_namenode.domain.com/10.75.144.1:8020
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs: starting, having connections 1
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs sending #0
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs got value #0
19/04/08 12:17:39 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 53ms
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs sending #1
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs got value #1
19/04/08 12:17:39 DEBUG ipc.ProtobufRpcEngine: Call: getBlockLocations took 1ms
19/04/08 12:17:39 DEBUG azure.NativeAzureFileSystem: finalize() called.
19/04/08 12:17:39 DEBUG azure.NativeAzureFileSystem: finalize() called.
19/04/08 12:17:39 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{
  fileLength=281939725
  underConstruction=false
  blocks=[LocatedBlock{BP-1711156358-10.75.144.1-1519036486930:blk_1212723761_139044745; getBlockSize()=268435456; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.75.144.15:50010,DS-7bf7ad15-da5c  -454b-adb5-67f1ea89e0e6,DISK], DatanodeInfoWithStorage[10.75.144.14:50010,DS-ee0aca78-2c5c-48bf-aeb8-3e4ed5823be1,DISK], DatanodeInfoWithStorage[10.75.144.11:50010,DS-92c5ac9c-6a51-464a-9925-8f0cc06d3f3f,D  ISK]]}, LocatedBlock{BP-1711156358-10.75.144.1-1519036486930:blk_1212733643_139054627; getBlockSize()=13504269; corrupt=false; offset=268435456; locs=[DatanodeInfoWithStorage[10.75.144.15:50010,DS-291244e5  -8577-4095-b200-3233661417db,DISK], DatanodeInfoWithStorage[10.75.144.11:50010,DS-bb818fac-2bb1-4ccf-9a17-a9afbecf5f6f,DISK], DatanodeInfoWithStorage[10.75.144.14:50010,DS-96c13ead-02bb-4191-b6c9-02c2f632e  bf0,DISK]]}]
  lastLocatedBlock=LocatedBlock{BP-1711156358-10.75.144.1-1519036486930:blk_1212733643_139054627; getBlockSize()=13504269; corrupt=false; offset=268435456; locs=[DatanodeInfoWithStorage[10.75.144.14:50010,  DS-96c13ead-02bb-4191-b6c9-02c2f632ebf0,DISK], DatanodeInfoWithStorage[10.75.144.15:50010,DS-291244e5-8577-4095-b200-3233661417db,DISK], DatanodeInfoWithStorage[10.75.144.11:50010,DS-bb818fac-2bb1-4ccf-9a1  7-a9afbecf5f6f,DISK]]}
  isLastBlockComplete=true}
19/04/08 12:17:39 DEBUG hdfs.DFSClient: Connecting to datanode 10.75.144.15:50010
19/04/08 12:17:39 DEBUG util.PerformanceAdvisory: BlockReaderFactory(fileName=/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q846/000000_0, block=BP-1711156358-10.75.  144.1-1519036486930:blk_1212723761_139044745): PathInfo{path=, state=UNUSABLE} is not usable for short circuit; giving up on BlockReaderLocal.
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs sending #2
19/04/08 12:17:39 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs got value #2
19/04/08 12:17:39 DEBUG ipc.ProtobufRpcEngine: Call: getServerDefaults took 1ms
19/04/08 12:17:39 DEBUG sasl.SaslDataTransferClient: SASL client skipping handshake in unsecured configuration for addr = /10.75.144.15, datanodeId = DatanodeInfoWithStorage[10.75.144.15:50010,DS-7bf7ad15-  da5c-454b-adb5-67f1ea89e0e6,DISK]
19/04/08 12:17:41 DEBUG hdfs.DFSClient: Connecting to datanode 10.75.144.15:50010
19/04/08 12:17:41 DEBUG util.PerformanceAdvisory: BlockReaderFactory(fileName=/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q846/000000_0, block=BP-1711156358-10.75.  144.1-1519036486930:blk_1212733643_139054627): PathInfo{path=, state=UNUSABLE} is not usable for short circuit; giving up on BlockReaderLocal.
19/04/08 12:17:41 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@70e38ce1
19/04/08 12:17:41 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@70e38ce1
19/04/08 12:17:41 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@70e38ce1
19/04/08 12:17:41 DEBUG ipc.Client: Stopping client
19/04/08 12:17:41 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs: closed
19/04/08 12:17:41 DEBUG ipc.Client: IPC Client (1605851606) connection to master_namenode.domain.com/10.75.144.1:8020 from hdfs: stopped, remaining connections 0
19/04/08 12:17:41 DEBUG util.ShutdownHookManager: ShutdownHookManger complete shutdown.

And we have seen that dfs.client.read.shortcircuit was correctly used…

Questioning few Cloudera engineers they suggested to increase HiveServer2 memory as in case of a Hive fetch the ORC decompression and filtering is done directly by Hive server. Using Grafana from Ambari we saw some pikes in HiveServer2 memory and so decided to increase its memory from 12GB to 32GB. This was also possible as our server has plenty of memory. When I talk of ORC decompression it is because we have chosen this parameters:

hive_fetch01
hive_fetch01

We retried the query with absolutely no improvement… At that time we also investigated in HiveServer2 log file and found:

2019-04-09 15:00:34,183 INFO  [HiveServer2-Handler-Pool: Thread-112]: orc.ReaderImpl (ReaderImpl.java:rowsOptions(478)) - Reading ORC rows from hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_spotfire_refined.db/tbl_bin_stat_orc/fab=C2WF/lot_partition=Q846/000002_0 with {include: [true, true, true, true, true, true, true, false, true, false, true, true, true, true, true, true, true, false, false], offset: 0, length: 257464687, sarg: leaf-0 = (IN lot_partition Q846), leaf-1 = (IN lot_id Q840401), leaf-2 = (IN wafer_id Q840401-01E6 Q840401-02E1 Q840401-03D4 Q840401-04C7 Q840401-05C2 Q840401-06B5 Q840401-07B0 Q840401-08A3 Q840401-09H1 Q840401-10A3 Q840401-11H1 Q840401-12G4 Q840401-13F7 Q840401-14F2 Q840401-15E5 Q840401-16E0 Q840401-17D3 Q840401-18C6 Q840401-19C1 Q840401-20C6 Q840401-21C1 Q840401-22B4 Q840401-23A7 Q840401-24A2 Q840401-25H0), leaf-3 = (IN flow_id EWS1), leaf-4 = (IN start_t 2018.12.27-10:42:54 2018.12.27-12:01:57 2018.12.27-13:18:47 2018.12.27-14:36:31 2018.12.27-15:55:57 2018.12.27-17:13:42 2018.12.27-18:31:34 2018.12.27-19:49:27 2018.12.27-21:05:30 2018.12.27-22:23:36 2018.12.27-23:40:11 2018.12.28-00:56:40 2018.12.28-02:15:51 2018.12.28-03:41:23 2018.12.28-04:58:02 2018.12.28-06:16:11 2018.12.28-07:34:40 2018.12.28-08:55:29 2018.12.28-10:13:25 2018.12.28-11:30:34 2018.12.28-12:48:08 2018.12.28-14:06:12 2018.12.28-15:23:00 2018.12.28-16:39:50 2018.12.28-17:56:57), leaf-5 = (IN finish_t 2018.12.27-12:00:40 2018.12.27-13:17:30 2018.12.27-14:35:13 2018.12.27-15:54:39 2018.12.27-17:12:26 2018.12.27-18:30:16 2018.12.27-19:48:12 2018.12.27-21:04:15 2018.12.27-22:22:19 2018.12.27-23:38:53 2018.12.28-00:55:24 2018.12.28-02:14:34 2018.12.28-03:32:41 2018.12.28-04:56:43 2018.12.28-06:14:52 2018.12.28-07:33:21 2018.12.28-08:54:12 2018.12.28-10:12:06 2018.12.28-11:29:17 2018.12.28-12:46:50 2018.12.28-14:04:55 2018.12.28-15:21:43 2018.12.28-16:38:30 2018.12.28-17:55:38 2018.12.28-19:13:18), leaf-6 = (IN hbin_number 1), leaf-7 = (IN sbin_number 1), expr = (and leaf-0 leaf-1 leaf-2 leaf-3 leaf-4 leaf-5 leaf-6 leaf-7), columns: ['null', 'lot_id', 'wafer_id', 'flow_id', 'start_t', 'finish_t', 'hbin_number', 'null', 'sbin_number', 'null', 'param_id', 'param_name', 'param_unit', 'param_low_limit', 'param_high_limit', 'nb_dies_tested', 'nb_dies_failed', 'null', 'null']}
2019-04-09 15:02:24,610 INFO  [HiveServer2-Handler-Pool: Thread-112]: orc.OrcUtils (OrcUtils.java:getDesiredRowTypeDescr(810)) - Using schema evolution configuration variables schema.evolution.columns [lot_id, wafer_id, flow_id, start_t, finish_t, hbin_number, hbin_name, sbin_number, sbin_name, param_id, param_name, param_unit, param_low_limit, param_high_limit, nb_dies_tested, nb_dies_failed, nb_dies_good, ingestion_date] / schema.evolution.columns.types [string, string, string, string, string, int, string, int, string, string, string, string, float, float, int, int, int, string] (isAcid false)

If you see below this file is only around 230MB in size. So it has taken almost 2 minutes (!!!) to read, decompress and filter this small file. So clearly there is something wrong with Hive fetch task and we have decided to deactivate it by setting cluster wide below parameter:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.fetch.task.conversion;
+----------------------------------+--+
|               set                |
+----------------------------------+--+
| hive.fetch.task.conversion=none  |
+----------------------------------+--+
1 row selected (0.09 seconds)

At the end both queries are now running in less than 25-30 seconds and we will monitor for any other negative side effects…

References

The post Hive fetch task really improving response time by bypassing MapReduce ? appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/hive-fetch-task-really-improving-response-time-by-bypassing-mapreduce.html/feed 0
Fetch Zookeeper information from Python with Kazoo to connect Hive https://blog.yannickjaquier.com/hadoop/fetch-zookeeper-information-from-python-with-kazoo-to-connect-hive.html https://blog.yannickjaquier.com/hadoop/fetch-zookeeper-information-from-python-with-kazoo-to-connect-hive.html#respond Fri, 25 Oct 2019 07:53:32 +0000 https://blog.yannickjaquier.com/?p=4802 Preamble Straight from the beginning when I have landed to our Hadoop project (HortonWorks) I have seen all our Python scripts directly connecting to our Hive server, with PyHive, bypassing Zookeeper coordination. I also noticed that every Beeline client connection where well using (obviously) the HiveServer2 JDBC URL. I have left this point open for […]

The post Fetch Zookeeper information from Python with Kazoo to connect Hive appeared first on IT World.

]]>

Table of contents

Preamble

Straight from the beginning when I have landed to our Hadoop project (HortonWorks) I have seen all our Python scripts directly connecting to our Hive server, with PyHive, bypassing Zookeeper coordination. I also noticed that every Beeline client connection where well using (obviously) the HiveServer2 JDBC URL. I have left this point open for later until we decided to improve our High Availability (HA) by making few components running on multiple servers. And when it came to our HiveServer2 that is now running on two edge nodes of our Hadoop cluster I have decided to dig out this thread…

The good way of connecting to HiveServer2 is to first get current status and configuration from Zookeeper and then use this information in PyHive (for example) to make a Hive connection. Zookeeper is acting here as a configuration keeper as well as an availability watcher, means Zookeeper will not return a dead HiveServer2 information.

Digging a bit on Internet I came quickly to the obvious conclusion that Kazoo Python package was a must try !

This blog post has been written using kazoo 2.5.0, Python 3.7.3. My Hadoop cluster is HortonWorks Data Platform (HDP) 2.6.4. All developed scripts are running on a Fedora 30 virtual machine.

Kazoo development environment installation

Anaconda is the preferred Python 3.7 environment management so started by downloading and installing it for Python 3.7 on my Fedora virtual machine. The release I have installed is:

[root@fedora1 ~]# anaconda --version
anaconda 30.25.6

It also gives you access to conda that is command line environment management:

(base) [root@fedora1 ~]# conda --version
conda 4.6.14
(base) [root@fedora1 ~]# conda info

     active environment : base
    active env location : /opt/anaconda3
            shell level : 1
       user config file : /root/.condarc
 populated config files : /root/.condarc
          conda version : 4.6.14
    conda-build version : 3.17.8
         python version : 3.7.3.final.0
       base environment : /opt/anaconda3  (writable)
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/free/linux-64
                          https://repo.anaconda.com/pkgs/free/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/anaconda3/pkgs
                          /root/.conda/pkgs
       envs directories : /opt/anaconda3/envs
                          /root/.conda/envs
               platform : linux-64
             user-agent : conda/4.6.14 requests/2.21.0 CPython/3.7.3 Linux/5.0.11-300.fc30.x86_64 fedora/30 glibc/2.29
                UID:GID : 0:0
             netrc file : None
           offline mode : False

As I am behind a corporate proxy I had to customize a little bit my .condarc profile:

(base) [root@fedora1 ~]# cat  .condarc
ssl_verify: False
proxy_servers:
    http: http://proxy_user:proxy_password@proxy_server:proxy_port
    https: https://proxy_user:proxy_password@proxy_server:proxy_port

Remark
I have also been obliged to set ssl_verify to false to avoid any certificate issues that are not in my proxy server…

I create a kazoo conda environment with:

(base) [root@fedora1 ~]# conda create -n kazoo
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /opt/anaconda3/envs/kazoo



Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate kazoo
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Activate it with:

[root@fedora1 ~]# conda activate kazoo
(kazoo) [root@fedora1 ~]#

As you have seen I am working with root which is a very bad idea so better to use a normal account (and keep root for more important tasks see below), to do so initialize Conda with (my shell is obviously bash):

[yjaquier@fedora1 ~]$ /opt/anaconda3/bin/conda init bash
no change     /opt/anaconda3/condabin/conda
no change     /opt/anaconda3/bin/conda
no change     /opt/anaconda3/bin/conda-env
no change     /opt/anaconda3/bin/activate
no change     /opt/anaconda3/bin/deactivate
no change     /opt/anaconda3/etc/profile.d/conda.sh
no change     /opt/anaconda3/etc/fish/conf.d/conda.fish
no change     /opt/anaconda3/shell/condabin/Conda.psm1
no change     /opt/anaconda3/shell/condabin/conda-hook.ps1
no change     /opt/anaconda3/lib/python3.7/site-packages/xonsh/conda.xsh
no change     /opt/anaconda3/etc/profile.d/conda.csh
modified      /home/yjaquier/.bashrc

==> For changes to take effect, close and re-open your current shell. <==

Logoff and logon again and use the newly created environment with:

(base) [yjaquier@fedora1 ~]$ conda activate kazoo
(kazoo) [yjaquier@fedora1 ~]$

You also need to configure your conda environment (.condarc) same as above...

For package management and search your reference will be https://anaconda.org. Here is an example of a search for Python (direct link is https://anaconda.org/search?q=python):

kazoo01
kazoo01

If you enter in the Python package most downloaded (good practice in my opinion) you will find the command to install it:

conda install -c conda-forge python

But then you cannot modify the environment with your own account and packages add must be done by root, which is, in my opinion, a very good practice:

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
  environment location: /opt/anaconda3/envs/kazoo
  uid: 1000
  gid: 100

So executing with root account (in kazoo conda environment):

(kazoo) [root@fedora1 ~]# conda install -c conda-forge python
Collecting package metadata: done
Solving environment: done

## Package Plan ##

  environment location: /opt/anaconda3/envs/kazoo

  added / updated specs:
    - python


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    bzip2-1.0.6                |    h14c3975_1002         415 KB  conda-forge
    certifi-2019.3.9           |           py37_0         149 KB  conda-forge
    pip-19.1                   |           py37_0         1.8 MB  conda-forge
    python-3.7.3               |       h5b0a415_0        35.7 MB  conda-forge
    setuptools-41.0.1          |           py37_0         616 KB  conda-forge
    wheel-0.33.1               |           py37_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:        38.7 MB

The following NEW packages will be INSTALLED:

  bzip2              conda-forge/linux-64::bzip2-1.0.6-h14c3975_1002
  ca-certificates    conda-forge/linux-64::ca-certificates-2019.3.9-hecc5488_0
  certifi            conda-forge/linux-64::certifi-2019.3.9-py37_0
  libffi             conda-forge/linux-64::libffi-3.2.1-he1b5a44_1006
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-8.2.0-hdf63c60_1
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-8.2.0-hdf63c60_1
  ncurses            conda-forge/linux-64::ncurses-6.1-hf484d3e_1002
  openssl            conda-forge/linux-64::openssl-1.1.1b-h14c3975_1
  pip                conda-forge/linux-64::pip-19.1-py37_0
  python             conda-forge/linux-64::python-3.7.3-h5b0a415_0
  readline           conda-forge/linux-64::readline-7.0-hf8c457e_1001
  setuptools         conda-forge/linux-64::setuptools-41.0.1-py37_0
  sqlite             conda-forge/linux-64::sqlite-3.26.0-h67949de_1001
  tk                 conda-forge/linux-64::tk-8.6.9-h84994c4_1001
  wheel              conda-forge/linux-64::wheel-0.33.1-py37_0
  xz                 conda-forge/linux-64::xz-5.2.4-h14c3975_1001
  zlib               conda-forge/linux-64::zlib-1.2.11-h14c3975_1004


Proceed ([y]/n)? y


Downloading and Extracting Packages
python-3.7.3         | 35.7 MB   | #################################################################################################################################################################### | 100%
certifi-2019.3.9     | 149 KB    | #################################################################################################################################################################### | 100%
wheel-0.33.1         | 34 KB     | #################################################################################################################################################################### | 100%
setuptools-41.0.1    | 616 KB    | #################################################################################################################################################################### | 100%
pip-19.1             | 1.8 MB    | #################################################################################################################################################################### | 100%
bzip2-1.0.6          | 415 KB    | #################################################################################################################################################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

I have also installed Kazoo using:

(kazoo) [root@fedora1 ~]# conda install -c conda-forge kazoo

And I have also installed PyHive to connect to Hive, Pandas to manipulate data structures (finally I have not used it at the end):

conda install -c anaconda pyhive
conda install -c conda-forge pandas

Python source code

The small source code (kazoo_testing.py file name in below) I have written is mainly coming (for the Kazoo part) from the official documentation so no restriction to visit it:

from kazoo.client import KazooClient,KazooState

def my_listener(state):
  if state == KazooState.LOST:
    # Register somewhere that the session was lost
    print('Connection lost !!')
  elif state == KazooState.SUSPENDED:
    # Handle being disconnected from Zookeeper
    print('Connection suspended !!')
  else:
    # Handle being connected/reconnected to Zookeeper
    print('Connected !!')

zk = KazooClient(hosts='zookeeper_server01.domain.com:2181,zookeeper_server02.domain.com:2181,zookeeper_server03.domain.com:2181')

zk.add_listener(my_listener)
#zk.start()
zk.start(timeout=5)

# Display Zookeeper information
print(zk.get_children('/'))

#print(zk.get_children('hiveserver2')[0])
print(zk.get_children(path='hiveserver2'))

for hiveserver2 in zk.get_children(path='hiveserver2'):
  array01=hiveserver2.split(';')[0].split('=')[1].split(':')
  hive_hostname=array01[0]
  hive_port=array01[1]
  print('Hive hostname: ' + hive_hostname)
  print('Hive port: ' + hive_port)

The list of Zookeeper server can be taken from the Hive Ambari page where you can copy/paste the so called HIVESERVER2 JDBC URL.

The above source code does not include the PyHive connection but once you get the Hive host name and port you can easily connect with something like (configuration parameter is optional):

from pyhive import hive

# Hive connection 
connection=hive.connect(
    host = hive_hostname, 
    port = hive_port,
    configuration={'tez.queue.name': 'your_yarn_queue_name'}, 
    username = "your_account"
    )

pandas01=pd.read_sql("select * from ...", connection)

print(pandas01.sample(10))

Kazoo testing

I have the chance to have configured two HiveServer2 in my Hortonworks Hadoop cluster. Which is, by the way, strongly suggested if you aim to be Highly Available (HA). When the two HiveServer2 processes are up and running I get below result:

(kazoo) [yjaquier@fedora1 ~]$ python kazoo_testing.py
Connected !!
['registry', 'cluster', 'brokers', 'storm', 'zookeeper', 'infra-solr', 'hbase-unsecure', 'tracers', 'hadoop-ha', 'admin', 'isr_change_notification',
 'accumulo', 'logsearch', 'controller_epoch', 'hiveserver2', 'druid', 'rmstore', 'ambari-metrics-cluster', 'consumers', 'config']
['serverUri=hiveserver201.domain.com:10000;version=1.2.1000.2.6.4.0-91;sequence=0000000042', 'serverUri=hiveserver202.domain.com:10000;version=1.2.1000.2.6.4.0-91;sequence=0000000043']
Hive hostname: hiveserver201.domain.com
Hive port: 10000
Hive hostname: hiveserver202.domain.com
Hive port: 10000

If I stop the first HiverServer2, after a while, I suppose the time for Zookeeper to get and propagate the information I finally get:

(kazoo) [yjaquier@fedora1 ~]$ python kazoo_testing.py
Connected !!
['registry', 'cluster', 'brokers', 'storm', 'zookeeper', 'infra-solr', 'hbase-unsecure', 'tracers', 'hadoop-ha', 'admin', 'isr_change_notification',
 'accumulo', 'logsearch', 'controller_epoch', 'hiveserver2', 'druid', 'rmstore', 'ambari-metrics-cluster', 'consumers', 'config']
['serverUri=hiveserver202.domain.com:10000;version=1.2.1000.2.6.4.0-91;sequence=0000000042']
Hive hostname: hiveserver202.domain.com
Hive port: 10000

References

The post Fetch Zookeeper information from Python with Kazoo to connect Hive appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/fetch-zookeeper-information-from-python-with-kazoo-to-connect-hive.html/feed 0
Hadoop backup: what parts to backup and how to do it ? https://blog.yannickjaquier.com/hadoop/hadoop-backup-what-parts-to-backup-and-how-to-do-it.html https://blog.yannickjaquier.com/hadoop/hadoop-backup-what-parts-to-backup-and-how-to-do-it.html#respond Fri, 27 Sep 2019 08:09:45 +0000 https://blog.yannickjaquier.com/?p=4617 Preamble Hadoop backup, wide and highly important subject and most probably like me you have been surprised by poor availability of official documents and this is most probably why you have landed here trying to find a first answer ! Needless to say this blog post is far to be complete so please do not […]

The post Hadoop backup: what parts to backup and how to do it ? appeared first on IT World.

]]>

Table of contents

Preamble

hadoop_backup01
hadoop_backup01

Hadoop backup, wide and highly important subject and most probably like me you have been surprised by poor availability of official documents and this is most probably why you have landed here trying to find a first answer ! Needless to say this blog post is far to be complete so please do not hesitate to submit a comment and I would enrich this document with great pleasure !

One of the main difficulty with Hadoop is its scale out natural essence that’s make difficult to understand what’s nice to backup and what is REALLY important to backup.

I have split the article in three parts:

  • First part is what you MUST backup to be able to survive to a major issue
  • Second part is what is not required to be backed-up.
  • Third part is what is nice to backup.

Also, I repeat, I’m interested by any comment you might have that would help to enrich this document or correct any mistake…

Mandatory parts to backup

Configuration files

All files under /etc and /usr/hdp on edge nodes (so not for your workers nodes). On the principle you could recreate them from scratch but you surely do not want to loose multiple months or years of fine tuning isn’t it ?

Theoretically all your configuration files will be saved when saving Ambari server meta info but if you have a corporate tool to backup your host OS it is worth putting the two above directories as it is sometimes much simpler to restore a single files from those tools…

Those edge nodes are:

  • Master nodes
  • Management nodes
  • Client nodes
  • Utilities node (Hive, …)
  • Analytics Nodes

In other words all except worker nodes..

Ambari server meta info

[root@mgmtserver ~]# ambari-server backup /tmp/ambari-server-backup.zip
Using python  /usr/bin/python
Backing up Ambari File System state... *this will not backup the server database*
Backup requested.
Backup process initiated.
Creating zip file...
Zip file created at /tmp/ambari-server-backup.zip
Backup complete.
Ambari Server 'backup' completed successfully.
[root@mgmtserver ~]# ll /tmp/ambari-server-backup.zip
-rw-r--r-- 1 root root 2444590592 Dec  3 17:01 /tmp/ambari-server-backup.zip

To restore this backup in case of a big crash the command is:

[root@mgmtserver ~]# ambari-server restore /tmp/ambari-server-backup.zip

NameNode metadata

As they say in NameNode this component is key:

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself.

While it’s not an Oracle database and a continuous backup is not possible:

Regardless of the solution, a full, up-to-date continuous backup of the namespace is not possible. Some of the most recent data is always lost. HDFS is not an Online Transaction Processing (OLTP) system. Most data can be easily recreated if you re-run Extract, Transform, Load (ETL) or processing jobs.

The always working procedure to do a backup of your NameNode is really simple:

[hdfs@namenode_primary ~]$ hdfs dfsadmin -saveNamespace
saveNamespace: Safe mode should be turned ON in order to create namespace image.
[hdfs@namenode_primary ~]$ hdfs dfsadmin -safemode enter
Safe mode is ON
[hdfs@namenode_primary ~]$ hdfs dfsadmin -safemode get
Safe mode is ON
[hdfs@namenode_primary ~]$ hdfs dfsadmin -saveNamespace
Save namespace successful
[hdfs@namenode_primary ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF
[hdfs@namenode_primary ~]$ hdfs dfsadmin -safemode get
Safe mode is OFF
[hdfs@namenode_primary ~]$ hdfs dfsadmin -fetchImage /tmp
19/01/07 12:57:10 INFO namenode.TransferFsImage: Opening connection to http://namenode_primary.domain.com:50070/imagetransfer?getimage=1&txid=latest
19/01/07 12:57:10 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
19/01/07 12:57:10 INFO namenode.TransferFsImage: Combined time for fsimage download and fsync to all disks took 0.04s. The fsimage download took 0.04s at 167097.56 KB/s. Synchronous (fsync) write to disk of /tmp/fsimage_0000000000002300573 took 0.00s.

Then you can put in a safe place (tape, SAN, NFS, …) the file that has been copied to /tmp directory. But this has anyways the bad idea to put your entire cluster in read only mode (safemode), so in a 24/7 production cluster this is surely not something you can accept…

All your running processes will end with something like:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot complete file /apps/hive/warehouse/database.db/table_orc/.hive-staging_hive_2019-02-12_06-12-16_976_7305679997226277861-21596/_task_tmp.-ext-10002/_tmp.000199_0. Name node is in safe mode.
It was turned on manually. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.

In the initial releases of Hadoop the NameNode was a Single Point Of Failure (SPOF) as you could have only what is called a secondary NameNode. The secondary NameNode handle an important CPU intensive task called checkpointing. Checkpointing is the operation to combine edits logs files (edits_xx files) and latest fsimage file to create an up to date HDFS filesystem metadata snapshot (fsimage_xxx file). But the secondary NameNode cannot be used as a failover of the primary NameNode so in case of failure is can only be used to rebuild the primary NameNode, not to take his role.

In Haddop 2.0 this limitation has gone and in an High Availability (HA) mode you can have a standby NameNode that does same job as secondary NameNode and can also take the role of the primary NameNode by a simple switch.

If for any reason this checkpoint operation has not happened since long you will receive the scary NameNode Last Checkpoint Ambari alert:

hadoop_backup02
hadoop_backup02

This alert will also trigger below Ambari warning when you will try to stop NameNode process (when the NameNode restart is read the latest fsimage and re-apply to it all the edits log files generated since):

hadoop_backup03
hadoop_backup03

Needless to say that having your NameNode service in High Availability (active/standby) is strongly suggested !

Whether you have NameNode in HA or not there is a list of important parameters to consider with the value we have chosen, maybe I should decrease the checkpoint period value:

  • dfs.namenode.name.dir = /hadoop/hdfs
  • dfs.namenode.checkpoint.period = 21600 (in seconds i.e. 6 hours)
  • dfs.namenode.checkpoint.txns = 1000000
  • dfs.namenode.checkpoint.check.period = 60

But in this case on your standby or secondary NameNode every dfs.namenode.checkpoint.period or every dfs.namenode.checkpoint.txns whichever is reached first you will have a new checkpoint file and the cool thing is that this latest checkpoint is copied back to your active NameNode. In below the checkpoint at 07:08 is the periodic automatic checkpoint while the one at 06:15 is the one we have explicitly done with a hdfs dfsadmin -saveNamespace command.

On standby NameNode:

[root@namenode_standby ~]# ll -rt /hadoop/hdfs/current/fsimage*
-rw-r--r-- 1 hdfs hadoop 650179252 Feb 13 06:15 /hadoop/hdfs/current/fsimage_0000000000520456166
-rw-r--r-- 1 hdfs hadoop        62 Feb 13 06:15 /hadoop/hdfs/current/fsimage_0000000000520456166.md5
-rw-r--r-- 1 hdfs hadoop 650235574 Feb 13 07:08 /hadoop/hdfs/current/fsimage_0000000000520466841
-rw-r--r-- 1 hdfs hadoop        62 Feb 13 07:08 /hadoop/hdfs/current/fsimage_0000000000520466841.md5

On active Namenode:

[root@namenode_primary ~]# ll -rt /hadoop/hdfs/current/fsimage*
-rw-r--r-- 1 hdfs hadoop        62 Feb 13 06:15 /hadoop/hdfs/current/fsimage_0000000000520456198.md5
-rw-r--r-- 1 hdfs hadoop 650179470 Feb 13 06:15 /hadoop/hdfs/current/fsimage_0000000000520456198
-rw-r--r-- 1 hdfs hadoop 650235574 Feb 13 07:08 /hadoop/hdfs/current/fsimage_0000000000520466841
-rw-r--r-- 1 hdfs hadoop        62 Feb 13 07:08 /hadoop/hdfs/current/fsimage_0000000000520466841.md5

So in a NameNode HA cluster you can just copy regularly the dfs.namenode.name.dir to a safe place (tape, NFS, …) and you are not obliged to enter this impacting safemode

If at a point in time you don’t have Ambari and/or you want to script it here is the commands to get your active and standby NameNode servers:

[hdfs@namenode_primary ~]$ hdfs getconf -confKey dfs.ha.namenodes.mycluster
nn1,nn2
[hdfs@namenode_primary ~]$ hdfs getconf -confKey dfs.namenode.rpc-address.mycluster.nn1
namenode_standby.domain.com:8020
[hdfs@namenode_primary ~]$ hdfs getconf -confKey dfs.namenode.rpc-address.mycluster.nn2
namenode_primary.domain.com:8020
[hdfs@namenode_primary ~]$ hdfs haadmin -getServiceState nn1
standby
[hdfs@namenode_primary ~]$ hdfs haadmin -getServiceState nn2
active

Ambari repository database

Our Ambari repository database is a PostgreSQL one, if you have chosen MySQL refer to next chapter.

Backup with Point In Time Recovery (PITR) capability

As clearly explained in the documentation there is a tool to do it called pg_basebackup. To use it you need to put your PostgreSQL instance in write ahead log (WAL) mode that is equivalent of binary logging of MySQL or archive log mode of Oracle. This is done by setting three parameters in postgresql.conf file:

  • wal_level = replica
  • archive_mode = on
  • archive_command = ‘test ! -f /var/lib/pgsql/backups/%f && cp %p /var/lib/pgsql/backups/%f’

Remark:
The archive command that has been chosen is just an example that will copy WAL files to a backup directory that you obviously need to save to a secure place.

If not done you will end up with below error message:

[postgres@fedora1 ~]$ pg_basebackup --pgdata=/tmp/pgbackup01
pg_basebackup: could not get write-ahead log end position from server: ERROR:  could not open file "./.postgresql.conf.swp": Permission denied
pg_basebackup: removing data directory "/tmp/pgbackup01"

Once done and activated (restart required) you can make and online backup that can be used to perform PITR with:

[postgres@fedora1 ~]$ pg_basebackup --pgdata=/tmp/pgbackup01
[postgres@fedora1 ~]$ ll /tmp/pgbackup01
total 52
-rw------- 1 postgres postgres   206 Nov 30 18:02 backup_label
drwx------ 6 postgres postgres   120 Nov 30 18:02 base
-rw------- 1 postgres postgres    30 Nov 30 18:02 current_logfiles
drwx------ 2 postgres postgres  1220 Nov 30 18:02 global
drwx------ 2 postgres postgres    80 Nov 30 18:02 log
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_commit_ts
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_dynshmem
-rw------- 1 postgres postgres  4414 Nov 30 18:02 pg_hba.conf
-rw------- 1 postgres postgres  1636 Nov 30 18:02 pg_ident.conf
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_log
drwx------ 4 postgres postgres   100 Nov 30 18:02 pg_logical
drwx------ 4 postgres postgres    80 Nov 30 18:02 pg_multixact
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_notify
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_replslot
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_serial
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_snapshots
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_stat
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_stat_tmp
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_subtrans
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_tblspc
drwx------ 2 postgres postgres    40 Nov 30 18:02 pg_twophase
-rw------- 1 postgres postgres     3 Nov 30 18:02 PG_VERSION
drwx------ 3 postgres postgres    80 Nov 30 18:02 pg_wal
drwx------ 2 postgres postgres    60 Nov 30 18:02 pg_xact
-rw------- 1 postgres postgres    88 Nov 30 18:02 postgresql.auto.conf
-rw------- 1 postgres postgres 22848 Nov 30 18:02 postgresql.conf
[postgres@fedora1 pg_wal]$ ll /var/lib/pgsql/backups/
total 32772
-rw------- 1 postgres postgres 16777216 Nov 30 18:02 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Nov 30 18:02 000000010000000000000003
-rw------- 1 postgres postgres      302 Nov 30 18:02 000000010000000000000003.00000060.backup
[postgres@fedora1 pg_wal]$ cat /var/lib/pgsql/backups/000000010000000000000003.00000060.backup
START WAL LOCATION: 0/3000060 (file 000000010000000000000003)
STOP WAL LOCATION: 0/3000130 (file 000000010000000000000003)
CHECKPOINT LOCATION: 0/3000098
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2018-11-30 18:02:03 CET
LABEL: pg_basebackup base backup
STOP TIME: 2018-11-30 18:02:03 CET
[postgres@fedora1 pg_wal]$ ll /var/lib/pgsql/data/pg_wal/
total 49156
-rw------- 1 postgres postgres 16777216 Nov 30 18:02 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Nov 30 18:02 000000010000000000000003
-rw------- 1 postgres postgres      302 Nov 30 18:02 000000010000000000000003.00000060.backup
-rw------- 1 postgres postgres 16777216 Nov 30 18:02 000000010000000000000004
drwx------ 2 postgres postgres      133 Nov 30 18:02 archive_status
[postgres@fedora1 pg_wal]$ ll /var/lib/pgsql/data/pg_wal/archive_status/
total 0
-rw------- 1 postgres postgres 0 Nov 30 18:02 000000010000000000000002.done
-rw------- 1 postgres postgres 0 Nov 30 18:02 000000010000000000000003.00000060.backup.done
-rw------- 1 postgres postgres 0 Nov 30 18:02 000000010000000000000003.done

You can also directly generate TAR files with:

[postgres@fedora1 pg_wal]$ pg_basebackup --pgdata=/tmp/pgbackup02 --format=t
[postgres@fedora1 pg_wal]$ ll /tmp/pgbackup02
total 48128
-rw-r--r-- 1 postgres postgres 32500224 Nov 30 18:11 base.tar
-rw------- 1 postgres postgres 16778752 Nov 30 18:11 pg_wal.tar

Backup with no PITR capability

This method is obviously based on the creation of a dump file. Either you use pg_dump or pg_dumpall.

At this stage either you do all with postgres Linux account that is able to connect in a password less fashion, thanks to default pg_hba.conf file:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             postgres                                     peer
# IPv4 local connections:
host    all             postgres             127.0.0.1/32            ident
# IPv6 local connections:
host    all            postgres             ::1/128                 ident
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     postgres                                     peer
host    replication     postgres             127.0.0.1/32            ident
host    replication     postgres             ::1/128                 ident

Or you set it for another account that has less privileges, the owner of the database you want to backup for example. I initially tried with PGPASSWORD but this is apparently not working anymore in later releases of PostgreSQL (10.6 for the release I have used to test the feature):

[postgres@fedora1 ~]$ export PGPASSWORD='secure_password'
[postgres@fedora1 ~]$ echo $PGPASSWORD
secure_password
[postgres@fedora1 ~]$ psql --dbname=ambari --username=ambari --password
Password for user ambari:

Our Ambari repository is older (9.2.23) but to prepare future better to move to password file. A password file is file called ~/.pgpass and having below structure:

hostname:port:database:username:password

I have created it like:

[postgres@fedora1 ~]$ ll /var/lib/pgsql/.pgpass
-rw-r--r-- 1 postgres postgres 37 Nov 30 15:12 /var/lib/pgsql/.pgpass
[postgres@fedora1 ~]$ cat /var/lib/pgsql/.pgpass
localhost:5432:ambari:ambari:secure_password

The file must be 600 or less or you will get:

[postgres@fedora1 ~]$ psql --dbname=ambari --username=ambari
WARNING: password file "/var/lib/pgsql/.pgpass" has group or world access; permissions should be u=rw (0600) or less
Password for user ambari:

Then you can connect without specifying a password:

[postgres@fedora1 ~]$ psql --dbname=ambari --username=ambari
psql (10.6)
Type "help" for help.

ambari=>

All this to do a backup off all databases with:

[postgres@fedora1 ~]$ pg_dumpall --file=/tmp/pgbackup.sql
[postgres@fedora1 ~]$ ll /tmp/pgbackup.sql
-rw-r--r-- 1 postgres postgres 3768 Nov 30 16:55 /tmp/pgbackup.sql

Or just the Ambari one with:

[postgres@fedora1 ~]$ pg_dump --file=/tmp/pgbackup_ambari.sql ambari
[postgres@fedora1 ~]$ ll /tmp/pgbackup_ambari.sql
-rw-r--r-- 1 postgres postgres 1117 Nov 30 16:57 /tmp/pgbackup_ambari.sql

Hive repository database

Our Hive repository database is a MySQL one, if you have chosen PostgreSQL refer to previous chapter.

Backup with Point In Time Recovery (PITR) capability

You must activate binary log by setting log-bin parameter in my.cnf file with something like (MOCA https://blog.yannickjaquier.com/mysql/mysql-replication-with-global-transaction-identifiers-gtid-hands-on.html):

log-bin = /mysql/logs/mysql01/mysql-bin

You should end up with below configuration:

+---------------------------------+------------------------------------+
| Variable_name                   | Value                              |
+---------------------------------+------------------------------------+
| log_bin                         | ON                                 |
| log_bin_basename                | /mysql/logs/mysql01/mysql-bin      |
| log_bin_index                   | /mysql/logs/mysql01mysql-bin.index |
+---------------------------------+------------------------------------+

First you must regularly backup the MySQL binary logs !

Before any online backup (snapshot) do the following to reset binary logs:

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.001087 |       242 |
| mysql-bin.001088 |       242 |
| mysql-bin.001089 |       242 |
| mysql-bin.001090 |      9638 |
| mysql-bin.001091 |      1538 |
| mysql-bin.001092 |       242 |
| mysql-bin.001093 |       242 |
| mysql-bin.001094 |      1402 |
| mysql-bin.001095 |      4314 |
| mysql-bin.001096 |      2304 |
| mysql-bin.001097 |       120 |
+------------------+-----------+
11 rows in set (0.00 sec)

mysql> flush logs;
Query OK, 0 rows affected (0.41 sec)

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.001088 |       242 |
| mysql-bin.001089 |       242 |
| mysql-bin.001090 |      9638 |
| mysql-bin.001091 |      1538 |
| mysql-bin.001092 |       242 |
| mysql-bin.001093 |       242 |
| mysql-bin.001094 |      1402 |
| mysql-bin.001095 |      4314 |
| mysql-bin.001096 |      2304 |
| mysql-bin.001097 |       167 |
| mysql-bin.001098 |       120 |
+------------------+-----------+
11 rows in set (0.00 sec)

mysql> purge binary logs to 'mysql-bin.001098';
Query OK, 0 rows affected (0.00 sec)

mysql> show binary logs;
+------------------+-----------+
| Log_name         | File_size |
+------------------+-----------+
| mysql-bin.001098 |       120 |
+------------------+-----------+
1 row in set (0.00 sec)

Then take the snapshot by keeping tables in read lock with something like:

FLUSH TABLES WITH READ LOCK;
\! lvcreate --snapshot --size 100M --name lvol98_save /dev/vg00/lvol98 or any snapshot command
UNLOCK TABLES;

Backup with no PITR capability

If you don’t want to activate binary logging and manage them or can afford to loose multiple hours of transaction you can simply perform a MySQL dump even once a week when your cluster is stabilized. Use a command like below to create a simple dump file:

[mysql@server1 ~] mysqldump --user=root -p --single-transaction --all-databases > /tmp/backup.sql

Not mandatory parts to backup

JournalNodes

From Cloudera official documentation:

High-availabilty clusters use JournalNodes to synchronize active and standby NameNodes. The active NameNode writes to each JournalNode with changes, or “edits,” to HDFS namespace metadata. During failover, the standby NameNode applies all edits from the JournalNodes before promoting itself to the active state.

Those JournalNodes are installed only if your NameNode is in HA mode. They are preferred method to handle shared storage between your primary and standby NameNodes, this method is called Quorum Journal Manager(QJM).

Each time a new edits file is created or modified on primary NameNode it is also written on maximum (quorum) of JournalNodes. Standby NameNode constantly monitor the JournalNodes for any changes and apply them to its own namespace to be ready to failover primary NameNode in case of failure. All JournalNodes store more or less same files (edits_xx files and edits_inprogress_xx file) as NameNodes except that they do not have the checkpoint fsimages_xx results. You must have three or more (odd number) JournalNodes for high availability and to handle split brain scenarios.

The working directory of JournalNodes is defined by:

  • dfs.journalnode.edits.dir = /var/qjn

On one JournalNode the real directory will be (cluster name is the name of your cluster that has been chosen at installation):

[root@journalnode01 ~]# ll -rt /var/qjn//current
.
.
.
-rw-r--r-- 1 hdfs hadoop 1006436 Jan 18 12:13 edits_0000000000433896848-0000000000433901168
-rw-r--r-- 1 hdfs hadoop  133375 Jan 18 12:15 edits_0000000000433901169-0000000000433901822
-rw-r--r-- 1 hdfs hadoop  133652 Jan 18 12:17 edits_0000000000433901823-0000000000433902395
-rw-r--r-- 1 hdfs hadoop  918778 Jan 18 12:19 edits_0000000000433902396-0000000000433906383
-rw-r--r-- 1 hdfs hadoop  801672 Jan 18 12:21 edits_0000000000433906384-0000000000433910273
-rw-r--r-- 1 hdfs hadoop   76329 Jan 18 12:23 edits_0000000000433910274-0000000000433910699
-rw-r--r-- 1 hdfs hadoop   90404 Jan 18 12:25 edits_0000000000433910700-0000000000433911201
-rw-r--r-- 1 hdfs hadoop   48435 Jan 18 12:27 edits_0000000000433911202-0000000000433911468
-rw-r--r-- 1 hdfs hadoop  882923 Jan 18 12:29 edits_0000000000433911469-0000000000433915208
-rw-r--r-- 1 hdfs hadoop 1048576 Jan 18 12:31 edits_inprogress_0000000000433915209
-rw-r--r-- 1 hdfs hadoop       8 Jan 18 12:31 committed-txid

So as such JournalNodes do not contains any required information that can be inherited from NameNode so nothing to backup

Parts nice to backup

HDFS

In essence your Hadoop cluster has surely been built to handle Terabytes, not to say Petabytes, of data. So doing a backup of all your HDFS data is technically not possible. First HDFS is replicating each data block (of dfs.blocksize in size, 128MB by default) multiple times (parameter is dfs.replication and is set to 3 in my case and you have surely configured what is call rack awareness. Means your worker nodes are physically in different racks in your computer room.

So in other words is you loose one or multiple worker nodes or even a complete rack of your Hadoop cluster this is going to be completely transparent to your application. At worst you might suffer from a performance decrease but no interruption to production (ITP).

But what if you loose the entire data center where is located your Hadoop cluster ? We initially had the idea to split our cluster between two data center geographically separated by 20-30 Kilometers (12 to 18 miles) but this would require a (dedicated) low latency high speed link (dark fiber or else) between the two data centers which is most probably not cost effective…

This is why the most implemented architecture is a second smaller cluster in a remote site where you will try to have a copy of your main Hadoop cluster. This copy can be done by provided Haddop tool called DistCp or simply by running the exact same ingestion process on this failover cluster…

Running the same ingestion process on two distinct clusters might sound a bad idea but if you store your source raw files on a low cost NFS filer then, first, you can easily backup them to tape. Secondly, you can use same exact copy from two (or more) Hadoop cluster and in case of crash or consistency issue you are able to restart the ingestion from raw files. The secondary cluster can then be, with no issue, smaller that the primary one as only ingestion will run on it. Interactive queries and users will remain on primary cluster…

Here I have not at all mentioned HDFS snapshot because for me it is not a all a backup solution ! This is not different from a NFS snapshot and the only case you handle with this is human error. in case of a hardware failure or a data center failure this HDFS snapshot will be of no help as you will loose it at same time of the crash…

References

The post Hadoop backup: what parts to backup and how to do it ? appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/hadoop-backup-what-parts-to-backup-and-how-to-do-it.html/feed 0
HDFS capacity planning computation and analysis https://blog.yannickjaquier.com/hadoop/hdfs-capacity-planning-generation-analysis.html https://blog.yannickjaquier.com/hadoop/hdfs-capacity-planning-generation-analysis.html#respond Fri, 30 Aug 2019 08:02:42 +0000 https://blog.yannickjaquier.com/?p=4538 Preamble In our ramp up period we wanted to estimate already consumed HDFS size as well as how is split this used space. This would help us build HDFS capacity planning plan and know which investment would be needed. I have found tons of document on how to do it but the snapshot “issue” I […]

The post HDFS capacity planning computation and analysis appeared first on IT World.

]]>

Table of contents

Preamble

In our ramp up period we wanted to estimate already consumed HDFS size as well as how is split this used space. This would help us build HDFS capacity planning plan and know which investment would be needed. I have found tons of document on how to do it but the snapshot “issue” I had was a nice discover…

HDFS capacity planning first estimation

The real two first commands you would use are:

[hdfs@clientnode ~]$ hdfs dfs -df -h /
Filesystem                          Size    Used  Available  Use%
hdfs://DataLakeHdfs               89.5 T  22.4 T     62.5 T   25%

And:

[hdfs@clientnode ~]$ hdfs dfs -du -s -h /
5.9 T  /

You can drill down directories size with:

[hdfs@clientnode ~]$ hdfs dfs -du -h /
169.0 G   /app-logs
466.7 M   /apps
12.5 G    /ats
3.1 T     /data
710.4 M   /hdp
0         /livy2-recovery
0         /mapred
16.8 M    /mr-history
1004.4 M  /spark2-history
2.1 T     /tmp
479.7 G   /user

In HDFS you have dfs.datanode.du.reserved which specify a reserved space in bytes per volume. This is set to 1243,90869140625 MB in my environment.

I also have below HDFS parameters that will be part of formula:

Parameter Value Description
dfs.datanode.du.reserved 1304332800 bytes Reserved space in bytes per volume. Always leave this much space free for non dfs use.
dfs.blocksize 128MB The default block size for new files, in bytes. You can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide complete size in bytes (such as 134217728 for 128 MB).
dfs.replication 3 Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
fs.trash.interval 360 (minutes)

Number of minutes after which the checkpoint gets deleted. If zero, the trash feature is disabled. This option may be configured both on the server and the client. If trash is disabled server side then the client side configuration is checked. If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.

You can have a complete report with more precise number than hdfs dfs -df -h / command for all your worker nodes using below command:

[hdfs@clientnode ~]$ hdfs dfsadmin -report
Configured Capacity: 98378048588800 (89.47 TB)
Present Capacity: 93368566571440 (84.92 TB)
DFS Remaining: 68685157293611 (62.47 TB)
DFS Used: 24683409277829 (22.45 TB)
DFS Used%: 26.44%
Under replicated blocks: 20
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (5):

Name: 10.75.144.13:50010 (worker3.domain.com)
Hostname: worker3.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 3676038734820 (3.34 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 14998265052417 (13.64 TB)
DFS Used%: 18.68%
DFS Remaining%: 76.23%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 16
Last contact: Wed Oct 24 14:57:06 CEST 2018
Last Block Report: Wed Oct 24 11:48:58 CEST 2018


Name: 10.75.144.12:50010 (worker2.domain.com)
Hostname: worker2.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 3884987861604 (3.53 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 14789450082223 (13.45 TB)
DFS Used%: 19.75%
DFS Remaining%: 75.17%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 14
Last contact: Wed Oct 24 14:57:06 CEST 2018
Last Block Report: Wed Oct 24 09:44:51 CEST 2018


Name: 10.75.144.14:50010 (worker4.domain.com)
Hostname: worker4.domain.com
Rack: /AH/27
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 6604991718895 (6.01 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 12068909438191 (10.98 TB)
DFS Used%: 33.57%
DFS Remaining%: 61.34%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 22
Last contact: Wed Oct 24 14:57:06 CEST 2018
Last Block Report: Wed Oct 24 12:36:28 CEST 2018


Name: 10.75.144.11:50010 (worker1.domain.com)
Hostname: worker1.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 3983207846801 (3.62 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 14690022328249 (13.36 TB)
DFS Used%: 20.24%
DFS Remaining%: 74.66%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 32
Last contact: Wed Oct 24 14:57:06 CEST 2018
Last Block Report: Wed Oct 24 13:50:10 CEST 2018


Name: 10.75.144.15:50010 (worker5.domain.com)
Hostname: worker5.domain.com
Rack: /AH/27
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 6534183115709 (5.94 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 12138510392531 (11.04 TB)
DFS Used%: 33.21%
DFS Remaining%: 61.69%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 40
Last contact: Wed Oct 24 14:57:04 CEST 2018
Last Block Report: Wed Oct 24 10:41:56 CEST 2018

So far if I do my computation I get 5.9 TB * 3 (dfs.replication) = 17.7 TB, I am a bit below the 22.4 TB used of hdfs dfs -df -h / command… Where has gone the 4.7 TB ? Quite a few TB isn’t it ?

HDFS snapshot situation

Then after a bit on investigation I had the idea to check if HDFS snapshots have been created on my HDFS:

[hdfs@clientnode ~]$ hdfs lsSnapshottableDir -help
Usage:
hdfs lsSnapshottableDir:
        Get the list of snapshottable directories that are owned by the current user.
        Return all the snapshottable directories if the current user is a super user.

[hdfs@clientnode ~]$ hdfs lsSnapshottableDir
drwxr-xr-x 0 hdfs hdfs 0 2018-07-13 18:14 1 65536 /

You can get snapshot(s) name(s) with:

[hdfs@clientnode ~]$ hdfs dfs -ls /.snapshot
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2018-07-13 18:14 /.snapshot/s20180713-101304.832

Computing snapshot size is not possible as in case of a pointer to origial block (block not modified) the size of the original block will be added:

[hdfs@clientnode ~]$ hdfs dfs -du -h /.snapshot
3.3 T  /.snapshot/s20180713-101304.832

You can also get a graphical access using NameNode UI in Ambari:

hdfs_capacity_planning01
hdfs_capacity_planning01

Here we are a snapshot of HDFS root directory has been created… I rated this tricky as you don’t see it with a hdfs dfs du command:

[hdfs@clientnode ~]$ hdfs dfs -du -h /
169.0 G   /app-logs
466.7 M   /apps
4.7 G     /ats
3.1 T     /data
710.4 M   /hdp
0         /livy2-recovery
0         /mapred
0         /mr-history
1004.4 M  /spark2-history
2.1 T     /tmp
173.9 G   /user

I have also performed a HSFS filesystem check to be sure everything is fine and no blocks have been marked corrupted:

[hdfs@clientnode ~]$ hdfs fsck
.
.

........................
/user/training/.staging/job_1519657336782_0105/job.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754565_13755. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.
/user/training/.staging/job_1519657336782_0105/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754566_13756. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1519657336782_0105/libjars/hive-hcatalog-core.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754564_13754. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.
/user/training/.staging/job_1536057043538_0001/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621525_11894367. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0002/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621527_11894369. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0004/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621593_11894435. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0023/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622064_11894906. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0025/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622086_11894928. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0027/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622115_11894957. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536057043538_0028/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622133_11894975. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0002/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397707_12670663. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0003/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397706_12670662. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0004/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397708_12670664. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0005/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397718_12670674. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0006/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397720_12670676. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_0007/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397721_12670677. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
.....
/user/training/.staging/job_1536642465198_2307/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086509846_12782817. Target Replicas is 10 but found 5 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
....

Status: HEALTHY
 Total size:    5981414347660 B (Total open files size: 455501 B)
 Total dirs:    740032
 Total files:   3766023
 Total symlinks:                0 (Files currently being written: 17)
 Total blocks (validated):      3781239 (avg. block size 1581866 B) (Total open file blocks (not validated): 17)
 Minimally replicated blocks:   3781239 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       20 (5.2892714E-4 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0000105
 Corrupt blocks:                0
 Missing replicas:              100 (8.8153436E-4 %)
 Number of data-nodes:          5
 Number of racks:               2
FSCK ended at Wed Oct 17 16:12:48 CEST 2018 in 61172 milliseconds


The filesystem under path '/' is HEALTHY

After delete of HDFS snapshot

Get snapshot(s) name(s) and delete them with, I have also forbid further creation of any snapshot on root directory (does not make sense in my opinion):

[hdfs@clientnode  ~]$ hdfs dfsadmin -disallowSnapshot /
disallowSnapshot: The directory / has snapshot(s). Please redo the operation after removing all the snapshots.
[hdfs@clientnode ~]$ hdfs dfs -ls /.snapshot
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2018-07-13 18:14 /.snapshot/s20180713-101304.832
[hdfs@clientnode ~]$ hdfs dfs -deleteSnapshot / s20180713-101304.832
[hdfs@clientnode ~]$ hdfs dfsadmin -disallowSnapshot /
Disallowing snaphot on / succeeded
[hdfs@clientnode ~]$ hdfs lsSnapshottableDir

After a cleaning phase I reach the stable below situation:

[hdfs@clientnode ~]$ hdfs dfs -df -h /
Filesystem                          Size    Used  Available  Use%
hdfs://DataLakeHdfs               89.5 T  16.8 T     68.1 T   19%
[hdfs@clientnode ~]$ hdfs dfs -du -s -h /
5.5 T  /

So the computation is more accurate as 5.5 * 3 = 16.5 # 16.8.

As you have noticed my /tmp directory is 2.1 TB which is quite a lot of space for a temporary directory. For me all occupied space was directories under /tmp/hive. It end up that it is aborted Hive queries and can be safely deleted (we currently have one directory of 1.7 TB !!!):

Parameter Description Value
hive.exec.scratchdir This directory is used by Hive to store the plans for different map/reduce stages for the query as well as to stored the intermediate outputs of these stages.
Hive 0.14.0 and later: HDFS root scratch directory for Hive jobs, which gets created with write all (733) permission. For each connecting user, an HDFS scratch directory ${hive.exec.scratchdir}/ is created with ${hive.scratch.dir.permission}.
/tmp//hive (Hive 0.8.0 and earlier)
/tmp/hive- (as of Hive 0.8.1 to 0.14.0)
/tmp/hive (Hive 0.14.0 and later)

References

The post HDFS capacity planning computation and analysis appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/hdfs-capacity-planning-generation-analysis.html/feed 0
ORC versus Parquet compression and response time https://blog.yannickjaquier.com/hadoop/orc-versus-parquet-compression-and-response-time.html https://blog.yannickjaquier.com/hadoop/orc-versus-parquet-compression-and-response-time.html#respond Fri, 02 Aug 2019 07:52:44 +0000 https://blog.yannickjaquier.com/?p=4596 Preamble For their open source position we have chosen to install an Hortonworks HDP 2.6 Hadoop cluster. At the initial phase of our Hadoop project ORC storage has been chosen as the default storage engine for our very first Hive tables. Performance of our queries is, obviously, a key factor we consider. This is why […]

The post ORC versus Parquet compression and response time appeared first on IT World.

]]>

Table of contents

Preamble

For their open source position we have chosen to install an Hortonworks HDP 2.6 Hadoop cluster. At the initial phase of our Hadoop project ORC storage has been chosen as the default storage engine for our very first Hive tables.

Performance of our queries is, obviously, a key factor we consider. This is why we have started to consider Live Long And Process (LLAP) and realized it was not so easy to handle in our small initial cluster. Then the merge between Hortonworks and Cloudera happened and we decided to move all our tables to Parquet storage engine with the clear objective to use Impala from Cloudera.

But at a point in time we have started to study a bit the disk space usage (again linked to our small initial cluster) and realized that Parquet tables were much bigger than their ORC counterpart. All our Hive tables are highly using partitioning for performance and to ease cleaning by simply dropping old partitions…

There are plenty of articles comparing Parquet and ORC (and others) storage engines but if you read them carefully till the end there will most probably be a disclaimer stating that the comparison is tightly linked to data nature. In other words your data model and figures is unique and you have really no other option than testing it by yourself and this blog post is here to provide few tricks to achieve this…

Our cluster is running HDP-2.6.4.0 with Ambari version 2.6.1.0.

ORC versus Parquet compression

On one partition of one table we observed:

  • Parquet = 33.9 G
  • ORC = 2.4 G

Digging further we saw that ORC compression can be easily configured in Ambari and we have set it to zlib:

orc_vs_parquet01
orc_vs_parquet01

While the default Parquet compression is (apparently) uncompressed that is obviously not really good from compression perspective.

Digging in multiple (contradictory) blog posts and official documentation and personal testing I have been able to draw below table:

Hive SQL property Default Values
orc.compress ZLIB NONE, ZLIB or SNAPPY
parquet.compression UNCOMPRESSED UNCOMPRESSED, GZIP or SNAPPY

Remark:
I have seen many blog posts suggesting to use parquet.compress for Parquet compression algorithm but in my opinion this one does not work…

To change compression algorithm when creating a table use TBLPROPERTIES keyword like:

STORED AS PARQUET TBLPROPERTIES("parquet.compression"="GZIP");
STORED AS ORC TBLPROPERTIES("orc.compress"="SNAPPY")

So, as an example, the test table I have built for my testing is something like:

drop table default.test purge;

CREATE TABLE default.test(
  column01 string,
  column02 string,
  column03 int,
  column04 array)
PARTITIONED BY (column01 string, column02 string, column03 int)
STORED AS PARQUET
TBLPROPERTIES("parquet.compression"="SNAPPY");

To get the size of your test table (replace database_name and table_name by real values) just use something like (check the value of hive.metastore.warehouse.dir for /apps/hive/warehouse):

[hdfs@server01 ~]$ hdfs dfs -du -s -h /apps/hive/warehouse/database_name/table_name

Then I have copied my source Parquet table to this test table using the six combination of storage engine and compression algorithms and the result is:

ORC Parquet
orc.compress parquet.compression
NONE ZLIB SNAPPY UNCOMPRESSED GZIP SNAPPY
12.9 G 2.4 G 3.2 G 33.9 G 7.3 G 11.5 G

Or graphically:

orc_vs_parquet02
orc_vs_parquet02

To do the copy I have written a shell script that is dynamically copying each partition of the source table to its destination table that has the same layout:

#!/bin/bash
#
# Y.JAQUIER  06/02/2019  Creation
#
# -----------------------------------------------------------------------------
#
# This job copy one table on another one
# Useful when migrating from Parquet to ORC for example
#
# Destination table must have been created before using this script
#
# -----------------------------------------------------------------------------
#

###############################################################################
# Function to execute a query on hive
# Param 1 : Query string
###############################################################################
function execute_query
{
  #echo "Execute query: $1"
  MYSTART=$(date)
  beeline -u "jdbc:hive2://${HIVE_CONNEXION}?tez.queue.name=${HIVE_QUEUE}" -n ${HIVE_USER} --incremental=true --silent=true --fastConnect=true -e "$1"
  status=$?
  MYEND=$(date)
  MYDURATION=$(expr $(date -d "$MYEND" +%s) - $(date -d "$MYSTART" +%s))
  echo "Executed in $MYDURATION second(s)"
  #echo "MEASURE|$FAB|${INGESTION_DATE}|$1|$(date -d "$MYSTART" +%Y%m%d%H%M%S)|$(date -d "$MYEND" +%Y%m%d%H%M%S)|$MYDURATION"
  if [[ $status != "0" ]]
  then
    echo " !!!!!!!! Error Execution Query $1 !!!!!!!! "
    exit -1
  fi
}

###############################################################################
# Function to execute a query on hive in CSV output file
# Param 1 : Query string
# Param 2 : output file
###############################################################################
function execute_query_to_csv
{
  #echo "Execute query to csv: $1 => $2"
  MYSTART=$(date)
  beeline -u "jdbc:hive2://${HIVE_CONNEXION}?tez.queue.name=${HIVE_QUEUE}" -n ${HIVE_USER} --outputformat=csv2 --silent=true --verbose=false \
  --showHeader=false --fastConnect=true -e "$1" > $2
  status=$?
  MYEND=$(date)
  MYDURATION=$(expr $(date -d "$MYEND" +%s) - $(date -d "$MYSTART" +%s))
  #echo "MEASURE|$FAB|${INGESTION_DATE}|$1|$(date -d "$MYSTART" +%Y%m%d%H%M%S)|$(date -d "$MYEND" +%Y%m%d%H%M%S)|$MYDURATION"
  if [[ $status != "0" ]]
  then
    echo " !!!!!!!! Error Execution Query to csv : $1 => $2 !!!!!!!! "
    exit -1
  fi
}

###############################################################################
# Print the help
###############################################################################
function print_help
{
  echo "syntax:"
  echo "$0 source_database.source_table destination_database.destination_table partition_filter_pattern_and_option (not mandatory)"
  echo "Source and destination table must exists"
  echo "Destination table data will be overwritten !!"
}

###############################################################################
# Main
###############################################################################
HIVE_CONNEXION="..."
HIVE_QUEUE=...
HIVE_USER=...

TABLE_SOURCE=$1
TABLE_DESTINATON=$2
PARTITION_FILTER=$3

if [[ $# < "2" ]]
then
  print_help
  exit 0 
fi

echo "This will overwrite $TABLE_DESTINATON table by $TABLE_SOURCE table data !"
echo "The destination table MUST be created first !"
read -p "Do you wish to continue [Y | N] ? " answer
case $answer in
  [Yy]* ) ;;
  [Nn]* ) exit 0;;
  * ) echo "Please answer yes or no."; exit 0;;
esac

# Generate partitions list
execute_query_to_csv "show partitions $1;" partition_list.$$.csv

# Filter partiion list base on regular expression given
if [[ $PARTITION_FILTER != "" ]]
then
  grep $PARTITION_FILTER partition_list.$$.csv > partition_list1.$$.csv
  mv partition_list1.$$.csv partition_list.$$.csv
fi

partition_number=$(cat partition_list.$$.csv | wc -l)

# Generate column list (with partition columns which must be removed)
execute_query_to_csv "show columns from $1;" column_list.$$.csv

# First partition column
while read line
do
  first_partition_column=$(echo $line | awk -F "=" '{print $1}')
  break
done < partition_list.$$.csv

# Columns list without partition columns
columns_list_without_partitions=""
while read line
do
  if [[ $line = $first_partition_column ]]
  then
    break
  fi
  columns_list_without_partitions+="$line,"
done < column_list.$$.csv

# Remove trailing comma
columns_length=${#columns_list_without_partitions}
columns_list_without_partitions=${columns_list_without_partitions:0:$(($columns_length-1))}

echo "The source table has $partition_number partition(s)"

# Generate list of all insert partition by partition
i=1
while read line
do
  #echo $line
  echo "Partition ${i}:"
  j=1
  query1="insert overwrite table $TABLE_DESTINATON partition ("
  query2=""
  query3=""
  IFS="/"
  read -r -a partition_list <<< "$line"

  # We fetch all partition columns
  for partition_column_list in "${partition_list[@]}"
  do
    IFS="="
    read -r -a partition_columns <<< "$partition_column_list"
    # First insert is with WHERE and we must enclosed columns value with double quote
    if [[ $j -eq 1 ]]
    then
      query3+="where ${partition_columns[0]}=\"${partition_columns[1]}\" "
    else
      query3+="and ${partition_columns[0]}=\"${partition_columns[1]}\" "
    fi
    query2+="${partition_columns[0]}=\"${partition_columns[1]}\","
    j=$((j+1))
  done
  IFS=""
  i=$((i+1))
  query2_length=${#query2}
  query2_length=$((query2_length-1))
  query2=${query2:0:$query2_length}
  final_query=$query1$query2") select "$columns_list_without_partitions" from $TABLE_SOURCE "$query3
  #Executing the query comment out the execute query to test it before running
  echo $final_query
  execute_query $final_query
done < partition_list.$$.csv
rm partition_list.$$.csv
rm column_list.$$.csv

So clearly for our data nature the ORC storage engine cannot be beaten when it comes to disk usage...

I have taken additional figures when we have migrated our live tables of our Spotfire data model:

[hdfs@server01 ~]$ hdfs dfs -du -s -h /apps/hive/warehouse/database.db/table01*
564.2 G  /apps/hive/warehouse/database.db/table01_orc
3.6 T  /apps/hive/warehouse/database.db/table01_pqt
[hdfs@server01 ~]$ hdfs dfs -du -s -h /apps/hive/warehouse/database.db/table02*
121.3 M  /apps/hive/warehouse/database.db/table02_pqt
5.6 M  /apps/hive/warehouse/database.db/table02_orc

ORC versus Parquet response time

But what about response time ? To do this I have extracted a typical query used by Spotfire and executed it on Parquet (UNCOMPRESSED) and on ORC (ZLIB) tables:

Iteration Parquet (s) ORC (s)
Run 1 6.553 0.077
Run 2 5.291 0.066
Run 3 1.915 0.065
Run 4 2.987 0.074
Run 5 1.825 0.070
Run 6 2.720 0.092
Run 7 3.989 0.062
Run 8 4.526 0.079
Run 9 3.385 0.082
Run 10 3.588 0.176
Average 3.6779 0.0843

So on average over my ten runs we have a factor of 43-44 time faster for ORC...

I would explain this that I have much less data to read on disk for the ORC tables and, again, this is linked to our data nodes hardware where we have much more CPU than disk axis (ratio of one thread per physical disk is not at all followed). If you are low on CPU and have plenty of disks (which is also not a good practice for an Hadoop cluster) you might experience different results...

References

The post ORC versus Parquet compression and response time appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/orc-versus-parquet-compression-and-response-time.html/feed 0
HDFS balancer options to speed up balance operations https://blog.yannickjaquier.com/hadoop/hdfs-balancer-options-to-speed-up-balance-operations.html https://blog.yannickjaquier.com/hadoop/hdfs-balancer-options-to-speed-up-balance-operations.html#respond Fri, 05 Jul 2019 06:54:28 +0000 https://blog.yannickjaquier.com/?p=4604 Preamble We have started to receive the below Ambari alerts: Percent DataNodes With Available Spaceaffected: [2], total: [5] DataNode StorageRemaining Capacity:[4476139956751], Total Capacity:[77% Used, 19675609717760] In itself the DataNode Storage alert is not super serious because, first, it is sent far in advance (> 75%) but it anyways tells you that you are reaching the […]

The post HDFS balancer options to speed up balance operations appeared first on IT World.

]]>

Table of contents

Preamble

We have started to receive the below Ambari alerts:

  • Percent DataNodes With Available Space
    affected: [2], total: [5]
  • DataNode Storage
    Remaining Capacity:[4476139956751], Total Capacity:[77% Used, 19675609717760]

In itself the DataNode Storage alert is not super serious because, first, it is sent far in advance (> 75%) but it anyways tells you that you are reaching the storage limit of your cluster. One drawback we have seen is the impacted DataNodes are loosing contact with Ambari server and we are often obliged to restart the process.

On our small Hadoop cluster two nodes have more fill than the three others…

Should be easy to solve with below command:

[hdfs@clientnode ~]$ hdfs balancer

HDFS Balancer

We issued the HDFS balancer command with no options but after a very long run (almost a week) we end up with a still unbalanced situation. We even try to rerun the command but at the end the command completed very quickly (less than 2 seconds) but left us with two Datanodes still more filled than the three others.

[hdfs@clientnode ~]$ hdfs dfsadmin -report
Configured Capacity: 98378048588800 (89.47 TB)
Present Capacity: 93358971611260 (84.91 TB)
DFS Remaining: 31894899799432 (29.01 TB)
DFS Used: 61464071811828 (55.90 TB)
DFS Used%: 65.84%
Under replicated blocks: 24
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (5):

Name: 192.168.1.3:50010 (datanode03.domain.com)
Hostname: datanode03.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 11130853114413 (10.12 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 7534254091791 (6.85 TB)
DFS Used%: 56.57%
DFS Remaining%: 38.29%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 25
Last contact: Tue Jan 08 12:51:44 CET 2019
Last Block Report: Tue Jan 08 06:52:34 CET 2019


Name: 192.168.1.2:50010 (datanode02.domain.com)
Hostname: datanode02.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 11269739413291 (10.25 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 7403207769673 (6.73 TB)
DFS Used%: 57.28%
DFS Remaining%: 37.63%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 33
Last contact: Tue Jan 08 12:51:44 CET 2019
Last Block Report: Tue Jan 08 11:30:59 CET 2019


Name: 192.168.1.4:50010 (datanode04.domain.com)
Hostname: datanode04.domain.com
Rack: /AH/27
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 14226431394146 (12.94 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 4448006323316 (4.05 TB)
DFS Used%: 72.30%
DFS Remaining%: 22.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 14
Last contact: Tue Jan 08 12:51:43 CET 2019
Last Block Report: Tue Jan 08 12:12:55 CET 2019


Name: 192.168.1.1:50010 (datanode01.domain.com)
Hostname: datanode01.domain.com
Rack: /AH/26
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 10638187881052 (9.68 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 8035048514823 (7.31 TB)
DFS Used%: 54.07%
DFS Remaining%: 40.84%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 20
Last contact: Tue Jan 08 12:51:43 CET 2019
Last Block Report: Tue Jan 08 09:38:50 CET 2019


Name: 192.168.1.5:50010 (datanode05.domain.com)
Hostname: datanode05.domain.com
Rack: /AH/27
Decommission Status : Normal
Configured Capacity: 19675609717760 (17.89 TB)
DFS Used: 14198860008926 (12.91 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 4474383099829 (4.07 TB)
DFS Used%: 72.16%
DFS Remaining%: 22.74%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 29
Last contact: Tue Jan 08 12:51:45 CET 2019
Last Block Report: Tue Jan 08 11:50:32 CET 2019

From NameNode UI it gives the clean graphical picture:

hdfs_balancer01
hdfs_balancer01

Two datanodes are still more filled than the three others.

Then digging inside HDFS balancer official documentation we found two interesting parameters that are -source and -threshold.

-source is easily understandable with below example from official documentation (that I prefer to put it with the acquisition of Hortonworks by Cloudera):

The following table shows an example, where the average utilization is 25% so that D2 is within the 10% threshold. It is unnecessary to move any blocks from or to D2. Without specifying the source nodes, HDFS Balancer first moves blocks from D2 to D3, D4 and D5, since they are under the same rack, and then moves blocks from D1 to D2, D3, D4 and D5.
By specifying D1 as the source node, HDFS Balancer directly moves blocks from D1 to D3, D4 and D5.

Datanodes (with the same capacity) Utilization Rack
D1 95% A
D2 30% B
D3, D4, and D5 0% B

This is also explained in Storage group pairing policy:

The HDFS Balancer selects over-utilized or above-average storage as source storage, and under-utilized or below-average storage as target storage. It pairs a source storage group with a target storage group (source → target) in a priority order depending on whether or not the source and the target storage reside in the same rack.

And this rack awareness story is exactly what we have as displayed in server list of Ambari:

hdfs_balancer02
hdfs_balancer02

-threshold is also an interesting parameter to be more strict with nodes above or below the average…

So we tried unsuccessfully below command:

[hdfs@clientnode ~]$ hdfs balancer -source datanode04.domain.com,datanode05.domain.com -threshold 1

We also found many others “more agressive options” listed below:

DataNode Configuration Properties:

Property Default Background Mode Fast Mode
dfs.datanode.balance.max.concurrent.moves 5 4 x (# of disks) 4 x (# of disks)
dfs.datanode.balance.max.bandwidthPerSec 1048576 (1 MB) use default 10737418240 (10 GB)

Balancer Configuration Properties:

Property Default Background Mode Fast Mode
dfs.datanode.balance.max.concurrent.moves 5 # of disks 4 x (# of disks)
dfs.balancer.moverThreads 1000 use default 20,000
dfs.balancer.max-size-to-move 10737418240 (10 GB) 1073741824 (1GB) 107374182400 (100 GB)
dfs.balancer.getBlocks.min-block-size 10485760 (10 MB) use default 104857600 (100 MB)

So again tried:

[hdfs@clientnode ~]$ hdfs balancer -Ddfs.balancer.movedWinWidth=5400000 -Ddfs.balancer.moverThreads=50 -Ddfs.balancer.dispatcherThreads=200 -threshold 1 \
-source datanode04.domain.com,datanode05.domain.com 1>/tmp/balancer-out.log 2>/tmp/balancer-err.log

But again it did not change anything special and they have been both executed very fast…

So clearly in our case the rack awareness story is a blocking factor. One mistake we have done is to have an odd number of datanodes and this 2-3 configuration in two racks is clearly not a good idea. Of course we could remove the rack awareness configuration to have a well balanced cluster but we do not want to loose the extra high availibilty we have with it. SO only available plan is to buy new databases or add more disks to our existing nodes as we have less disks than threads…

References

The post HDFS balancer options to speed up balance operations appeared first on IT World.

]]>
https://blog.yannickjaquier.com/hadoop/hdfs-balancer-options-to-speed-up-balance-operations.html/feed 0