How to use Livy server to submit Spark job through a REST interface Posted on February 24, 2020, updated on February 19, 2020 by Yannick JaquierTable of contents Preamble Configuration Curl testing HTTP ERROR: 400 Python testing References Continue reading
Hive concatenate command issues and workaround Posted on January 24, 2020, updated on January 20, 2020 by Yannick JaquierTable of contents Preamble Concatenate command failing for access right HDFS default access right with Hive and Spark To go further Worked well till... References Continue reading
How to handle HDFS blocks with corrupted replicas or under replicated Posted on December 25, 2019, updated on January 30, 2023 by Yannick JaquierTable of contents Preamble Managing under replicated blocks Managing blocks with corrupted replicas References Continue reading
Fetch Zookeeper information from Python with Kazoo to connect Hive Posted on October 25, 2019, updated on October 22, 2019 by Yannick JaquierTable of contents Preamble Kazoo development environment installation Python source code Kazoo testing References Continue reading
HDFS capacity planning computation and analysis Posted on August 30, 2019, updated on August 27, 2019 by Yannick JaquierTable of contents Preamble HDFS capacity planning first estimation HDFS snapshot situation After delete of HDFS snapshot References Continue reading
HDFS balancer options to speed up balance operations Posted on July 5, 2019, updated on June 4, 2020 by Yannick JaquierTable of contents Preamble HDFS Balancer References Continue reading