Hive Warehouse Connector integration in Zeppelin Notebook Posted on May 21, 2020, updated on March 26, 2021 by Yannick JaquierTable of contents Preamble HWC on the command line Zeppelin Spark2 interpreter configuration Zeppelin Livy2 interpreter configuration References Continue reading
Setup Spark and Intellij on Windows to access a remote Hadoop cluster Posted on April 22, 2020, updated on November 29, 2021 by Yannick JaquierTable of contents Preamble Spark installation and configuration Sbt installation and configuration Intellij IDEA from JetBrains Conclusion References Continue reading
On the importance to have good Hive statistics on your tables Posted on March 23, 2020, updated on March 20, 2020 by Yannick JaquierTable of contents Preamble The problematic queries Problem has gone with good Hive statistics References Continue reading
How to use Livy server to submit Spark job through a REST interface Posted on February 24, 2020, updated on February 19, 2020 by Yannick JaquierTable of contents Preamble Configuration Curl testing HTTP ERROR: 400 Python testing References Continue reading
Hive concatenate command issues and workaround Posted on January 24, 2020, updated on January 20, 2020 by Yannick JaquierTable of contents Preamble Concatenate command failing for access right HDFS default access right with Hive and Spark To go further Worked well till... References Continue reading
How to handle HDFS blocks with corrupted replicas or under replicated Posted on December 25, 2019, updated on January 30, 2023 by Yannick JaquierTable of contents Preamble Managing under replicated blocks Managing blocks with corrupted replicas References Continue reading