INSERT OVERWRITE does not delete old directories Posted on September 22, 2020, updated on September 15, 2020 by Yannick JaquierTable of contents Preamble INSERT OVERWRITE test case INSERT OVERWRITE does not delete old directories References Continue reading
Spark lineage issue and how to handle it with Hive Warehouse Connector Posted on August 23, 2020, updated on July 30, 2020 by Yannick JaquierTable of contents Preamble Test case and problem Spark lineage solution Spark lineage second problem and partial solution References Continue reading
How to install and configure a standalone TEZ UI with HDP 3.x Posted on July 22, 2020, updated on September 15, 2020 by Yannick JaquierTable of contents Preamble TEZ UI installation TEZ UI configuration TEZ UI bug correction trick References Continue reading
PySpark and Spark Scala Jupyter kernels cluster integration Posted on June 21, 2020, updated on September 20, 2023 by Yannick JaquierTable of contents Preamble JupyterHub installation Jupyter kernels manual configuration Sparkmagic Jupyter kernels configuration References Continue reading
Hive Warehouse Connector integration in Zeppelin Notebook Posted on May 21, 2020, updated on March 26, 2021 by Yannick JaquierTable of contents Preamble HWC on the command line Zeppelin Spark2 interpreter configuration Zeppelin Livy2 interpreter configuration References Continue reading
Setup Spark and Intellij on Windows to access a remote Hadoop cluster Posted on April 22, 2020, updated on October 16, 2023 by Yannick JaquierTable of contents Preamble Spark installation and configuration Sbt installation and configuration Intellij IDEA from JetBrains Conclusion References Continue reading