Setup Spark and Intellij on Windows to access a remote Hadoop cluster

Posted on April 22, 2020, updated on October 16, 2023 by Yannick Jaquier

Table of contents

Continue reading

On the importance to have good Hive statistics on your tables

Posted on March 23, 2020, updated on March 20, 2020 by Yannick Jaquier

Table of contents

Continue reading

How to use Livy server to submit Spark job through a REST interface

Posted on February 24, 2020, updated on February 19, 2020 by Yannick Jaquier

Table of contents

Continue reading

Hive concatenate command issues and workaround

Posted on January 24, 2020, updated on January 20, 2020 by Yannick Jaquier

Table of contents

Continue reading

How to handle HDFS blocks with corrupted replicas or under replicated

Posted on December 25, 2019, updated on January 30, 2023 by Yannick Jaquier

Table of contents

Continue reading

Hive fetch task really improving response time by bypassing MapReduce ?

Posted on November 24, 2019, updated on February 28, 2020 by Yannick Jaquier

Table of contents

Continue reading