Spark Structured Streaming for incremental ingestion flows Posted on January 20, 2024, updated on January 19, 2024 by Yannick JaquierTable of contents Preamble Spark Structured Streaming first test case Spark Structured Streaming second test case Spark Structured Streaming third test case to prepare further ingestion References Continue reading
Change Data Feed to track changes at row level Posted on December 20, 2023, updated on June 18, 2024 by Yannick JaquierTable of contents Preamble Change Data Feed testing Change Data Feed testing with Spark Structured Streaming References Continue reading
Auto loader to process flow of raw files hands-on Posted on November 30, 2023, updated on January 19, 2024 by Yannick JaquierTable of contents Preamble Auto Loader test case Auto Loader testing References Continue reading
dbx for local and rapid development lifecycle through Databricks Posted on July 10, 2023, updated on July 10, 2023 by Yannick JaquierTable of contents Preamble Not able to access DBFS files under Databricks Connect Creation of the DBR 12.2 Conda environment DBX installation and configuration Databricks CLI installation Databricks CLI profile configuration DBX installation DBX configuration DBX execution Resolve unicode error and add icons to Git Bash DBX execution of a more complex example Synchronize your local code with Databricks References Continue reading
How to online refresh BI tables with minimum impact to end users Posted on May 10, 2023, updated on May 10, 2023 by Yannick JaquierTable of contents Preamble Preparation of the test environment The old legacy technique The alter table switch technique The view technique The synonym technique The schema transfer technique Conclusion References Continue reading
Citus columnar storage hands-on Posted on March 28, 2023, updated on March 28, 2023 by Yannick JaquierTable of contents Preamble Citus installation Citus columnar storage testing preparation Citus columnar storage testing results References Continue reading