How to Read CSV file in PySpark easily in Azure Databricks

In this article we will have a DEMO on How to Read CSV file in PySpark and load into a DataFrame in several ways using a Azure Databricks Notebook. PySpark provides us CSV() and Load() methods to read and load data from: We will read CSV file using different option like delimiter/separator , inferSchema, custom… Continue reading How to Read CSV file in PySpark easily in Azure Databricks

Delta Lake’s Change Data Feed (CDF) Demo in Azure Databricks

In this article we will have a demo on implementing Delta Lake’s Change Data Feed (CDF) using Multi-hop or Medallion Architecture in Azure Databricks. Need for Change Data Feed (CDF): Simplify CDC: Change Data Feed (CDF) addresses these challenges we faces using Change Data Capture (CDC) like improving: Quality Control by tracking row level changes.… Continue reading Delta Lake’s Change Data Feed (CDF) Demo in Azure Databricks

Databricks SQL Analytics introduction

Databricks SQL Analytics – Quick introduction:  Databricks SQL is now Generally Available (GA) providing a new generation of analytics & data applications, running directly on the data lake with data warehousing capabilities & SQL support on the Databricks Lakehouse Platform. Databricks SQL (DB SQL): allows us to operate on a multi-cloud lakehouse architecture provides thousands of optimizations… Continue reading Databricks SQL Analytics introduction