New for 2020: Automating Data Hubs with Databricks Delta(Cleveland, OH)
Excited to share that I'll be presenting a new topic in the first half of 2020 on how to automate a Data Hub leveraging Apache Spark and...
Excited to share that I'll be presenting a new topic in the first half of 2020 on how to automate a Data Hub leveraging Apache Spark and...
THE MANY SMALL FILE PROBLEM If you have ever worked with big data tooling it is very likely that you've encountered The Many Small Files...
Part 2 of 4 in the series of blogs where I walk though metadata driven ELT using Azure Data Factory. We will review the primary component...
Metadata-driven Azure Data Factory. Automate data ingestion into Azure Data Lake
Problem: Need to profile a certain object to understand certain metrics in preparation for Data Warehousing, Engineering, or Science....
https://delta.io/ I've been working on Databricks Delta for clients since it was in preview, it changed the game for how we can do...
The GDELT dataset is one of the most interesting datasets in the world and is perfect for Analytics, Big Data, and AI project...
When generating Azure Data Factory(ADF) ARM templates, not all fields are automatically parameterized or you may not want a huge list of...
Problem: You need to copy multiple tables into Azure Data Lake Store (ADLS) as quickly and efficiently as possible. You don't want...
This diagram can help you understand the logistics behind CICD for Azure Data Factory. Below we will walk through some of the high level...