Sean Forgatch
- Jan 21, 2019
- 2 min read

How to Copy Multiple Tables in Azure Data Factory

Problem: You need to copy multiple tables into Azure Data Lake Store (ADLS) as quickly and efficiently as possible. You don't want overhead of having to map the source table to the target directory.

Solution: Use the concept of Schema Loader/ Data Loader in Azure Data Factory (ADF). You will first get a list of tables to ingest, then pass in the list to a ForEach that will copy the tables automatically in parallel.

Create Linked Services

1. Create a connection to the source where we will extract the data from. I'm using Azure SQL Database in this case.

Best Practice: ADF Connections - Store Connection Strings and Secrets in Azure Key Vault

Best Practice: Naming Convention - Always start the name of a Dataset or LinkedService with the type abbreviation(eg. LS_<systemType). If it is a direction connection add on that additional info(eg. LS_<systemType>_myDBname. Don't put environment abbreviations in the name, this will save you some work if you setup CICD.

2. Create a data lake connection to copy the table(s) to. I'm using Azure Data Lake Store Gen1.

Best Practice: Managed Service Identity - When you create a Azure Data Factory service, a service principle with the same name automatically gets created. Use this for connectivity to Azure Data Lake.

Create a Pipeline

Lookup Activity

1. Create a Dataset: Azure SQL Database

2. Select the LS_ASQL Linked Service you created, do not select a table and do not define a schema.

3. Add Lookup Activity named Get-Tables

4. Select Settings > Select the LS_ASQL Linked Service you created > Select Query

5. Copy & Paste the following query to retrieve a list of tables:

SELECT QUOTENAME(table_Schema)+'.'+QUOTENAME(TABLE_NAME) AS Table_Name FROM information_Schema.tables

ForEach Activity

1. Name your activity Copy-Each-Table

2. Go to Settings > Items

3. In the "Add Dynamic Content" blade under Activity Outputs, select the output for our

Lookup Activity

4. Add .value to the end like this:

@activity('Get-Tables').output.value

5. Select Activities

6. Add a Copy Data Activity

7. Click source > Query

8. Input the following dynamic sql code:

SELECT * FROM @{item().Table_Name}

9. Create the Sink Dataset (ADLS) named DS_ADLS

10. For storage path we will use: Datalake>DBName>TableName (as we are just doing full copies) and store files as Text

Pipeline Execution

Next, go ahead and and hit the debug button to run your pipeline. You can see that multiple tables start getting copied in parallel.

Our result in Azure Data Lake is 1 file per table.

MODERN DATA ENGINEERING

Data Strategy and Engineering Insights

How to Copy Multiple Tables in Azure Data Factory

Create Linked Services

Create a Pipeline

Pipeline Execution

Recent Posts