Now Azure Data Factory & Synapse Analytics provides Script Activity which allows to execute the DML & DDL script on data stores (SQL family, Snowflake, Oracle).
To understand this activity, in this article we will have a quick demo using this Script Activity in Azure Synapse to create (DDL) table if does not exist then insert (DML) data and select (DML) the specific data in 3 separate script activity and verify the result in activity output.
Azure Data Factory or Synapse Analytics added new activity called Script Activity which allows to execute common below operation using script:
- Data Manipulation Language (DML): SELECT, UPDATE, and INSERT let users retrieve, store, modify, delete, insert and update data in the database
- Data Definition Language (DDL): CREATE, ALTER and DROP allow to create, modify, and remove database objects such as tables, indexes, and users.
Allowed Data Stores for Script Activity:
- Azure SQL Database
- Azure Synapse Analytics
- SQL Server Database
- Oracle
- Snowflake
The script may contain either a single SQL statement or multiple SQL statements that run sequentially.
Sample use case of this Script Activity:
- Re-create fact and dimension tables before loading data into them.
- Save the dataset returned from a query as activity output for downstream consumption.
- Truncate a table view to insert data.
- Create, alter, and drop database objects such as tables and views.
- Run stored procedures.
Important features:
- Supported data source: Database (SQL family, Snowflake, Oracle)
- Supported Operations: Read / Modify
- Query parameter support: Input / Output
- Integrated CICD (ADF): Yes
Here we will add separate script activity in sypase pipeline and perform DDL & DML execution. First we will create table if does not exist then insert data and select the specific data in 3 separate script activity and verify the result in activity output.
In below list & image, we can see the 3 Script Activity are used for DDL/DML:
- DropAndCreateEmpTable (for DDL ): Drop existing table and create same table. it will use NonQuery to perform DDL Drop & Create Employee table. Please refer below image having numbered 1 fro this script acitvity.
- LoadIntoEmpTable (for DML ): Insert dummy employee data
- GetEmployeeDetails (for DML ): select few employee details
Stage 1: DropAndCreateEmpTable (for DDL ) Script Activity: Here we will drop the table if exists and then create new table.
DDL Script – Drop & Create table in multi line statement:
IF OBJECT_ID(N'dbo.Employee', N'U') IS NOT NULL
DROP TABLE [dbo].[Employee];
CREATE TABLE Employee(
EmpID INT,
Name VARCHAR(200)
Salary Decimal
);
Settings tab: Linked Service: The target database the script runs on. It should be a reference to a linked service.
Script Type: As we see in above image – there are two types of script:
- Query : Using DML – Database statements that return one or more result sets.
- NonQuery: Using DDL – Database statements that perform catalog operations (for example, querying the structure of a database or creating database objects such as tables), or change the data in a database by executing UPDATE, INSERT, or DELETE statements.
Currently we can provide script as text, but there should be option to provide script via script file as per Microsoft. As we can see the settings tab, script file option is not available till now while writing this article.
We will discuss about logging in later portion of this article.
Stage 2: LoadIntoEmpTable (for DML ) Script Activity: Here we will load data using insert script into the table created using previous script activity.
DML Script to insert data:
INSERT INTO Employee VALUES(5,'Sarnendu', 600) ;
INSERT INTO Employee VALUES(2,'David', 300);
INSERT INTO Employee VALUES(1,'John', 900);
Stage 3: GetEmployeeDetails (for DML ) Script Activity: Here we select records based on specific condition from the table which was loaded with records using previous script activity.
DML Script to select records: This should returns two records of ‘Sarnendu’ & ‘John’
SELECT FROM Employee WHERE Salary > 400
Enable Logging:
As we can see below image, in the script activity, there is a property to enable logging of output of the activity. As part of this demo, we will enable logging in stage 3 GetEmployeeDetails to log the output as Activity Output.
To enable logging, please expand ‘Advanced‘ and click on ‘Enable logging‘ as shown in below image. In this article we will use Activity Output as log output during demo.
After we enable logging, we will have option to select the script log output:
Script Log Output: Options to log the output of the activity
- Activity Output: User-defined message will be uploaded to external storage provided by the linked service. Use this option if messages are truncated with “Activity Output”, as this has no size limitation. In above image, we selected Activity output to log.
- External storage: Like blob storage. User-defined message will be put into the activity output. The maximum size limit of activity output is 4MB. Any exceeded message will be truncated, in which case we recommend “External storage” instead.
Please image shows the the setup of external storage to log the output. Here we need to link the Azure blob storage account to store the log.
So far we have created 3 Script Activity sequentially, Now let’s publish the pipeline and run the trigger now. Once the pipeline gets successfully completed. let’s monitor the pipeline as shown in below image:
In the below pipeline run, please have a look at the mapped activity runs which ran sequentially:
Verifying result in Activity Output:
Let’s check the Activity Output in stage 3 GetEmployeeDetails: We should get only two records: rowCount: 2 as shown in below image.
Let’s look at details Activity output of the our pipeline: details output records –
So we have seen the usage of 3 Script Activity to drop/create table and insert records and select the specific records. We have also verified the result in Activity Output in pipeline run.
Thanks for reading the article; Please feel free to comment below in case any query/suggestion.