FLOWLET is a re-usable components in mapping data flow-> consisting of series of transformation steps to be used in several dataflow as transformation i.e. flowlet is to be reused by many dataflow so that transformation steps are not repeated in each data flow as flowlet contains the series of transformation steps.
To understand term FLOWLET we can think of it as(DataFLOW,TempLET)->FLOW+LET=>FLOWLET
Flowlet transformation is generally available both in Azure Data Factory and Synapse Analytics pipelines.
In this article we will create a new flowlet, debug it and use it in a dataflow as transformation.
In separate article, we will create a flowlet from dataflow and use flowlet as a source for a data flow.
Microsoft defined Flowlet as below:
“A flowlet is a reusablecontainer of activities that can be created from an existing mapping data flow or started from scratch. By reusing patterns you can prevent logic duplication and apply the same logic across many mapping data flows.”
Flowlet is analogue to function in programming:
Function is reusable, takes defined input type and process the input data using the logic inside its body and defined returned output type.
Similarly, flowlet is also reusable, takes input parameters (input setting ) and perform series of transformation on input data to produce output (output setting ) data.
How flowlet is similar to or differ from dataflow ?
- User Interface: no data preview in flowlet but in output setting.
- Components: The primary differences are the input, output, and debugging experiences
- debugging: we can debug flowlet by passing value to input parameters
- Reusability: flowlet is to be reused by many dataflow
Two types of Flowlet in terms of flowlet usage:
- as a transformation step (as template when created from a dataflow or manually from scratch).
- as a source of data flow (Add flowlet as source)
How to create flowlet?
- Create from scratch manually. In this article we will only create flowlet from scratch.
- Create from existing dataflow
Create flowlet from scratch manually: step by step
To create new flowlet, please select ‘Auther’ tab at left side and right click on the 3 three dots (…) at the right side of ‘Data flows’ and select ‘New flowlet’
Once we select ‘New flowlet’, it will open below image to create using ‘Add Input’ parameters or using ‘Add Source’ option. Please have a look at the flowlet logo.
Create flowlet using using ‘Add Input’ parameters:
once we click on ‘Add Input’, we will get option to add input setting/transformations and output settings. let us give name for this flowlet as “ProductNewFlowlet” as we are creating this flowlet from scratch and it will transform product dimension source.
Before we create full flowlet, let’s have understanding of flowlet components:
Flowlet has 3 required components —
- Input setting: flowlet starts with input setting. input of a flowlet defines the input columns expected from a calling mapping data flow ie.e input setting contains input parameters;
- Transformations: Series of transformation steps in between input and output setting.
- Output setting: flowlet ends with output settings. The output of a flowlet defines the output columns that can be expected to emit to the calling mapping data flow i.e. output setting contains output parameters.
Flowlet Input setting:
Now let’s add the input parameters with data type as shown below. These input parameters of flowlet and value of these parameters will be provided by calling dataflow.
Flowlet Transformations steps:
Next we will add dataflow stands transformations steps. In our example flowlt we will add three transformation – filter & derived column transformation.
Let’s click on the + sign of above image to keep adding filter & derived column transformation as shown below respectively.
Please note, there is no data preview tab of each transformation shown just above or just below, we will get data preview tab only in output setting
Flowlet Output setting:
Now let’s add the output of flowlet as below: please note, data preview tab only available in below output setting.
Once we cilck on ‘output’ as shown in above image , we need to config the output setting of flowlet as shown below: The value of these output fields will be returned to calling dataflow.
Debugging a flowlet:
before we use this flowlet in a data flow, Azure Data factory provides us option to debug this flowlet as shown to pass the required values in the input parameters as shown in below image:
Please ensure, data flow debug is ON; to test debug this flowlet, please go to ‘data preview’ tab of output setting and click on refresh. it will perform required filter & derived column transformation as shown earlier and display the result here in this tab:
Flowlet transformation in mapping data flow: running a flowlet inside of a mapping data flow-
let’s use flowlet in a data flow & run a flowlet inside of a mapping data flow i.e. our calling dataflow will invoke flowlet by providing the input parameters.
We already created a data flow and added a ‘data source’ there i.e. Azure Data Lake Gen2. Before we adding flowlet transformation on it, let’s preview the source once data on which we will perform flowlet transformation:
Now let us add flowlet as transformation step on this dataflow as shown below image:
once we clicked to add flowlet transformation as shown above, now we have to select the flowlet “ProductNewFlowlet” that we already created earlier. please refer below image:
Mapping between source data column and input parameters of flowlet:
This is important to have right mapping as this will decide what we are passing to flowlet, accordingly we will get data in sink.
Mapping tab: “If the selected flowlet has input columns, you can map columns from the input stream to the expected input columns in the flowlet. This mapping of your mapping data flows columns to the flowlet is what enables the flowlets to serve as reusable snippets of mapping data flow logic across potentially many mapping data flows.”
Once the mapping is done, let’s preview the data in flowlet transformation itself:
now we have to add dataflow sink at the flowlet; here we can review the data in ‘data preview tab’ as shown below:
Earlier we saw the source data of dataflow and we added flowlet as transformation step and preview the data in this flowlet and later we also preview the data in dataflow sink; In both cases preview data is same. Flowlet contains 2 transformation — filter and derived column which have been applied on the source data and accordingly preview data is generated.
So in this article we understood why we need flowlet and how to create it and debug it. We also saw how to use it inside a dataflow.