Tuesday 18 April 2017

SSIS - Searching Records in Flat file when loading

Introduction
In this article we are trying to play with filtration in Flat File. As we all know that to make any filtration with flat file, we have to load the entire flat file into a staging table or just use a Conditional Split Transformation after loading the Flat file by using Flat File Data source and Flat file connection manager.

If the size of our flat file is huge and we need only some records depending on specified criteria, why we are going to load the entire flat file and then filter some records from it?
It is always better, if we are going to filter some records before loading the flat file and then work with those records. Something like Incremental Load with Flat File but No staging table.

Here we are just going to demonstrate the simple filtration in flat file before loading it. We can make it Incremental load by using the same approach.

Look’s interesting? To understand this article properly, we need some prerequisite like the VB scripting language and understanding of parameterized Batch file with SSIS.

Case Scenario
The high-level view of our SSIS package is it just load a flat file form the source and before loading the flat file it filters some records first and then load it.
Please note that we are not going to load the entire flat file. We are just going to load some selected records.

Source and Destination Details
In source we have a flat file with huge volume of data and in destination, we have a table object to store the records.


Sample of Flat File Source:



Filter Criteria:
Before loading the flat file we are just going to filter the records depending on the criteria mentioned below.
We are going to use the last columns of the Flat File and retrieve those records that have greater than or equal (>=) to the specified date in a variable.

Proposed Solution
We are going
·         Call a parameterized batch file from the SSIS Execute Process task
·         Batch file is going to call the VB script file
·         The VB Script file is going to filter the flat file according to criteria and make another flat file with filtered output
·         By using the newly created flat file we are going to store the records into our DB table objects


Pictorial Diagram for Understanding:





[Step - 1 ] Create the VB Script

Here in our example we are creating the VB script in the name of Test.bvs


strFiles = WScript.Arguments.Unnamed(0)
strDate  = WScript.Arguments.Unnamed(1)

Set objFS = CreateObject( "Scripting.FileSystemObject" )
Set objFile = objFS.OpenTextFile(strFiles)

Do Until objFile.AtEndOfStream
   strLine = objFile.ReadLine
   s = Split(strLine,"|")
  
   if (CDate(s(5)) >= CDate(strDate)) then
      WScript.Echo  strLine
   end if
Loop

If we look at the code we can see that it’s open the flat file first. The name of the flat file is supplied with arguments.
strFiles = WScript.Arguments.Unnamed(0)

Then it takes another input. In our case it is a Date.
strDate  = WScript.Arguments.Unnamed(1)

Then it reads the file line by line using a Loop until the end of the file comes. After reading a single line of flat file, it just splits the columns.
s = Split(strLine,"|")
In our case the column separator is (|). In our flat file, there are 5 columns (starting columns are 0 columns). The last columns contain the Date columns that we need to compare.

if (CDate(s(5)) >= CDate(strDate)) then


[ Step – 2 ] Create the Batch file that called the VB Script file
Here we are creating the batch file in the name of MyBatch.bat
The batch file calls the VB script file. To call the VB script file we need two parameters.


@ECHO OFF
SET arg1 = %1

REM cscript //nologo test.vbs stdDtls.txt 05-Jan-2015 > filter.txt
cscript //nologo test.vbs StdDtls.txt "%1%" > filter.txt

@ECHO ON

Here the filtered output is stored into another flat file named filter.txt. The file name filter.txt is automatically created.

[ Step – 3 ] Call the Batch file from SSIS Process Task
First, we have to create a variable in our SSIS package which contains the date and depending on that date we are going to extract the records.



Now we are going to call the batch file from Execute Process Task of SSIS.





After executing the process task, it generates another flat file. This flat file contains the filtered records. In our case the name of the file is filter.txt





[ Step – 4 ] Retrieve records from Newly created flat file
Now we are extracting the records from newly created flat file. For Meta data mapping we just execute the Execute Process time once in first time to create a new flat file and then use this flat file form Flat File Connection Manager.
If it is needed, we can use Expression of Flat File connection manager Property named Connection string to make it dynamic.

Control Flow Diagram:




Data Flow Diagram:





Output in Data Viewer:





Here we are not going to store it in a DB table object. We are just showing it in the Data Viewer.

Hope you like it.




Posted By: MR. JOYDEEP DAS


Sunday 9 April 2017

Learn MDX with Me – Part - 6

Introduction
Now we are continuing our journey of MDX. Now we are trying to drill down more on MDX Query. Hope the session is very interesting.

Understanding Parent function
Parent function represents the Parent of the current member. To understand it, we are creating a new Hierarchy with the name of [CallenderHierarchy






Now try to see a member of Calendar Hierarchy

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017] on Rows
FROM   [CUBESales]

Output:

Qty Sold
Sales Rate
Calculated Sales Amout
Jan-17
80
900
35000

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017].parent on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Quarter 1, 2017
100
1200
38000

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017].parent.parent on Rows
FROM   [CUBESales]


Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Calendar 2017
100
1200
38000

Ancestors Function
It is quite difficult to use parent function because if we have lot of Hierarchy Levels. Suppose we have 9 Hierarchy levels and the members of the last Hierarchy Level we want to move the top level… we just do code like that


[Members of Last Level].Parent.Parent.Parent.Parent.Parent.Parent.Parent.Parent.Parent


So we have another function named Ancestors. It tales the current member and the level number that we need to see.


SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        Ancestors([Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017], 2) on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Calendar 2017
100
1200
38000

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        Ancestors([Dim Time].[CallenderHierarchy].[Year Name].
            &[Calendar 2017].&[Quarter 1, 2017], 1) on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Calendar 2017
100
1200
38000

Instead of Level 1,2,3 we can directly use the Level name also.


SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        Ancestors([Dim Time].[CallenderHierarchy].[Year Name].
            &[Calendar 2017].&[Quarter 1, 2017],
            [Dim Time].[CallenderHierarchy].[Year Name]) on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Calendar 2017
100
1200
38000



Ascendants Function
The Ascendants function returns all of the ancestors of a member from the member itself up to the top of the member’s hierarchy; more specifically, it performs a post-order traversal of the hierarchy for the specified member, and then returns all ascendant members related to the member, including itself, in a set. This is in contrast to the Ancestor function, which returns a specific ascendant member, or ancestor, at a specific level.


Try this

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        ascendants([Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017]) on Rows
FROM   [CUBESales]



Qty Sold
Sales Rate
Calculated Sales Amout
Jan-17
80
900
35000
Quarter 1, 2017
100
1200
38000
Calendar 2017
100
1200
38000
All
100
1200
38000

Finding Brothers and Sister of Specified members


We have to find the current members Parent first and then find the children of the parents.
   

     Finding Parent

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017].parent on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Quarter 1, 2017
100
1200
38000

  

  Finding Children of the Parent

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017].parent.children on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Feb-17
20
300
3000
Jan-17
80
900
35000
Mar-17
(null)
(null)
(null)

Siblings Function
It is the same output as Parent.Children function

SELECT {[Measures].[Qty Sold],
        [Measures].[Sales Rate],
        [Measures].[Calculated Sales Amout]} on Columns,
        [Dim Time].[CallenderHierarchy].[Year Name].&[Calendar 2017].
           &[Quarter 1, 2017].&[January 2017].siblings on Rows
FROM   [CUBESales]

Output:


Qty Sold
Sales Rate
Calculated Sales Amout
Feb-17
20
300
3000
Jan-17
80
900
35000
Mar-17
(null)
(null)
(null)


This learning session will be continued. Please make your interest by commenting it.




Posted by: MR. JOYDEEP DAS