Wednesday, 1 February 2017

SSIS Appetizer: XML source is already sorted

I have a large XML file with Orders and Onderlines which I want to (merge) join into a new destination. To join the two outputs I need to order them, but the sort transformation takes too much time. Is there a faster alternative?
XML Source with two joined outputs

The solution is surprisingly quite simple: the outputs are already sorted and you only have to tell SSIS that (Similar to a source with order by in the query).

For XML files with multiple levels (first for orders and second for orderlines) like below, SSIS will create two output ports.
XML Sample

The outputs will have an extra bigint column which allows you to connect the orderlines to the correct order.
Two outputs with additional ID column

Instead of using these ID columns in the SORT transformations, you can also use the advanced editor of the XML source to tell SSIS that these columns are already sorted. Right click the XML source and choose 'Show Advanced Editor...'.
Show Advanced Editor...

Then go to the last page 'Input and Output Property' and select the Orderline output. In the properties of this output you can tell SSIS that the output is sorted.
Set IsSorted to true

Next expand OrderLine and then Output Columns and click on the additional ID column 'Order_Id'. In its properties locate the SortKeyPosition and change it from 0 to 1.
Set SortKeyPosition to 1

Repeat this for the second output called 'Order' and then close the advanced editor. If you still have the SORT transformations, you will notice the yellow triangle with the exclamation mark in it. It tells you that the data is already sorted and that you can remove the SORT transformations.
And if you edit the Data Flow Path and view the metadata you will see that the column is now sorted.
Sorted! Remove the SORT transformations!

The solution is very simple and perhaps this should have been the default sort key position anyway? It smells like a bug to me...
No sorts!

Wednesday, 4 January 2017

SSIS Appetizer: import numerics with other regional settings

I have a CSV file with numeric values that use a dot "." as decimal separator instead of the comma "," we use locally. When I try to import it in SSIS with a Flat File Source it gives me an error. I don't want to/can't change the regional settings on the server. How do I import this flat file without errors?
The value could not be converted because of a potential loss of data
10.5 should be 10,5 (or vice versa)

Error: 0xC02020A1 at DFT - Process Data, FF_SRC - myCsvFile [2]: Data conversion failed. The data conversion for column "myColumn" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
Error: 0xC0209029 at DFT - Process Data, FF_SRC - myCsvFile [2]: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR.  The "FF_SRC - myCsvFile.Outputs[Flat File Source Output].Columns[myColumn]" failed because error code 0xC0209084 occurred, and the error row disposition on "FF_SRC - myCsvFile.Outputs[Flat File Source Output].Columns[myColumn]" specifies failure on error. An error occurred on the specified object of the specified component.  There may be error messages posted before this with more information about the failure.
Error: 0xC0202092 at DFT - Process Data, FF_SRC - myCsvFile [2]: An error occurred while processing file "D:\myFolder\2016-12-27.csv" on data row 2.
Error: 0xC0047038 at DFT - Process Data, SSIS.Pipeline: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on FF_SRC - myCsvFile returned error code 0xC0202092.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.

You can change the LocaleID of the connection manager to import this file. Right click the connection managers and choose Properties...
Go to properties of flat file connection manager

Then locate the LocaleID property and change it to English (United States), English (United Kingdom) or an other country that uses a dot "." as decimal separator. Or change it to for example Dutch (Netherlands) if you have the opposite problem.
Change LocaleID

Now run the package again to see the result.

Related Posts Plugin for WordPress, Blogger...