Friday, March 23, 2012

how do i import repeatable but non pre-defined fields?

hello all,

I'm strugeling for quite soem time and hoping to get your assistance:

facts:

We have a flat file source each record has the same set of columns, let's call it "identifying columns" each row can have after the identifying columns (but not all the records have it) a second set of columns, let's call it "detailed info". the "detailed info" set can repeat itself several time for each record We have a sql table destination The row in the table should represent an "indetifying columns" record

sample record 1:

David | 1 | Rodeo Drive | 456 |

sample record 2:

Jeff | 2 |

sample record 3:

John | 3 | Sunny rd | 111 | Marvell str | 6

expected output in the table:

David | 1 | Rodeo Drive | 456

Jeff | 2 | null | null

John | 3 | Sunny rd | 111

John | 3 | Marvell str | 6

question: How do i import the data from the source file into the table, when i dont know in advance how many columns each record has? and how do i divide the sets of "detailed info" into different records in the db?

Thanks,

Eric

From your example, it looks like the table represents a "detailed info" record. Regardless, your problem is that you have a varying number of columns and the SSIS flat file connection manager won't support that. There are several threads here about the problem.

To quickly describe my preferred solution, I think you should read the file as a single column and use a script component to parse it into the columns you need. Others may recommend the Derived Column for the parsing.

http://agilebi.com/cs/blogs/jwelch/archive/2007/05/08/handling-flat-files-with-varying-numbers-of-columns.aspx
|||

re the number of columns - i've used a script that adds empty columns to each row, based on the max number of columns in the file,

so my file will actualy look like this:

John | 123 | B str | 2 | | |

Jeff | 444 | | | | |

Amy | 555 | A str | 1 | C Str | 12

my issue still remains with splitting the columns into rows. meaning - i need my 3rd & 4th columns to represent 1 row in the table and i need my 5th & 6th columns to represent a second row in the destination table.

(the 2 columns are just an example in order to simplify, i actualy have a set of 10 columns that repeat itself and need to be separated into rows).

how would you suggest splitting each set of columns into rows?

thanks in advance !

|||Have you looked at the Unpivot transform? That is designed to move columns to rows.|||

yes, this is what I am trying to do now.

the question i have around this one is - whether we can do some kind of dynamic un-pivot (or via script task?) cause the number of columns that we have is not defined in advance, so i need some kind of looping on all the columns and insert every 4th column in destination column A and every 5th column into destination column B, etc.

is there such a thing, a dynamic unpivot? can you direct me to examples?

thanks!

|||The Unpivot does not support a dynamic number of columns to pivot on, so you'd need to use a script component transform with an asynchronous output. For each input row, you would output one or more rows. Take a look at "Creating an Asynchronous Transformation with the Script Component" in Books Online for some guidance in doing this.|||

I posted an example of this on my blog - hope it's helpful.

http://agilebi.com/cs/blogs/jwelch/archive/2007/05/18/dynamically-pivoting-columns-to-rows.aspx

No comments:

Post a Comment