Tuesday, March 27, 2012

Foreach loop over Excel files seems 'fragile'

All,

I have a package that loops over ~60 Excel files in a directory. Each
file has three named ranges in it, which I import into different
tables. Sometimes the package runs without a hitch, sometimes it
chokes. But it is intermittent.

If I pull the control flow components out of the foreach loop and
point the Excel connection manager to the specific Excel file that has
caused the package to choke, I get a message in the dataflow component
pointing to the named range that "the metadata of the following output
columns does not match the metadata of the external columns......Do
you want to replace the metadata of the output columns with the
metadata of the external columns?" When I choose 'yes', then the
file will be loaded. then I can put the control flow components back
into the foreach loop and the file will run again, successfully, along
with some more, until it chokes again....

So, first of all, does anyone have any insight into this? Sometimes,
somedays, these files will load with no problems. These exact files;
I am having to reload constantly... Other times, like today, it is a
battle.

Otherwise, is there a way to get Integration Svcs to handle the
metadata issue on the fly?

Any ideas, resources, references, war stories, or good clean jokes
would be appreciated,
Kathryn

Metadata cannot change... Do you have changing metadata in your Excel documents, or does SSIS just think it is changing?|||

Phil,

Thanks for the quick reply. It seems that SSIS thinks the metadata is changing..

As far as I can tell, the problem is caused when a field in the file does/does not have a hyphen in it. For example, some files give us EIN with a hyphen and some don't. The package will chug along until it gets an EIN with a hyphen, then it will choke. I will pull the control flow components out of the foreach, point the excel source at the file that's causing it to choke, then i will answer yes to the metadata warning. Then I'll put the control flow components back into the foreach and it will chug along until it gets to a file WITH a hyphen in the EIN, when it will choke again....

All fields are defined to be strings. I even put a Data Conversion component after the Excel Source component to strip out hyphens, but the data flow doesn't get to the Data Conversion; it chokes on the Excel Source.

Kathryn

|||

Hey Kathryn,

Try this... I don't know if it'll work or if you've already tried this, but try to process the erroneous file first (if possible) in the loop. I don't know if you can control that or not. Here's what I'm thinking. I think that SSIS looks at the first file, sees that FieldA1 is a numeric, and sets the metadata to numeric for that field. When you encounter a text value for that same field in a subsequent file, it bombs. So I'm wondering if you can process a file first that contains the text value of that field, for example. Then it'll think that field is a text field and process it the same for the rest? It's just a thought!

Rebecca

|||Maybe setting IMEX=1 in the excel connection string is the answer here as well.|||

Phil,

Thanks for the suggestion. Unfortunately, it didn't work, though it seems that that should be the answer....

Kathryn

|||

I'm very surprised that IMEX=1 did not work, since forcing everything to be loaded as a string should avoid the issue with the mixed data types that you otherwise have in your EIN column (numeric values when there's no dash, string values when there is one).

The only potential issue that comes to mind is the difference between string and memo fields, for which there must be at least 1 row with a memo value in the rows sampled by the driver for the driver to recognize that column as a memo column.

Let's remember that Excel has no column metadata. The driver can only guess.

-Doug

sql

No comments:

Post a Comment