Thursday, March 29, 2012

Foreach NodeList Enumerator

Does anyone have any experience of using the NodeList enumerator in a Foreach loop? BOL is a bit light on this.

I want to enumerate over an XML Document that is passed into my package. The package is executed from a .net application.
Has anyone done anything like this?
Any demo material?
Should I pass the XML Document into an SSIS object variable or a String variable?
Can the NodeList enumerator enumerate an XML document that is stored in a String variable?
etc...
Thanks
JamieHey Jamie,
I've done this -- albeit for a rather simple example. I have an XML file that is just a persisted collection of structs with properties called "BusinessObjectName", so the data itself looks like this:
<DimensionInfos xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<BusinessDimensionEntries>
<anyType xsi:type="BusinessDimensionInfo">
<BusinessObjectName>User</BusinessObjectName>
</anyType>
</BusinessDimensionEntries>
</DimensionInfos>

Essentially I want to iterate over all the nodes and pull out the "BusinessObjectName" from each. I created a foreach NodeList enumerator, document source is the file, enumeration type is NodeText, XPath source is DirectInput (meaning I specify it within the task) and the XPath string is "//BusinessObjectName". This query will recursively match all nodes of that type. Then I map index 0 to some variable and I will get the NodeText "User" (as specified) of the node(s) that match my XPath expression.
Not sure if you came across this article:
http://databasejournal.com/features/mssql/article.php/3528791
Also, a good XPath reference:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk/html/_xpath_reference.asp

|||BTW - DocumentSourceType seems to be what you need, no? It allows you to draw the XML in from a variable or a string (which I suppose in theory you could bind to an expression =))
|||I have to say the use of indexes is a bit of a black art.

What if I want to get the value of multiple elements of a node.

i.e. <root><mynode value1="simon" value2="fred" />
<mynode value1="jamie" value2="smith" />
</root>

Can I iterate through mynode elements and store the value1 and value2 attributes.|||

SimonSa wrote:

I have to say the use of indexes is a bit of a black art.

What if I want to get the value of multiple elements of a node.

i.e. <root><mynode value1="simon" value2="fred" />
<mynode value1="jamie" value2="smith" />
</root>

Can I iterate through mynode elements and store the value1 and value2 attributes.


Simon,

I'm trying to do something almost identical to this but just can't get it working at the moment. I'll post up here if i get something useful working.

-Jamie|||I sent feedback about the help not being very helpful.

The help for the properties of the for each loop doesn't explain what each one is it just says 'set the value'. What else would I do with it. It needs to say what the value should be set to. Unlike other feedback. I didn't get a response on this one|||

SimonSa wrote:

I sent feedback about the help not being very helpful.

Me too. Doug Laudenshlager is good at taking feedback on board so expect something more useful in the future.

-Jamie|||

SimonSa wrote:

I have to say the use of indexes is a bit of a black art.

What if I want to get the value of multiple elements of a node.

i.e. <root><mynode value1="simon" value2="fred" />
<mynode value1="jamie" value2="smith" />
</root>

Can I iterate through mynode elements and store the value1 and value2 attributes.

Hi all,
OK, i'm progressing with this but have now hit EXACTLY the same problem as Simon. Here's my XML document:
<LastLoadDateList>
<Pair>
<StreamDetailID>3</StreamDetailID>
<LastLoadDate>2005-09-19 13:40:00</LastLoadDate>
</Pair>
<Pair>
<StreamDetailID>4</StreamDetailID>
<LastLoadDate>2005-09-19 13:42:15</LastLoadDate>
</Pair>
</LastLoadDateList>

I've managed to enumerate the 2 <Pair> nodes which results in the strings "32005-09-19 13:40:00" & "42005-09-19 13:42:15" getting enumerated. I've got EnumerationType=Nodetext.
Can you see what's happened here? Its concatenated the StreamDetailID & LastLoadDate nodes.

So how can I get those 2 values out into seperate variables? I guess its something to do with InnerXPathString but I can't see how to do it.

The lack of documentation around this is infuriating. To say the least!!!

Any help much appreciated!

-Jamie
|||<root>
<mynode value1="simon" value2="fred" />
<mynode value1="jamie" value2="smith" />
</root>

If your goal is to enumerate the attributes of the preceeding document, you could do so with an EnumerationType of NodeText and an OuterXPathString of /root/mynode/@.*|||<LastLoadDateList>
<Pair>
<StreamDetailID>3</StreamDetailID>
<LastLoadDate>2005-09-19 13:40:00</LastLoadDate>
</Pair>
<Pair>
<StreamDetailID>4</StreamDetailID>
<LastLoadDate>2005-09-19 13:42:15</LastLoadDate>
</Pair>
</LastLoadDateList>

If your goal is to enumerate the children of the Pair nodes in the preceeding document, you could do so with an EnumerationType of NodeText and an OuterXPathString of /LastLoadDateList/Pair/*|||Can the NodeList Enumerator be used to slice out a segment of an XML file and store it in a variable? Here's an example of what I would like to do. Given the following xml file:

<book>
<section>
<name>Chapter1</name>
<content>...</content>
</section>
<section>
<name>Chapter2</name>
<content>...</content>
</section>
</book>

I would like to use the NodeList Enumerator to loop twice over the XML and pull out the section into a variable, such that the variable would contain <section><name>Chapter1</name><content>...</content></section> in the first iteration and <section><name>Chapter2</name><content>...</content></section> in the second. Whatever I do, the enumerator either wants to map the different elements to different variables or store everything as text. Any help on this matter is appreciated.

Regards,
Lars R?nnb?ck|||If you set the index to -1 and the variable type to object and the outerXpath to \\section you should end up with the section in your object|||Should the enumerator type then be "Navigator" or "Node", and how would you transform the object variable back to string, so I could use it in an XSLT task?

Thanks for the tip about the -1 index, that seems to put everything into a single variable though.

Regards,
Lars|||Probably Node

and probably need to use a script component to convert the object to text.|||

Since others might be struggling with figuring out the InnerXPathString I thought I would post my experiences. I have the following recursive schema in the database for my typed XML datatype:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns="http://garbleddomain/schemas/meta/jobb"
targetNamespace="http://garbleddomain/schemas/meta/jobb"
elementFormDefault="qualified">
<xs:element name="job">
<xs:annotation>
<xs:documentation>
<xhtml:p>
Defines a job.
</xhtml:p>
</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element ref="job" minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="name" type="xs:string" />
<xs:attribute name="script" type="xs:string" />
<xs:attribute name="type" default="Group">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Group" />
<xs:enumeration value="SP" />
<xs:enumeration value="SSIS" />
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:element>
</xs:schema>

The schema is used to define jobs I want to run that can be grouped and hierarchical. Here's an example XML document:

<?xml version="1.0"?>
<job xmlns="http://garbleddomain/schemas/meta/jobb">
<job name="Loads">
<job name="Load customer data" script="sp_loadCust" type="SP" />
<job name="Load articles" script="sp_loadArticles" type="SP" />
</job>
<job name="Updates">
<job name="Update transactions" script="sp_updateTrans" type="SP" />
<job name="Update categories" script="UpdateCategories" type="SSIS" />
</job>
</job>

To retrieve the XML using an Execute SQL Task over OLE DB I set the ResultSet type to XML and used the following Direct Input query:

SELECT CAST(definition AS VARCHAR(max)) AS JobDefinition
FROM META_Job_TB
WHERE (JobID = ?)

In the Parameter Mapping section I map one String variable, User::JobID as Input with type VARCHAR and Parameter Name 0. In the Result Set section I map another variable User::JobDefinition with Result Name 0. This will pull the XML document above wrapped in <ROOT> tags.

Since I want to remove the <ROOT> tags and since I couldn't get SSIS to work with typed XML I have a cleanup step using an XML Task (XSLT) where I remove the tags and namespace. The XSLT is a Direct Input which looks as follows:

<?xml version="1.0" ?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:id="http://garbleddomain/schemas/meta/jobb"
exclude-result-prefixes="id">
<xsl:template match="/">
<xsl:apply-templates mode="copy-no-ns" select="/ROOT/id:job"/>
</xsl:template>
<xsl:template mode="copy-no-ns" match="*">
<xsl:element name="{name(.)}">
<xsl:copy-of select="@.*"/>
<xsl:apply-templates mode="copy-no-ns"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>

I use the User::JobDefinition variable both as input source and output operation result destination. Now I am left with the job definition XML document without the namespace. In this case I want to iterate over the actual jobs and store the attributes in variables that can be accessible inside the loop, and do the following:

1. Create three String variables; User::JobName, User::JobScript, and User::JobType.
2. In the ForEach Loop I select the NodeList Enumerator.
3. I use the User::JobDefinition as document source.
4. Set EnumerationType to ElementCollection.
5. Set the OuterXPathString to //job[not(@.type = 'Group')]
6. Set the InnerElementType to NodeText.
7. Set the InnerXPathString to @.*
8. In Variable Mappings add the three String variables with Index 0, 1, and 2.

Voila, the variables will now be set to the values of the attributes for each job.

Hope this helps someone,
Regards,
Lars R?nnb?ck

No comments:

Post a Comment