CSV to XML with PowerShell: Writing Typed Objects to the Pipeline

Posted on July 8, 2015 by Jeff Hicks in PowerShell with 0 Comments

Let’s continue our journey of transforming CSV data into XML data. One of the primary reasons I’m undertaking this process is to maintain type information. With that in mind, let’s take a moment to see what is involved in importing an XML file and writing typed objects to the pipeline. For example, we’ll how to handle and treat a value like 345321 as an integer and not a string.

I’m going to use one of the XML files I created earlier that looks like this:

XML file. (Image Credit: Jeff Hicks)

XML file. (Image Credit: Jeff Hicks)

You’ll notice that I’ve added type information into my XML file. You can read in any XML file to PowerShell as an XML document.

You can also navigate the document like any other PowerShell object.

Navigating the XML document with PowerShell. (Image Credit: Jeff Hicks)

Navigating the XML document with PowerShell. (Image Credit: Jeff Hicks)

Using an XPath filter, it isn’t too difficult to see the data.

Using an XPath filter for better data visibility. (Image Credit: Jeff Hicks)

Using an XPath filter for better data visibility. (Image Credit: Jeff Hicks)

Technically, I’m getting a bit more than I really need but you get the idea. The data is waiting for you. But I would like to have this turned back into objects that I can use in PowerShell. Because my XML file was originally generated with the ConvertTo-XML cmdlet, it only has a few nodes. Other than the XML declaration, there is a top-level collection node that in turn has a collection of nodes. The ConvertTo-XML cmdlet defines these as Objects and Object. As I showed you last time, these nodes could be renamed. So I have to assume I don’t know in advance what they are called, and I’ll let PowerShell figure it out for me.

All that really matters is the nested collection of nodes. I can determine what they are called by getting the last child node of the document root and grabbing the first child of that node.

In this particular file $name will be “Process”. The next step is to use an XPath filter to select all of these nodes.

Now, I can enumerate each node and create a hashtable for each node. After processing all property nodes, I can write an object to the pipeline.

What did I end up with?

Looks pretty good. Until I check the object with Get-Member.

Checking our object with the Get-Member cmdlet in Windows PowerShell. (Image Credit: Jeff Hicks)

Checking our object with the Get-Member cmdlet in Windows PowerShell. (Image Credit: Jeff Hicks)

Sponsored

You can see that everything is a string. One solution is to use another mapping hashtable that defines a type for each property.

The value is a scriptblock that contains code to convert each text entry. You can use it like this:

Now look at what I see with Get-Member.

A mapping hashtable works if you don’t have any other options. Because I went to the effort of including type information in the XML file, I can take advantage of the type attribute on each node.

Instead of hashtable, I’m using a switch construct to convert the value depending on the type attribute. Note that if the value is null, then you might see a warning message, but that shouldn’t be a problem. The end result is a collection of properly-typed objects.

Now that I have the process for converting an XML document into objects, I can use from the PowerShell console, and because I’m going to be creating XML files that store type information, I can create a function for importing my XML files.

With this function, I can easily import my XML file.

Importing an XML file in Windows PowerShell. (Image Credit: Jeff Hicks)

Importing an XML file in Windows PowerShell. (Image Credit: Jeff Hicks)

Everything is properly typed.

This I can do whatever I need with the data.

Sponsored

My end-to-end transformation is just about complete. All the pieces are in place for a final solution, which I’ll go over next time.

Sponsored

Tagged with ,