PowerShell Problem Solver: Text to Objects with Regex

Posted on July 22, 2015 by Jeff Hicks in PowerShell with 0 Comments

In today’s PowerShell Problem Solver, we’re tackling a problem that comes up often, which I’m sure I’ll write about again in the future.

A user at the PowerShell.com forums asked a question about how to parse output from what I’m assuming was a command-line utility. The command output looked like this:

Figure 1: Text-Based Output. (Image Credit: Jeff Hicks)

Figure 1: Text-Based Output. (Image Credit: Jeff Hicks)

For the sake of demonstration, I’m going to use a text file that contains this output. If you look at the output, you can imagine two different types of objects being displayed. I’m going to focus on the second part of the output, which is the “Disks in Use” section. Remember this is all text. I’m going to deviate a little from the original need for the sake of education. First, let’s say I only want to get the disks with a failed status. The quickest solution is to use Select-String.

No denying it works, but we lost the heading.

Figure 2: Using Windows PowerShell's Select-String cmdlet to find matching text. (Image Credit: Jeff Hicks)

Figure 2: Using Windows PowerShell’s Select-String cmdlet to find matching text. (Image Credit: Jeff Hicks)

I could use a slightly more complex pattern to include the header.

Figure 3: Adding the header. (Image Credit: Jeff Hicks)

Figure 3: Adding the header. (Image Credit: Jeff Hicks)

But this is all still text, and you know how I feel about PowerShell. I think it would be more helpful to turn this text back into a set of objects. The header text makes a natural list of properties. One solution is to use a regular expression pattern that includes named captures.

With a named capture, you can reference matching groups by a name. So I’ll create a pattern to describe the data I want to process.

Sponsored

I’m not a big fan of spaces or special characters in names, so I have modified a few items. The text inside the angle brackets will be the name of each matching group. The pattern that follows each name should capture the corresponding data. With this, I can match the text from the text file.

The variable $m is now a collection of MatchInfo objects, which are in the Matches property.

Figure 4: Match objects. (Image Credit: Jeff Hicks)

Figure 4: Match objects. (Image Credit: Jeff Hicks)

Knowing the capture names, I can access individual properties.

Figure 5: Obtaining the value of a named capture in Windows PowerShell. (Image Credit: Jeff Hicks)

Figure 5: Obtaining the value of a named capture in Windows PowerShell. (Image Credit: Jeff Hicks)

All I need to do is go through each match and enumerate each named capture. Fortunately I don’t have to hard code any names. I can get them from the regex object.

Figure 6: Listing named capture names. (Image Credit: Jeff Hicks)

Figure 6: Listing named capture names. (Image Credit: Jeff Hicks)

Although I don’t want that the first name of 0, which is always there. My intention is to create a custom object for each match.

The end result is a variable, $data, with a collection of custom objects.

Figure 7: Displaying converted matches to objects. (Image Credit: Jeff Hicks)

Figure 7: Displaying converted matches to objects. (Image Credit: Jeff Hicks)

Because I now have objects, I can use PowerShell cmdlets.

Figure 8: Using custom objects in PowerShell. (Image Credit: Jeff Hicks)

Figure 8: Using custom objects in PowerShell. (Image Credit: Jeff Hicks)

The only way all of this works is if you know what your data will look like. There’s also an assumption that there are no blanks or null values. If there were some gaps in the text output, my solution would most likely fail.

Sponsored

I hope you see the value in working with objects instead of text. Yes, I had to jump through a few hoops to convert the text, and I won’t disagree that regular expressions can be a tough nut to crack. If you’d like a tool to make this process easier, then take a look at a function I published last year on my blog and see if that helps. As with most things PowerShell, there’s usually more than one answer, and I’ll show you another approach next time you might find a bit easier.

Sponsored

Tagged with , ,