Splitting a Hex-Encoded String into Pairs of Hex Characters (a.k.a. To Pull a Noah)

Simple enough task: I have a hex-encoded string and need to decode it.  Now, we all know that to encode a string to hex is to cast each [char] to [int], then shove it through the "{0:X}" format specifier, then concatenate all the strings.

$string = "The quick brown dog"; 
[string]::Join($null, ([char[]]$string | % { "{0:X}" -f [int]$_; }));

This returns:


And we know to decode a hex-encoded character, we shove each pair of hex characters through the reverse process (more or less):

[char][Convert]::ToInt16('54', 16);

However, how do we split a string into pairs?  After all, the above magic only works for pairs of hex characters.  We can cast it into a [char[]] array, then iterate over the array to concatenate every two characters back to a bunch of two-character strings:

$hex = "54686520717569636B2062726F776E20646F67";
[char[]]$hex | % -begin { 
        $i = 0; 
        [string]$s = $null; 
    } -process { 
        if ($i % 2) 
            $s + $_; 
            [string]$s = $null; 
            $s += $_; 
        $i +=1; 

Perfectly valid.  However, PowerShell provides an obscure, yet incredibly succinct way to do this:

$hex = "54686520717569636B2062726F776E20646F67"; $hex -split '(..)' | ? { $_ }

Three notes:

– The "| ? { $_; }" filter (effectively equivalent to IsNotNull()) is required. Otherwise, the split has the nasty habit of interleaving $null elements in the returned list of strings. I have no idea why.

– The parentheses in the "'(..)'" split specifier is also required. "'..'" matches on the literal twin periods. Why the split specifier becomes a RegEx when parenthesized? No idea.

– For those who like Perl’s readability, you can smash the loop into a single line so it’s as inobvious as the "-split '(..)'" split specifier.

$hex = "54686520717569636B2062726F776E20646F67";

[char[]]$hex | % -begin { $i = 0; [string]$s = $null; } -process { if ($i % 2) { $s + $_; [string]$s = $null; } else { $s += $_; } $i +=1; }

Why would you want to do that? You guessed it: no idea.

Comments (1)

  1. forjo says:


    "…interleaving $null elements in the returned list of strings":

    No idea as well.

    "Why the split specifier becomes a RegEx when parenthesized?":

    It's always a RegEx. The parantheses make the 2 dots a Capturing Group. What you get back are the Values of the Captures.

    Knowing this, you can do the following:

    [regex]::Matches($hex,'(..)') | % {$_.Captures} | % {$_.Value}

    It's even shorter, if you retrieve the Values of the Matches:

    [regex]::Matches($hex,'..') | % {$_.Value}

    Since you need no Captures, you can omit the parantheses for the Capturing Group.

    No idea, why the -split operator returns the Captures but not the Matches.

    By the way:

    For RegEx testing, especially with .Net, I'm very happy with the free "Rad Software Regular Expression Designer":