Properties coding expedition #5 – Stripping characters


In Part 4, I discovered that WideCharToMultiByte converts certain invisible non-spacing Unicode characters to ?.  This makes the output look really silly in a command line application.  I want to keep this as a command line application, so I need to strip these characters away.  A simple helper solves this rather neatly:


void _StripCharacters(__inout PWSTR pszText, __in PCWSTR pszRemove)
{
PWSTR pszSource = pszText;
PWSTR pszDest = pszSource;
while (*pszSource)
{
// Skip copying characters found in pszRemove
if (!StrChr(pszRemove, *pszSource))
{
*pszDest = *pszSource;
pszDest++;
}
pszSource++;
}
*pszDest = 0; // NULL terminate
}

This modifies the input string, omitting any characters found in pszRemove.  Nothing fancy.  Now I call it when I want to send a string to the console:


… from part 3
PWSTR pszValue;
hr = ppropdesc->FormatForDisplay(propvar, PDFF_DEFAULT, &pszValue);
if (SUCCEEDED(hr))
{
// LRM RLM LRE RLE PDF LRO RLO

_StripCharacters(pszValue, L”\x200e\x200f\x202a\x202b\x202c\x202d\x202e”);
wprintf(L”%s: %s\n”, pszLabel, pszValue);
CoTaskMemFree(pszValue);
}


Now the output is free of those annoying question marks:

Date last saved: 9/29/2006 10:12 PM
Width: 1139 pixels
Height: 769 pixels
Horizontal resolution: 200 dpi
Vertical resolution: 200 dpi
Bit depth: 24
Dimensions: 1139 x 769

Comments (1)

  1. Anonymous says:

    This coding expedition has developed a tool that can dump out all the properties on a file. If you are