Export reports to PDF with improved support for international text

Today’s post is from Andre Milbradt, an engineer who’s been working on Reporting Services and other BI products here at Microsoft since 1999.

For SQL Server 2016 Reporting Services we made a set of improvements to our PDF Renderer targeted at our international customers. Some of them were long-standing feature requests like copy/paste support for international characters or drawing vertically-stacked East-Asian characters. Others are paving the future for a better out-of-the-box authoring experience for our international customers.

Copy/paste support for international characters

International characters (characters outside of the ASCII range) are written out in PDF using their Glyph ID. A Glyph ID is the ID of a character’s vector drawing unique to a particular font. Hence Glyph IDs only have a meaning in the context of a given font.

To copy/paste characters in PDF (or search for them) document viewers need to know the reverse mapping of those Glyph IDs back to Unicode character codes. To achieve this the PDF specification has the notion of a ToUnicode mapping embedded in the PDF document (tutorial here).

The Reporting Services PDF Renderer never wrote out such mapping with the result that international customers are not able to copy/paste content. That has finally changed for SQL Server 2016 Reporting Services. Now the PDF Renderer writes out a ToUnicode mapping and subsequently fully supports copy/paste for international characters. The mappings themselves are optimized using character ranges to ensure a minimal increase in the size of the resulting PDF document.

There’s an infamous MDSN Forums thread from back in October 2008 about this issue (“a wagging donkey's tail”). While it certainly took us a fair amount of time to get to it, it was always on our minds…

Drawing of vertically-stacked characters

Another feature that was often requested, yet missing for a long time, is the support for vertically-stacked East-Asian characters in documents produced by the PDF Renderer. It’s a layout that is used for Japanese, Chinese and Korean scripts. So far the PDF Renderer wrote them out by simply rotating the content, but not stacking the characters.

For SQL Server 2016 Reporting Services the PDF Renderer will automatically detect East-Asian characters and write them out vertically-stacked. This mode is enabled by setting the WritingMode property of a textbox to Vertical. Note that non-East Asian characters within a block of vertically stacked East-Asian characters are not stacked.

New DefaultFontFamily property in RDL

For SQL Server 2016 Reporting Services a new RDL property called DefaultFontFamily was added:

<?xml version="1.0" encoding="utf-8"?>
<Report MustUnderstand="df" xmlns="https://schemas.microsoft.com/sqlserver/reporting/2016/01/reportdefinition" xmlns:rd="https://schemas.microsoft.com/SQLServer/reporting/reportdesigner" xmlns:df="https://schemas.microsoft.com/sqlserver/reporting/2016/01/reportdefinition/defaultfontfamily">
<df:DefaultFontFamily>Segoe UI</df:DefaultFontFamily>

</Report>

Note that for SQL Server 2016 Reporting Services micro-versioning was introduced. This allows us to support new features without having to revision the entire schema.

While this property is infrastructure that supports SSRS 2016’s updated visual styles, we imagined it might come in handy in future as we consider other enhancements. For instance, should the default font be culture-sensitive? For instance, should it be “Segoe UI” on an English machine but “Meiryo UI” on a Japanese machine and “JhengHei UI” on a Chinese machine? This might give international customers a better authoring experience right out-of-the-box with less reliance on font fallback, which can be complicated and ambiguous at times.

Please let us know what you think in the comment to this post.

Try it now and send us your feedback