A custom BizTalk 2004 disassembling pipeline component (1)

Part 1 Part 2 Part 3

There are many different format of business documents. Recently, a post on microsoft.public.biztalk.nonxml caught my attention. The author receives a flat file with name/value pairs organized in tables. The flat file looks like:

Table: tablename1
Field1&Value1
Field2@Value2
<more fields here with different separators>  

Table: tablename2
FieldN%ValueN
FieldP$ValueP
<more fields here with different separators>

However, the order of the tables can change (i.e. “tablename2“ can come before “tablename1“ in an instance of this file) as well as the order of the Field/Value pairs under each table. The separators themselves are not always the same, which makes the file format unusual. The challenge here is to build an XML document that is independent from the order of the fields.

There are several ways to do this: One way is to use a custom disassembler pipeline component and I have been trying to find a real case scenario to demonstrate how to write such a custom component.

Creating a custom pipeline component is easier than writing an adapter. Create a new C# Class Library project in Visual Studio 2003, add a reference to Microsoft.BizTalk.Pipeline.dll (found in BizTalk 2004 install directory) and you are pretty much good to go.

Our strategy will be to disassemble the flat file and store all the data in memory in a “Table“ data structure shown below. We will then extract the data to create an XML document.

      /// <summary>

      /// Maintains all the name/value associated with a table.

      /// </summary>

      internal class Table

      {

            private string name;

            private ListDictionary nameValues;

            /// <summary>

            /// New table constructor

            /// </summary>

            /// <param name="name">string: name of table.</param>

            internal Table(string name)

            {

                  this.name = name;

                  nameValues = new ListDictionary();

            }

            /// <summary>

            /// Gets/Sets the name of the table.

            /// </summary>

            internal string Name

            {

                  get { return name; }

                  set { name = value; }

            }

            /// <summary>

            /// Indexer on names stored into this object.

            /// </summary>

            internal object this[string key]

            {

                  get { return nameValues[key]; }

                  set { nameValues[key] = value; }

            }

            /// <summary>

            /// Collections of all keys currently inserted in this object.

            /// </summary>

            internal ICollection Keys

            {

                  get { return nameValues.Keys; }

            }

      }

Since the component is a Disassembler, we will be implementing IBaseComponent (all components have to implement that one), IComponentUI (required by the Pipeline Designer in Visual Studio), IDisassemblerComponent (the interface which allows performing the actual disassembling) and IPersistPropertyBag (required to hold the custom properties our custom disassembler might expose).

In addition to this, we will also need to specify the category of our custom pipeline component so it is properly recognized by the Pipeline Designer. Also, let's make sure we get a new GUID for the component:

[ComponentCategory(CategoryTypes.CATID_PipelineComponent)]

[ComponentCategory(CategoryTypes.CATID_DisassemblingParser)]

[System.Runtime.InteropServices.Guid("B60F6801-C07E-45f7-A9C7-E8797671D710")]

public class TableFormatterDisassembler : IBaseComponent,

IComponentUI,

IDisassemblerComponent,

IPersistPropertyBag

{

/// <summary>

/// Constants used to access various strings stored in resources.

/// </summary>

private const string descriptionRsrcID = "Description";

private const string nameRsrcID = "Name";

private const string wrongFlatFileRsrcID = "WrongFlatFileFormat";

private const string designerIconRsrcID = "DesignerIcon";

/// <summary>

/// Instance of ResourceManager to access our resources.

/// </summary>

private ResourceManager resMgr;

/// <summary>

/// Maintains all tables inside an ordered list.

/// </summary>

private ArrayList tablesArray;

/// <summary>

/// Original's message context.

/// </summary>

private IBaseMessageContext originalMsgContext;

/// <summary>

/// Regular expression used to determine if a line is blank.

/// </summary>

private Regex regExpBlankLine;

/// <summary>

/// Regular expression used to determine and parse a table header line.

/// </summary>

private Regex regExpTableParser;

/// <summary>

/// Regular expression used to determine and parse a name/value pair.

/// </summary>

private Regex regExpLineParser;