Kirk Evans Blog

.NET From a Markup Perspective

XSLT as a Bigger Hammer?

Turns out that my post about using XSLT to convert CSV to XML hit a nerve with some folks.  Robert thought it would come in handy, while Dare thinks it is convoluted.

Dare suggests tokenizing the strings into a string array and iterating through the array.  Martin suggested Aaron Skonnard’s custom XPathNavigator and XmlReader implementation as an alternative.  Rob Karatzas and Dare also added Chris Lovett’s XmlCsvReader as another alternative.

The original post asked:

I’m trying to find a way to convert a .csv file to xml using stylesheets and I can’t seem to find anything. I did read about something called fxml or something like that for flat files but I don’t think it’s a standard. Can anyone help me in this regard?

I think XmlCsvReader is the best alternative if the CSV’s format is constant and the output does not vary much.  I have used it as well as Chris Lovett’s other implementations (such as XmlNodeWriter).  But the original post asked for an example using stylesheets.  I agree that the posted solution is convoluted and definitely highlights some of XSLT’s current weaknesses.  But I disagree that it is absolutely the wrong tool for the job.  In fact, I offer that the solution demonstrates recursion using XSLT quite nicely while offering an extensible solution.  Maybe using XLST to process CSV is not always the best tool for the job, but maybe it was an elegant fit for the original problem.

I guess this is a common issue in tehnical newsgroups.  If you post anything other than the exact answer the original poster asked for, they are likely to respond with a frustrated “that’s not what I asked.”  If you add more to the solution than they asked, more often than not you will receive a response similar to “thanks for answering my question, the rest of it doesn’t fit within my solution.”  In other words, while we would often like to question “why?  why would you not simply use C#, Perl, VB, JavaScript… anything but XSLT for this task?”, we also have to consider the impact of our corrective suggestions on the projects people are asking questions about.

Which way do you lean when answering questions on newsgroups?  Do you offer the 15 or so extra corrections in addition to their requested solution?  Do you strive to be as absolutely thorough as you can?  Or do you answer the question and move on?

Dare’s solution of splitting into string arrays is certainly the most readable and the most maintainable, but is it applicable to the problem domain?  We don’t know given the context of the original post.  Consider the problem if the output format varies greatly and often.  Using Dare’s current implementation, we would have to modify existing code and specialize the solution for each variant. If we want to change the parent element and child element names, we have to modify the compiled code.  If we want to use attributes rather than elements, we have to modify the compiled code.  What if we wanted to take CSV and turn it into something like this?

<Customer id =”1 >
 <Address street =”123 Main St. />

Certainly we can create an intelligent object model that can provide the naming scheme and provide various child concrete classes to specialize handling.  Or we can put the information in a custom configuration section and pull it dynamically.  Maybe we just provide all the needed information as parameters to the function and build up the method’s implementation to accomodate.  We mitigate some of this by using Lovett’s XmlCsvReader, we still have to specialize forming the new output schema.  But if we use XSLT, we can modify the XSLT without causing code recompilations.  And that is important in many situations.

What situations could something so fangled up be applicable to?  I don’t know, the original post didn’t say.  Maybe the original solution might have used the text ODBC driver to populate a typed dataset, who knows?  Without knowing the original problem domain, it is difficult to guess if this was an appropriate solution or not.  But it gave them what they asked for.