Most complex data structure ever?


Reader Question: What’s the most complex data structure you’ve ever run into or had to program against?

My personal one was the “Mortgage Industry Standards Maintenance Organization” (you can download it if you’re bored) which is a monster DTD that includes just about everything a real estate/mortgage or financial company could want.  In any case, creating “fake” data [for a customer prototype using BizTalk] for this thing was tough, and showing how each different partner organization could use part of this data in their application as part of an orchestrated process was tougher.

Here’s another one (although I haven’t written code for this), the Visual Studio 2005 has a way to export your IDE settings into a *.vssettings file. So I did this with C# Express and I got a nice 400K size text file. Ouch! To be fair, most of the size appears to be Base64 encoded data which bloats the file size quite a bit.

Back to the question: What’s the most complex data structure you’ve ever run into or had to program against?

Comments (16)

  1. Anything coming out of the HL7 consortium.

  2. Denny says:

    does this count:

    export all the data from an *UN DOCUMENTED* Oracle Database and then convert it to a format for importing to a new database.

    many, many, many tables and some very strange data…. with no docs!

  3. Frisky says:

    ACORD – Insurance Standard

    Anything you would ever want to know about any type of insurance. (And there is a ton you may want to know.)

    Funny thing is, I used to work in the power industry. They had way more complex data used in way more complex ways. All kinds of unit factors, power loss coefficients, and polutant credits. Youch! But their structures were small and easy to use. Probably because they were smart enough to know it, and categorize/organize it into consumable units.

    Frisky

  4. The MISMO! Gahhhhhhhh! I’m on a project to upgrade our version to to 2.3.1. Good to see a few people will at least see it’s complexity 😉

  5. James Geurts says:

    A boolean variable 😉

  6. Tyler Young says:

    Has to be either a genealogical database structure or an OMG proposal for the semantics and vocabulary of business rules. The genealogy project had an ORM diagram spanning about 25 pages. It had to keep track of things like where a person was born, how certain you are about when that event took place, when the name of that place changed, who had made changes to the data and when, permissions for collaboration groups…. oy! We had to pare it down fairly significantly before we could implement it. The BSBR (Business Semantics of Business Rules– it’s since been renamed to someting else) document was 300+ pages of terminology for defining what a business rule is and how to quantify it. Since it was a draft proposal, we had to go through and check to make sure the document was consistent with itself before we could even start deciding if the things it said made logical sense. Plus, all the semantics of these words were defined by font styles in a Word document. We ended up creating a massive XSL transform that spat it into parsable XML that could then be visualized and verified for consistency.

  7. Alex – Ouch, I feel your pain, good luck!

    Tyler – Ouch, the BSBR sounds bad, does anyone actually use it?

  8. Tyler Young says:

    Dan – Not yet. The OMG is still taking submissions for their Request For Proposals. IBM has a proposal in, and a consortium with Unisys, Northface University and a few others have a proposal in. The IBM proposal is centered more on implementation, while our group is more geared twoard conceptual modeling. We’d like to bridge them a bit, but we’ll have to see how it all goes. Still, it made for a nice non-trivial student project!

  9. Tyler Young says:

    Dan – Not yet. The OMG is still taking submissions for their Request For Proposals. IBM has a proposal in, and a consortium with Unisys, Northface University and a few others have a proposal in. The IBM proposal is centered more on implementation, while our group is more geared twoard conceptual modeling. We’d like to bridge them a bit, but we’ll have to see how it all goes. Still, it made for a nice non-trivial student project!

  10. Shaji says:

    My biggest one was OTA Standard http://www.opentravel.org they had schema for anything and everything to do with travel industry and to make matters worse would change the namespace every 6 months or so…

  11. Ricky Dhatt says:

    Make that three for MISMO. Ugh.

  12. KP says:

    HIPAA 4010 transaction documents

  13. Darren Oakey says:

    I once worked for a company [who shall remain nameless] who had this concept of the "Main Library" – basically, a single C++ monolith type that encapsulated ALL the data from the database, (and we are talking a BIG database – one table had more than 4 million rows) loading it in when the application ran, and doing all the same things that a DB does for you -indexed, sorting, searching – all in painfully convoluted C++.

    Apparently, using the database directly was too slow for their application.

    I was skeptical :)

  14. zerocool says:

    have a look 1st :)

  15. dcv says:

    I once created a batch file that copied a .txt file and pasted itself in another directory.