I think I like parsers too much, I end up writing about them a lot. Maybe that’s because about one program in three is, loosely speaking, a parser of some kind. All the more reason why we should be very good about writing them.
Unlike some of my previous quizes I’m not exactly sure where this one is going to go. I have a few ideas but I’m going to go on the voyage of discovery with you this time and see where it leads us; and maybe we’ll take some twists and turns that I didn’t expect based on what you all suggest. That should make it all the more fun.
To start us out, I wrote a quick and dirty little parser. It’s motivated by a parser that’s in a piece of code I’m working on right now. This parser basically takes boolean expressions that can look something like this: “(fact1 & fact2) | !fact3” and evaluates them to true or false based on whether or not fact1, fact2… etc. are true. The design is to assume that comparatively few facts will be true in any given run (maybe a few dozen) but there could be thousands of predicates (expressions) to evaluate. I’ve made a little harness that has the parser and some test code wrapped around it and it’s posted here:
This time I’m going to ask some open ended questions:
Q1: What’s wrong with this parser from a performance perspective?
Q2: What should we be doing to improve it?
Q3: How big of a difference is it likely to make?
These questions touch on the cornerstone of good engineering practices — understanding what matters and what does not and prioritizing the work so that you don’t spend a lot of time working on things that ultimately will not matter. But to really cement this think about what I consider to be the hardest the most important question of all:
Q4: If you hadn’t written all the code yet, what would you do to get a better idea what was going to matter and what wasn’t?
See the continuation in Performance Quiz #8 — The problems with parsing — Part 2