Friday, August 22, 2003

Parsers: When you need one you need one ...


It's vacation time so some interesting thinking can be done. More than just figure out how to meet my software delivery schedule.

Problem: Use a universal language to drive an electronic frequency generator used for killing pathogens in the human body to drive alternative frequency devices. The best language available is for the F100 and I want to use it to drive an FSCAN for you gadget freaks. The inventors of both devices have provided me with helpful documentation.

History: Most programmers hack up parsers without thinking. I once inherited a COBOL team that had hand crafted a multi-million line COBOL program for generating reports on a large banking software product. Average time to fix a bug in the report language parser was infinite (some could never be fixed). I took two C programmers, fed them the Dragon book, and in six weeks they replaced the parser with standard tools, eliminating hundreds of thousands of lines of code and 80% of the bugs out of the box. Average time to fix any remaining bug - less than an hour.

Question: What's the best open source parser for a small job like this?

Answer: Spirit

Caveat: If you have a better one send me a note ...

Why would you want to use Spirit?

Spirit is designed to be a practical parsing tool. At the very least, the ability to generate a fully-working parser from a formal EBNF specification inlined in C++ significantly reduces development time. While it may be practical to use a full-blown, stand-alone parser such as YACC or ANTLR when we want to develop a computer language such as C or Pascal, it is certainly overkill to bring in the big guns when we wish to write extremely small micro-parsers. At that end of the spectrum, programmers typically approach the job at hand not as a formal parsing task but rather through ad hoc hacks using primitive tools such as scanfs. True, there are tools such as regular-expression libraries (such as boost regex) or scanners (such as boost tokenizer), but these tools do not scale well when we need to write more elaborate parsers. Attempting to write even a moderately-complex parser using these tools leads to code that is hard to understand and maintain.

One of the prime objectives is to make the tool easy to use. When one thinks of a parser generator, the usual reaction is "it must be big and complex with a steep learning curve." Not so. Spirit is designed to be fully scalable. The framework is structured in layers. This permits learning on an as-needed basis, after only learning the minimal core and basic concepts.

For development simplicity and ease in deployment, the entire framework consists of only header files, with no libraries to link against or build. Just put the spirit distribution in your include path, compile and run. Code size? Very tight.

0 Comments:

Post a Comment

<< Home