ANTLR Is Not As Cool As I’d Hoped
By Adrian Sutton
About 5 years ago while I was doing some part time work for my university one of my lecturers walked by, looked at the program I was developing and asked: “You’re using antlr or something like it to generate your parser aren’t you?”. I wasn’t, I’d written the parser by hand in an hour or two and it worked exactly as I wanted so I saw no need to go back and rewrite it.
Since then every time I’ve wound up writing a parser I’ve thought, “I probably should use antlr for this” but then decided to just take 10 minutes and write it myself instead of introducing another dependency and having to muck around with the build system etc. I’ve had to support these over multi-year periods without problems.
Finally today I hit a case where I needed to write a reasonably complex parser that was likely to see a lot of updates in it’s life time. So I went off and took the time to add antlr as a dependency and learn how to use it etc. Now it’s worth noting that I’d already written a perfectly working first iteration of this parser by hand in about an hour.
I now finally have an antlr parser working after about 1-2 hours of reading up on ANTLR and about 3 hours of actually working on tweaking the grammar. It’s very brittle (otherwise known as strict) which in this case is a major problem so I’ve had to add in Java code to perform recovery when unexpected input arises. I’m now much more concerned that I’ll wind up with an unmaintainable mess than I was with my hand written parser.
The worst part of it is that more people would be able to understand and work with the hand written, plain Java parser than could understand the antlr grammar files. Not to mention the fact that the generated source code (2 files, a lexer and a parser) generate 27 warnings in eclipse – so much for a nice clean warning free code base.
It’s great that things like antlr are available for when you have a very clearly defined grammar (particularly for large or complex grammars) and you want the strict checking it provides but I’m not sure I’ll feel so bad about writing my own parsers in future.