Archive for the ‘Languages’ Category

ANTLR and Ruby—Perfect Together

In Ruby, Tools on January 30, 2012 at 9:33 am

I’ve been playing around with Ruby quite a bit lately. It’s a good language for building tools for my testing course. It’s also a great metaprogramming language. This weekend I found that it’s a pretty good language for building code analyzers and other language tools as well, especially if you have a great language processing tool like ANTLR (ANother Tool for Language Recognition).

I wanted to write a tool that would read in source code for a function (in a simple language that I created) and produce a list of nodes with information about definitions and uses of variables in a textual form that my students would be able to input into their programs for further analysis of control flow and data flow.

I use ANTLR as the tool for building lexers and parsers in my compiler construction course. But I’ve always used it by writing Java and having it generate Java. Certainly my tool could be written in any language since it just had to produce textual output, so writing the ANTLR-based processor in Java was my first thought. Then I thought that it would be fun to learn something new and have some fun and I thought about producing the tool in Ruby.

I knew that ANTLR has an option to produce Ruby and thought I’d take a crack at that. With a little bit of Web searching I found the ANTLR for Ruby project. All I had to do was load a Ruby gem and I was off and running. Well, at least I was off and crawling along.

I spent most of Saturday looking at documentation and understanding what an ANTLR grammar with Ruby actions would look like. I also wanted Ruby because I had the feeling that the plumbing around the resulting lexer and parser would be much simpler to write than it is in Java—and I was right!

I set up my project, initialized a Git repository and starting writing tests for the lexer and parser. This was slow going at first. I was learning and trying to write good code. Most of all though, it was fun.

The documentation on ANTLR for Ruby is pretty minimal. It seems that Kyle Yetter, the author, decided to do something else and there are a lot of missing details missing from the Web page. However, there is more detail and the complete API for antlr3 on the actual project pages. It’s not the best documentation, but it’s enough to get you going. Once I muddled through the basics, I picked up steam and finished the application comfortably on Sunday.

I found that the Ruby grammar file was simple and easy to read. The connections to the parser and lexer, and setting up my own internal structures for capturing nodes, definitions, and uses of variables was almost trivial.

By the end of the weekend I had written the following number of lines (just using wc, so blank lines and comments are include here):

  • The ANTLR grammar file: 239
  • The script to run the processor and produce the output: 36
  • Utility classes for the nodes, def-use entries, and exceptions: 60
  • Unit tests: 285
  • Rake files: 31

So, the total number of lines I actually wrote came to: 651 lines. ANTLR generated 3059 lines of code. I think this is a pretty good output for a weekends hackfest. I’ll post some of the code and the grammar for the language in future posts.

I encourage you to try your hand at something like this. It’s fun and you get to learn a lot. One thing I will say is that if you do this, make sure that you’ve got some really good documentation on the tools you’re going to use available. During the weekend I had both Programming Ruby 1.9 and The Definitive ANTLR Reference open in my PDF reader. I didn’t need the ANTLR book that much. Once you’ve written an ANTLR grammar, it’s pretty straight forward. But I was constantly searching the Ruby book for the right classes and methods. These two, combined with the on-line Ruby documentation made the programming adventure fun and productive.