Thursday, April 12, 2007

Free Java Compilers for GNU/Linux

For some time, free Java compilers for GNU/Linux have been disappointing. (As with anything in the world of free software, I want to moderate my criticism: if you really don't like something, at least you can fix it, and besides, you should be only so rude about stuff you get for free.)


Where We Stand

The state of things as I see it is this:







CompilerSpeedWarnings/ErrorsLanguage Coverage
javacSlowLousyGreat (1.5)
gcjMediumGreatLousy (1.1/1.2)
jikesBlazing fastOutstandingGood (1.4)
ecjPainfully slowVery goodExcellent (1.5)


So each of these has at least one problem that will drive you crazy. Sun's own javac is slow and gives you the same error message ("unknown symbol") for practically everything. GNU's gcj supports only an older version of the Java language: no generics, no enums, etc. IBM's jikes, too, is falling behind on the language feature front, and is apparently no longer maintained, so that's only going to get worse. Finally, the Eclipse project's ecj is so slow, you can actually brew a cup of java every time you run it.

(Kaffe's compiler, by the way, is just plain hopeless.)

As a result, I wasn't really happy with any of these. For a long time, I stuck with jikes as being the overall best contender, but I was increasingly unhappy about its lack of support for generics, enums, annotations, and other new language constructs.


Finding a Better Solution

When I looked at the alternatives, I was least repelled by ecj. If only there were some way to make it faster ....

Happily, there is. A little investigation showed that the problem was in the JVM startup time. Ecj itself is written in Java, and thus it must start up a JVM every time it runs. So a native-code version of ecj has a chance of being the Holy Grail of free Java compilers: fast, with good feedback and up-to-date language support.

And there's even a way to produce such a thing: gcj. Gcj can't compile ecj itself, because ecj's own code uses language constructs that gcj doesn't support. But one of gcj's many nice features is that it can compile a .class file, or a .jar file full of .class files, to native code. After a bit of experimentation, I was able to compile ecj's .jar files to native code. The resulting compiler is about as fast as jikes, offers feedback almost as good as gcj or jikes, and supports the language as well as javac. It's a winner.


Regrets

There are still some things I miss about using jikes:
  • Spell-checking. If you mistyped a symbol (a method name, variable name, class name, etc.), jikes offered reasonable alternatives. If you accidentally typed mystring.substrng(...), jikes would suggest mystring.substring(...). Its suggestions were nearly always correct, and the feature was a huge time saver. Ecj has nothing like it.
  • Jikes also behaved better when compiling multiple files. It correctly detected and compiled dependencies, didn't care if you specified a file more than once on the command line, and would compile file N+1 if file N didn't compile (more useful than you might think). Ecj does none of this.
Still, a natively compiled ecj is a really nice option, definitely the cream of the current crop.

2 comments:

  1. Nice summary.

    A natively compiled version of ecj is in many current distributions, I believe. Try apt-get install ecj-bootstrap-gcj on Debian, for example.

    cheers,
    dalibor topic

    ReplyDelete
  2. Good point; thanks for mentioning that. It is certainly a simpler way to get the same effect for most people.

    As it happens, my target system was an ancient Red Hat 7.3 (!!!) box, for which I could find no prebuilt binaries. Plus, doing it myself was fun. :-)

    ReplyDelete