Tail-recursive calls are important for Erlang indeed. If you need to know why, you'll have to read that elsewhere. In this blog post I'll explain how Erjang handles tail-recursion.
When functions are self-recursive (the trivial case), a tail-recursion simply turns into a loop. For other cases things get a little more involved. Luckily however, it turns out to be quite efficient. Here is an outline of how Erjang does it:
- Every function is compiled to a class F with one method named body(EProc proc, EObject arg1, ... EObject argN). The function takes the EProc (a reference to the light-weight process) and the N arguments. The class F is marked final. This function contains the result of compiling an Erlang function into Java.
- If the function takes 2 arguments then F is a subclass of class EFun2, a class provided by Erjang. Erjang has a class loader that triggers when such a class is missing and generates EFunN classes on the fly.
- From the generated superclass, class F inherits the virtual methods invoke(EProc proc, EObject arg1, ... EObject argN) and go(EProc proc).
A normal call is encoded by simply calling F_instance.invoke(proc, arg1, ... argN), where F_instance is a statically allocated instance of class F.
A tail-recursive call to function G (assuming two arguments) is encoded thus:
- // save the arguments into fields of the EProc object
proc.arg1= value1; proc.arg2 = value2; - // save the target function singleton object a field of the EProc object
proc.tail = G_instance - // return a special global singleton tail marker.
return TAIL_MARKER;
The method body (which contains the real function body) is never invoked directly, but always indirectly via invoke or go. Those two functions collaborate to ensure that the caller never sees the TAIL_MARKER.
- Method invoke(EProc proc, EObject arg1, ... EObject argN) is used for "normal" invocations, and what it does is to first call res = this.body(proc, arg1, ... argN), and then loop as long as res is the TAIL_MARKER, repeatedly calling res = proc.tail.go() until the result of the tail-recursion is available.
- The method go simply calls this.body(EProc proc, EObject arg1, ... EObject argN) by taking the argument values from the proc object. Notice here that this refers to the function called in tail-position (G_instance) above.
Closures are encoded in a similar way, but the free variables are saved in a fresh instance of the Fun object, so they can be used repeatedly.
The net effect of all this is that
- We preserve proper tail-recursion semantics in the sense that the stack doesn't grow. A tail-recursive call pops a call frame (returning from body) and creates a new one (called from go).
- Other aspects of the code generation ensure that function objects like F and G in many cases statically typed whenever possible, the JVM can inline calls to invoke andbody (remember, they are final classes). This is the case whenever calling non-external functions.
- The final touch is this: saving arguments into the proc object is surprisingly cheap, because it is very likely to be on CPU-cache. Granted, it does require memory access, but on a modern processor it is probably not more expensive than saving them to the stack!
The overhead of all this is naturally a class with one method per function. Similar to how most other functional languages on JVM or CLR do it.
Further to this is that the JVM is likely to support tail calls in the near future, it seems. With native tail-call support we might do even better, but this encoding is actually already surprisingly fast.
As Per Bothner points out here, this is similar to other techniques.