Erik has been hacking the interpreter the last few weeks, and now we're finally at a state where it is useful (though there are still some issues).
Interpreter??
Yep, now Erjang has two modes of operations.
- The default is the Eager JIT mode, meaning that modules are compiled from
.beam to .jar-with-class-files eagerly as they are encountered as OTP boots up. This results in relatively slower start-up time, but fast execution.
- The new option
ej +i lets you run Erjang with the byte-code interpreter execution mode; which starts up faster, uses less memory, but runs slower.
One effect of this can be seen from the launch time of Erjang. Look at this
prompt$ rm .erj/* ## delete class cache
prompt$ echo 'erlang:halt().' | time ej
Eshell V5.8 (abort with ^G)
1> 13.96 real 14.45 user 1.86 sys
Wow, that took 14 seconds to launch the Erjang REPL. Now, doing the same thing again should speed things up because now the .beam files have been compiled; see here:
prompt$ echo 'erlang:halt().' | time ej
Eshell V5.8 (abort with ^G)
1> 4.67 real 4.61 user 0.29 sys
Wow! the second launch was much faster because the Eager JIT does not have to pay the price of the up-front compilation.
So, running the interpreted Erjang (notice the +i there) gives you:
prompt$ echo 'erlang:halt().' | time ej +i
Eshell V5.8 (abort with ^G)
1> 2.47 real 2.64 user 0.20 sys
Wow, its only ~55% of the execution time.
You might be puzzled why this is faster than the pre-compiled case above. The reason is that Java byte code loading is fairly slow, and that case ends up loading a lot of java code. And the Java JIT has not yet had a chance to optimize the code.
Unfortunately, running BEAM (the erl command) yields...
prompt$ echo 'erlang:halt().' | time ej +i
Eshell V5.8 (abort with ^G)
1> 0.24 real 0.11 user 0.02 sys
10 times faster! Urgh. Fortunately for most erlang systems, launch time is not significant... but still. Over time (when the Java JIT kicks in) the performance of Erjang and BEAM are currently at-par.
Memory Usage
Having the interpreter provides two new interesting possibilities. First, it brings Erjang within reach of running on an Android machine, because (a) this execution mode does not involve runtime generation of code, and (b) the memory load is much smaller. Lets try to launch mnesia and inspect memory usage. see:
First, look at an interpreted mode...
prompt$ ej +i
1> mnesia:start().
ok
2> 'java.lang.System':gc().
ok
3> erlang:system_info(allocated_areas).
[{'non_heap:CMS Perm Gen',134217728,10628584},
{'heap:CMS Old Gen',50331648,7618216},
{'heap:Par Survivor Space',1638400,0},
{'heap:Par Eden Space',13500416,1790200},
{'non_heap:Code Cache',2560000,2529664}]
Next, compiled mode...
prompt$ ej
Eshell V5.8 (abort with ^G)
1> mnesia:start().
ok
2> 'java.lang.System':gc().
ok
3> erlang:system_info(allocated_areas).
[{'non_heap:CMS Perm Gen',134217728,35051856},
{'heap:CMS Old Gen',50331648,13346696},
{'heap:Par Survivor Space',1638400,0},
{'heap:Par Eden Space',13500416,416592},
{'non_heap:Code Cache',2494464,2483520}]
To make it more readable, here is a table ...
| | compiled | interpreted |
| Old Gen | 13,346,696 | 7,618,216 |
| Survivor Space | 0 | 0 |
| Eden Space | 416,592 | 1,790,200 |
| Perm Space (Java byte code) | 35,051,856 | 10,628,584 |
| Code Cache (native code) | 2,483,520 | 2,529,664 |
With the compiled mode, memory adds up to ~55MB, in interpreted mode, ~20MB. Over time when the Java JIT kicks in, this difference is going to be even more evident.
Another interesting measure, is the memory usage before and after starting mnesia. With ej +i starting mnesia adds ~1MB total memory usage (similar to BEAM); but with compiled mode, starting mnesia adds ~9MB memory usage. So, while the baseline for a booted OTP is higher with Erjang +i than BEAM (~19MB vs ~3MB), the incremental memory usage when loading code and running is the same order of magnitude.
HotSpot Erjang
Another interesting potential would be to combine the Erjang interpreter an the Erjang JIT, so that code starts up interpreted and then if something is used "enough" it will get JIT'ed to java byte code. This is similar to how HotSpot Java does it (but is also one of the things that Oracle has patented and was recently suing Google over ...).
In such a scenario, there are some more interesting optimizations that would make sense; because if we have more information then it might well make sense to do more agressive optimizations and trade off some extra byte code generated for more speed.
Getting there...
As I hinted above, there are still issues with the interpreted mode; one is that interpreted mode uses more stack stack space; another is that there are some still some bugs lurking in there. At lease when running rabbitmq, something doesn't boot correctly. So, there is still some wat to go.
Recent Comments