The last few days I've been working on completing Erjang's implementation of function object (fun's); specifically being able to pass fun values from one node to another, and I'd like to share some of the detail of how that works. Nothing breathtaking here, just some details that Erlangers might find interesting.
As you might know, Erlang distribution works by sending the binary encoding of terms; and so sending a fun is also essentially done by encoding it using erlang:term_to_binary/1; passing the resulting binary to another node, and then decoding it again using erlang:binary_to_term/1. (The truth is a bit more complex, because the distribution protocol have some extra tags and information stored in the binary format.) This is pretty obvious for most data types; but how does it work for function objects?
When you encode a fun, what is encoded is just a reference to the function, not the function implementation. The information passed on the wire corresponds largely to the information given by erlang:fun_info/1:
Eshell V5.7.3 (abort with ^G)
(erjang@127.0.0.1)1> erlang:fun_info(fun(X)->X*X end).
[{pid,<1.0.668>},
{module,erl_eval},
{new_index,2},
{new_uniq,<<56,152,230,129,197,121,23,226,118,89,11,188,
47,178,198,203>>},
{index,6},
{uniq,13229925},
{name,'-expr/5-fun-2-'},
{arity,1},
{env,[[],
{value,#Fun<shell.7.125426800>},
{eval,#Fun<shell.24.80962049>},
[{clause,1,
[{var,1,'X'}],
[],
[{op,1,'*',{var,1,'X'},{var,1,...}}]}]]},
{type,local}]
Some of these values are a bit strange, so let me explain: The first thing you notice is that a Fun object captures the PID, the current process when the fun was created. That's quite peculiar, and I have not been able to figure out why this information is part of the fun; nor what it is being used for.
The fun's identification consists of the values for module, new_index, new_uniq, index and uniq. The module is just the name of the module, and the new_uniq is an MD5 checksum of the module. new_index is an index into the BEAM file identifying the fun. At some point (distribution protocol version 5), the new_... elements were added; prior to that only module, index and uniq were passed (and this also applies when talking to pre R5 nodes). The old uniq is the hash code of the parse tree of the fun; index was some other index into the BEAM file.
Finally, the value of the function's free variables env above, are passed in both versions of the distribution protocol.
The values of module, index and uniq are what is printed in the string representation of a string:
(erjang@127.0.0.1)2> fun(X)->X*X end. #Fun<erl_eval.6.13229925>
But the definition of the function is not passed along; just exactly enough information to recreate the fun at an other node if the module is there.
So what happens if the function implementation is not there when the fun is decoded? Well, that depends.
If the module containing the fun has not yet been loaded, and the target node is running in interactive mode; then the module is attempted loaded using the regular module loading mechanism (contained in the module error_handler); and then it tries to see if a fun with the given id is available in said module. However, this only happens lazily when you try to apply the function.
If you never attempt to apply the function, then nothing bad happens. The fun can be passed to another node (which has the module/fun in question) and then everybody is happy.
Maybe the target node has a module loaded of said name, but perhaps in a different version; which would then be very likely to have a different MD5 checksum, then you get the error badfun if you try to apply it.
So you can see that functions are not resilient to module updates; and this is one of the reasons why you want to have the versions of your erlang nodes in sync. The Erlang runtime system makes sure that the function is only "applicable" if the real function is actually there. This version checking of funs was significantly strengthened with the update in erlang distribution protocol 5, because the the identity of the fun now includes an MD5 checksum of the entire module, not just the body of the fun itself.
But there is another kind of fun, a fun created with a direct reference to an exported function, such as this:
(erjang@127.0.0.1)6> fun erlang:display/1 . #Fun<erlang.display.1>
Such function objects are transferred differently, carrying only the module, function and arity. And that makes such funs much more resilient to module updates; as they are applicable regardless of the version of the module. It's like an external function call is normally. So if you use these "external funs" carefully, you can use them to work with different versions of the same modules, such as in scenarios where you do code upgrades.
In the example above, the env does actually include the function definition because it exposes part of the Erlang interpreter which is part of the shell. So that's intersting too; it is kind of like passing a string to a remote site and doing "eval". You can use this yourself, to create funs that can be passed around, like this [Example due to Vladimir Sekissov]:
1> FunStr = "fun (A) -> A+B end.".
2> {ok, Tokens, _} = erl_scan:string(FunStr).
3> {ok, [Form]} = erl_parse:parse_exprs(Tokens).
4> Bindings = erl_eval:add_binding('B', 2, erl_eval:new_bindings()).
5> {value, Fun, _} = erl_eval:expr(Form, Bindings).
6> Fun(1).
3
This will work as long as the version of erl_eval is the same on the sender and receiver node.
7>erlang:fun_info(Fun).
[{pid,<1.0.668>},
{module,erl_eval},
{new_index,2},
{new_uniq,<<56,152,230,129,197,121,23,226,118,89,11,188,
47,178,198,203>>},
{index,6},
{uniq,13229925},
{name,'-expr/5-fun-2-'},
{arity,1},
{env,[[{'B',2}],
none,none,
[{clause,1,
[{var,1,'A'}],
[],
[{op,1,'+',{var,1,'A'},{var,1,...}}]}]]},
{type,local}]
This may work for you, but the resulting fun is obviously rather slow, since it is interpreted when applied. A better way might be to put your fun in a separate module which can then be force-loaded on the target node. Here is some code (from TrapExit) showing how to force load code remotely:
%% Find object code for module Mod
{Mod, Bin, File} = code:get_object_code(Mod),
%% and load it on all connected nodes including this one;
%% the next time the code is called it will be using the new version
{Replies, _} = rpc:multicall(code, load_binary, [Mod, File, Bin]),
%% if this node were the "master/admin" node
%% then to push to everything but this node use:
%% {Replies, _} = rpc:multicall(nodes(), code, load_binary, [Mod, File, Bin]),
%% and then maybe check the Replies list.
This would force-load your local version of the code onto all connected nodes by passing the local beam code to the remote nodes; which assures that all the remote nodes have the correct version of your code.