My Quest for the Holy Grail
For many years, I have been searching for what I call the "holy grail of concurrent programming." I have been looking for a set of abstractions as powerful as classes and objects which describes activities and their coordination. Just as was the case for the medieval quest for the holy grail, I am not alone. Many other researchers are searching for this. I remember that at the time when I started doing my Ph.D. in 1996, I was talking to Doug Lea about this subject; and he urged me to continue this line of thought. My thesis subject ended up going down another path, but the issue has persisted at the back of my mind, ready to jump up whenever I experienced new evidence that might guide me.
For the last couple of years I have been pursuing these issues more persistently; as chief editor of the JAOO and QCon conferences I have invited speakers that have something related to this on their hearts and mind, and the last six months - since working more intensively with the Erlang programming language - they have started to grow into a more consistent picture.
So here is a little essay and some ramblings on some of the things I have found out so far.
The Danger of Understanding How Things Work
When I was 16 years old, I started programming on a Commodore 64; I was learning MOS 6510 machine code, interrupt programming and all kinds of crazy low level stuff. In the process I learned a lot of things about how operating systems work - too much for my own good. For 25 years, these learnings has hunted me, because they ruined my abilities to think in many ways. Because I actually understand what is going on one level down, it is difficult to abstract from that knowledge. It seems that because I know the computer is organized in the way it is, I tend to also think that way. When I think of a problem that I need to solve, I tend to jump straight to thinking how is this possible in terms of what I want the computer to ultimately do to solve this. Whereas this kind of thinking can be good from a performance point of view - and I have used this knowledge to that purpose at many occasions - it ruins the resulting program in terms of understandability and maintainability.
But I'm not the only one. Everywhere around us, we are trying to solve problems with the tools at hand (what else can we do?) And these tools shape the way we think of problems, and for the case of thinking of concurrent activities - a.k.a. processes - this is a big issue.
Imagine implementing Asteroids, the game. What would your model look like? One process for each of those 1000 things on the screen? Or one object/data structure for each thing, and then have some sort of scheduler that runs a task every N milliseconds, asking each object to move itself and check for collisions?
Well, processes are so expensive, so it's not good to have a process for each thing on the screen; maybe 1000 processes just for that, are you crazy? More over, how are you supposed to coordinate all those processes? Crazy idea, we can't do it with processes with our current tools.
Sidebar: In 1996 me and a colleague won a programming contest at the first Java One conference - the "JavaCup International". Our feat? We wrote a framework for writing games, and the example app was ... ta da ... asteroids. The design was the way you would think, with objects and a scheduler.
Because we think of processes as "somewhat expensive", our intuitive reaction to modeling systems that have processes in them is fundamentally wrong. My thesis is - much inspired by Joe Armstrong - that once you get used to programming in a system in which processes are "very cheap" (by some relative measure), the way you model things changes significantly. Consider this quote from Joe Armstrong "Imagine an object-oriented language that would only allow 500 objects; that is how constrained we are with processes today". I predict, that 10 years from now, we will look back at 2009 thinking the same things that we do now of the pre-OO era. We will say things like "Life is much easier when modeling with processes, how could we have missed that?".
This change will be for the better in terms of understandability, readability and maintainability of the resulting system. And if our systems are better in these regards, they are also less likely to have errors; and they are easier to improve - even improve performance if that is the issue.
The Power of Object-Oriented Thinking
BUT! ... it is not enough to have "cheap processes" and message passing like Erlang has it, though I believe that having cheap processes is a necessary element of a solution. We also need to have a way to think of processes that is so that we can draw on our (vast) previous experience from the physical world to reason about them, and to structure our thinking about processes. That's the magic that will make processes manageable.
In the early 90's, Dave Ungar - a researcher at Stanford and later at Sun Labs - came up with a crazy idea. He started out a project to create a new object-oriented programming language to support a certain style of programming without worrying about performance in the design. He then took it as an engineering/researching challenge to make it run fast. The language he was designing was called Self, and even though you might not have heard of it; it changed our world. This fundamental shift in attitude towards language design literally revolutionized programming language implementations. Not many people know Dave, but he and his team ended up coming up with all the major innovations that are making Java and C# run fast today. So even though his original project did not fly, he paved the way for other object-oriented programming languages to enable object-oriented thinking without having to understand too much when writing programs in such languages.
I believe that giving programmers good high-level abstractions as part of programming languages have always been the most powerful way to improve programs in general. In my opinion, object-oriented thinking has proven this big time. The concepts of objects and classes have empowered developers to structure many aspects of problems in a more intuitive fashion, releasing their intuition to reason about these structures in terms of a mental model that resembles our "natural intuition" (read: our schooling in aristotelian logic) rather than how the computer works.
The kind of intuition released by objects and classes is "good" in the sense that it stems from our ability to make deductions from observations and experience out side of the programming domain. All the fundamental concepts in object-orientation are concepts that we learned as children: object identity (two similar looking objects being not-the-same), classification (kinds of things), composition (putting things together or taking them apart) and basic notions of relations between things. Such concepts are natural to us, and thus if we can fit them into our programming language constructs, they release our natural curiosity and ability to reason about things based on outside experiences without having to divert into (too) artificial concepts that are only applicable in context of programming.
Some of our core concepts however, did not make it into the first generation of object-oriented languages: notion of time/space, location, places, activities and their concurrent execution. It is time we start figuring out how to make that happen.
Joe Armstrong, one of the original creators of Erlang, has been advocating against object-oriented programs on the grounds that it does not support concurrency. This is not an unjust argument given the state of object-oriented programming languages, but it is unfair to the idea of object-oriented thinking. Object-oriented thinking is about modeling systems in a fashion that makes the resulting software resemble the "real world", so as to better aid our natural abilities to reason about the resulting system.
Thinking of Activities
The tools we have available today in main stream programming languages for working with processes are fundamentally variations over the concepts of threads and locks. (Yes, I know that is a gross simplification, but not entirely untrue none the less.) Whereas threads and locks are indeed fine tools to create processes, they are lousy when it comes to managing and coordinating the outcome. And this is why we "hate threads" - threads are impossible to manage. Let a thread loose in your code, and it will run all over the place; it's not confined to a part of your program, it has no scope and no concept of time/space. Threads shoot themselves in the foot all the time, running into each other; creating havoc in their path. Unlike objects, which as discussed above have some resemblance of "real things", a thread i more like an electric current that runs through your objects. It's just difficult to cope with.
So if we look around us, and try to see the things in our real world that resemble processes, activities and such. What are they like? Here is just a sample of things that are related to real-life processes:
- An agenda, A recipe, or a script for a play; these are things that prescribe how an activity should unfold. Many activities have an associated prescriptive artifact that describes what steps are involved in carrying out the activity.
- A cook following a specific recipe, or an actor who is acting a specific role in a play. These are live objects that are themselves active or "live." The the cook or the actor are not themselves the activity, the activity is something they carry out; something separate from the live object, and also separate from the recipe or script. However, they need to keep track of how far they are so they have a conception of what to do next. Keep your index finger right there on line 7 in the recipe, it's the program counter.
- An ensemble playing Hamlet. The "play" is a coordinated effort of multiple actors, an aggregate activity composed of the activities carried out by a multitude of actors, each doing their "part". We could similarly describe what happens in a restaurant as a coordinated aggregate activity played out by chefs, waiters, the hostess, and the guy doing the dishes out back.
- A party entering the restaurant have their own agenda, but they interact with the people in the restaurant in some coordinated fashion. To make the interaction easier, the restaurant has prepared a menu - which is by en large the only "interface" you have to interact with the restaurant.
If we look hard enough in the above examples, I think we can find the abstractions we need, without diverting into artificial abstractions that have the distinct smell of hardware, interrupts, threads and locks.
On the other hand, there is a long way from this list of examples, to the formalisms required to make a programming language, but I hope that you can by now follow my train of thought on the importance of grounding our abstractions in the real world. I think this kind of discourse is important, because it is important that we have an opinion on how to think about programs disconnected from the code representation.
Process == Object
Now that I have studied Erlang for a while, and from reading some of Alan Kay's old writings [see references below] I have come to this idea that "process" and "object" are quite compatible, object-oriented thinking, and erlang-style process thinking are, with the right interpretation, very compatible. And I think that we need to wrap erlang-style concurrency in an object-oriented presentation to make the main stream swallow it.
The kind of encapsulation that Erlang has for processes is the kind of encapsulation objects should have had. An Erlang process encapsulates it's own mutable state, and everything a process shares with the outside world is in the form of immutable values. An Erlang process' mutable state is (a) its process dictionary, which is like a set of "thread-local variables" or "process instance variables", and (b) the state of stack/arguments and the program counter. Everything shared with the outside world is immutable. Can we combine those notions with an object-oriented way of thinking? I think so.
One of my mental troubles with Erlang is that it is not clearly visible what code [in a module] is inside a process and which code is on the "surface" of the process i.e., code that is used to encapsulate interactions with the process. In an object-oriented programming language, this distinction is quite clear because the boundary is the methods. Trouble is that this distinction in most (all?) object-oriented programming languages is wrong, because they permit mutable state to escape. So how do we fix that?
Another thing that has challenged me with process thinking is how to manage "multiple conversations" from a single process. [And this could be the subject of a larger discourse] Since an Erlang process has just one message queue, you need to be able to look ahead into the queue and select messages that are relevant for the conversation you want to have. In effect, multiple channels of communication are multiplexed onto a single message queue. That seems to work just fine, but perhaps things are more clear if we simply have the multiple channels explicitly as in CSP / Go?
Where to go from Here?
Alistair Cockburn likes to tell this story, ...
A tourist in the highlands was lost. He stops a passing farmer and asks for directions to Glasgow. The farmer says, “Och, if I were goin’ ta Glasgow, I wouldna’ start from here!”
We need to grow a way to think more coherently about processes, a way which is compatible with our experience of the real world. Only then can we make concurrent programs comprehensible. Trying to warp your mind like this, and continue programming Java is probably not a compatible mix.
Collecting a comprehensive list of the various researchers and practitioners that have influenced my ideas here is close to impossible, but here are some of them...
A heavy influence is naturally my schooling in object-oriented thinking, my thesis supervisor Ole Lehrmann-Madsen, all the good writings on what it means to think in an object-oriented style [OOPSLA 88-96'ish]. Chapter 18 of the BETA programming language book provides one such overview.
During my Ph.D. I visited Akinori Yonezawa at Todai for a semester. Lots of interesting work by his teams on ABCL/x and friends. I've also studies available material on several kinds of concurrent Smalltalks; most interestingly OTI's Actra Smalltalk.
My recent encounter with Erlang, has led me to think of processes in a new way; that light weight processes are really possible. Joe Armstrong's presentation at the JAOO Conference 2007 opened my eyes to how constrained we are, by his quote "Imagine an object-oriented language that would only allow 500 objects; that is how constrained we are with processes today". You can also find his JAOO presentation at InfoQ.com.