
Dramatis is at a pretty early stage. Consider it alpha in terms of the development process.
The first release alpha release, 0.1.1, is now available. The Ruby gem can be pulled from rubyforge and the Python distutils package from pypi. Of course, the Git repository is always available.
IRC: #dramatis on irc.freenode.net
Mail: dramatis at Google Groups
Currently, I don't have a set of HTML docs for Python. There are doc strings in the code which can be accessed via pydoc and the results are functional if not pretty. I looked at pydoc, happydoc, epydoc, and doxygen and wasn't sure what the best practices are. (The mainstream docs are written separate from the code?) Anyway, suggestions/recommendations wanted!
actor.become and actor.yield/actor.actor_yieldCurrently, the implementation is Ruby and Python. Other sufficiently similar languages are possible, for some definition of "sufficiently similar".
I'm not entirely sure. Here's what I would guess:
I don't think it needs open types, but since both Ruby and Python have them, I'm not sure. I'm not sure how much of a metaobject protocol it needs. It probably doesn't need everything it uses right now.
On Linux, 1.8.6 (p114), 1.9 (a recent trunk), and JRuby (a recent trunk). On OS X, I think it's the same. I haven't looked at Windows at this point.
I did one very simple test ( source:examples/pingpong/actor.rb ) on a recent trunk of rubinius and it worked.
I've only tried it on Python 2.5 on Linux.
Yes.
No.
Whatever the implementation provides. Dramatis doesn't care, but some programs will behave somewhat differently on different platforms.
To the latter, nope.
To the former: two things. First, it provides the actor model, which makes writing event- and data-driven code easier since you don't have to write your own scheduler.
Second: even green thread implementations handle overlapping I/O. See the futures example in the actor FAQ. This was one of my first use cases.
Yup. Two common examples of this are, I believe, GUI toolkits and DNS lookup. That doesn't make them useless, though. See the FOX IM example.
I'm not an expert. Ruby 1.9 and Python have GILs, the giant (or is it global?) interpreter lock, and I think that limits the usefulness of multicore. Ruby and Python code isn't going to run concurrently, though extension code might.
As I understand it, ruby code can run concurrently in multiple threads in JRuby so actors should run concurrently. This has been demonstrated to a limited extent using the fib example. In fact, the JRuby folks are using the example to look in to some possible issues in their thread implementation.
No, for two reasons:
First the actor model is just nice and the I/O overlapping is nice. Even on a single core. My first use case didn't call for concurrent threading.
Second, on the road-map is the ability to run multiple processes with socket-based message passing. Actors in different processes can run concurrently.
Grain size, i.e., the ratio of computation and communication, is always an issue in concurrency if processor utilization is a driving factor. Dramatis is no exception and I don't know what its ratio will be at this point. But I have plenty of reasonably coarse-grained needs that I'm not worried about it.
My needs are more about overlapping I/O and large loosely coupled, resilient systems and for those, Dramatis with sockets should be fine.
If anyone wants to rewrite this FAQ to not be in the first person, go for it.
Okay. It's a little much to call this a FAQ. (I mean, for one thing, no one asked me any of these questions once, let alone frequently.) It's some cross between a FAQ and a tutorial, with a bit of backstory and editorial thrown in. I fall on the mercy of "progress, not perfection". Needless to say, if you find any inaccuracies, feel free to let me know ... or better yet, fix 'em. It's a wiki ...
The terse answer:
Actors are objects with concurrent semantics. They can only execute one method at a time and they are uninterruptible.
The long answer:
Most programming languages don't have a first-class representation for concurrency or threads. The programming model they provide is a single thread of control that executes statements one at a time. Concurrency is often added to these languages through libraries that create new threads of control and provide tools for managing synchronization. In particular, there is either no or limited interaction between the object-oriented aspects of the language and the concurrency provided by the language libraries.
In actors, concurrency is intimately tied to the object nature of the language. Rather than creating independent threads that run through multiple objects, each actor has, abstractly, its own thread. Since a thread can only do one thing at a time, the implication is that the actor can only do one thing at a time. Multiple actors can be running at the same time, but each actor only has one thread of control running thorough it.
Actors also can't have their thread of control yanked away from them to do something else. For example, if an actor is executing some bit of code, something else can't interrupt it and cause it to execute another bit of code. The actor gets to run its method to completion.
No. Only objects that are identified as actors are actors. All other objects have their native semantics. So standard library objects behave like they always have. If multiple actors access the same object, chaos can reign: Dramatis actors help you manage concurrency, and hopefully they do it in a pretty way. But they don't solve all the world's (or all your) problems.
Note: this discussion tries to avoid bringing up behaviors, which makes it slightly inaccurate, but if you don't care about behaviors, an advanced feature mentioned below, you won't care about the slight inaccuracies.
Yes. Dramatis actors are normal objects and will act like normal objects when accessed through normal references. Dramatis programs access an actor object's "actor-ness" by using actor names. These are reference-like proxy objects that cause what look like normal method calls to have concurrent semantics; they are the primary entree to the Dramatis runtime.
You mixin the dramatis Actor class. In Ruby, this looks like
1 class MyClass
2 include Dramatis::Actor
3 ...
4 end
1 class MyClass ( dramatis.Actor ):
2 ...
MyClass.new in Ruby and MyClass() in Python.
Something like this works, too (I think), but (currently) has holes; Ruby:
1 my_concurrent_hash = Dramatis::Actor.new Hash.new
1 my_concurrent_hash = dramatis.Actor( dict() )
It works (if it does?) if the existing class isn't thread safe or, eventually, if the object lives in a different process.
If the object accepts or passes back mutable data, things can go funny. The library takes care of converting between actor names and references for the object itself but won't, for example, make a copy of the actor an actor (it'll be a plain object.) Of course, you're not supposed to copy actors in the actor model, so ...
If you have an actor name, let's call it my_actor, you call the method like you would normally, e.g.,
1 my_actor.my_function( arg1, arg2, arg3 )
Nothing, except define the method in the class of the object as you normally would. There is no difference in the way that the method is defined or what it does internally.
self) can I call methods normally?Sure. These calls have the native language semantics; that is, they aren't scheduled by the dramatis runtime and run within the current thread of control. If you have shared state, all the normal shared-state bugaboos apply. So if the object will be accessed by multiple threads, for example with dramatis, by multiple actors, the object should itself be an actor and those other actors should access it using its actor name.
In most (but not all) cases, objects that are not shared do not need to be actors.
The approach represented by dramatis is to allow you to use all your normal serial constructs and then give you actor tools to manage concurrency.
Sure. The most common call form, the one shown above, returns the value the normal method call would.
Yes. Different kinds of calls (see below) process exceptions in different ways, but the normal case raises the exception just as it would if the object were called via a normal reference.
The dramatis way of "sending a message to an actor" is to make a method call via an actor name.
Yup. I said that.
Yup.
Nope. (You thought I was gonna say yup. Admit it.)
There are a number of differences. First and foremost, if the callee actor is in the midst of executing another method call, the current call is queued and the caller blocks. (With a normal reference, the call will simply go ahead, with multiple threads running in the object.) When the callee actor finishes its current method, the next method call in its queue is scheduled. When our call finally makes it the front of the queue, it gets executed and the return value is computed and returned to us. When we get our value, we are unblocked and we can continue.
Yup.
In many cases, the difficulty of concurrency comes from managing, that is, limiting, concurrency. Blocking calls are a nice simple way to manage concurrency when that's appropriate (more on that later).
Blocking is in the eye of the beholder. In actually, the actor isn't really blocked. If you were to look under the covers, you'd see that the actor was sending your method to the callee and asking the receiver to send a response message back when it had a result value. The response message causes us, the calling actor, to pick up where we left off. (If you know what continuations and continuation passing style (CPS) are, this should sound familiar. If that makes you think about Ruby continuations, don't go there; they're conceptually related, but dramatis doesn't use them.)
And as an aside, let's just say that anyone that tells you that actor programs don't have deadlocks or races is playing a bit fast and loose with their explanations. Data and control dependence issues don't manifest the same way and, perhaps, with the same frequency as they can in thread-model programs, but they can arise in any concurrent program. However, they should be less common and easier to manage with actors.
Another thing that is hopefully true is that we can build better, higher level, tools for development and debugging in actor systems (given that it's a higher level model).
Yes, they can. However, dramatis has gating logic that will, by default, only allow through the unique response continuation it is looking for. All other call requests are queued and will only be delivered once the result value is returned and the blocked method resumed and completed. This is one use of what is often called selective receive.
Dramatis has a gating API that integrates with the continuation gating. Currently, methods can be identified as always allowed or never allowed, overriding the continuation gate. This can be used, for example, to always allow status requests to be responded to.
always gated method, it will get executed? Isn't that "interrupting" my method?Yes to the first part and, I would say, no to the second. You're making the call to the other actor, so, to some extent, you're voluntarily relinquishing control. Moreover, default actor gating will block all other messages until your method is rescheduled and completed. When you explicitly identify always methods, you're making yourself responsible for the ramifications.
To reiterate: with blocking calls and always gates:
Where a normal blocking call might look like an_actor.method_a( arg1 ) you make a non-blocking call with
1 Dramatis.release( an_actor ).method_a( arg1 )
1 dramatis.release( an_actor ).method_a( arg1 )
include Dramatis, you don't need to to put the Dramatis. on every call.)
Nope. The method might start immediately, if the actor isn't busy and the scheduler decides to do that, but the calling actor doesn't wait around to find out. Importantly, there's no "blocking" here so the calling actor cannot be interrupted: no other methods can get scheduled on it until it either finishes this method or makes a blocking call.
Well, everything in dynamic languages returns a value. Casts return nil/None immediately.
release do, exactly?release takes a name and returns a new name that has non-blocking semantics; you may also see this referred to as a null continuation. You can use that name as you would any other actor name or object reference, but it will always have non-blocking semantics.
Sure:
1 release_actor = Dramatis.release( an_actor )
2 release_actor.method_a( "once" )
3 release_actor.method_a( "twice" )
To the big return value bit bucket in the sky. I.e., nowhere.
I'm glad you asked that. The caller hasn't waited around, so there's no way to raise the exception to it via the call. One might be tempted to say, hey, if you don't care about the return value, you don't care about the exception either, but this is a Bad Idea. Swallowing exceptions is a Very Very Bad Idea. Don't ever work on an actor system that does this. (I've written a few.)
That we all know. Nobody disagrees about that. (Well, probably somebody does, but let's pretend they don't.) Unfortunately, the next step gets a little fuzzy.
Imagine this: you're the callee actor and you're in the middle of some middle layer of code somewhere when some exception gets raised. Right off, it would seem like you could partition most exceptions into one of two sets: (1) those that are the result of something you were asked to do and (2) those that arise because something internal to you is in an unanticipated state. An example of the first case might arise if you were given two numbers to divide and the denominator was zero. If the caller gave you the zero, it's pretty much the callers fault: it should be informed and the callee actor can go along its merry way.
An example of the second case might be an actor that provides some kind of service to clients. Let's say it keeps a few references to other objects but at some point, because of bugs in the code and/or the phase of the moon, one of those references is null. As part of the method call, it tries to call a method and raises an exception. In this case, it isn't the caller's fault at all. It sent a prefectly valid request. The fault lies with the callee, so at the very least, the callee should probably react to the exception. (The caller should perhaps get an exception, too, though if the callee could somehow restart, maybe it could fulfill the request after the restart).
So handling exceptions is gonna take some thought. Our Erlang friends (well, we like them; they probably don't know who we are) have some methods for this and we're frantically copying/adapting their methods for dramatis.
In the case of null continuations, dramatis will currently attempt to call a method dramatis_exception on the caller actor at some point in the future. This call is queued for the caller like all methods but in this case, it (currently) can't be gated.
If the caller doesn't have this function, it's passed to the runtime which records it.
This is all definitely incomplete but does provide the basis for current debugging.
This is a pretty interesting area for more work in dramatis ... really looking for use cases and experience here. Proper exception handling, both during development and production, will be crucial for dramatis.
Dramatis does not have an explicit receive. Many of our dynamic languages, including Ruby and Python, have "send" without receive: why should actors make things different? They don't (in dramatis). The language runtime already has a mechanism for method dispatch. Dramatis integrates with this.
Mostly the functions they provide aren't necessary with dramatis. For those cases where they are necessary, or just helpful, dramatis provides gates and behaviors.
An actor will not have more than one method scheduled by the dramatis runtime at a time. However, if a method is called via a normal reference, that call will execute in the normal fashion, using the caller's thread of control, wherever that may have come from. So, to manage concurrency, the developer manages the type of reference that is used by client code: if you only give out the actor name, then all calls to that actor will always use actor semantics.
What, I'm gonna say, no? Okay, yes, these topics are a bit more advanced (actually, I think that's just a nice way of saying my explanation of them is weak.) Two advanced features are gates and behaviors.
At the highest level, dramatis gates are similar to the pattern matching and guards in a language like Erlang. They affect when particular methods become schedulable. In Erlang, the list of all possibly receivable patterns is given in the receive statement. If a received message does not match this list, it is deferred. In Erlang, this receive block also specifies the code that processes messages when they are accepted.
In dramatis, the gating function (what can be done) and the response behavior (what to do) are separate. The response behavior to a message is always whatever the callee does when it executes the given method. Gates, however, affect whether a method will be executed or not.
Conceptually, Dramatis needs to provide the ability to
refuse methodsaccept methodsalways methodsThe refuse and accept methods are fairly self-explanatory: a refused method won't be scheduled until it is accepted. An accepted method will be allowed. All methods are accepted by default when the actor is created.
The always method is required to support advanced features like blocking continuations.
By default, when a blocking call is made, only that unique continuation is allowed (this is a bit of a lie; but it's mostly true). This really is necessary: otherwise managing concurrency gets pretty hard, pretty quickly. So, when making a blocking call, the gate state is kind of "pushed" and only that continuation allowed. When the result continuation is received from the callee, the gate state is "popped" and the method resumes, more or less the way it was.
This is all great, except that another common use case is status methods which are designed to be always safe. We don't want the blocking call to block these. So actors can identify always methods which don't interact with continuation gating (and, in fact, override all refuse specifications).
One observation here is that while the concept of delaying methods is similar to that in Erlang, the way it manifests is very different. This is primarily a result of the functional nature of Erlang and the imperative nature of our non-functional dynamic languages. The functional nature of Erlang means that the entire matching structure must be specified in one declaration in the Erlang receive statement. The gate features in dramatis are not functional: they are mutable and change over time. There are some use cases (like continuations) where this proves very convenient, but it's a new concept and will certainly evolve with experience.
A few notes:
This is an area of active work. The existing code and API supports the existing use cases but quite a lot of fleshing out can be done. Use cases and ideas very welcome.
This is advanced stuff. You don't need to understand this at first. In fact, you may not need to know it at all. It'll be interesting to see how much it's used.
In reality, in dramatis, actors and their behaviors are separate concepts. While we normally think of "an actor" as being an instance of a class with an Actor class mixin, this isn't exactly what happens in dramatis.
In dramatis, an actor is, by definition, an object that has an actor name and queue. An actor doesn't have any intrinsic behavior: it gets that, from, well, the behavior. Your class is the behavior part of the actor. The behavior is the part that actually responds to method calls. It's your instance: its data and code and whatnot.
So in dramatis, an actor is something that has a name, has a queue for holding method calls, has a gate, and has a behavior. Your code is the last part.
Well, right now you don't. (But this could change pretty soon.) When it is allowed, you can become a new instance. See the auction example below.
For a few reasons. One is related to user functionality (see the next question). But another reason is to allow objects to have both actor and normal object semantics. This gives a lot more flexibility in managing concurrency.
Some seriously cool, maybe seriously useful, and perhaps seriously twisted capabilities.
Gul Agha's book talks a lot about behaviors, which are fairly close to dramatis behaviors.
In functional languages, behaviors are the receive statements: they're the code to call and all the state you're going to pass. So in Erlang, you can think of each receive statement as defining a behavior.
We tend to not think of behaviors, or behavior-like objects, in imperative languages because we have mutable state. Every time we change a member variable, we're kind of "becoming" a new behavior, at least from the functional language point of view.
So, mostly we can ignore behaviors in imperative languages and dramatis.
But some use cases are cropping up that seem kind of interesting. The auction example (which originally came from Scala) provides an interesting use case. If you look at the original Scala code, the auction sever has two receive loops: one that it uses when the auction is ongoing and another that is used when the auction has closed. When the auction first starts, the first loop receives and responds to all bid requests. Once the auction has closed, its behavior has to change to no longer allow new bids. In fact, most messages should just get an "auction over" response.
There are many ways one could implement this is an imperative language. One could keep a member state variable and have every method check that state variable, responding accordingly. People do this all the time. You're effectively implementing a little ad hoc state machine.
But a nice, clean way to do this is to become an EndedAuction. That object has a different set of methods that respond differently.
There's really nothing inherently parallel about this ... at least as far as you've heard so far.
Two answers:
First: the feeling is that because actor programs tend to be event/data driven applications, the need for this kind of state evolution is more prevalent than in other contexts. To the extent that this is true, making it easy is worth the effort.
Second: there is a concurrent issue here that goes beyond the serial case. It revolves around the become method.
When you become a new behavior, that behavior, by definition, isn't executing anything, so the actor it represents is immediately free to schedule the next task in its queue. Your current method, any code after the become, can still execute, but it has no effect on the actor in the future. This gives you concurrency that might be useful. It's sort of like returning a value in the middle of a function, and then finishing some other tasks (for purposes of evaluating side effects, presumably) (which should also be possible in dramatis but isn't currently implemented.)
Gul talks about this a lot. It's cool but I don't have a compelling use case for it ... yet.
Yeah, it does, but they're pretty new and while I really like the start, there's some things that need to be figured out.
A future is kind of like an in-between of an RPC and a null continuation: it gets a result, like the RPC, but it doesn't wait for it, as a release call doesn't.
You create a future name, much like you do a cast:
1 future_name = dramatis.future( an_actor )
Then you call it (you could, of course, do this all in one step):
1 my_value = future_name.some_method( some_arg )
Then, sometime later, you use the value:
1 my_string = "I got %s" % my_value
The difference is that the object you got back, my_value, is not the actual result, but a proxy for the result. It is returned immediately and subsequent code executed. There is no blocking at this point, as there would be in the RPC case. Only when you try to do something that depends on the value in some way might a block occur.
When the value is required, the future internally examines itself. Hopefully, over on the callee, the method got executed and the result calculated and sent back to us. In this case, the value is just returned where we need it (this is a bit of a fib ... but this is experimental, so ...)
If the callee hasn't sent us the value, trying to access the value of the future blocks, as if it were an RPC continuation.
A very simple but pretty useful example (my first use case) was needing to fetch a bunch of web pages. Say I have an actor that represents a web site and I want to get a few pages:
1 root = web_site.fetch( "/" )
2 index = web_site.fetch( "/index.html" )
3 robots = web_site.fetch( "/robots.txt" )
4 result = root + " " + index + " " + robots
If web_site is a normal actor name, these are RPCs, the implication being that I won't even try to fetch the subsequent pages until the earlier ones have been completely returned. Since this is all network I/O, this takes a while and I'm mostly sitting around doing nothing useful. I'm generating latency for no good reason.
I could use release names ... but then how would I get the result? There are answers to this. It's certainly possible to do this without futures, but the futures code is just so very pretty:
1 future_web_site = Dramatis.future( web_site )
2 root = future_web_site.fetch( "/" )
3 index = future_web_site.fetch( "/index.html" )
4 robots = future_web_site.fetch( "/robots.txt" )
5 result = root + " " + index + " " + robots
In this case, all the fetches are started. When the final string concatenation tries to convert the futures to strings, the futures go see if their value has been returned and wait if they aren't.
If you're a parallel-type person, that is a simple fork-join without fork or join.
Yup. I don't know exactly how well this works. It's unclear when the proxy gets evaluated and how good a proxy it can be.
Yes. Right now, the the futures have to be evaluated within the same actor that made the original call. This almost certainly isn't a good idea.
I don't know. See "experimental" above.
I don't think so. I think they probably can't be naively used willy-nilly, but I think used with care, they can be very useful. (The only reason I actually make of point of this is that futures are so pretty, it's very tempting to use them willy-nilly.)
Yes, the function/block/proc call. The idea is that you specify a block of code to receive the value returned from an actor call. Since the way blocks and functions are represented is pretty different between Ruby and Python, the implementation is fairly different.
In Ruby, you can ask for a block of code to be called as the result of an actor method call:
1 ( interface( actor_name ).continue { |result| do_something( result ) } ).a_method
def a_block( result ):
do_something( result )
( Dramatis.interface( actor_name ).continuation( { result: a_block } ) ).a_method()
Yes. Well, maybe.
No. The block is effectively an unnamed method of the actor and goes through execution scheduling like all other actor methods.
This example creates two actors that send messages back and forth between each other, a bit like a ping pong ball. It's adapted from the Scala Example. The code for the final versions is in source:examples/pingpong. There are three versions there: the serial (source:examples/pingpong/serial.py) and the actor (source:examples/pingpong/actor.py) versions which we develop here and, for reference, a version closer to the orignal Scala example (source:examples/pingpong/scala.py).
For our example, we'll have two objects, ping and pong that pass a token back and forth a fixed number of times. We'll make the token, our "ball", be the number of volleys left to perform.
We'll start with a simple non-concurrent version. We'll need a class with a pingpong method. This method will take a count representing the number of volleys left to play and a reference to the partner that it is volleying with. With a little extra code so that we can see what is going on, it looks like this:
1 def pingpong(self,count,partner):
2 if count == 0:
3 print "%s: done" % self._name
4 else:
5 if count % 500 == 0 or count % 500 == 1:
6 print "%s: pingpong %d" % ( self._name, count )
7 partner.pingpong( count-1, self )
That's really all there is to it. All we need is the class wrapper and few lines to actually create a couple of objects and start the ball rolling (or volleying, as the case may be):
1 class PingPong ( object ):
2
3 def __init__(self,name):
4 self._name = name
5
6 def pingpong(self,count,partner):
7 if count == 0:
8 print "%s: done" % self._name
9 else:
10 if count % 500 == 0 or count % 500 == 1:
11 print "%s: pingpong %d" % ( self._name, count )
12 partner.pingpong( count-1, self )
13 # sleep 0.001
14
15 ping = PingPong( "ping" )
16 pong = PingPong( "pong" )
17
18 ping.pingpong( int(sys.argv[1]), pong )
To see what happens, I can run a thousand volleys on my machine:
$ python serial.py 900
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ python serial.py 10000 2>&1 |head
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
Traceback (most recent call last):
File "serial.py", line 24, in <module>
ping.pingpong( int(sys.argv[1]), pong )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
RuntimeError: maximum recursion depth exceeded
$
pingpong method doesn't return a useful value, we still have to call it recursively: that's the only way we have of calling methods. Most serial languages have no native way of expressing this volleying kind of communication between objects. It is possible to do this with continuations but most languages/implementations, including ones we care about a lot like python and jruby, do not provide them. It's certainly possible to write a message passing layer, but it's a fair amount of code.
Lets put that aside for now and look at making pingpong into an actor program using dramatis. To make a normal class into an actor, first we need to mixin, that is, add a base class, from dramatis:
1 class PingPong ( dramatis.Actor ):
2 ...
1 import dramatis
Before we rush ahead, we need to consider what we've done, what actors are.
Actors are concurrent objects: they are part object, part thread, a kind of chimera. They look in many ways like normal objects: they have state (data members), methods (member functions), and they can have methods called on them.
They also are, abstractly, threads. Rather than have whatever thread is running when a method call is made on an actor execute that method recursively and immediately, each actor has its own thread and only that thread is allowed to execute methods for that actor. This implies that an actor can only be executing one method at a time, so there can be no races or conflicts among methods of a single actor. Note that this thread is abstract and that different actor implementations implement it in different ways (dramatis, for example, does not create a thread per actor).
In dramatis, when an actor makes a call on another actor (often phrased in actor parlance as sending a message in the same way that Smalltalk sends messages between objects), rather than executing the method itself, on its thread, it creates a task, a combination of a reference to the called actor (which we call the actor name), the method to be called, any arguments to the method, and a continuation. The continuation is a representation of where the results of the method call should be sent. In the case of a normal call, like we're familiar with in non-concurrent programs, the continuation indicates a message should be sent back to the caller such that the result is delivered as the result of the method call. We call this style of method call a remote procedure call, or rpc, where remote means on another actor.
One other aspect of actors, of their threads and methods, is that once begun, a method cannot be interrupted. If another task is scheduled on an executing actor, it cannot be executed until the current method runs to completion.
However, continuation passing provides a nice syntactic shortcut that does look a little bit like a method not running to completion. When an rpc call is made, a normal method call, on another actor, the calling thread is, in a way, waiting for the called thread. Lets look at an example. If actor_1 executes the code
1 def a_method(self):
2 ...
3 x = actor_2.another_method()
4 print x
5 ...
x before it can execute the print. We said before that calling a method on another actor runs on that other thread, which holds in this case: actor_1's thread cannot run another_method: the method must run on actor_2's thread. It may not be possible to run the method immediately: actor_2 may be in the middle of another task and may have other tasks queued to execute.
The semantics of the rpc protocol translates fairly easily into two tasks: actor_1 creates a task for the another_method call and includes as the continuation of that task information that runtime can use to get the returned value to the right place in the call stack. When actor_2 completes another_method, it calls that continuation which results in a new task, targeted on actor_1, which, when the runtime executes it, will cause the call on actor_1 to complete and the return value from another_method to be assigned to x.
In effect, we've taken our a_method and broken it in two, the part before the call to another_method and the part after. When actor_1 calls another_method, it effectively finishes the first half. The continuation it sends to actor_2 effectively says, "run the second half". Since actor_1 has finished the first half of the method, it is finished with the current task. It can therefore execute another task, including the task that will be created by actor_2 when it finishes another_method.
What if there are other calls that have been made on actor_1? Can they run? Can they run before the result from actor_2 has been returned?
This is an area that different actor systems vary on, the ability to selectively block tasks. Dramatis does provide this ability. As mentioned, actor methods are uninterruptible: that uninterruptibility is key to controlling concurrency conflicts in actor systems. If all other methods could execute when an rpc call was made on another actor, abstractly, the uninterruptible nature of the method execution is lost. Steps that occur before the call and after no longer appear atomic and without this atomicity, rpcs become much less useful.
Many actor systems, including dramatis, provide selective receives. That is, they allow an actor to indicate that certain calls are acceptable or unacceptable at at any point in time. Dramatis uses this gating behavior to implement consistent rpcs.
When an actor makes an rpc call on another actor, dramatis automatically restricts the set of tasks that the caller will accept. In general, it will only allow the task that will return the desired value to execute. Any other tasks that were queued at the time of the call or that are received before the target actor returns a value are deferred. In this way, dramatis maintains the atomity of methods even when rpcs involving multiple messages are used.
Dramatis also provides gating features so that actors can identity other methods that can be executed even while an rpc is pending.
Not all actor systems use implicit continuations as dramatis does. In many of these, the caller of an actor method must explicitly pass its name in the argument list and the target actor must explicitly send the result back. The effective is similar.
dramatis has other continuation types, as will be shown below. dramatis continuations are similar to native language continuations such as those found in Ruby, but have some extensions (they are concurrent) and limitations (often they cannot be called multiple times.) dramatis does not use native language continuations.
So, with some background on actors, let return to our example. When we mixed in dramatis.Actor, what changed in our program? First, lets look at the lines that created our actors:
1 ping = PingPong( "ping" )
2 pong = PingPong( "pong" )
actor dramatis.Actor.Name, a proxy for the actor. In most cases this proxy object, which we call the actor name, acts like a native object reference with the addition of actor semantics. So when we call1 ping.pingpong( int(sys.argv[1]), pong )
The runtime will, at some point in the future, run the pingpong method on ping which will result in ping executing
1 partner.pingpong( count-1, self )
partner will be pong, so dramatis will create another task, this time targeted at pong and, at some point in the future, execute it.
Another actor issue comes in to play at this step. Actor systems are generally pass by value. That is, they send object values or copies, rather than references to objects. Nothing is shared between the caller and the callee. In pure actor systems, there are only values (which include actor names) and actors so nothing except actor state is mutable and actors are internally serial.
In this sense, dramatis is not a pure actor system. Since it's only a library on top of a non-actor language and virtual machine, this is pretty much guaranteed: to make a pure actor system would generally require changing either one or both. In addition to immutable values like numbers, dramatis programs have all the mutable objects found in non-concurrent program. dramatis provides mechanisms for for managing concurrency but cannot guarantee that shared objects will not have concurrent conflicts if they are used.
At this time, dramatis does not specify whether actor method call arguments will be copied or not. Thus some care is required when considering objects passed to actor methods.
One philosophy of dramatis is to balance concurrency issues with divergence from serial programming and at this point, it's unclear whether always copying method arguments is always a good idea.
Back on our example:
1 partner.pingpong( count-1, self )
self? Generally self in an actor method works as it normally does in a serial program. Only when an actor name is used do actor semantics enter the picture. Thus, an actor class can still call all its internal methods as it normally would without invoking actor semantics.
An exception to this occurs when passing self references to actor objects as arguments to an actor method or as its return value. In these cases, the runtime automatically converts the self reference to an actor name. This is a special case of pass by value, where the normal way to martial an actor is to convert a reference to an actor to an actor name. This case is handled specially by dramatis because it's a common pattern and simplifies coding.
So, in our example, when ping calls pong with self as a parameter, dramatis substitutes ping's actor name for self. An actor can get its own name by calling self.actor.name.
Finally, pong will execute pingpong and, if the count hasn't reached zero, will call pingpong back on ping.
We can try to run it now but we get an error:
$ python ./actor.py 100
Traceback (most recent call last):
File "./actor.py", line 34, in <module>
ping.pingpong( int(sys.argv[1]), pong )
File "./actor.py", line 28, in pingpong
partner.pingpong( count-1, self )
File "./actor.py", line 28, in pingpong
partner.pingpong( count-1, self )
Deadlock:
$
pingpong to our actor named ping from our main program. ping dutifully sends a pingpong to pong in the next stack frame. This works fine. Now pong tries to volley back to ping and something, perhaps unexpected, happens. dramatis is telling us that a deadlock has occurred while executing this code. (For those that may have noticed, the backtrace returned by dramatis represents the actor calls across threads; a raw (and much longer and messy) backtrace is also available).
The issue here is that we're trying to send a pingpong back from pong to ping, but ping is still waiting to hear back from pong. It isn't busy: it doesn't have any messages to process. But as we mentioned above, it called pong with an rpc call which set itself up only to receive the result from pong, and pong hasn't returned it. Instead pong is trying to make a new pingpong call. This is corecursion, just as we had in our nonconcurrent case, and by default, dramatis does not allow it.
Before we fix this, we'll mention in passing that dramatis can be made to allow recursion and corecursion, what we call call threading, by default. Adding
1 self.actor.enable_call_threading()
The right way to fix this is to notice that we don't really need to recurse here at all. Our actors don't look at the result of the pingpong method (which is just as well, since the method doesn't return anything useful).
What we need is a way to call a method but not wait around for the results (if you're paying close attention, we're being fast and loose with terminology here: actors don't wait, in most senses.) All actor implementations have this. In Erlang OTP it's called cast.
In dramatis, we make this non-waiting call by writing
1 dramatis.release( partner ).pingpong( count-1, self )
dramatis.release takes an actor name and returns a new name. This new name acts slightly differently than the original name. It releases, if you will, the task created by the call. That is, it doesn't ask the task to return value and the method call returns immediately. Another way of looking at is that rather than providing the current continuation, it provides a nil continuation.
If we make this single change to our program and rerun it, we get:
$ ./actor.py 900
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ ./actor.py 10000
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
pong: pingpong 9001
ping: pingpong 9000
pong: pingpong 8501
ping: pingpong 8500
pong: pingpong 8001
ping: pingpong 8000
pong: pingpong 7501
ping: pingpong 7500
pong: pingpong 7001
ping: pingpong 7000
pong: pingpong 6501
ping: pingpong 6500
pong: pingpong 6001
ping: pingpong 6000
pong: pingpong 5501
ping: pingpong 5500
pong: pingpong 5001
ping: pingpong 5000
pong: pingpong 4501
ping: pingpong 4500
pong: pingpong 4001
ping: pingpong 4000
pong: pingpong 3501
ping: pingpong 3500
pong: pingpong 3001
ping: pingpong 3000
pong: pingpong 2501
ping: pingpong 2500
pong: pingpong 2001
ping: pingpong 2000
pong: pingpong 1501
ping: pingpong 1500
pong: pingpong 1001
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
pingpong recursively. ping calls pong and then returns, going inactive until it gets a request from pong. Similarly, pong calls ping and then returns. This style of data-flow programming is dead-simple in actors and fairly complex in serial languages.
Finally, we could wonder, are we getting any other benefits from using actors here? We have a nice data flow model, but what about concurrency? We know in theory that the actors are running on different threads, but can we demonstrate that in a measurable way?
One useful feature of concurrency in actors is concurrent I/O: for example, fetching a number of web pages concurrently. That's a little complex for our example, but we can simulate it. Lets say that at each volley, our actors wanted to perform some time consuming I/O. To simulate this, we'll put a short sleep in our pingpong method, right after we pingpong our partner:
1 time.sleep( 0.001 )
$ time ./serial.py 900
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m9.021s
user 0m0.010s
sys 0m0.000s
$ time ./actor.py 900
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m4.590s
user 0m0.050s
sys 0m0.010s
$
ping and pong get to overlap their sleep in the actor version. This can't be done in the serial version. This is analogous to saying, in a serial program, you can't fetch two web pages at the same time without resorting to some form of manual thread management or asynchronous I/O.
That's it. A concurrent actor program and you've seen the most important dramatis objects. Other, more advanced, features to explore are futures, available via dramatis.future and advanced task gating, available via methods of dramatis.Actor.Interface.
This example creates two actors that send messages back and forth between each other, a bit like a ping pong ball. It's adapted from the Scala Example. The code for the final versions is in source:examples/pingpong. There are three versions there: the serial (source:examples/pingpong/serial.rb) and the actor (source:examples/pingpong/actor.rb) versions which we develop here and, for reference, a version closer to the orignal Scala example (source:examples/pingpong/scala.rb).
For our example, we'll have two objects, ping and pong that pass a token back and forth a fixed number of times. We'll make the token, our "ball", be the number of volleys left to perform.
We'll start with a simple non-concurrent version. We'll need a class with a pingpong method. This method will take a count representing the number of volleys left to play and a reference to the partner that it is volleying with. With a little extra code so that we can see what is going on, it looks like this:
1 def pingpong count, partner
2 if count == 0
3 puts "#{@name}: done"
4 else
5 if count % 500 == 0 or count % 500 == 1
6 puts "#{@name}: pingpong #{count}"
7 end
8 partner.pingpong count-1, self
9 end
10 end
That's really all there is to it. All we need is the class wrapper and few lines to actually create a couple of objects and start the ball rolling (or volleying, as the case may be):
1 class PingPong
2
3 def initialize name
4 @name = name
5 end
6
7 def pingpong count, partner
8 if count == 0
9 puts "#{@name}: done"
10 else
11 if count % 500 == 0 || count % 500 == 1
12 puts "#{@name}: pingpong #{count}"
13 end
14 partner.pingpong count-1, self
15 end
16 end
17
18 end
19
20 ping = PingPong.new "ping"
21 pong = PingPong.new "pong"
22
23 ping.pingpong ARGV[0].to_i, pong
To see what happens, I can run a thousand volleys on my machine:
$ ruby serial.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ ruby serial.rb 10000
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
pong: pingpong 9001
ping: pingpong 9000
pong: pingpong 8501
ping: pingpong 8500
pong: pingpong 8001
ping: pingpong 8000
pong: pingpong 7501
ping: pingpong 7500
pong: pingpong 7001
ping: pingpong 7000
pong: pingpong 6501
ping: pingpong 6500
pong: pingpong 6001
ping: pingpong 6000
Exception `SystemStackError' at serial.rb:13 - stack level too deep
serial.rb:13:in `pingpong': stack level too deep (SystemStackError)
from serial.rb:16:in `pingpong'
from serial.rb:25
$
pingpong method doesn't return a useful value, we still have to call it recursively: that's the only way we have of calling methods. Most serial languages have no native way of expressing this volleying kind of communication between objects. It is possible to do this with continuations but most languages/implementations, including ones we care about a lot like python and jruby, do not provide them. It's certainly possible to write a message passing layer, but it's a fair amount of code.
Lets put that aside for now and look at making pingpong into an actor program using dramatis. To make a normal class into an actor, first we need to mixin a dramatis class:
1 class PingPong
2 include Dramtis::Actor
3 ...
1 require 'dramatis/actor'
Before we rush ahead, we need to consider what we've done, what actors are.
Actors are concurrent objects: they are part object, part thread, a kind of chimera. They look in many ways like normal objects: they have state (data members), methods (member functions), and they can have methods called on them.
They also are, abstractly, threads. Rather than have whatever thread is running when a method call is made on an actor execute that method recursively and immediately, each actor has its own thread and only that thread is allowed to execute methods for that actor. This implies that an actor can only be executing one method at a time, so there can be no races or conflicts among methods of a single actor. Note that this thread is abstract and that different actor implementations implement it in different ways (dramatis, for example, does not create a thread per actor).
In dramatis, when an actor makes a call on another actor (often phrased in actor parlance as sending a message in the same way that Ruby and Smalltalk are send messages between objects), rather than executing the method itself, on its thread, it creates a task, a combination of a reference to the called actor (which we call the actor name), the method to be called, any arguments to the method, and a continuation. The continuation is a representation of where the results of the method call should be sent. In the case of a normal call, like we're familiar with in non-concurrent programs, the continuation indicates a message should be sent back to the caller such that the result is delivered as the result of the method call. We call this style of method call a remote procedure call, or rpc, where remote means on another actor.
One other aspect of actors, of their threads and methods, is that once begun, a method cannot be interrupted. If another task is scheduled on an executing actor, it cannot be executed until the current method runs to completion.
However, continuation passing provides a nice syntactic shortcut that does look a little bit like a method not running to completion. When an rpc call is made, a normal method call, on another actor, the calling thread is, in a way, waiting for the called thread. Lets look at an example. If actor_1 executes the code
1 def a_method
2 ...
3 x = actor_2.another_method
4 puts x
5 ...
6 end
x before it can execute the puts to print the value. We said before that calling a method on another actor runs on that other thread, which holds in this case: actor_1's thread cannot run another_method: the method must run on actor_2's thread. It may not be possible to run the method immediately: actor_2 may be in the middle of another task and may have other tasks queued to execute.
The semantics of the rpc protocol translates fairly easily into two tasks: actor_1 creates a task for the another_method call and includes as the continuation of that task information that runtime can use to get the returned value to the right place in the call stack. When actor_2 completes another_method, it calls that continuation which results in a new task, targeted on actor_1, which, when the runtime executes it, will cause the call on actor_1 to complete and the return value from another_method to be assigned to x.
In effect, we've taken our a_method and broken it in two, the part before the call to another_method and the part after. When actor_1 calls another_method, it effectively finishes the first half. The continuation it sends to actor_2 effectively says, "run the second half". Since actor_1 has finished the first half of the method, it is finished with the current task. It can therefore execute another task, including the task that will be created by actor_2 when it finishes another_method.
What if there are other calls that have been made on actor_1? Can they run? Can they run before the result from actor_2 has been returned?
This is an area that different actor systems vary on, the ability to selectively block tasks. Dramatis does provide this ability. As mentioned, actor methods are uninterruptible: that uninterruptibility is key to controlling concurrency conflicts in actor systems. If all other methods could execute when an rpc call was made on another actor, abstractly, the uninterruptible nature of the method execution is lost. Steps that occur before the call and after no longer appear atomic and without this atomicity, rpcs become much less useful.
Many actor systems, including dramatis, provide selective receives. That is, they allow an actor to indicate that certain calls are acceptable or unacceptable at at any point in time. Dramatis uses this gating behavior to implement consistent rpcs.
When an actor makes an rpc call on another actor, dramatis automatically restricts the set of tasks that the caller will accept. In general, it will only allow the task that will return the desired value to execute. Any other tasks that were queued at the time of the call or that are received before the target actor returns a value are deferred. In this way, dramatis maintains the atomity of methods even when rpcs involving multiple messages are used.
Dramatis also provides gating features so that actors can identity other methods that can be executed even while an rpc is pending.
Not all actor systems use implicit continuations as dramatis does. In many of these, the caller of an actor method must explicitly pass its name in the argument list and the target actor must explicitly send the result back. The effective is similar.
dramatis has other continuation types, as will be shown below. dramatis continuations are similar to native language continuations such as those found in Ruby, but have some extensions (they are concurrent) and limitations (often they cannot be called multiple times.) dramatis does not use native language continuations.
So, with some background on actors, let return to our example. When we mixed in Dramatis::Actor, what changed in our program? First, lets look at the lines that created our actors:
1 ping = PingPong.new "ping"
2 pong = PingPong.new "pong"
actor new no longer returns a reference to the object. Instead, it returns a Dramatis::Actor::Name, a proxy for the actor. In most cases this proxy object, which we call the actor name, acts like a native object reference with the addition of actor semantics. So when we call1 ping.pingpong ARGV[0].to_i, pong
The runtime will, at some point in the future, run the pingpong method on ping which will result in ping executing
1 partner.pingpong count-1, self
partner will be pong, so dramatis will create another task, this time targeted at pong and, at some point in the future, execute it.
Another actor issue comes in to play at this step. Actor systems are generally pass by value. That is, they send object values or copies, rather than references to objects. Nothing is shared between the caller and the callee. In pure actor systems, there are only values (which include actor names) and actors so nothing except actor state is mutable and actors are internally serial.
In this sense, dramatis is not a pure actor system. Since it's only a library on top of a non-actor language and virtual machine, this is pretty much guaranteed: to make a pure actor system would generally require changing either one or both. In addition to immutable values like numbers, dramatis programs have all the mutable objects found in non-concurrent program. dramatis provides mechanisms for for managing concurrency but cannot guarantee that shared objects will not have concurrent conflicts if they are used.
At this time, dramatis does not specify whether actor method call arguments will be copied or not. Thus some care is required when considering objects passed to actor methods.
One philosophy of dramatis is to balance concurrency issues with divergence from serial programming and at this point, it's unclear whether always copying method arguments is always a good idea.
Back on our example:
1 partner.pingpong count-1, self
self? Generally self in an actor method works as it normally does in a serial program. Only when an actor name is used do actor semantics enter the picture. Thus, an actor class can still call all its internal methods as it normally would without invoking actor semantics.
An exception to this occurs when passing self references to actor objects as arguments to an actor method or as its return value. In these cases, the runtime automatically converts the self reference to an actor name. This is a special case of pass by value, where the normal way to martial an actor is to convert a reference to an actor to an actor name. This case is handled specially by dramatis because it's a common pattern and simplifies coding.
So, in our example, when ping calls pong with self as a parameter, dramatis substitutes ping's actor name for self. An actor can get its own name by calling actor.name.
Finally, pong will execute pingpong and, if the count hasn't reached zero, will call pingpong back on ping.
We can try to run it now but we get an error:
$ ./actor.rb 100
./actor.rb:22:in `pingpong': Dramatis::Deadlock (Dramatis::Deadlock)
from ./actor.rb:22:in `pingpong'
from ./actor.rb:31
$
pingpong to our actor named ping from our main program. ping dutifully sends a pingpong to pong in the next stack frame. This works fine. Now pong tries to volley back to ping and something, perhaps unexpected, happens. dramatis is telling us that a deadlock has occurred while executing this code. (For those that may have noticed, the backtrace returned by dramatis represents the actor calls across threads; a raw (and much longer and messy) backtrace is also available).
The issue here is that we're trying to send a pingpong back from pong to ping, but ping is still waiting to hear back from pong. It isn't busy: it doesn't have any messages to process. But as we mentioned above, it called pong with an rpc call which set itself up only to receive the result from pong, and pong hasn't returned it. Instead pong is trying to make a new pingpong call. This is corecursion, just as we had in our nonconcurrent case, and by default, dramatis does not allow it.
Before we fix this, we'll mention in passing that dramatis can be made to allow recursion and corecursion, what we call call threading, by default. Adding
1 actor.enable_call_threading
The right way to fix this is to notice that we don't really need to recurse here at all. Our actors don't look at the result of the pingpong method (which is just as well, since the method doesn't return anything useful).
What we need is a way to call a method but not wait around for the results (if you're paying close attention, we're being fast and loose with terminology here: actors don't wait, in most senses.) All actor implementations have this. In Erlang OTP it's called cast.
In dramatis, we make this non-waiting call by writing
1 release( partner ).pingpong count-1, self
release (or Dramatis.release if you haven't used include Dramatis) takes an actor name and returns a new name. This new name acts slightly differently than the original name. It releases, if you will, the task created by the call. That is, it doesn't ask the task to return value and the method call returns immediately. Another way of looking at is that rather than providing the current continuation, it provides a nil continuation.
If we make this single change to our program and rerun it, we get:
$ ./actor.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ ./actor.rb 10000
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
pong: pingpong 9001
ping: pingpong 9000
pong: pingpong 8501
ping: pingpong 8500
pong: pingpong 8001
ping: pingpong 8000
pong: pingpong 7501
ping: pingpong 7500
pong: pingpong 7001
ping: pingpong 7000
pong: pingpong 6501
ping: pingpong 6500
pong: pingpong 6001
ping: pingpong 6000
pong: pingpong 5501
ping: pingpong 5500
pong: pingpong 5001
ping: pingpong 5000
pong: pingpong 4501
ping: pingpong 4500
pong: pingpong 4001
ping: pingpong 4000
pong: pingpong 3501
ping: pingpong 3500
pong: pingpong 3001
ping: pingpong 3000
pong: pingpong 2501
ping: pingpong 2500
pong: pingpong 2001
ping: pingpong 2000
pong: pingpong 1501
ping: pingpong 1500
pong: pingpong 1001
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
pingpong recursively. ping calls pong and then returns, going inactive until it gets a request from pong. Similarly, pong calls ping and then returns. This style of data-flow programming is dead-simple in actors and fairly complex in serial languages.
Finally, we could wonder, are we getting any other benefits from using actors here? We have a nice data flow model, but what about concurrency? We know in theory that the actors are running on different threads, but can we demonstrate that in a measurable way?
One useful feature of concurrency in actors is concurrent I/O: for example, fetching a number of web pages concurrently. That's a little complex for our example, but we can simulate it. Lets say that at each volley, our actors wanted to perform some time consuming I/O. To simulate this, we'll put a short sleep in our pingpong method, right after we pingpong our partner:
1 sleep 0.001
$ time ./serial.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m11.291s
user 0m0.000s
sys 0m0.000s
$ time ./actor.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m5.060s
user 0m0.020s
sys 0m0.000s
$
ping and pong get to overlap their sleep in the actor version. This can't be done in the serial version. This is analogous to saying, in a serial program, you can't fetch two web pages at the same time without resorting to some form of manual thread management or asynchronous I/O.
That's it. A concurrent actor program and you've seen the most important dramatis objects. Other, more advanced, features to explore are futures, available via Dramatis.future and advanced task gating, available via methods of Dramatis::Actor::Interface.
This example creates two actors that send messages back and forth between each other, a bit like a ping pong ball. It's adapted from the Scala Example. The code for the final versions is in source:examples/pingpong. There are three versions there: the serial (source:examples/pingpong/serial.rb) and the actor (source:examples/pingpong/actor.rb) versions which we develop here and, for reference, a version closer to the orignal Scala example (source:examples/pingpong/scala.rb).
For our example, we'll have two objects, ping and pong that pass a token back and forth a fixed number of times. We'll make the token, our "ball", be the number of volleys left to perform.
We'll start with a simple non-concurrent version. We'll need a class with a pingpong method. This method will take a count representing the number of volleys left to play and a reference to the partner that it is volleying with. With a little extra code so that we can see what is going on, it looks like this:
1 def pingpong count, partner
2 if count == 0
3 puts "#{@name}: done"
4 else
5 if count % 500 == 0 or count % 500 == 1
6 puts "#{@name}: pingpong #{count}"
7 end
8 partner.pingpong count-1, self
9 end
10 end
That's really all there is to it. All we need is the class wrapper and few lines to actually create a couple of objects and start the ball rolling (or volleying, as the case may be):
1 class PingPong
2
3 def initialize name
4 @name = name
5 end
6
7 def pingpong count, partner
8 if count == 0
9 puts "#{@name}: done"
10 else
11 if count % 500 == 0 || count % 500 == 1
12 puts "#{@name}: pingpong #{count}"
13 end
14 partner.pingpong count-1, self
15 end
16 end
17
18 end
19
20 ping = PingPong.new "ping"
21 pong = PingPong.new "pong"
22
23 ping.pingpong ARGV[0].to_i, pong
To see what happens, I can run a thousand volleys on my machine:
$ ruby serial.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ ruby serial.rb 10000
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
pong: pingpong 9001
ping: pingpong 9000
pong: pingpong 8501
ping: pingpong 8500
pong: pingpong 8001
ping: pingpong 8000
pong: pingpong 7501
ping: pingpong 7500
pong: pingpong 7001
ping: pingpong 7000
pong: pingpong 6501
ping: pingpong 6500
pong: pingpong 6001
ping: pingpong 6000
Exception `SystemStackError' at serial.rb:13 - stack level too deep
serial.rb:13:in `pingpong': stack level too deep (SystemStackError)
from serial.rb:16:in `pingpong'
from serial.rb:25
$
pingpong method doesn't return a useful value, we still have to call it recursively: that's the only way we have of calling methods. Most serial languages have no native way of expressing this volleying kind of communication between objects. It is possible to do this with continuations but most languages/implementations, including ones we care about a lot like python and jruby, do not provide them. It's certainly possible to write a message passing layer, but it's a fair amount of code.
Lets put that aside for now and look at making pingpong into an actor program using dramatis. To make a normal class into an actor, first we need to mixin a dramatis class:
1 class PingPong
2 include Dramtis::Actor
3 ...
1 require 'dramatis/actor'
Before we rush ahead, we need to consider what we've done, what actors are.
Actors are concurrent objects: they are part object, part thread, a kind of chimera. They look in many ways like normal objects: they have state (data members), methods (member functions), and they can have methods called on them.
They also are, abstractly, threads. Rather than have whatever thread is running when a method call is made on an actor execute that method recursively and immediately, each actor has its own thread and only that thread is allowed to execute methods for that actor. This implies that an actor can only be executing one method at a time, so there can be no races or conflicts among methods of a single actor. Note that this thread is abstract and that different actor implementations implement it in different ways (dramatis, for example, does not create a thread per actor).
In dramatis, when an actor makes a call on another actor (often phrased in actor parlance as sending a message in the same way that Ruby and Smalltalk are send messages between objects), rather than executing the method itself, on its thread, it creates a task, a combination of a reference to the called actor (which we call the actor name), the method to be called, any arguments to the method, and a continuation. The continuation is a representation of where the results of the method call should be sent. In the case of a normal call, like we're familiar with in non-concurrent programs, the continuation indicates a message should be sent back to the caller such that the result is delivered as the result of the method call. We call this style of method call a remote procedure call, or rpc, where remote means on another actor.
One other aspect of actors, of their threads and methods, is that once begun, a method cannot be interrupted. If another task is scheduled on an executing actor, it cannot be executed until the current method runs to completion.
However, continuation passing provides a nice syntactic shortcut that does look a little bit like a method not running to completion. When an rpc call is made, a normal method call, on another actor, the calling thread is, in a way, waiting for the called thread. Lets look at an example. If actor_1 executes the code
1 def a_method
2 ...
3 x = actor_2.another_method
4 puts x
5 ...
6 end
x before it can execute the puts to print the value. We said before that calling a method on another actor runs on that other thread, which holds in this case: actor_1's thread cannot run another_method: the method must run on actor_2's thread. It may not be possible to run the method immediately: actor_2 may be in the middle of another task and may have other tasks queued to execute.
The semantics of the rpc protocol translates fairly easily into two tasks: actor_1 creates a task for the another_method call and includes as the continuation of that task information that runtime can use to get the returned value to the right place in the call stack. When actor_2 completes another_method, it calls that continuation which results in a new task, targeted on actor_1, which, when the runtime executes it, will cause the call on actor_1 to complete and the return value from another_method to be assigned to x.
In effect, we've taken our a_method and broken it in two, the part before the call to another_method and the part after. When actor_1 calls another_method, it effectively finishes the first half. The continuation it sends to actor_2 effectively says, "run the second half". Since actor_1 has finished the first half of the method, it is finished with the current task. It can therefore execute another task, including the task that will be created by actor_2 when it finishes another_method.
What if there are other calls that have been made on actor_1? Can they run? Can they run before the result from actor_2 has been returned?
This is an area that different actor systems vary on, the ability to selectively block tasks. Dramatis does provide this ability. As mentioned, actor methods are uninterruptible: that uninterruptibility is key to controlling concurrency conflicts in actor systems. If all other methods could execute when an rpc call was made on another actor, abstractly, the uninterruptible nature of the method execution is lost. Steps that occur before the call and after no longer appear atomic and without this atomicity, rpcs become much less useful.
Many actor systems, including dramatis, provide selective receives. That is, they allow an actor to indicate that certain calls are acceptable or unacceptable at at any point in time. Dramatis uses this gating behavior to implement consistent rpcs.
When an actor makes an rpc call on another actor, dramatis automatically restricts the set of tasks that the caller will accept. In general, it will only allow the task that will return the desired value to execute. Any other tasks that were queued at the time of the call or that are received before the target actor returns a value are deferred. In this way, dramatis maintains the atomity of methods even when rpcs involving multiple messages are used.
Dramatis also provides gating features so that actors can identity other methods that can be executed even while an rpc is pending.
Not all actor systems use implicit continuations as dramatis does. In many of these, the caller of an actor method must explicitly pass its name in the argument list and the target actor must explicitly send the result back. The effective is similar.
dramatis has other continuation types, as will be shown below. dramatis continuations are similar to native language continuations such as those found in Ruby, but have some extensions (they are concurrent) and limitations (often they cannot be called multiple times.) dramatis does not use native language continuations.
So, with some background on actors, let return to our example. When we mixed in Dramatis::Actor, what changed in our program? First, lets look at the line that created our actors:
1 ping = PingPong.new "ping"
2 pong = PingPong.new "pong"
actor new no longer returns a reference to the object. Instead, it returns a Dramatis::Actor::Name, a proxy for the actor. In most cases this proxy object, which we call the actor name, acts like a native object reference with the addition of actor semantics. So when we call1 ping.pingpong ARGV[0].to_i, pong
The runtime will, at some point in the future, run the pingpong method on ping which will result in ping executing
1 partner.pingpong count-1, self
partner will be pong, so dramatis will create another task, this time targeted at pong and, at some point in the future, execute it.
Another actor issue comes in to play at this step. Actor systems are generally pass by value. That is, they send object values or copies, rather than references to objects. Nothing is shared between the caller and the callee. In pure actor systems, there are only values (which include actor names) and actors so nothing except actor state is mutable and actors are internally serial.
In this sense, dramatis is not a pure actor system. Since it's only a library on top of a non-actor language and virtual machine, this is pretty much guaranteed: to make a pure actor system would generally require changing either one or both. In addition to immutable values like numbers, dramatis programs have all the multiple objects found in non-concurrent program. dramatis provides mechanisms for for managing concurrency but cannot guarantee that shared objects will not have concurrent conflicts if they are used.
At this time, dramatis does not specify whether actor method call arguments will be copied or not. Thus some care is required when considering objects passed to actor methods.
One philosophy of dramatis is to balance concurrency issues with divergence from serial programming and at this point, it's unclear whether always copying method arguments is always a good idea.
Back on our example:
1 partner.pingpong count-1, self
self? Generally self in an actor method works as it normally does in a serial program. Only when an actor name is used do actor semantics enter the picture. Thus, an actor class can still call all its internal methods as it normally would without invoking actor semantics.
An exception to this occurs when passing self references to actor objects as arguments to an actor method or as its return value. In these cases, the runtime automatically converts the self reference to an actor name. This is a special case of pass by value, where the normal way to martial an actor is to convert a reference to an actor to an actor name. This case is handled specially by dramatis because it's a common pattern and simplifies coding.
So, in our example, when ping calls pong with self as a parameter, dramatis substitutes ping's actor name for self. An actor can get its own name by calling actor.name.
Finally, pong will execute pingpong and, if the count hasn't reached zero, will call pingpong back on ping.
We can try to run it now but we get an error:
$ ./actor.rb 100
./actor.rb:22:in `pingpong': Dramatis::Deadlock (Dramatis::Deadlock)
from ./actor.rb:22:in `pingpong'
from ./actor.rb:31
$
pingpong to our actor named ping from our main program. ping dutifully sends a pingpong to pong in the next stack frame. This works fine. Now pong tries to volley back to ping and something, perhaps unexpected, happens. dramatis is telling us that a deadlock has occurred while executing this code. (For those that may have noticed, the backtrace returned by dramatis represents the actor calls across threads; a raw (and much longer and messy) backtrace is also available).
The issue here is that we're trying to send a pingpong back from pong to ping, but ping is still waiting to hear back from pong. It isn't busy: it doesn't have any messages to process. But as we mentioned above, it called pong with an rpc call which set itself up only to receive the result from pong, and pong hasn't returned it. Instead pong is trying to make a new pingpong call. This is corecursion, just as we had in our nonconcurrent case, and by default, dramatis does not allow it.
Before we fix this, we'll mention in passing that dramatis can be made to allow recursion and corecursion, what we call call threading, by default. Adding
1 actor.enable_call_threading
The right way to fix this is to notice that we don't really need to recurse here at all. Our actors don't look at the result of the pingpong method (which is just as well, since the method doesn't return anything useful).
What we need is a way to call a method but not wait around for the results (if you're paying close attention, we're being fast and loose with terminology here: actors don't wait, in most senses.) All actor implementations have this. In Erlang OTP it's called cast.
In dramatis, we make this non-waiting call by writing
1 class PingPong
2 release( partner ).pingpong count-1, self
release (or Dramatis.release if you haven't used include Dramatis) takes an actor name and returns a new name. This new name acts slightly differently than the original name. It releases, if you will, the task created by the call. That is, it doesn't ask the task to return value and the method call returns immediately. Another way of looking at is that rather than providing the current continuation, it provides a nil continuation.
If we make this single change to our program and rerun it, we get:
$ ./actor.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ ./actor.rb 10000
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
pong: pingpong 9001
ping: pingpong 9000
pong: pingpong 8501
ping: pingpong 8500
pong: pingpong 8001
ping: pingpong 8000
pong: pingpong 7501
ping: pingpong 7500
pong: pingpong 7001
ping: pingpong 7000
pong: pingpong 6501
ping: pingpong 6500
pong: pingpong 6001
ping: pingpong 6000
pong: pingpong 5501
ping: pingpong 5500
pong: pingpong 5001
ping: pingpong 5000
pong: pingpong 4501
ping: pingpong 4500
pong: pingpong 4001
ping: pingpong 4000
pong: pingpong 3501
ping: pingpong 3500
pong: pingpong 3001
ping: pingpong 3000
pong: pingpong 2501
ping: pingpong 2500
pong: pingpong 2001
ping: pingpong 2000
pong: pingpong 1501
ping: pingpong 1500
pong: pingpong 1001
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
pingpong recursively. ping calls pong and then returns, going inactive until it gets a request from pong. Similarly, pong calls ping and then returns. This style of data-flow programming is dead-simple in actors and fairly complex in serial languages.
Finally, we could wonder, are we getting any other benefits from using actors here? We have a nice data flow model, but what about concurrency? We know in theory that the actors are running on different threads, but can we demonstrate that in a measurable way?
One useful feature of concurrency in actors is concurrent I/O: for example, fetching a number of web pages concurrently. That's a little complex for our example, but we can simulate it. Lets say that at each volley, our actors wanted to perform some time consuming I/O. To simulate this, we'll put a short sleep in our pingpong method, right after we pingpong our partner:
1 sleep 0.001
$ time ./serial.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m11.291s
user 0m0.000s
sys 0m0.000s
$ time ./actor.rb 1000
ping: pingpong 1000
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
real 0m5.060s
user 0m0.020s
sys 0m0.000s
$
ping and pong get to overlap their sleep in the actor version. This can't be done in the serial version. This is analogous to saying, in a serial program, you can't fetch two web pages at the same time without resorting to some form of manual thread management or asynchronous I/O.
That's it. A concurrent actor program and you've seen the most important dramatis objects. Other, more advanced, features to explore are futures, available via Dramatis.future and advanced task gating, available via methods of Dramatis::Actor::Interface.
This example creates two actors that send messages back and forth between each other, a bit like a ping pong ball. It's adapted from the Scala Example. The code for the final versions is in source:examples/pingpong. There are three versions there: the serial (source:examples/pingpong/serial.py) and the actor (source:examples/pingpong/actor.py) versions which we develop here and, for reference, a version closer to the orignal Scala example (source:examples/pingpong/scala.py).
For our example, we'll have two objects, ping and pong that pass a token back and forth a fixed number of times. We'll make the token, our "ball", be the number of volleys left to perform.
We'll start with a simple non-concurrent version. We'll need a class with a pingpong method. This method will take a count representing the number of volleys left to play and a reference to the partner that it is volleying with. With a little extra code so that we can see what is going on, it looks like this:
1 def pingpong(self,count,partner):
2 if count == 0:
3 print "%s: done" % self._name
4 else:
5 if count % 500 == 0 or count % 500 == 1:
6 print "%s: pingpong %d" % ( self._name, count )
7 partner.pingpong( count-1, self )
That's really all there is to it. All we need is the class wrapper and few lines to actually create a couple of objects and start the ball rolling (or volleying, as the case may be):
1 class PingPong ( object ):
2
3 def __init__(self,name):
4 self._name = name
5
6 def pingpong(self,count,partner):
7 if count == 0:
8 print "%s: done" % self._name
9 else:
10 if count % 500 == 0 or count % 500 == 1:
11 print "%s: pingpong %d" % ( self._name, count )
12 partner.pingpong( count-1, self )
13 # sleep 0.001
14
15 ping = PingPong( "ping" )
16 pong = PingPong( "pong" )
17
18 ping.pingpong( int(sys.argv[1]), pong )
To see what happens, I can run a thousand volleys on my machine:
$ python serial.py 900
pong: pingpong 501
ping: pingpong 500
pong: pingpong 1
ping: done
$
$ python serial.py 10000 2>&1 |head
ping: pingpong 10000
pong: pingpong 9501
ping: pingpong 9500
Traceback (most recent call last):
File "serial.py", line 24, in <module>
ping.pingpong( int(sys.argv[1]), pong )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
File "serial.py", line 18, in pingpong
partner.pingpong( count-1, self )
RuntimeError: maximum recursion depth exceeded
$
pingpong method doesn't return a useful value, we still have to call it recursively: that's the only way we have of calling methods. Most serial languages have no native way of expressing this volleying kind of communication between objects. It is possible to do this with continuations but most languages/implementations, including ones we care about a lot like python and jruby, do not provide them. It's certainly possible to write a message passing layer, but it's a fair amount of code.
Lets put that aside for now and look at making pingpong into an actor program using dramatis. To make a normal class into an actor, first we need to mixin, that is, add a base class, from dramatis:
1 class PingPong ( dramatis.Actor ):
2 ...
1 import dramatis
Before we rush ahead, we need to consider what we've done, what actors are.
Actors are concurrent objects: they are part object, part thread, a kind of chimera. They look in many ways like normal objects: they have state (data members), methods (member functions), and they can have methods called on them.
They also are, abstractly, threads. Rather than have whatever thread is running when a method call is made on an actor execute that method recursively and immediately, each actor has its own thread and only that thread is allowed to execute methods for that actor. This implies that an actor can only be executing one method at a time, so there can be no races or conflicts among methods of a single actor. Note that this thread is abstract and that different actor implementations implement it in different ways (dramatis, for example, does not create a thread per actor).
In dramatis, when an actor makes a call on another actor (often phrased in actor parlance as sending a message in the same way that Smalltalk sends messages between objects), rather than executing the method itself, on its thread, it creates a task, a combination of a reference to the called actor (which we call the actor name), the method to be called, any arguments to the method, and a continuation. The continuation is a representation of where the results of the method call should be sent. In the case of a normal call, like we're familiar with in non-concurrent programs, the continuation indicates a message should be sent back to the caller such that the result is delivered as the result of the method call. We call this style of method call a remote procedure call, or rpc, where remote means on another actor.
One other aspect of actors, of their threads and methods, is that once begun, a method cannot be interrupted. If another task is scheduled on an executing actor, it cannot be executed until the current method runs to completion.
However, continuation passing provides a nice syntactic shortcut that does look a little bit like a method not running to completion. When an rpc call is made, a normal method call, on another actor, the calling thread is, in a way, waiting for the called thread. Lets look at an example. If actor_1 executes the code
1 def a_method(self):
2 ...
3 x = actor_2.another_method()
4 print x
5 ...
x before it can execute the print. We said before that calling a method on another actor runs on that other thread, which holds in this case: actor_1's thread cannot run another_method: the method must run on actor_2's thread. It may not be possible to run the method immediately: actor_2 may be in the middle of another task and may have other tasks queued to execute.
The semantics of the rpc protocol translates fairly easily into two tasks: actor_1 creates a task for the another_method call and includes as the continuation of that task