Tutorial
Updated 2009-02-06 for Actors Guild version 0.61. Overview
New CPUs have mostly stopped getting faster at executing instructions. Instead new CPUs have gained the ability to execute several threads of execution simultaneously, as if you had several CPUs in your computer that all share the same memory. Unfortunately taking advantage of these systems is difficult. Not only do you have to design your application to distribute the work into several simultaneously running threads. It is also surprisingly complicated to coordinate the simultaneous access to memory and other common resources, even in a language that has built-in design for threads such as Java. The framework that this tutorial will show you offers a simple way of writing multi-threaded applications.
Actors Guild is a concurrency framework for Java that is loosely based on the Actor model. Actors are a special kind of object that do their work by sending and receiving messages. An actor can process only one message at a time and does not share any mutable data with other actors. Data is always owned by exactly one actor. Other actors can only access it indirectly, by exchanging messages with the owner. Thus, when you write an actor, you don't have to deal with all the problems that come with concurrency, like the need for synchronization. You write only single-threaded Actors which have to follow some relatively simple rules.
What makes the Actor model a concurrency model is that several actors can run, and thus process messages, at the same time. The Actors Guild framework will take care of things such as message dispatching, managing threads and memory synchronization for you.
One major difference between Actors Guild and other Actor-based frameworks is that you implement messages as Java methods. You can send a message just like calling a Java method. Actors Guild does not need any preprocessor, special compiler or other tools for this. You only need follow a few simple rules for the method's signature. All the other magic is done invisibly at run-time.
Because Java methods usually return a value, Actors Guild always uses a request/response pattern for messages. This is different from 'normal' actor-based systems. When you invoke an actor's message method, you will send a message to the actor and then you receive a message as reply. But for you it looks like invoking a method, albeit with an somewhat unusual return type.
In order to make use of several CPU cores, the trick is to design your program in a way that lets several Actors work in parallel. You do so by sending more than one message before you process the results. Or, with the right design, design your interfaces in a way that you don't need to process the results at all and have a completely asynchronous program.
2. How to Write an Actor
Extend Actor
To create a new actor without any messages, just create a class that extends from the abstract class Actor:
class MyFirstActor extends Actor { }
That's it. However, the initialization of actors is different than the normal Java costructor
Do Not Write a Constructor! (Use Properties Instead)
Actors should not have a constructor. Any configuration of the actor is done using properties. Actors Guild has some annotations to support property-based initialization, but they will be shown later.
3. How To Write A Message Handler
A message handler is a Java method that fulfills three conditions:
- It has a @Message annotation.
- Its return value is wrapped in a AsyncResult object.
- All arguments and the return values must be serializable (exception: the @Shared annotation).
The @Message Annotation
The following example shows a simple actor with two messages that implements a counter. The messages allow you to add a number to the counter and to retrieve its current value. Both messages will automatically work correctly, even when accessed by several threads simultaneously, because the framework processes only one message at a time.
class Counter extends Actor { private int count; @Message public AsyncResult<Integer> getCount() { return result(count); } @Message public AsyncResult<Void> add(int a) { count = count + a; return noResult(); } }
The AsyncResult Return Type
Every message must return an AsyncResult value. From the perspective of the message implementation, AsyncResult is just a wrapper for the return value. The generic type of the AsyncResult is the actual return type. Methods that do not return a value should use 'AsyncResult<Void>'. The reasons why need to use AsyncResult will be explained later when we show how to send a message.
Message implementations will usually just use the convenience methods result() and noResult() to obtain an AsyncResult that wraps the actual result. Both methods create a new instance of ImmediateResult and return it. ImmediateResult is a very simple implementation of the AsyncResult interface which does nothing but store the value.
Message Argument And Return Value Types
The declared arguments of a message, as well as the return value wrapped in the AsyncResult, must either be serializable types, actors or interfaces. Serializable are all primitives (int, boolean etc), Strings, all classes implementing Serializable as well as arrays of serializable types.
Because of Java's memory model, sharing mutable types between threads is hardly possible without error-prone synchronization. Thus the values have to be copied using the serialization mechanism, which is relatively slow. You can improve performance by using types that Actors Guild recognizes as immutable types. Immutable values can be shared and thus passed by reference. The primitive types, their wrapper classes and the String class will be automatically treated as immutable. If you want to pass your own types to an Actor method, you should consider designing it as an immutable type and implementing the Immutable interface.
There is one exception to the rule above: you can use the @Shared annotation for a parameter to pass a reference to any object. This object will be shared between the caller and the called method, and must thus be multi-threading safe. Sometimes this is needed to use existing Java APIs, but otherwise sharing is discouraged.
4. How to Create an Actor
Obtain an Agent
In order to create an actor, you need to create an Agent first. The agent is responsible for creating and managing actors. It will also manage common resources for the actors, such as threads.
The default implementation of the Agent interface is the DefaultAgent class. In most cases you can just create it using the default constructor. The other constructor variants allow you to configure options such as the number of threads in the agent's pool.
Creating the agent:
Agent agent = new DefaultAgent();
If you are implementing an actor and need to get a reference to the actor's agent, for example to create a new actor, call the Actor's getAgent method.
Create the Actor
Now that we have an agent instance, we can use it to create the Actor. Our Counter actor can be created like this:
Agent agent = new DefaultAgent(); Counter counter = agent.create(Counter.class);
Please note that the returned class is not the Counter class itself, but a proxy class created by the framework that extends Counter. The proxy class is responsible for most of Actors Guild's functionality.
5. How to Send a message
Queuing the Message
If you have a reference, you can send a message to the actor by invoking the message's method. This will cause the message to be queued by the agent. As soon as there is a spare thread and the actor is not busy, the agent will let the Actor execute the message.
Agent agent = new DefaultAgent(); Counter counter = agent.create(Counter.class); counter.add(10); // Queues a message (will be executed later!!)
Waiting for Messages
When you only queue a message, you don't know when the message has been processed. In fact, it may happen that the message is never being processed. Actors Guild's only promise is that messages will be processed in the order they have been queued. So if you need to be sure that the message has been processed at a certain time, you need to wait for it.
In order to wait for a message you need the AsyncResult interface returned by the message methods. In the previous chapter it has been said that message implementation just uses AsyncResult as a result wrapper, and that's true for the implementation side. The caller, however, will retrieve a different implementation of AsyncResult. This AsyncResult implementation can notify you when a message completed, allows you to wait for a message to complete and to retrieve the result.
To wait for a single message to be completed, just call the method await of the AsyncResult:
Agent agent = new DefaultAgent();
Counter counter = agent.create(Counter.class);
counter.add(10).await(); // Queues a message and waits for its completion
Retrieving Results
Message implementations can provide you with a return value, and this return value can be retrieved from the AsyncResult using the get method. Get will wait for the message to be processed (just like awake) and then return the result that has been wrapped by the implementation:
Agent agent = new DefaultAgent();
Counter counter = agent.create(Counter.class);
counter.add(10).await() // Queues a message and waits for its completion
System.out.println("Counter value = " + counter.getCount().get());
Please note that the await() for the add operation is optional. Messages will be processed in the order in which they have been queued, so add will always be executed before getCount.
Get will not only return results. If the message implementation threw an exception, it will be rethrown by get, wrapped in a WrappedException.
Don't Wait For Yourself!
There's a common trap when writing an Actor: you can deadlock it by sending a message to itself and then wait for it:
class DeadlockActor extends Actor
{
@Message
public AsyncResult<Void> a() {
return noResult();
}
@Message
public AsyncResult<Void> b() {
a().get(); // DEADLOCK
return noResult();
}
}
You don't do this: the messages a and b can not run simultaneously. Thus if b calls a and then waits until a is finished, it will never return.
There are two ways to prevent this: either you move the functionality of a into a private, non-message method that can be called by both a and b. Or, if need to you wait for a at the end of b, you can also return the AsyncResult of a:
class NoDeadlockActor extends Actor
{
@Message
public AsyncResult<Void> a() {
return noResult();
}
@Message
public AsyncResult<Void> b() {
return a();
}
}
6. How to Make Concurrent Calls
In Actors Guild, concurrency is achieved by processing several messages simultaneously. As an actor can only process one message at a time, this means that you need to have several actor instances that can work in parallel (there is an exception to this rule, but this will be shown later). Much of the art of writing a fast, concurrent application with Actors Guild is designing your application to have several actors work in parallel on the same problem.
The following example shows you how to create two Counter instances and let the process add in parallel (admittedly, add is so fast that the overhead of sending a message is higher by orders of magnitude, but it's just an example):
Agent agent = new DefaultAgent();
Counter counter1 = agent.create(Counter.class);
Counter counter2 = agent.create(Counter.class);
AsyncResult<Void> ar1, ar2;
ar1 = counter1.add(2);
ar2 = counter2.add(5);
agent.awaitAll(ar1, ar2); // wait for both messages to finish
The method awaitAll blocks until the messages of all AsyncResult handles have been processed. Alternatively you could also call await on every AsyncResult separately, but awaitAll allows some optimizations in the framework.
7. How to Initialize an Actor with Properties
Actors Guild uses Dependency Injection initialization instead of constructors. You can define standard Java properties and then pass values for them in the Agent.create() invocation. After setting them, create will invoke all methods in the actor that have the @Initializer annotation. You can use the @Initializer to initialize any properties and fields that depend on other properties, as well as for checking the validity of injected properties.
The following example extends the Counter class with a property for the initial count. The initializer methods writes the initial value into the count variable:
class InitializedCounter extends Actor { private int initialCount; private int count; @Initializer void init() { count = initialCount; } public void setInitialCount(int initialCount) { this.initialCount = initialCount; } public int getInitialCount() { return this.initialCount; } @Message public AsyncResult<Integer> getCount() { return result(count); } @Message public AsyncResult<Void> add(int a) { count = count + a; return noResult(); } }
The InitializedCounter actor can be created like this:
Agent agent = new DefaultAgent();
InitializedCounter counter = agent.create(InitializedCounter.class, new Props("initialCount", 5));
The second argument of create specifies one or more properties that will be set during the construction of the class. After the constructor finished, the @Initializer method init will be called and can do additional initialization.
10. @Prop, @DefaultValue and @Bean Annotations
Writing Java properties the regular way, like in the previous chapter, does not only create a lot of redundant code, it also has two additional problems in actors:
- You can not define final read-only properties, because 'final' fields can only be written in the Java constructor. Actor constructors can not get any arguments.
- Regular Java properties, like IDEs usually generate them, are not thread-safe. Properties need either a volatile in the field or a synchronized statement in the accessors, otherwise they can not be safely called from other actors.
Actors Guild solves these problems with the @Prop annotation. You can write abstract getter and setters, annotate the getter with @Prop, and Agent.create() will implement the methods for you. The generated code uses synchronization by default, so the properties are thread-safe. If you omit the setter, you create a final read-only property. The following example re-implements the InitializedCounter example with @Prop, making initialCount read-only and implementing a property for the count field as well:
abstract class InitializedCounter2 extends Actor { @Initializer void init() { setCount(getInitialCount()); } @Prop public abstract int getInitialCount(); @Prop public abstract int getCount(); public abstract void setCount(int count); @Message public AsyncResult<Void> add(int a) { setCount(getCount() + a); return noResult(); } }
By default, Agent.create() initializes unspecified properties to their default value (0 for numbers, null for references). If you want a different default value for a property, define a static final field of the same type and declare it as default using the @DefaultValue annotation. This is the InitializedCounter2 example with a default value of 100 for the counter:
abstract class InitializedCounter3 extends Actor
{
@DefaultValue("initialCount")
final static int DEFAULT_INITIAL_COUNT = 100;
@Initializer
void init() {
setCount(getInitialCount());
}
@Prop
public abstract int getInitialCount();
@Prop
public abstract int getCount();
public abstract void setCount(int count);
@Message
public AsyncResult<Void> add(int a) {
setCount(getCount() + a);
return noResult();
}
}
The @Prop, @Initializer and @DefaultValue annotations are not limited to actors. In fact, you can use them in any class as long as
it has a constructor without arguments and a Bean
annotation.
Here is a simple bean that uses @Prop
for three read-only properties:
@Bean(threadSafe=false)
abstract class Book
{
@Prop
public abstract String getTitle();
@Prop
public abstract String getAuthor();
@Prop
public abstract int getReleaseYear();
}
The mandatory threadSafe
parameter in @Bean
specifies whether the generated property accessors
will be synchronized and thus thread-safe, or not.
@Beans
must be created with Agent.create()
as well. This snippet creates a new Book instance:
Agent agent = new DefaultAgent(); Book book = agent.create(Book.class, new Props("title", "The C Programming Language") .add("author", "Brian Kernighan and Dennis Ritchie") .add("releaseYear", 1978));
9. Thread @Usage Annotations
In order to process messages, the Agent uses a thread pool with a limited number of threads. Parameters like the size of the pool can be configured in the Agent's implementation. The disadvantage this automatic thread management is that it is optimized for the execution of CPU-bound tasks. Usually the agent uses not more than two threads per CPU core. This becomes a problem when the thread is not used for computation and data processing, but rather for I/O-bound tasks or waiting for external input, such as from the network or the user. In these case there can be and should be more than one thread per CPU core, as the threads are mostly idle. So how can the agent know about it?
In order to tell the agent how a message is using its CPU time, there's an annotation called @Usage. By default, the agent assumes that the message is using the full CPU time. In that case you do not need the annotation. But if the message is mostly I/O bound, it should be declared with @Usage(ThreadUsage.IO). Similarly, messages that are waiting for some external event (but not another actor!) should be declared as @Usage(ThreadUsage.Waiting). Both annotations will cause the agent to increase the thread pool's maximum size while the message is running. A message implementation should not mix costly CPU operations with IO or external events. Instead it should be splitted into several messages.
The following example shows an actor with an I/O-bound and a waiting message:
class SlowMessageActor extends Actor { @Message @Usage(ThreadUsage.IO) public AsyncResult<Void> writeFile(String name, String content) throws Exception { FileOutputStream fos = new FileOutputStream(name); fos.write(content.getBytes()); fos.close(); return noResult(); } @Message @Usage(ThreadUsage.Waiting) public AsyncResult<Void> waitForKeyPrompt() throws Exception { System.in.read(); return noResult(); } }
10. Concurrency Model Annotations
So far, in this tutorial it has always been said that an actor can only process one message at a time. This is the classic actor model, and also the default model in Actors Guild. However, in some cases a single actor can become a bottleneck. You should always consider to avoid this by distributing the work on several actor instances, but this is not always possible. For these cases Actors Guild allows you to change the concurrency model of the actor and allow several threads running at the same time in the same actor. The @Model can be applied to an actor to change the concurrency model.
The safer of the two multi-threaded models is ConcurrencyModel.Stateless. In Stateless actor classes all fields must be 'final', and @Prop annotated properties must be read-only. Their type must either be a primitive, String, extend the Immutable interface or be annotated with the @Shared annotation. The latter should only be used with thread-safe classes such as JDBC's DataSource. As long as you adhere to these rules (and Actors Guild will refuse to create Stateless actors that violate them), it is safe to declare the concurrency model as Stateless, because without mutable fields there are no race conditions. The following actor is Stateless:
@Model(ConcurrencyModel.Stateless)
class MultiplicatorActor extends Actor
{
@Message
public AsyncResult<Integer> mul(int a, int b) {
return result(a * b);
}
}
Stateless actors can be called like any other actor. From the perspective of the caller, the only difference is that the stateless actor is able to process more than one message at a time.
The second multi-threaded model is the pure MultiThreaded model. When you use this, you are on your own. The framework will process MultiThreaded messages as soon as possible and not coordinate their execution with other messages. The order in which the messages are being processed is not defined and may be different from the order of queuing. The message implementation is also responsible for memory synchronization. One way to achieve this is to synchronize on the actor instance, which is what the framework does for all single-threaded message implementations. The problem with manual synchronization is that this may block the executing thread for a long time, if you have long-running single-threaded messages in the Actor, and the Agent may eventually run out of threads if too many actors do this.