Asynchronicity is the price to pay, you better know what you're paying for...
Let's share some vocabulary first:
Thread: The primitive responsible of executing code on the processor, you can give an existing (or a new) Thread some code, and it will execute it. Normally you can have a few hundreds on a JVM, arguments that you can tweak your way out to thousands. Worth noting that multitasking is achieved when using multiple Threads. Multiple Threads can exist for a single processor in which case multitasking happens when this processor switches between threads, called context switching, which will give the impression of things happenning in parallel. An example of a direct, and probably naive, use of a new Thread in Java:
public class MyRunnable implements Runnable {
public void run(){
System.out.println("MyRunnable running");
}
}
Thread thread = new Thread(new MyRunnable());
thread.start();
Asynchronous: Not synchronous, or not guaranteed to happen in the order in which it appears in the code. It might, but since there are no guarantees, you cannot treat it as such. Now that we have a definition we can say what Async is not. Asynchrounous code is not necessarily parallel (parallel execution being good for exploiting multi-core), is not necessarily non-blocking (which can be good for Thread economy using callback events), but can be concurrent and is trouble making. Basically Asynchronous code alone is a problem and you should know why would you introduce a new problem. So next time someone talks about it you should be aware that Asynchronicity is a price to pay, and you better know what you're paying for, insure your cost effectivity and have the tools which make the cost more manageable.
Non-blocking: Is a call that doesn't cause the caller (Thread) to be blocked waiting for a result, but rather provides a mechanism (often associated with callbacks) that allows the caller to schedule an action whenever the result is ready. Non-blocking is often used with IO operation since there is no CPU, thus no Threads, needed for accomlplishing the task. This kind of programming can lead to a spaghetti disaster if not used with appropriate composition constructs.
Reactive programming: Programming in a non-blocking style.
Futures: A construct that allows highly composable reactive programming. Basically whenever you decide to use Asynchronous and Reactive style of programming (for some good advantagaes such like exploiting multi-core, non-blocking IO or execution isolation), Futures provide a mechanism for making the problem simpler and more manageable. They feature composition semantics and functions for representing: sequencing two or more calls, error handling, combining asynchronous results and a lot more through a high level API while doing the necessary synchronization with a lot of craft. Unintuitively, a Future doesn't necessary happen in the future. By the time you receive it, it might have already happened. You can even make Futures that already carry a known value using Future.successful.
Examples of using Futures:
val f: Future[Long] = Future { fib(100) }
val fib100:Future[Long] =
Future.successful(354224848179261915075)
val answerIsOk: Future[Boolean] =
f.zip(fib100).map{ case (r1,r2) => r1 == r2 }
val all: Future[List[Long]] =
Future.sequence(List(f,fib100))
ExecutionContext: Is used by Futures to represent a Thread pool. ExecutionContext is a good way for talking about a Thread
pool responsible of executing some code. It is possible that ExeuctionContext
s share Threads or Processors, it all depends on how an ExecutionContext
is implemented. An example of using an ExecutionContext
:
val f: Future[Long] =
Future { fib(100) }(
ExecutionContext.Implicits.global)
Looking at the Future use, you will notice that each time you compose it with others you get a new Future. Then what? It seems that we can not get rid of the Future and get just what is inside it. Technically you can, you can use Await which will block the caller until you have a result. Is it bad to Await? Often yes, there are exceptions but you should really know what you are doing.
Good news is: Playframework nicely accepts a Future, more precisely a Future[Result]. This means that all what you need is transform your Future into a Future of Result and Play will take over from there. Does this mean Play is Asynchronous? Reactive? Non-blocking? features Parallel execution? should I always use Futures? Can I not use Futures? can I block? what about SQL blocking calls? these are exactly the questions I want to answer in this text. First here is how you handle a Future to Play:
def index = Action {
Async {
fib100.map( r =>
Ok("fib 100 is: " +r) )
}
}
Using Async allows you to handle a Future to Play, period. As we already know, a Future is not necessarily something that happens in the Future. Play just gives the opportunity to hand in a Future, up to you to use the opportunity or not. In other words, with Play you can choose to be non-blocking (reactive), but Play doesn't force you into that. If you prefer synchronous (with a lot of Threads) or you don't have enough benefits justifying the price of dealing with asynchronous code, that is prefectly ok.
Remember, though, that any API providing a callback or a Future is an opportunity to be non-blocking (reactive). Examples are web calls, network, file system, some database drivers, schdulers, events, etc.
Play defines an ExecutionContext that is responsible of the execution of user's code. That way, Play nicely separates ThreadPools responsible for doing internal server tasks (serving files, handling incoming requests, etc) from the code written by the Play user. This means that even if you have no more Threads in the "user code ExecutionContext", Play can continue handling its tasks. That is true unless user's code is using 100% of available CPU.
The "user code ExecutionContext" is configurable, here are some ideas of configuration of its ThreadPool:
- Number of Threads = Number of cores, this is the default and it presumes that you are not blocking (unnecessarly) Threads. If all your code is just doing pure CPU and not blocking for non CPU tasks then this is a very good configuration, you can do a bit better
- Number of Threads < Number of cores, leaving in the worst case scenario on core for Play to handle its tasks. With this even if the Play user's code is using all the threads for pure CPU computation, there is some CPU power left for Play (unless some other code on the same JVM is using all the cores)
- Number of Threads > Number of cores, mostly hundrends, this is when you choose a synchronous blocking architecture. Mostly you will configure it to as many as blocking operations you can do simultaniously (ie: the same size as the connection pool)
The user code ExecutionContext
is accessible at play.api.libs.concurrent.Execution.defaultContext
, but generally you are not supposed to use it directly.
Apart from separating Play user code from internal tasks, you can choose to separate different parts of your app and make them run on different Thread pools. As you might have realized, having all your actions sharing the same ExecutionContext means they share the same Thread pool, meaning an Action that blocks Threads might drastically impact other Actions doing pure CPU while having CPU not fully utilized! Another example is isolating one functionality which is CPU intensive from the rest of the App so that you don't have a laggy experience of the whole application because of one, understandably, slow functionality. Imagine how bad your app will look if its Homepage is slow because of face recognition functionality that happens on a completely different page.
There are different ways that can be used to give different Thread pools to different parts of you applications. But the focus should be ExecutionContext
s, not Futures, not Async.
What we are going to do is use a separate ExecutionContext
for some of our Actions. An ad-hoc way could be using a Future, which takes an ExecutionContext
as an extra (implicit) parameter. We can define:
// An ExecutionContext with a ThreadPool
// that the same size as the ConnectionPool of our DB
val dbExecutionContext: ExecutionContext = ...
val eventuallySql = Future {
/* some sql here */
}(dbExecutionContext)
// hand the Future to Play
def getUser = Action {
Async {
eventuallySql.map( user =>
Ok("user is: "+user))
}
}
It should be obvious by now that the use of Async is not what is important here, as its meaning communicates it doesn't guarantee synchronous execution. It is the use of our special ExecutionContext
that made this Action not block other actions in our App.
Since we will be using this ExecutionContext
quite often, why not make a simpler API:
def DBAction(r:=> Result):EssentialAction = {
Action {
Async{
Future(r)(dbExecutionContext)
}
}
}
And using it:
def getUSer = DBAction {
val user = // get the user from the database
Ok("user is: "+user)
}
Now the SQL code in this example is blocking, but that is okay since we designed our ExecutionContext
specifically for that and we separated it from the rest of our app's Actions.
So as a summary, it is important to:
- Have a good understanding of the meaning and differences between Asynchronous, Non-blocking and Parallel.
- Understand what is a Future and what is an ExecutionContext, and how do they relate to threads.
- Be aware of the opportunity Play brings accepting a Future, and have an idea of scenarios in which this can be interesting.
Nope, the callback will be executed on the ExecutionContext passed (implicitly) to the completion scheduling method (onComplete, map, flatMap). You can choose to pass it explicitly too.