On GitHub I provide example or Cooperative Multitasking components on a single site served by Nette Framework. These components need to process several network requests to render themselves, which is normally slow.
This example takes advantage of yield operator (available since PHP 5.5) to switch between tasks, Flow as scheduler and Rect as parallel http client.
This post introduces the Flow framework and cooperative multitasking in general.
This example application shows recent activity of some very good Nette developers.
There is a GithubComponent
, which takes username in constructor. It then sends a http request to GitHub API asking for the users most recent event and repository it was in. Then it looks for composer.json
in that repository. It displays composer name of the project.
Thus, each component needs to send two http requests.
There are 4 such components on the site, i.e. 8 http requests need to be performed.
The component may look something like this:
<?php
class GithubComponent {
function __construct($name) { ... }
function render() {
// http request to github events
// http request to composer json
return $composer['name'];
}
}
Normally in PHP, all code is executed synchronously and one must wait for blocking calls to finish (i.e. when you send a http request, you're waiting for the resposne before further processing).
The following code show simple implementation of the GithubComponent. It is simple and easy to understand. It is easy to handle errors and tell what it'll output.
The problem is that all the requests are blocking, thus the application will wait 8 times for a request to complete before going further.
<?php
class GithubComponent extends Control
{
/** @var string */
private $name;
public function __construct($name)
{
parent::__construct();
$this->name = $name;
}
public function render()
{
$data = $this->httpClient->request('GET', "https://github.com/$this->name.json")->getResponseBody();
$events = Json::decode($data, Json::FORCE_ARRAY);
$event = $events[0];
$composerUrl = ...; // prepare url for composer.json file
$composerData = $this->httpClient->request('GET', $composerUrl)->getResponseBody();
if ($composer = json_decode($composerData, JSON_OBJECT_AS_ARRAY)) {
return "Last change to composer project $composer[name]";
} else {
return "Last change to github repo {$event['repository']['url']}";
}
}
}
How can we speed up the application? We can make the first request in all components in parallel, than wait for all the responses at once, and then send all the second requests.
With this solution, we would wait only twice.
But the code must be completely rewritten, the components must know about each other and synchronize explicitly. Such code is much more difficult to write and to understand. It also makes the component much more rightly coupled, which complicates maintaining the application.
Not a good slution.
<?php
// Exercise: can you provide async example?
We can use Promises and it would work, but the code would need to change a lot as well and it would bring more complexity. That's not what we want.
<?php
// Exercise: Can you write GithubComponent using *Promises*?
// Send me the code and I'll include it here.
Imagine a component would be able to pause itself when it needs to wait for data, and let other components work for a while. Once the data are available, the component would resume and do more work. When it needs more data, it can pause again.
One possible way components working togetger is called cooperative multitasking and there is a great blog post about it by @nikic.
Let me step aside for a second and introcude some new concepts (generators and promises), so that you understand it all clearly.
Generators in PHP are pretty complex, but we need just a small part of them.
When you call a function, which contains yeild keyword, it won't be executed, but rather it'll create a new instance of Generator.
This generator can be executed, paused and resumed. Once you execute it, it runs like normal function, until it hits a yield command. Then it gets paused. When resumed, it continues just where it paused.
The yield command can receive a value, and return another value, i.e. it can be used in expressions and it works kind of like a function here.
<?php
function foo() {
$data = (yield $client->request('...'));
}
That shall be enought about generators now. If you want to know more, look at the article I spoke about before.
Promise is a concept used mostly in asynchronous programming. A promise represents a value, which may not be available yet (but it shall be available in future).
In asynchronous code, you tell a promise what to do next once it gets resolved (once the value gets available) by then
method. You may be familiar with it:
<?php
$client->requestFoo(...)->then(function($result) { ... });
We will use promises, but we won't pass callbacks to it (or to be precise, we won't do it explicitly; it will happen in the background).
Cooperative code uses special kind of functions, let's call them coroutines.
Coroutines themselves are synchronous, but can be paused and resumed when they wish to. While they're paused, other components can start working or be resumed. That's what makes them cooperative.
Thus they are kind
The code will be just a slightly different:
- We'll use
renderFlow
method (defined byFlow\FlowControl
) instead odrender
. That's because the API header of this method is different -render
shall return nothing and echo the result. ButrenderFlow
creates a Generator, which in several steps yields the output. getResponseBody()
shall not wait for the value, but return a Promise for that value instead. Creating a promise is not blocking (let's change the function so that it works like that now)yield
will wait for the Promise to finish, and take out its value. I'll get to how it's done soon.- instead of
return
, we now have to yield the return value byyield result(...)
Let's look at the code and I'll exaplain it afterwards.
<?php
class GithubComponent extends Control
{
public function renderFlow()
{
$data = (yield $this->httpClient->request('GET', "https://github.com/$this->name.json")->getResponseBody());
$events = Json::decode($data, Json::FORCE_ARRAY);
$event = $events[0];
$composerUrl = ...; // prepare url for composer.json file
$composerData = (yield $this->httpClient->request('GET', $composerUrl)->getResponseBody());
if ($composer = json_decode($composerData, JSON_OBJECT_AS_ARRAY)) {
yield result("Last change to composer project $composer[name]");
} else {
yield result("Last change to github repo {$event['repository']['url']}");
}
}
}
These components will be rendered cooperatively, which is just the same as asynchronous solution - it will send all the network requests and wait, then keep processing the rest and sending other batch of request. Thus it will wait for network only twice, making the application much faster.
Flow is a new framework for cooperative multitasking with integration to Nette Framework.
It means that it provides a scheduler which executes cooperative components, and it also adds some convenient functions.
Flow has 4 basic concepts and provides simple API.
Flow defines these four general types of commands (concepts):
- component usage (use-component)
- component data provider (get-data)
- fetch functions (request)
- waiting for response (wait)
These types of commands can be implemeneted by functions or other language constructs.
There is direct equivalent in synchronous code for these commands which you know, but you just probably never thought of them. And it makes sence, because they don't bring you anything special here. But they get some superpowers in cooperative code.
I'll try to explain these concepts first on synchronous code and then shift them to cooperative code.
In synchronous code, these concepts don't need to be fully separated. But let's try to separate them, so that we can see how the're mapped to cooperative code.
Try to look back at the synchronous code and find where and how are these concepts applied.
use-component is the place where you are rendering one or more components. It may be in the template {control githubJuzna}
or in PHP $this->getComponent('githubJuzna')->render()
.
get-data is the render method of the component. It fetches the data which the component needs to display something useful. That's the $httpClient->request(...)
or it can be database queries $db->table('Articles')->fetch($id)
.
request is the place where you send the request for some data (but you don't wait for the response yet); that's the $httpClient->request(...)
.
wait is completion of the reqeust and reading the data from response; that's the ->getResponseBody()
.
Note that request and wait concepts are not well separated in synchronous code, because there you mostly wait for the response within the same function that send the request. But in cooperative code we need to separate these two. This code illustrates separating the two concepts needed for coroutines:
<?php
// Normal sync code
$httpClient->syncRequest(...) // return response body
// Separating request and wait
$httpClient
->request(...) // send request
->getResponseBody() // wait command; waits and returns response body
You use these concepts in synchronous code, and can be directly transformed to cooperative counter-parts.
The same four concepts are present in cooperative code, but they're gaining some superpowers here.
use-component can be used on several components at once, and they're processed cooperatively. You can do it explicitly by calling Flow
, or latte template support it out of the box. This example runs two component cooperatively.
<?php
Flow::flowAuto([
'juzna' => new GithubComponent('juzna'),
'hosiplan' => new GithubComponent('hosiplan'),
]);
get-data can be performed in several succeeding steps, letting the other components work while waiting for the data.
request sends a request and returns Promise. Or it can group multiple requests and send them in one bulk request (but I'll get to that later, because that's another complex topic).
wait gets the data out of a Promise. If the data is not yet available, here comes the part where it cooperatively lets the other components run.
As you can see, these concepts are pretty similar in both synchronous and cooperative code. Transforming (rewriting) a code to work cooperatively shall be easy and straightforward.
Concept | Sync Code | Cooperative Code |
---|---|---|
use-component | $this->getComponent('githubJuzna')->render() |
Flow::flowAuto([$this->getComponent('githubJuzna'), $this->getComponent('githubHosiplan')]) |
get-data | render , blocking calls |
renderFlow , yield |
request | ->request(...) , blocking |
->request(...) , returns Promise |
wait | ->wait() , blocking |
yield, cooperative |
In synchronous code it is pretty easy, you're passing pure values everywhere. But in cooperative code, there are more types to care about.
To clearly understand, how the concepts fit together, let's look at their types. For each of the concepts, there is a functions with specific type.
Flow::flowAuto(FlowControl[]) → value[]
(or Flow::flowAuto(Generator[]) → value[]
) - receives list of components and renders them, resulting eventually in actual values of these components.
FlowControl::renderFlow() → Generator
- becuase this method contains yield, it creates a generator upon calling. This generator can be continuously executed.
request(values) → Promise<T>
- sending a request produces a promise for the result of type T.
yield Promise<T> → T
- yield unboxes the promise and produces the pure value.
With these types in mind, you can now easily compose the application, knowing what you're passing around.
Flow can upgrade your templates to render components cooperatively.
You can have this simple code and it just works.
<h1>GitHub news</h1>
{control ghJuzna}
{control ghHosiplan}
{control ghKaja}
{control ghDg}
For that, your components need to extend Flow\BaseControl
and the presenter needs to extend Flow\BasePresenter
.
(you can skip this part)
How on earth can this template render components cooperatively?
It first renders a partial template. When it needs to render a component, it will just remember where the component needs to be and starts rendering of the component, adding it to Flow. It cannot be rendered at the moment, because the data are not yet available.
After the template finished rendering, Flow will run the components cooperatively waiting for all of them to finish.
Once all the components finished rendering, Flow will put their output into the partial template.
Cooperative multitasking is another way of writing non-blocking code. The components by themselves are synchronous, but when multiple components are rendered, they can cooperate and thus reduce the waiting time.
Focus is on making the code simple, i.e. avoid passing callbacks around or using $promise->then(...)
.
To write cooperative code, you need:
- your data layer to return Promises.
- components to use yield within
renderFlow
- use templates, or call multiple components via
Flow::flowAuto
.
The rest is handled by the Flow framework.
Coroutines use delimited continuation principle.
The Flow framework is not yet available as a package, it is rather a proof of concept bundled with the example.
Try playing with the example first and provide me with some feedback ;) But I will create a package soon.