C++ Coroutine - When, How to use? [closed] - c++

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 12 months ago.
This post was edited and submitted for review 12 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
As a novice C++ programmer who is very new to the concept of coroutines, I am trying to study and utilize the feature. Although there is explanation of coroutine here: What is a coroutine?
I am not yet sure when and how to use the coroutine. There was several example use cases provided, but those use cases had alternative solutions which could be implemented by pre- C++20 features: (ex:lazy computation of infinite sequence can be done by a class with private internal state variable).
Therefore I am seeking for any usecases that coroutines are particularly useful.
(From the image posted by Izana)

The word "coroutine" in this context is somewhat overloaded.
The general programming concept called a "coroutine" is what is described in the question you're referring to. C++20 added a language feature called "coroutines". While C++20's coroutines are somewhat similar to the programming concept, they're not all that similar.
At the ground level, both concepts are built on the ability of a function (or call stack of functions) to halt its execution and transfer control of execution to someone else. This is done with the expectation that control will eventually be given back to the function which has surrendered execution for the time being.
Where C++ coroutines diverge from the general concept is in their limitations and designed application.
co_await <expr> as a language construct does the following (in very broad strokes). It asks the expression <expr> if it has a result value to provide at the present time. If it does have a result, then the expression extracts the value and execution in the current function continues as normal.
If the expression cannot be resolved at the present time (perhaps because <expr> is waiting on an external resource or asynchronous process or something), then the current function suspends its execution and returns control to the function that called it. The coroutine also attaches itself to the <expr> object such that, once <expr> has the value, it should resume the coroutine's execution with said value. This resumption may or may not happen on the current thread.
So we see the pattern of C++20 coroutines. Control on the current thread returns to the caller, but resumption of the coroutine is determined by the nature of the value being co_awaited on. The caller gets an object that represents the future value the coroutine will produce but has not yet. The caller can wait on it to be ready or go do something else. It may also be able to itself co_await on the future value, creating a chain of coroutines to be resumed once a value is computed.
We also see the primary limitation: suspension applies only to the immediate function. You cannot suspend an entire stack of function calls unless each one of them individually does their own co_awaits.
C++ coroutines are a complex dance between 3 parties: the expression being awaited on, the code doing the awaiting, and the caller of the coroutine. Using co_yield essentially removes one of these three parties. Namely, the yielded expression is not expected to be involved. It's just a value which is going to be dumped to the caller. So yielding coroutines only involve the coroutine function and the caller. Yielding C++ coroutines are a bit closer to the conceptual idea of "coroutines".
Using a yielding coroutine to serve a number of values to the caller is generally called a "generator". How "simple" this makes your code depends on your generator framework (ie: the coroutine return type and its associated coroutine machinery). But good generator frameworks can expose range interfaces to the generation, allowing you to apply C++20 ranges to them and do all sorts of interesting compositions.

coroutine makes asynchronous programing more readable.
if there is no coroutine, we will use callback in asynchronous programing.
void callback(int data1, int data2)
{
// do something with data1, data2 after async op
// ...
}
void async_op(std::function<void()> callback)
{
// do some async operation
}
int main()
{
// do something
int data1;
int data2;
async_op(std::bind(callback, data1, data2));
return 0;
}
if there is a lot of callback, the code will very hard to read.
if we use coroutine the code will be
#include <coroutine>
#include <functional>
struct promise;
struct coroutine : std::coroutine_handle<promise>
{
using promise_type = struct promise;
};
struct promise
{
coroutine get_return_object() { return {coroutine::from_promise(*this)}; }
std::suspend_always initial_suspend() noexcept { return {}; }
std::suspend_never final_suspend() noexcept { return {}; }
void return_void() {}
void unhandled_exception() {}
};
struct awaitable
{
bool await_ready() { return false; }
void await_suspend(std::coroutine_handle<promise> h)
{
func();
}
void await_resume() { }
std::function<void()> func;
};
void async_op()
{
// do some async operation
}
coroutine callasync()
{
// do somethine
int data1;
int data2;
co_await awaitable(async_op);
// do something with data1, data2 after async op
// ...
}
int main()
{
callasync();
return 0;
}

it seems to me that those cases can be achieved by more simpler way: (ex:lazy computation of infinite sequence can be done by a class with private internal state variable).
Say you're writing a function that should interact with a remote server, creating a TCP connection, logging in with some multi-stage challenge/response protocol, making queries and getting replies (often in dribs and drabs over TCP), eventually disconnecting.... If you were writing a dedicated function to synchronously do that - as you might if you had a dedicated thread for this - then your code could very naturally reflect the stages of connection, request and response processing and disconnecting, just by the order of statements in your function and the use of flow control (for, while, switch, if). The data needed at various points would be localised in a scope reflecting its use, so it's easier for the programmer to know what's relevant at each point. This is easy to write, maintain and understand.
If, however, you wanted the interactions with the remote host to be non-blocking and to do other work in the thread while they were happening, you could make it event driven, using a class with private internal state variable[s] to track the state of your connection, as you suggest. But, your class would need not only the same variables the synchronous-function version would need (e.g. a buffer for assembling incoming messages), but also variables to track where in the overall connection/processing steps you left off (e.g. enum state { tcp_connection_pending, awaiting_challenge, awaiting_login_confirmation, awaiting_reply_to_message_x, awaiting_reply_to_message_y }, counters, an output buffer), and you'd need more complex code to jump back in to the right processing step. You no longer have localisation of data with its use in specific statement blocks - and instead have a flat hodge-podge of class data members and additional mental overhead in understanding which parts of the code care about them, when they're valid or not etc.. It's all spaghetti. (The State/Strategy design pattern can help structure this better, but sometimes with runtime for virtual dispatch, dynamic allocation etc..)
Co-routines provide a best-of-both-worlds solution: you can think of them as providing an additional stack for the call to what looks very much like the concise and easy/fast-to-write/maintain/understand synchronous-processing function initially explained above, but with the ability to suspend and resume instead of blocking, so the same thread can progress the connection handling as well as do other work (it could even invoke the coroutine thousands of times to handle thousands of remote connections, switching efficiently between them to keep work happening as network I/O happens).
Harkening back to your "lazy computation of infinite sequence" - in one sense, a coroutine may be overkill for this, as there may not be multiple processing stages/states, or subsets of data members that are relevant therein. There are some benefits to consistency though - if providing e.g. pipelines of coroutines.

Just as lambda in C++ avoid you to define classes and function when you want to capture the context, coroutines also avoid you to define a class and a relatively complex function or set of functions when you want to be able to suspend and resume the execution of a function.
But contrarily to lambda, to use and define coroutines, you need a support library, and C++20 is missing that aspect in the standard library. That has for consequence that most if not all explanations of C++ coroutines target a low level interface and explain as much if not more how to build the support library as how to use it, giving the impression that the usage will be more complex than it is. You get a "how to implement std::vector" kind of description when you want a "how to use std::vector".
To take the example of cppreference.com, coroutines allows you to write
Generator<uint64_t>
fibonacci_sequence(unsigned n)
{
if (n==0)
co_return;
if (n>94)
throw std::runtime_error("Too big Fibonacci sequence. Elements would overflow.");
co_yield 0;
if (n==1)
co_return;
co_yield 1;
if (n==2)
co_return;
uint64_t a=0;
uint64_t b=1;
for (unsigned i = 2; i < n;i++)
{
uint64_t s=a+b;
co_yield s;
a=b;
b=s;
}
}
instead (I didn't pass that to a compiler, there must be errors in it) of
class FibonacciSequence {
public:
FibonacciSequence(unsigned n);
bool done() const;
void next();
uint64_t value() const;
private:
unsigned n;
unsigned state;
unsigned i;
uint64_t mValue;
uint64_t a;
uint64_t b;
uint64_t s;
};
FibonacciSequence::FibonacciSequence(unsigned pN)
: n(pN), state(1)
{}
bool FibonacciSequence::done() const
{
return state == 0;
}
uint64_t FibonacciSequence::value() const
{
return mValue;
}
void FibonacciSequence::next() const
{
for (;;) {
switch (state) {
case 0:
return;
case 1:
if (n==0) {
state = 0;
return;
}
if (n>94)
throw std::runtime_error("Too big Fibonacci sequence. Elements would overflow.");
mValue = 0;
state = 2;
return;
case 2:
if (n==1) {
state = 0;
return;
}
mValue = 1;
state = 3;
return;
case 3:
if (n==2) {
state = 0;
return;
}
a=0;
b=1;
i=2;
state = 4;
break;
case 4:
if (i < n) {
s=a+b;
value = s;
state = 5;
return;
} else {
state = 6;
}
break;
case 5:
a=b;
b=s;
state = 4;
break;
case 6:
state = 0;
return;
}
}
}
FibonacciSequence fibonacci_sequence(unsigned n) {
return FibonacciSequence(n);
}
Obviously something simpler could be used, but I wanted to show how the mapping could be done automatically, without any kind of optimization. And I've side stepped the additional complexity of allocation and deallocation.
That transformation is useful for generators like here. It is more generally useful when you want a kind of collaborative concurrency, with or without parallelism. Sadly, for such things, you need even more library support (including a scheduler to chose the coroutine which will be executed next in a given context) and I've not see relatively simple examples of that showing the underlying concepts while avoiding to be drown in implementation details.

Related

How to compose asynchronous operations?

I'm looking for a way to compose asynchronous operations. The ultimate goal is to execute an asynchronous operation, and either have it run to completion, or return after a user-defined timeout.
For exemplary purposes, assume that I'm looking for a way to combine the following coroutines1:
IAsyncOperation<IBuffer> read(IBuffer buffer, uint32_t count)
{
auto&& result{ co_await socket_.InputStream().ReadAsync(buffer, count, InputStreamOptions::None) };
co_return result;
}
with socket_ being a StreamSocket instance.
And the timeout coroutine:
IAsyncAction timeout()
{
co_await 5s;
}
I'm looking for a way to combine these coroutines in a way, that returns as soon as possible, either once the data has been read, or the timeout has expired.
These are the options I have evaluated so far:
C++20 coroutines: As far as I understand P1056R0, there is currently no library or language feature "to enable creation and composition of coroutines".
Windows Runtime supplied asynchronous task types, ultimately derived from IAsyncInfo: Again, I didn't find any facilities that would allow me to combine the tasks the way I need.
Concurrency Runtime: This looks promising, particularly the when_any function template looks to be exactly what I need.
From that it looks like I need to go with the Concurrency Runtime. However, I'm having a hard time bringing all the pieces together. I'm particularly confused about how to handle exceptions, and whether cancellation of the respective other concurrent task is required.
The question is two-fold:
Is the Concurrency Runtime the only option (UWP application)?
What would an implementation look like?
1 The methods are internal to the application. It is not required to have them return Windows Runtime compatible types.
I think the easiest would be to use the concurrency library. You need to modify your timeout to return the same type as the first method, even if it returns null.
(I realize this is only a partial answer...)
My C++ sucks, but I think this is close...
array<task<IBuffer>, 2> tasks =
{
concurrency::create_task([]{return read(buffer, count).get();}),
concurrency::create_task([]{return modifiedTimeout.get();})
};
concurrency::when_any(begin(tasks), end(tasks)).then([](IBuffer buffer)
{
//do something
});
As suggested by Lee McPherson in another answer, the Concurrency Runtime looks like a viable option. It provides tasks, that can be combined with others, chained up using continuations, as well as seamlessly integrate with the Windows Runtime asynchronous model (see Creating Asynchronous Operations in C++ for UWP Apps). As a bonus, including the <pplawait.h> header provides adapters for concurrency::task class template instantiations to be used as C++20 coroutine awaitables.
I wasn't able to answer all of the questions, but this is what I eventually came up with. For simplicity (and ease of verification) I'm using Sleep in place of the actual read operation, and return an int instead of an IBuffer.
Composition of tasks
The ConcRT provides several ways to combine tasks. Given the requirements concurrency::when_any can be used to create a task that returns, when any of the supplied tasks completes. When only 2 tasks are supplied as input, there's also a convenience operator (operator||) available.
Exception propagation
Exceptions raised from either of the input tasks do not count as a successful completion. When used with the when_any task, throwing an exception will not suffice the wait condition. As a consequence, exceptions cannot be used to break out of combined tasks. To deal with this I opted to return a std::optional, and raise appropriate exceptions in a then continuation.
Task cancellation
This is still a mystery to me. It appears that once a task satisfies the wait condition of the when_any task, there is no requirement to cancel the respective other outstanding tasks. Once those complete (successfully or otherwise), they are silently dealt with.
Following is the code, using the simplifications mentioned earlier. It creates a task consisting of the actual workload and a timeout task, both returning a std::optional. The then continuation examines the return value, and throws an exception in case there isn't one (i.e. the timeout_task finished first).
#include <Windows.h>
#include <cstdint>
#include <iostream>
#include <optional>
#include <ppltasks.h>
#include <stdexcept>
using namespace concurrency;
task<int> read_with_timeout(uint32_t read_duration, uint32_t timeout)
{
auto&& read_task
{
create_task([read_duration]
{
::Sleep(read_duration);
return std::optional<int>{42};
})
};
auto&& timeout_task
{
create_task([timeout]
{
::Sleep(timeout);
return std::optional<int>{};
})
};
auto&& task
{
(read_task || timeout_task)
.then([](std::optional<int> result)
{
if (!result.has_value())
{
throw std::runtime_error("timeout");
}
return result.value();
})
};
return task;
}
The following test code
int main()
{
try
{
auto res1{ read_with_timeout(3000, 5000).get() };
std::cout << "Succeeded. Result = " << res1 << std::endl;
auto res2{ read_with_timeout(5000, 3000).get() };
std::cout << "Succeeded. Result = " << res2 << std::endl;
}
catch( std::runtime_error const& e )
{
std::cout << "Failed. Exception = " << e.what() << std::endl;
}
}
produces this output:
Succeeded. Result = 42
Failed. Exception = timeout

I don't understand how can optimistic concurrency be implemented in C++11

I'm trying to implement a protected variable that does not use locks in C++11. I have read a little about optimistic concurrency, but I can't understand how can it be implemented neither in C++ nor in any language.
The way I'm trying to implement the optimistic concurrency is by using a 'last modification id'. The process I'm doing is:
Take a copy of the last modification id.
Modify the protected value.
Compare the local copy of the modification id with the current one.
If the above comparison is true, commit the changes.
The problem I see is that, after comparing the 'last modification ids' (local copy and current one) and before commiting the changes, there is no way to assure that no other threads have modified the value of the protected variable.
Below there is a example of code. Lets suppose that are many threads executing that code and sharing the variable var.
/**
* This struct is pretended to implement a protected variable,
* but using optimistic concurrency instead of locks.
*/
struct ProtectedVariable final {
ProtectedVariable() : var(0), lastModificationId(0){ }
int getValue() const {
return var.load();
}
void setValue(int val) {
// This method is not atomic, other thread could change the value
// of val before being able to increment the 'last modification id'.
var.store(val);
lastModificationId.store(lastModificationId.load() + 1);
}
size_t getLastModificationId() const {
return lastModificationId.load();
}
private:
std::atomic<int> var;
std::atomic<size_t> lastModificationId;
};
ProtectedVariable var;
/**
* Suppose this method writes a value in some sort of database.
*/
int commitChanges(int val){
// Now, if nobody has changed the value of 'var', commit its value,
// retry the transaction otherwise.
if(var.getLastModificationId() == currModifId) {
// Here is one of the problems. After comparing the value of both Ids, other
// thread could modify the value of 'var', hence I would be
// performing the commit with a corrupted value.
var.setValue(val);
// Again, the same problem as above.
writeToDatabase(val);
// Return 'ok' in case of everything has gone ok.
return 0;
} else {
// If someone has changed the value of var while trying to
// calculating and commiting it, return error;
return -1;
}
}
/**
* This method is pretended to be atomic, but without using locks.
*/
void modifyVar(){
// Get the modification id for checking whether or not some
// thread has modified the value of 'var' after commiting it.
size_t currModifId = lastModificationId.load();
// Get a local copy of 'var'.
int currVal = var.getValue();
// Perform some operations basing on the current value of
// 'var'.
int newVal = currVal + 1 * 2 / 3;
if(commitChanges(newVal) != 0){
// If someone has changed the value of var while trying to
// calculating and commiting it, retry the transaction.
modifyVar();
}
}
I know that the above code is buggy, but I don't understand how to implement something like the above in a correct way, without bugs.
Optimistic concurrency doesn't mean that you don't use the locks, it merely means that you don't keep the locks during most of the operation.
The idea is that you split your modification into three parts:
Initialization, like getting the lastModificationId. This part may need locks, but not necessarily.
Actual computation. All expensive or blocking code goes here (including any disk writes or network code). The results are written in such a way that they not obscure previous version. The likely way it works is by storing the new values next to the old ones, indexed by not-yet-commited version.
Atomic commit. This part is locked, and must be short, simple, and non blocking. The likely way it works is that it just bumps the version number - after confirming, that there was no other version commited in the meantime. No database writes at this stage.
The main assumption here is that computation part is much more expensive that the commit part. If your modification is trivial and the computation cheap, then you can just use a lock, which is much simpler.
Some example code structured into these 3 parts could look like this:
struct Data {
...
}
...
std::mutex lock;
volatile const Data* value; // The protected data
volatile int current_value_version = 0;
...
bool modifyProtectedValue() {
// Initialize.
int version_on_entry = current_value_version;
// Compute the new value, using the current value.
// We don't have any lock here, so it's fine to make heavy
// computations or block on I/O.
Data* new_value = new Data;
compute_new_value(value, new_value);
// Commit or fail.
bool success;
lock.lock();
if (current_value_version == version_on_entry) {
value = new_value;
current_value_version++;
success = true;
} else {
success = false;
}
lock.unlock();
// Roll back in case of failure.
if (!success) {
delete new_value;
}
// Inform caller about success or failure.
return success;
}
// It's cleaner to keep retry logic separately.
bool retryModification(int retries = 5) {
for (int i = 0; i < retries; ++i) {
if (modifyProtectedValue()) {
return true;
}
}
return false;
}
This is a very basic approach, and especially the rollback is trivial. In real world example re-creating the whole Data object (or it's counterpart) would be likely infeasible, so the versioning would have to be done somewhere inside, and the rollback could be much more complex. But I hope it shows the general idea.
The key here is acquire-release semantics and test-and-increment. Acquire-release semantics are how you enforce an order of operations. Test-and-increment is how you choose which thread wins in case of a race.
Your problem therefore is the .store(lastModificationId+1). You'll need .fetch_add(1). It returns the old value. If that's not the expected value (from before your read), then you lost the race and retry.
If I understand your question, you mean to make sure var and lastModificationId are either both changed, or neither is.
Why not use std::atomic<T> where T would be structure that hold both the int and the size_t?
struct VarWithModificationId {
int var;
size_t lastModificationId;
};
class ProtectedVariable {
private std::atomic<VarWithModificationId> protectedVar;
// Add your public setter/getter methods here
// You should be guaranteed that if two threads access protectedVar, they'll each get a 'consistent' view of that variable, but the setter will need to use a lock
};
Оptimistic concurrency is used in database engines when it's expected that different users will access the same data rarely. It could go like this:
First user reads data and timestamp. Users handles the data for some time, user checks if the timestamp in the DB hasn't changes since he read the data, if it doesn't then user updates the data and the timestamp.
But, internally DB-engine uses locks for update anyway, during this lock it checks if timestamp has been changed and if it hasn't been, engine updates the data. Just time for which data is locked smaller than with pessimistic concurrency. And you also need to use some kind of locking.

Ensuring that only one instance of a function is running?

I'm just getting into concurrent programming. Most probably my issue is very common, but since I can't find a good name for it, I can't google it.
I have a C++ UWP application where I try to apply MVVM pattern, but I guess that the pattern or even being UWP is not relevant.
First, I have a service interface that exposes an operation:
struct IService
{
virtual task<int> Operation() = 0;
};
Of course, I provide a concrete implementation, but it is not relevant for this discussion. The operation is potentially long-running: it makes an HTTP request.
Then I have a class that uses the service (again, irrelevant details omitted):
class ViewModel
{
unique_ptr<IService> service;
public:
task<void> Refresh();
};
I use coroutines:
task<void> ViewModel::Refresh()
{
auto result = co_await service->Operation();
// use result to update UI
}
The Refresh function is invoked on timer every minute, or in response to a user request. What I want is: if a Refresh operation is already in progress when a new one is started or requested, then abandon the second one and just wait for the first one to finish (or time out). In other words, I don't want to queue all the calls to Refresh - if a call is already in progress, I prefer to skip a call until the next timer tick.
My attempt (probably very naive) was:
mutex refresh;
task<void> ViewModel::Refresh()
{
unique_lock<mutex> lock(refresh, try_to_lock);
if (!lock)
{
// lock.release(); commented out as harmless but useless => irrelevant
co_return;
}
auto result = co_await service->Operation();
// use result to update UI
}
Edit after the original post: I commented out the line in the code snippet above, as it makes no difference. The issue is still the same.
But of course an assertion fails: unlock of unowned mutex. I guess that the problem is the unlock of mutex by unique_lock destructor, which happens in the continuation of the coroutine and on a different thread (other than the one it was originally locked on).
Using Visual C++ 2017.
use std::atomic_bool:
std::atomic_bool isRunning = false;
if (isRunning.exchange(true, std::memory_order_acq_rel) == false){
try{
auto result = co_await Refresh();
isRunning.store(false, std::memory_order_release);
//use result
}
catch(...){
isRunning.store(false, std::memory_order_release);
throw;
}
}
Two possible improvements : wrap isRunning.store in a RAII class and use std::shared_ptr<std::atomic_bool> if the lifetime if the atomic_bool is scoped.

How can I do automata/state machine coding in C++?

I have used it in another programming language and It's very usefull.
I cannot find anything about this for C++.
Let's for example take the following code:
void change();
enum
{
end = 0,
gmx
}
int
gExitType;
int main()
{
gExitType = end;
SetTimer(&change, 10000, 0);
return 0;
}
void ApplicationExit()
{
switch (gExitType)
{
case end:
printf("This application was ended by the server");
case gmx:
printf("This application was ended by the timer");
}
::exit(0);
}
void change()
{
gExitType = gmx;
ApplicationExit();
}
That's kind of how we would do it in C++, but when using state machine/automata I could do something like this in the other language:
void change();
int main()
{
state exitType:end;
SetTimer(&change, 10000, 0);
return 0;
}
void ApplicationExit() <exitType:end>
{
printf("This application was ended by the server");
}
void ApplicationExit() <exitType:gmx>
{
printf("This application ended by the timer");
}
void change()
{
state exitType:gmx;
ApplicationExit();
}
In my opition this is a really elegant way to achieve things.
How would I do this in C++? This code doesn't seem to work (obviously as I cannot find anything automata related to C++)
To clarify my opinion:
So what are the advantages to using this technique? Well, as you can clearly see the code is smaller; granted I added an enum to the first version to make the examples more similar but the ApplicationExit functions are definately smaller. It's also alot more explicit - you don't need large switch statements in functions to determine what's going on, if you wanted you could put the different ApplicationExits in different files to handle different sets of code independently. It also uses less global variables.
There are C++ libraries like Boost.statechart that specifically try to provide rich support for encoding state machines:
http://www.boost.org/doc/libs/1_54_0/libs/statechart/doc/tutorial.html
Besides this, one very elegant way to encode certain types of state machines is by defining them as a couroutine:
http://c2.com/cgi/wiki?CoRoutine
http://eli.thegreenplace.net/2009/08/29/co-routines-as-an-alternative-to-state-machines/
Coroutines are not directly supported in C++, but there are two possible approaches for
implementing them:
1) Using a technique similar to implementing a duff's device, explained in details here:
http://blog.think-async.com/search/label/coroutines
This is very similar to how C#'s iterators work for example and one limitation is that yielding form the coroutine can be done only from the topmost function in the coroutine call-stack. OTOH, the advantage of this method is that very little memory is required for each instance of the coroutine.
2) Allocating a separate stack and registers space for each coroutine.
This essentially makes the coroutine a full-blown thread of execution with the only difference that the user has full responsibility for the thread scheduling (also known as cooperative multi-tasking).
A portable implementation is available from boost:
http://www.boost.org/doc/libs/1_54_0/libs/coroutine/doc/html/coroutine/intro.html
For this particular example, you could use objects and polymorphism to represent the different states. For example:
class StateObject
{
public:
virtual void action(void) = 0;
};
class EndedBy : public StateObject
{
private:
const char *const reason;
public:
EndedBy( const char *const reason_ ) : reason( reason_ ) { }
virtual void action(void)
{
puts(reason);
}
};
EndedBy EndedByServer("This application was ended by the server");
EndedBy EndedByTimer ("This application ended by the timer");
StateObject *state = &EndedByServer;
void change()
{
state = &EndedByTimer;
}
void ApplicationExit()
{
state->action();
::exit(0);
}
int main()
{
SetTimer(&change, 10000, 0);
// whatever stuff here...
// presumably eventually causes ApplicationExit() to get called before return 0;
return 0;
}
That said, this isn't great design, and it isn't an FSM in the general sense. But, it would implement your immediate need.
You might look up the State Pattern (one reference: http://en.wikipedia.org/wiki/State_pattern ) for a more general treatment of this pattern.
The basic idea, though, is that each state is a subclass of some common "state" class, and you can use polymorphism to determine the different actions and behaviors represented by each state. A pointer to the common "state" base class then keeps track of the state you're currently in.
The state objects may be different types, or as in my example above, different instances of the same object configured differently, or a blend.
You can use Template value specialization over an int to achieve pretty much what you want.
(Sorry I'm at my tablet so I cannot provide an example, I will update on Sunday)

future composability, and boost::wait_for_all

I just read the article 'Futures Done Right', and the main thing that c++11 promises are lacking seems to be that creating composite futures from existing ones
I'm looking right now at the documentation of boost::wait_for_any
but consider the following example:
int calculate_the_answer_to_life_the_universe_and_everything()
{
return 42;
}
int calculate_the_answer_to_death_and_anything_in_between()
{
return 121;
}
boost::packaged_task<int> pt(calculate_the_answer_to_life_the_universe_and_everything);
boost:: future<int> fi=pt.get_future();
boost::packaged_task<int> pt2(calculate_the_answer_to_death_and_anything_in_between);
boost:: future<int> fi2=pt2.get_future();
....
int calculate_the_oscillation_of_barzoom(boost::future<int>& a, boost::future<int>& b)
{
boost::wait_for_all(a,b);
return a.get() + b.get();
}
boost::packaged_task<int> pt_composite(boost::bind(calculate_the_oscillation_of_barzoom, fi , fi2));
boost:: future<int> fi_composite=pt_composite.get_future();
What is wrong with this approach to composability? is this a valid way to achieve composability? do we need some elegant syntactic edulcorant over this pattern?
when_any and when_all are perfectly valid ways to compose futures. They both correspond to parallel composition, where the composite operation waits for either one or all the composed operations.
We also need sequential composition (which is not in Boost.Thread). This could be, for example, a future<T>::then function that allows you to queue up an operation that uses the future's value and runs when the future is ready. It is possible to implement this yourself, but with an efficiency tradeoff. Herb Sutter talks about this in his recent Channel9 video.
N3428 is a draft proposal for adding these features (and more) to the C++ standard library. They are all library features and don't add any new syntax to the language. Additionally, N3328 is a proposal to add syntax for resumable functions (like using async/await in C#) which will use future<T>::then internally.
Points for the use of the word edulcorant. :)
The problem with your sample code is that you package everything up into tasks, but you never schedule those tasks for execution!
int calculate_the_answer_to_life() { ... }
int calculate_the_answer_to_death() { ... }
std::packaged_task<int()> pt(calculate_the_answer_to_life);
std::future<int> fi = pt.get_future();
std::packaged_task<int()> pt2(calculate_the_answer_to_death);
std::future<int> fi2 = pt2.get_future();
int calculate_barzoom(std::future<int>& a, std::future<int>& b)
{
boost::wait_for_all(a, b);
return a.get() + b.get();
}
std::packaged_task<int()> pt_composite([]{ return calculate_barzoom(fi, fi2); });
std::future<int> fi_composite = pt_composite.get_future();
If at this point I write
pt_composite();
int result = fi_composite.get();
my program will block forever. It will never complete, because pt_composite is blocked on calculate_barzoom, which is blocked on wait_for_all, which is blocked on both fi and fi2, neither of which will ever complete until somebody executes pt or pt2 respectively. And nobody will ever execute them, because my program is blocked!
You probably meant me to write something like this:
std::async(pt);
std::async(pt2);
std::async(pt_composite);
int result = fi_composite.get();
This will work. But it's extremely inefficient — we spawn three worker threads (via three calls to async), in order to perform two threads' worth of work. That third thread — the one running pt_composite — will be spawned immediately, and then just sit there asleep until pt and pt2 have finished running. That's better than spinning, but it's significantly worse than not existing: it means that our thread pool has one fewer worker than it ought to have. In a plausible thread-pool implementation with only one thread per CPU core, and a lot of tasks coming in all the time, that means that we've got one CPU core just sitting idle, because the worker thread who was meant to be running on that core is currently blocked inside wait_for_all.
What we want to do is declare our intentions declaratively:
int calculate_the_answer_to_life() { ... }
int calculate_the_answer_to_death() { ... }
std::future<int> fi = std::async(calculate_the_answer_to_life);
std::future<int> fi2 = std::async(calculate_the_answer_to_death);
std::future<int> fi_composite = std::when_all(fi, fi2).then([](auto a, auto b) {
assert(a.is_ready() && b.is_ready());
return a.get() + b.get();
});
int result = fi_composite.get();
and then have the library and the scheduler work together to Do The Right Thing: don't spawn any worker thread that can't immediately proceed with its task. If the end-user has to write even a single line of code that explicitly sleeps, waits, or blocks, some performance is definitely being lost.
In other words: Spawn no worker thread before its time.
Obviously it's possible to do all this in standard C++, without library support; that's how the library itself is implemented! But it's a huge pain to implement from scratch, with many subtle pitfalls; so that's why it's a good thing that library support seems to be coming soon.
The ISO proposal N3428 mentioned in Roshan Shariff's answer has been updated as N3857, and N3865 provides even more convenience functions.