Rationale behind the resumable functions restrictions - c++

In the paper about resumable functions, in the section about restrictions three restrictions are listed:
Resumable functions cannot use a variable number of arguments. For situations where varargs are necessary,the argument unwrapping may be placed in a function that calls a resumable function after doing the unwrapping of arguments.
The return type of a resumable function must be future<T> or shared_future<T>. The restrictions on T are defined by std::future, not this proposal, but T must be a copyable or movable type, or ‘void’. It must also be possible to construct a variable of T without an argument; that is, it has to have an accessible (implicit or explicit) default constructor if it is of a class type.
Await expressions may not appear within the body of an exception handler and should not be executed while a lock on any kind is being held by the executing thread.
There must be a reason behind this restrictions and due to my lack of knowledge about concurrency I cannot deduce what reasons are. Could someone enlight me about this topic?
Variable number of arguments
This restriction is referring to the C-style variadic functions or the C++11 variadic template ones?
If is the C-style ones, the reason of the limitation is related to the magic trickery done by the va_* macros?
If is referring to the variadric template functions I assume that the limitation must be related with the unpacking of the parameter pack not with the fact that the function is a template one (there's no wording about template resumable functions, so I assume that they're legal).
In both cases, I thought that the compiler could be smart enough to deduce which function to use.
Default constructible and copyable/movable type
I understand the reason behind returning a std::future or std::shared_future, but I'm guessing that the reason of the limitation behind the usable types is related to the types that the futures could use.
So, the paper is proposing to extend the language with two new keywords (resumable and await) that provides the behaviour of resumable functions but in the end it trust on existent constructs to transfer the return values of resumable functions between the function and the caller.
Why not propose some kind of language extension for the return values too? that could (maybe) release the limitation to default constructible and copyable/movable types and fix the assimetry between the return type and the returned type:
It should thus be noted that there is an asymmetry between the function’s observed behavior from the outside (caller) and the inside: the outside perspective is that function returns a value of type future<T> at the first suspension point, while the inside perspective is that the function returns a value of type T via a return statement (...)
No awaitable exception handlers nor awaitable locked threads
I guess that awaiting something while catching an exception have no sense, but I haven't any clue about the limitation of the locked threads.

I'm pretty sure all of these restrictions are related to the fact that the context of a resumable function has to be "saved" and "resumed" - I expect the mechanism will create a temporary "copy of the stack" or something similar. So:
Variable number of arguments
This really means things using va_arg functionality - not because the macros involved, but because it's impossible for an outside agent to know the number of actual arguments - for printf, you'd have to read the format string, for some others the last one is marked with NULL. So how much context needs to be saved away?
Locked threads
So we have just said "we don't want this thread to be interrupted", and then we go and say "Now lets run something else". That's like saying "Please, under no circumstances shall I be interrupted while I take my bath" while stepping into the bath and at the same time say "Can you phone me in 2 minutes..." - assuming the bath takes more than 2 minutes, one of those will be untrue.
Default constructible
I'm pretty sure the logic here is that it would make the whole concept quite complex if the return value to be constructed has to have arguments passed to the constructor. How would you even describe such a thing? And you would also have to "hold on to" those arguments while in suspended state. Again, making the saving of the context more complex.

Related

Why are copy-capturing lambdas not default DefaultConstructible in c++20

C++20 introduces DefaultConstructible lambdas. However, cppreference.com states that this is only for stateless lambdas:
If no captures are specified, the closure type has a defaulted default constructor. Otherwise, it has no default constructor (this includes the case when there is a capture-default, even if it does not actually capture anything).
Why does this not extend to lambdas that capture things that are DefaultConstructible? For instance, why can [p{std::make_unique<int>(0)}](){ return p.get(); } not be DefaultConstructible, where the captured p would be nullptr?
Edit: For those asking why we would want this, the behavior only seems natural because one is forced to write something like this when calling standard algorithms that require functors to be default-constructible:
struct S{
S() = default;
int* operator()() const { return p.get(); }
std::unique_ptr<int> p;
};
So, we can pass in S{std::make_unique<int>(0)}, which does the same thing.
It seems like it would be much better to be able to write [p{std::make_unique<int>(0)}](){ return p.get(); } versus creating a struct that does the same thing.
There are two reasons not to do it: conceptual and safety.
Despite the desires of some C++ programmers, lambdas are not meant to be a short syntax for a struct with an overloaded operator(). That is what C++ lambdas are made of, but that's not what lambdas are.
Conceptually, a C++ lambda is supposed to be a C++ approximation of a lambda function. The capture functionality is not meant to be a way to write members of a struct; it's supposed to mimic the proper lexical scoping capabilities of lambdas. That's why they exist.
Creating such a lambda (initially, not by copy/move of an existing one) outside of the lexical scope that it was defined within is conceptually vacuous. It doesn't make sense to write a thing bound to a lexical scope, then create it outside of the scope it was built for.
That's also why you cannot access those members outside of the lambda. Because, even though they could be public members, they exist to implement proper lexical scoping. They're implementation details.
To construct a "lambda" that "captures variables" without actually capturing anything only makes sense from a meta-programming perspective. That is, it only makes sense when focusing on what lambdas happen to be made of, rather than what they are. A lambda is implemented as a C++ struct with captures as members, and the capture expressions don't even technically have to name local variables, so those members could theoretically be value initialized.
If you are unconvinced by the conceptual argument, let's talk safety. What you want to do is declare that any lambda shall be default constructible if all of its captures are non-reference captures and are of default constructible types. This invites disaster. Why?
Because the writer of many such lambdas didn't ask for that. If a lambda captures a unique_ptr<T> by moving from a variable that points to an object, it is 100% valid (under the current rules) for the code inside that lambda to assume that the captured value points to an object. Default construction, while syntactically valid, is semantically nonsense in this case.
With a proper named type, a user can easily control if it is default constructible or not. And therefore, if it doesn't make sense to default construct a particular type, they can forbid it. With lambdas, there is no such syntax; you have to impose an answer on everyone. And the safest answer for capturing lambdas, the one that is guaranteed to never break code, is "no."
By contrast, default construction of captureless lambdas can never be incorrect. Such functions are "pure" (with respect to the contents of the functor, since the functor has no contents). This also matches with the above conceptual argument: a captureless lambda has no proper lexical scope and therefore spawning it anywhere, even outside of its original scope, is fine.
If you want the behavior of a named struct... just make a named struct. You don't even need to default the default constructor; you'll get one by default (if you declare no other constructors).

Is it OK for a class constructor to block forever?

Let's say I have an object that provides some sort of functionality in an infinite loop.
Is is acceptable to just put the infinite loop in the constructor?
Example:
class Server {
public:
Server() {
for(;;) {
//...
}
}
};
Or is there an inherent initialization problem in C++ if the constructor never completes?
(The idea is that to run a server you just say Server server;, possibly in a thread somewhere...)
It's not wrong per standard, it's just a bad design.
Constructors don't usually block. Their purpose is to take a raw chunk of memory, and transform it into a valid C++ object. Destructors do the opposite: they take valid C++ objects and turn them back into raw chunks of memory.
If your constructor blocks forever (emphasis on forever), it does something different than just turn a chunk of memory into an object.
It's ok to block for a short time (a mutex is a perfect example of it), if this serves the construction of the object.
In your case, it looks like your constructor is accepting and serving clients. This is not turning memory into objects.
I suggest you split the constructor into a "real" constructor that builds a server object and another start method that serves clients (by starting an event loop).
ps: In some cases you have to execute the functionality/logic of the object separately from the constructor, for example if your class inherit from std::enable_shared_from_this.
It's allowed. But like any other infinite loop, it must have observable side effects, otherwise you get undefined behavior.
Calling the networking functions counts as "observable side effects", so you're safe. This rule only bans loops that either do literally nothing, or just shuffle data around without interacting with the outside world.
Its legal, but its a good idea to avoid it.
The main issue is that you should avoid surprising users. Its unusual to have a constructor that never returns because it isn't logical. Why would you construct something you can never use? As such, while the pattern may work, it is unlikely to be an expected behavior.
A secondary issue is that it limits how your Server class can be used. The construction and destruction processes of C++ are fundamental to the language, so hijacking them can be tricky. For example, one might want to have a Server that is the member of a class, but now that overarching class' constructor will block... even if that isn't intuitive. It also makes it very difficult to put these objects into containers, as this can involve allocating many objects.
The closest I can think of to what you are doing is that of std::thread. Thread does not block forever, but it does have a constructor that does a surprisingly large amount of work. But if you look at std::thread, you realize that when it comes to multithreading, being surprised is the norm, so people have less trouble with such choices. (I am not personally aware of the reasons for starting the thread upon construction, but there's so many corner cases in multithreading that I would not be surprised if it resolves some of them)
A user might expect to set up your Server object in the main thread. Then call the server.endless_loop() function within a worker thread.
In an actual server, the process of acquiring a port requires escalated privileges which can then be dropped. Or perhaps you have an object that needs to load settings. Those sort of tasks could take place in the main thread before the long term looping takes place elsewhere.
Personally, I'd prefer your object had a "poll" function that was fast and non blocking. You could then have a loop function that called poll and sleep in an endless loop. You might even have an atomic variable that you can set to exit the loop from a different thread. Another feature would be to launch an internal thread within the Server object.
As others have pointed out, there's nothing "wrong" with this as far as C++ semantics are concerned, but it's poor design. The point of a constructor is to construct an object, so if that task never completes then it will be surprising to users.
Others have made suggestions regarding splitting the construction & run steps into constructor and method, which makes sense if you have other things you might want to do with the Server besides run it, or if you actually might want to construct it, do other stuff, and then run it.
But if you expect the caller will always just do Server server; server.run(), then maybe you don't even need a class -- it could just be a stand-alone function run_server(). If you don't have state to encapsulate and pass around in the first place, then you don't necessarily need objects. A stand-alone function can even be marked [[noreturn]] to make it clear both to the user and the compiler that the function never returns.
It's hard to say which makes more sense without knowing more about your use case. But in short: constructors construct objects -- if you're doing something else, don't use them for that.
In most cases, your code has no problem. Because of the following rule:
A class is considered a completely-defined object type ([basic.types]) (or complete type) at the closing } of the class-specifier. Within the class member-specification, the class is regarded as complete within function bodies, default arguments, noexcept-specifiers, and default member initializers (including such things in nested classes). Otherwise it is regarded as incomplete within its own class member-specification.
However, A restriction for your code is that you cannot use a glvalue that doesn't obtain from pointer this to access this object due to the behavior is unspecified. It's governed by this rule:
During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor's this pointer, the value of the object or subobject thus obtained is unspecified.
Moreover, you cannot use the utility shared_ptr to manage such class objects. In general, place an infinitely loop into a constructor is not a good idea. Many restrictions will apply to the object when you use it.

Can C++ const method calls be reordered by optimizer?

This isocpp.org FAQ on constness says:
(...) let’s be precise about whether a method changes the object’s logical state. If you are outside the class — you are a normal user, every experiment you could perform (every method or sequence of methods you call) would have the same results (same return values, same exceptions or lack of exceptions) irrespective of whether you first called that lookup method. If the lookup function changed any future behavior of any future method (not just making it faster but changed the outcome, changed the return value, changed the exception), then the lookup method changed the object’s logical state — it is a mutuator. But if the lookup method changed nothing other than perhaps making some things faster, then it is an inspector.
This seems to imply that the compiler is free to assume calls to inspectors can be freely reordered if they don't include writes to volatiles/calls to IO library functions (or maybe even cached and ommitted). Is that the case?
If the answer is no, is there any situation in which constness will allow for some optimization by allowing the compiler to assume something it cannot prove? (it seems proving an object could be const is trivial, but maybe it gets complicated when passing references?)

Is there a reason some functions don't take a void*?

Many functions accept a function pointer as an argument. atexit and call_once are excellent examples. If these higher level functions accepted a void* argument, such as atexit(&myFunction, &argumentForMyFunction), then I could easily wrap any functor I pleased by passing a function pointer and a block of data to provide statefulness.
As is, there are many cases where I wish I could register a callback with arguments, but the registration function does not allow me to pass any arguments through. atexit only accepts one argument: a function taking 0 arguments. I cannot register a function to clean up after my object, I must register a function which cleans up after all objects of a class, and force my class to maintain a list of all objects needing cleanup.
I always viewed this as an oversight, there seemed no valid reason why you wouldn't allow a measly 4 or 8 byte pointer to be passed along, unless you were on an extremely limited microcontroller. I always assumed they simply didn't realize how important that extra argument could be until it was too late to redefine the spec. In the case of call_once, the posix version accepts no arguments, but the C++11 version accepts a functor (which is virtually equivalent to passing a function and an argument, only the compiler does some of the work for you).
Is there any reason why one would choose not to allow that extra argument? Is there an advantage to accepting only "void functions with 0 arguments"?
I think atexit is just a special case, because whatever function you pass to it is supposed to be called only once. Therefore whatever state it needs to do its job can just be kept in global variables. If atexit were being designed today, it would probably take a void* in order to enable you to avoid using global variables, but that wouldn't actually give it any new functionality; it would just make the code slightly cleaner in some cases.
For many APIs, though, callbacks are allowed to take additional arguments, and not allowing them to do so would be a severe design flaw. For example, pthread_create does let you pass a void*, which makes sense because otherwise you'd need a separate function for each thread, and it would be totally impossible to write a program that spawns a variable number of threads.
Quite a number of the interfaces taking function pointers lacking a pass-through argument are simply coming from a different time. However, their signatures can't be changed without breaking existing code. It is sort of a misdesign but that's easy to say in hindsight. The overall programming style has moved on to have limited uses of functional programming within generally non-functional programming languages. Also, at the time many of these interfaces were created storing any extra data even on "normal" computers implied an observable extra cost: aside from the extra storage used, the extra argument also needs to be passed even when it isn't used. Sure, atexit() is hardly bound to be a performance bottleneck seeing that it is called just once but if you'd pass an extra pointer everywhere you'd surely also have one qsort()'s comparison function.
Specifically for something like atexit() it is reasonably straight forward to use a custom global object with which function objects to be invoked upon exit are registered: just register a function with atexit() calling all of the functions registered with said global object. Also note that atexit() is only guaranteed to register up to 32 functions although implementations may support more registered functions. It seems ill-advised to use it as a registry for object clean-up function rather than the function which calling an object clean-up function as other libraries may have a need to register functions, too.
That said, I can't imagine why atexit() is particular useful in C++ where objects are automatically destroyed upon program termination anyway. Of course, this approach assumes that all objects are somehow held but that's normally necessary anyway in some form or the other and typically done using appropriate RAII objects.

Difference between C++0x lambdas and operator(), closure and functor

I'm confident I get the general gist of the constructs, but I can't see the purpose of them in c++. I have read the previous posts on the topic here on SO and elsewhere, but I fail to see why they should be a new language feature.
The things I would like answered is thusly
What is the difference between a lambda and a template argument accepting a function/functor.
Is a closure just a functor with some set object state (scope?)?
What is the "killer app" for these constructs? or perhaps the typical use case?
Lambdas are really just syntactic sugar for a functor. You could do it all yourself: defining a new class, making member variables to hold the captured values and references, hooking them up in the constructor, writing operator()(), and finally creating an instance and passing it. Or you could use a lambda that's 1/10 as much code and works the same.
Lambdas which don't capture can be converted to function pointers. All lambdas can be converted to std::function, or get their own unique type which works well in templated algorithms accepting a functor.
Ok, you're actually asking a bunch of different questions, possibly because you are not fully familiar with terminology. I'll try to answer all.
What's the difference between a lambda and "operator()"? - Let's reword this to, "What's the difference between a lambda and object with operator()?"
Basically, nothing. The main difference is that a lambda expression creates a functional object while an object with an operator() IS a functional object. The end result is similar enough to consider the same though, an entity that can be invoked with the (params) syntax.
What's the difference between a closure and a functor? This is also rather confused. Please review this link:
http://en.wikipedia.org/wiki/Closure_(computer_programming)
http://en.wikipedia.org/wiki/Closure_(computer_programming)#C.2B.2B
So, as you can see, a closure is a kind of "functor" that is defined within a scope such that it absorbs the variables available to it within that scope. In other words, its a function that is built on the fly, during the operation of the program and that building process is parameterized by the runtime values of the scope containing it. So, in C++ closures are lambdas that use values within the function building the lambda.
What is the difference between a lambda and a template argument accepting a function/functor? - This is again confused. The difference is that they are nothing alike, really. A template "argument" accepting a function/functor is already confused wording so I'll assume by "argument" you mean "function", because arguments don't accept anything. In this case, although a lambda can accept a functor as an argument, it can't be templated, one. Two, generally the lambda is the one being passed as an argument to a function accepting a functor argument.
Is a closure just a functor with some set object state (scope?)?
As you can see by the above link, no. In fact, a closure doesn't even have state, really. A closure is built based UPON the state of some other entity that built it, within that functor though this isn't state, it's the very construction of the object.
What is the "killer app" for these constructs? or perhaps the typical use case?
I'll reword that to, "Why are these things useful?"
Well, in general the ability to treat any object as a function if it has operator() is extremely useful for a whole array of things. For one, it allows us to extend the behavior of any stdlib algorithm by using either objects or free functions. It's impossible to inummerate the vast supply of usefulnesses this has.
More specifically speaking of lambda expressions, they simply make this process yet easier. The limitations imposed by object definitions made the process of using stdlib algorithms slightly inefficient in some cases (from a development use perspective, not program efficiency). For one thing, at the time at least any object passed as a parameter to a template had to be externally defined. I believe that's also changing, but still...having to create entire objects just to perform basic things, that only get used in one place, is inconvenient. Lambda expressions allow that definition to be quite easy, within the scope of the place its being used, etc, etc...