c++ constructors and concurrency

c++ constructors and concurrency - c++

I've been thinking about writing a container class to control access to a complex data structure that will have use in a multi-threaded environment.
And then the question occurred to me:
Is there ever a situation where c++ constructors must be thread-safe?

In general, a constructor cannot be called for the same object by two threads simultaneously. However, the same constructor can certainly be called for different objects at the same time.

Certainly you can invoke the same constructor from more than one thread at once. It that sense, they must be thread-safe, just as any other function must be. If the constructor is going to modify shared state, for example, your container, then you must use synchronization to ensure that the state is modified in a deterministic way.
You can't construct the same object on more than one thread at once, because each object is only constructed once, so there's no way to invoke the constructor on the same object more than once, much less on two different threads at the same time.

Not in my experience. It is the code that calls the constructor, implicitly or otherwise, which needs to be made thread-safe should the application require it.
The rationale is that only one thread should be initializing an object at a time, so no synchronization is necessary to protect the object being initialized within the constructor itself (if the object hasn't finished initialization, it shouldn't be shared between threads anyway).
Another way to look at it is this: Objects are to be treated as logically nonexistent until their constructors have returned. So, a thread that is in the process of creating an object is the only thread that "knows" about it.
Of course, proper synchronization rules apply to any shared resource the constructor itself accesses, but that applies to any function (I've encountered people that fail to realize this, believing constructors are special and somehow provide exclusive access to all resources).

Related

Is it OK for a class constructor to block forever?

Let's say I have an object that provides some sort of functionality in an infinite loop.
Is is acceptable to just put the infinite loop in the constructor?
Example:
class Server {
public:
Server() {
for(;;) {
//...
}
}
};
Or is there an inherent initialization problem in C++ if the constructor never completes?
(The idea is that to run a server you just say Server server;, possibly in a thread somewhere...)

It's not wrong per standard, it's just a bad design.
Constructors don't usually block. Their purpose is to take a raw chunk of memory, and transform it into a valid C++ object. Destructors do the opposite: they take valid C++ objects and turn them back into raw chunks of memory.
If your constructor blocks forever (emphasis on forever), it does something different than just turn a chunk of memory into an object.
It's ok to block for a short time (a mutex is a perfect example of it), if this serves the construction of the object.
In your case, it looks like your constructor is accepting and serving clients. This is not turning memory into objects.
I suggest you split the constructor into a "real" constructor that builds a server object and another start method that serves clients (by starting an event loop).
ps: In some cases you have to execute the functionality/logic of the object separately from the constructor, for example if your class inherit from std::enable_shared_from_this.

It's allowed. But like any other infinite loop, it must have observable side effects, otherwise you get undefined behavior.
Calling the networking functions counts as "observable side effects", so you're safe. This rule only bans loops that either do literally nothing, or just shuffle data around without interacting with the outside world.

Its legal, but its a good idea to avoid it.
The main issue is that you should avoid surprising users. Its unusual to have a constructor that never returns because it isn't logical. Why would you construct something you can never use? As such, while the pattern may work, it is unlikely to be an expected behavior.
A secondary issue is that it limits how your Server class can be used. The construction and destruction processes of C++ are fundamental to the language, so hijacking them can be tricky. For example, one might want to have a Server that is the member of a class, but now that overarching class' constructor will block... even if that isn't intuitive. It also makes it very difficult to put these objects into containers, as this can involve allocating many objects.
The closest I can think of to what you are doing is that of std::thread. Thread does not block forever, but it does have a constructor that does a surprisingly large amount of work. But if you look at std::thread, you realize that when it comes to multithreading, being surprised is the norm, so people have less trouble with such choices. (I am not personally aware of the reasons for starting the thread upon construction, but there's so many corner cases in multithreading that I would not be surprised if it resolves some of them)

A user might expect to set up your Server object in the main thread. Then call the server.endless_loop() function within a worker thread.
In an actual server, the process of acquiring a port requires escalated privileges which can then be dropped. Or perhaps you have an object that needs to load settings. Those sort of tasks could take place in the main thread before the long term looping takes place elsewhere.
Personally, I'd prefer your object had a "poll" function that was fast and non blocking. You could then have a loop function that called poll and sleep in an endless loop. You might even have an atomic variable that you can set to exit the loop from a different thread. Another feature would be to launch an internal thread within the Server object.

As others have pointed out, there's nothing "wrong" with this as far as C++ semantics are concerned, but it's poor design. The point of a constructor is to construct an object, so if that task never completes then it will be surprising to users.
Others have made suggestions regarding splitting the construction & run steps into constructor and method, which makes sense if you have other things you might want to do with the Server besides run it, or if you actually might want to construct it, do other stuff, and then run it.
But if you expect the caller will always just do Server server; server.run(), then maybe you don't even need a class -- it could just be a stand-alone function run_server(). If you don't have state to encapsulate and pass around in the first place, then you don't necessarily need objects. A stand-alone function can even be marked [[noreturn]] to make it clear both to the user and the compiler that the function never returns.
It's hard to say which makes more sense without knowing more about your use case. But in short: constructors construct objects -- if you're doing something else, don't use them for that.

In most cases, your code has no problem. Because of the following rule:
A class is considered a completely-defined object type ([basic.types]) (or complete type) at the closing } of the class-specifier. Within the class member-specification, the class is regarded as complete within function bodies, default arguments, noexcept-specifiers, and default member initializers (including such things in nested classes). Otherwise it is regarded as incomplete within its own class member-specification.
However, A restriction for your code is that you cannot use a glvalue that doesn't obtain from pointer this to access this object due to the behavior is unspecified. It's governed by this rule:
During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor's this pointer, the value of the object or subobject thus obtained is unspecified.
Moreover, you cannot use the utility shared_ptr to manage such class objects. In general, place an infinitely loop into a constructor is not a good idea. Many restrictions will apply to the object when you use it.

When should I mark the copy\movement constructor as deleted in C++? What aspects should be considered?

When should I mark the copy\movement constructor as deleted in C++? What aspects should be considered? Is there any potential problem that I should be aware of while the copy\movement constructor neither has been deleted nor replaced by a user-defined one? I would appreciate that if you could give a few simple examples including the ones provided by the libc++ or libstdc++.
I have thought and thought about it, but still don't fully understand it. I would be very grateful to have some help with this question.

A type should be non-copyable if copying would be incompatible with the designed intent of the type or if copying would represent an expensive operation or otherwise impact the performance of the type. The obvious case of the former is unique_ptr: an object which uniquely owns a specific resource. If you could copy the unique_ptr, then you would have two objects that own that resource. That would violate the intent of uniquely owning the resource, so it makes no sense.
Expensive operations would be something like std::mutex. It isn't fundamentally unreasonable to want to be able to copy a mutex. However, doing so would on many implementations require that the mutex heap-allocate (and use shared references) whatever internal OS mutex data structure is used to implement the mutex. That's needlessly expensive; a user who doesn't need to copy the mutex is paying for the ability to do so. Therefore instead, if a user want to be able to "copy a mutex", then they can heap-allocate std::mutex it and stick it in a shared_ptr.

How bad is not freeing up memory right before the end of program?

As an example let's talk about singleton implementation using new (the one where you create the actual instance at the first call to getInstance() method instead of using a static field. It dawned on me that it never frees that memory up. But then again it would have to do it right before the application closes so the system will free that memory up anyway.
Aside from bad design, what practical downsides does this approach have?
Edit: Ad comments - all valid points, thanks guys. So let me ask this instead - for a single thread app and POD singleton class are there any practical downsides? Just theoretically, I'm not going to actually do that.

for a single thread app and POD singleton class are there any practical downsides? Just theoretically, I'm not going to actually do that.
in standardese
[c++14-Object lifetime-4]For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.
where the dtor is implicitly called on automatic/static variables and such.
So, (assuming a new expression were used to construct the object) the runtime implementation of the invoked allocation function is free to release the memory and let the object decay in oblivion, as long as no observable effects depends on its destruction ( which is trivially true for types with trivial dtors ).

Use a schwartz counter for all singleton types. It's how std::cout is implemented.
Benefits:
thread safe
correct initialisation order guaranteed when singletons depend on each other
correct destruction order at program termination
does not use the heap
100% compliant with c++98, c++03, c++11, c++14, c++17...
no need for an ugly getInstance() function. Just use the global object.
https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Nifty_Counter
As for the headline question:
How bad is not freeing up memory right before the end of program?
Not freeing memory is not so bad when the program is running under an OS which manages process memory.
However, other things that singletons do could include flushing IO buffers (e.g. std::cout, std::cerr). That's probably something you want to avoid losing.

Is the default copy constructor thread-safe in c++?

class CSample{
int a;
// ..... lots of fields
}
Csample c;
As we know, Csample has a default copy constructor. When I do this:
Csample d = c
the default copy constructor will happen. My question is: is it thread safe? Because maybe someone modifies c in another thread when you do the copy constructor. If so, how does the compiler do it? And if not, I think it's horrible that the complier can't guarantee that the copy constructor be thread safe.

Nothing in C++ is thread-safe¹ unless explicitly noted.
If you need to read object c while it may be modified in another thread, you are responsible for locking it. That is a general rule and there is no reason why reading it for purpose of creating a copy should be an exception.
Note, that the copy being created does not need to be locked, because no other thread knows about it yet. Only the source needs to be.
The compiler does not guarantee anything to be thread-safe on it's own, because 99.9% of things don't need to be thread-safe. Most things only need to be reentrant³. So in the rare case you actually need to make something thread-safe, you have to use locks (std::mutex) or atomic types (std::atomic<int>).
You can also simply make your objects constant and then you can read them without locking, because nothing is writing them after creation. Code using constant objects is both more easily parallelised and more easily understood in general, because there is fewer things with state you have to track.
Note that on the most common architecture the mov instruction with int operands happens to be thread-safe. On other CPU types even that might not be true. And because the compiler is allowed to preload values, integer assignment in C++ is not anyway.
¹A set of operations is considered thread-safe if calling them concurrently on the same object is well defined². In C++, calling any modifying operation and any other operation concurrently on the same object is a data race, which is UndefinedBehaviour™.
²It is important to note, that if an object is "thread-safe", it does not really help you much most of the time anyway. Because if an object guarantees that when it's concurrently written you'll always read the new or the old value (C++ allows that when an int c is being changed from 0 to 1000 by one thread, another thread may read, say, 232), most of the time that won't help you, because you need to read multiple values in a consistent state, for which you have to lock over them yourself anyway.
³Reentrant means that the same operation may be called on different objects at the same time. There are a few functions in standard C library that are not reentrant, because they use global (static) buffers or other state. Most have reentrant variants (with _r suffix, usually) and the standrd C++ library uses these, so the C++ part is generally reentrant.

The general rule in the standard is simple: if an object (and
sub-objects are objects) is accessed by more than one thread,
and is modified by any thread, then all accesses must be
synchronized. There are numerous reasons for this, but the most
basic one is that protecting at the lowest level is usually the
wrong level of granularity; adding synchronization primitives
would only make the code run significantly slower, without any
real advantage for the user, even in a multithreaded
environment. Even if the copy constructor were "thread-safe",
unless the object is somehow totally independent of all other
context, you'll probably need some sort of synchronization
primitives at a higher level.
And with regards to "thread-safety": the usual meaning among
experienced practitionners it that the object/class/whatever
specifies exactly how much protection it guarantees. Precisely
because such low level definitions such as you (and many, many
others) seem to use are useless. Synchronizing each function in
a class is generally useless. (Java made the experiment, and
then backed off, because the gurantees they made in the initial
versions of their containers turned out to be expensive and
worthless.)

Assuming that d or c are accessed concurrently on multiple threads, this is not thread-safe. This would amount to a data-race which is undefined behavior.
Csample d = c;
is just as unsafe as
int d = c;
is.

Is there an issue with this singleton implementation?

I've typically gotten used to implementing a singleton pattern in this manner because it's so easy:
class MyClass
{
public:
MyClass* GetInstance()
{
static MyClass instance;
return &instance;
}
private:
//Disallow copy construction, copy assignment, and external
//default construction.
};
This seems significantly easier than creating a static instance pointer, initializing it in the source file, and using dynamic memory allocation in the instance function with guards.
Is there a downside that I'm not seeing? It looks thread-safe to me because I would think the first thread to get to the first line would cause the instantiation - and it seems nice and concise. I figure there has to be a problem I'm not seeing since this is not common though - I'd like to get some feedback before I keep using it

This is not an inherent threadsafe solution: while constructing the instance, another thread can preempt and try to get the instance, resulting in either a double instance or in using an unconstructed instance.
This is handled by several compilers by adding a guard (in gcc, I think there is a flag to disable this) because there is no way to guard this with a userdefined mutex.

The downside is that you have no control over exactly when the object is destroyed. This will be a problem if other static objects try to access it from their destructors.
A C++11 compliant compiler must implement this in a thread-safe way; however, older compilers might not. If you're in doubt, and you don't especially want lazy initialisation, you could force the object to be created by calling the accessor before starting any threads.

There are two issues in the interface:
You should return a reference
You should make either the destructor or the delete operator private
Also, there is a slight risk of attempting to using this class after it's been destructed.
Regarding your multi-threading concerns (and initialization I guess): it's fine in C++11 and have been fine for a long time on good C++ compilers anyway.

Aside from in a multi-threaded scenario - no. Well - let me qualify this, construction is lazy (so the first call may have a hit) and destruction - well, there is no guarantee (aside from it will be - at some point)

Generally speaking, qualifier static for local variable in a method doesn't guarantee that the variable is created only once. If the method is called by different threads, it could be created once for each thread so many times, as many threads called it. It should be not confused with static member of the class, which is created once before program is started. Thread safety of local static variables depends on particular realization of c++. Useful link : Are function static variables thread-safe in GCC?
Hope it helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js