Is std::string thead-safe with gcc 4.3? - c++

I'm developing a multithreaded program running on Linux (compiled with G++ 4.3) and if you search around for a bit you find a lot of scary stories about std::string not being thread-safe with GCC. This is supposedly due to the fact that internally it uses copy-on-write which wreaks havoc with tools like Helgrind.
I've made a small program that copies one string to another string and if you inspect both strings they both share the same internal _M_p pointer. When one string is modified the pointer changes so the copy-on-write stuff is working fine.
What I'm worried about though is what happens if I share a string between two threads (for instance passing it as an object in a threadsafe dataqueue between two threads). I've already tried compiling with the '-pthread' option but that does not seem to make much difference. So my question:
Is there any way to force std::string to be threadsafe? I would not mind if the copy-on-write behaviour was disabled to achieve this.
How have other people solved this? Or am I being paranoid?
I can't seem to find a definitive answer so I hope you guys can help me..
Edit:
Wow, that's a whole lot of answers in such a short time. Thank you! I will definitely use Jack's solution when I want to disable COW. But now the main question becomes: do I really have to disable COW? Or is the 'bookkeeping' done for COW thread safe? I'm currently browsing the libstdc++ sources but that's going to take quite some time to figure out...
Edit 2
OK browsed the libstdc++ source code and I find something like this in libstd++-v3/include/bits/basic_string.h:
_CharT*
_M_refcopy() throw()
{
#ifndef _GLIBCXX_FULLY_DYNAMIC_STRING
if (__builtin_expect(this != &_S_empty_rep(), false))
#endif
__gnu_cxx::__atomic_add_dispatch(&this->_M_refcount, 1);
return _M_refdata();
} // XXX MT
So there is definitely something there about atomic changes to the reference counter...
Conclusion
I'm marking sellibitze's comment as answer here because I think we've reached the conclusion that this area is still unresolved for now. To circumvent the COW behaviour I'd suggest Jack Lloyd's answer. Thank you everybody for an interesting discussion!

Threads are not yet part of the standard. But I don't think that any vendor can get away without making std::string thread-safe, nowadays. Note: There are different definitions of "thread-safe" and mine might differ from yours. Of course, it makes little sense to protect a container like std::vector for concurrent access by default even when you don't need it. That would go against the "don't pay for things you don't use" spirit of C++. The user should always be responsible for synchronization if he/she wants to share objects among different threads. The issue here is whether a library component uses and shares some hidden data structures that might lead to data races even if "functions are applied on different objects" from a user's perspective.
The C++0x draft (N2960) contains the section "data race avoidance" which basically says that library components may access shared data that is hidden from the user if and only if it activly avoids possible data races. It sounds like a copy-on-write implementation of std::basic_string must be as safe w.r.t. multi-threading as another implementation where internal data is never shared among different string instances.
I'm not 100% sure about whether libstdc++ takes care of it already. I think it does. To be sure, check out the documentation

If you don't mind disabling copy-on-write, this may be the best course of action. std::string's COW only works if it knows that it is copying another std::string, so you can cause it to always allocate a new block of memory and make an actual copy. For instance this code:
#include <string>
#include <cstdio>
int main()
{
std::string orig = "I'm the original!";
std::string copy_cow = orig;
std::string copy_mem = orig.c_str();
std::printf("%p %p %p\n", orig.data(),
copy_cow.data(),
copy_mem.data());
}
will show that the second copy (using c_str) prevents COW. (Because the std::string only sees a bare const char*, and has no idea where it came from or what its lifetime might be, so it has to make a new private copy).

This section of the libstdc++ internals states:
The C++ library string functionality
requires a couple of atomic operations
to provide thread-safety. If you don't
take any special action, the library
will use stub versions of these
functions that are not thread-safe.
They will work fine, unless your
applications are multi-threaded.
The reference counting should work in a multi-threaded environment. (unless your system doesn't provide the necessary atomics)

No STL container is thread safe. This way, the library has a general purpose (both to be used in single threading mode, or multi threading mode). In multithreading, you'll need to add the synchronization mechanism.

It seems that this was fixed a while ago: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5444 was (closed as a the same issue than http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5432, which was fixed in 3.1).
See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=6227

According to this bug issue, std::basic_string's copy-on-write implementation still isn't fully thread-safe. <ext/vstring.h> is an implementation without COW and seems to do much better in a read-only context.

Related

C++ std features and Binary size

I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.

Should a pointer to stack variable be volatile?

I know that I should use the volatile keyword to tell the compiler not to optimize memory read\write to variables. I also know that in most cases it should only be used to talk to non-C++ memory.
However, I would like to know if I have to use volatile when holding a pointer to some local (stack) variable.
For example:
//global or member variable
/* volatile? */bool* p_stop;
void worker()
{
/* volatile? */ bool stop = false;
p_stop = &stop;
while(!stop)
{
//Do some work
//No usage of "stop" or p_stop" here
}
}
void stop_worker()
{
*p_stop = true;
}
It looks to me like a compiler with some optimization level might see that stop is a local variable, that is never changed and could replace while(!stop) with a while(true) and thus changing *p_stop while do nothing.
So, is it required to mark the pointer as volatile in such a case?
P.S: Please do not lecture me on why not to use this, the real code that uses this hack does so for a (complex-to-explain) reason.
EDIT:
I failed to mention that these two functions run on different threads.
The worker() is a function of the first thread, and it should be stopped from another thread using the p_stop pointer.
I am not interested in knowing what better ways there are to solve the real reason that is behind this sort of hack. I simply want to know if this is defined\undefined behavior in C++ (11 for that matter), and also if this is compiler\platform\etc dependent. So far I see #Puppy saying that everyone is wrong and that this is wrong, but without referencing a specific standard that denoted this.
I understand that some of you are offended by the "don't lecture me" part, but please stick to the real question - Should I use volatile or not? or is this UB? and if you can please help me (and others) learn something new by providing a complete answer.
I simply want to know if this is defined\undefined behavior in C++ (11 for that matter)
Ta-da (from N3337, "quasi C++11")
Two expression evaluations conflict if one of them modifies a memory location [..] and the other one accesses or modifies the same memory location.
§1.10/4
and:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. [..]
§1.10/21
You're accessing the (memory location of) object stop from different threads, both accesses are not atomic, thus also in no "happens before" relation. Simply put, you have a data race and thus undefined behavior.
I am not interested in knowing what better ways there are to solve the real reason that is behind this sort of hack.
Atomic operations (as defined by the C++ standard) are the only way to (reliably) solve this.
So, is it required to mark the pointer as volatile in such a case?
No. It's not required, principally because volatile doesn't even remotely cover what you need it to do in this case. You must use an actual synchronization primitive, like an atomic operation or mutex. Using volatile here is undefined behaviour and your program will explode.
volatile is NOT useful for concurrency. It may be useful for implementing concurrent primitives but it is far from sufficient.
Frankly, whether or not you want to use actual synchronization primitives is irrelevant. If you want to write correct code, you have no choice.
P.S: Please do not lecture me on why not to use this,
I am not sure what we are supposed to say. The compiler manages the stack, so anything you are doing with it is technically undefined behavior and may not work when you upgrade to the next version of the compiler.
You are also making assumptions that may be different than the compiler's assumptions when it optimizes. This is the real reason to use (or not use) volatile; you give guidance to the compiler that helps it decide whether optimizations are safe. The use of volatile tells the compiler that it should assume that these variables may change due to external influences (other threads or special hardware behavior).
So yes, in this case, it looks like you would need to mark both p_stop and stop with a volatile qualifier.
(Note: this is necessary but not sufficient, as it does not cause the appropriate behaviors to happen in a language implementation with a relaxed memory model that requires barriers to ensure correctness. See https://en.wikipedia.org/wiki/Memory_ordering#Runtime_memory_ordering )
This question simply cannot be answered from the details provided.
As is stated in the question this is an entirely unsupported way of communicating between threads.
So the only answer is:
Specify the compiler versions you're using and hope someone knows its darkest secrets or refer to your documentation. All the C++ standard will tell you is this won't work and all anyone can tell you is "might work but don't".
There isn't a "oh, come on guys everyone knows it pretty much works what do I do as the workaround? wink wink" answer.
Unless your compiler doesn't support atomics or suitably concurrent mechanisms there is no justifiable reason for doing this.
"It's not supported" isn't "complex-to-explain" so I'd be fascinated based on that code fragment to understand what possible reason there is for not doing this properly (other than ancient compiler).

Passing the results of `std::string::c_str()` to `mkdtemp()` using `const_cast<char*>()`

OK, so: we all know that generally the use of const_cast<>() anywhere is so bad it’s practically a programming war crime. So this is a hypothetical question about how bad it might be, exactly, in a specific case.
To wit: I ran across some code that did something like this:
std::string temporary = "/tmp/directory-XXXXX";
const char* dtemp = ::mkdtemp(const_cast<char*>(temporary.c_str()));
/// `temporary` is unused hereafter
… now, I have run across numerous descriptions about how to get writeable access to the underlying buffer of a std::string instance (q.v. https://stackoverflow.com/a/15863513/298171 for example) – all of them have the caveat that yes, these methods aren’t guaranteed to work by any C++ standard, but in practice they all do.
With this in mind, I am just curious on how using const_cast<char*>(string.c_str()) compares to other known methods (e.g. the aforementioned &string[0], &c)… I ask because the code in which I found this method in use seems to work fine in practice, and I thought I’d see what the experts thought before I attempt the inevitable const_cast<>()-free rewrite.
const cannot be enforced at hardware level because in practice, in non-hypothetical environment, you can set read-only attribute only to a full 4K memory page and there are huge pages on the way, which drastically reduce CPU's lookup misses in the TLB.
const doesn't affect code generation like __restrict from C99 does. In fact, const, roughly speaking, means "poison all write attempts to this data, I'd like to protect my invariants here"
Since std::string is a mutable string, its underlying buffer cannot be allocated in read-only memory. So const_cast<> shouldn't cause program crash here unless you're going to change some bytes outside of underlying buffer's bounds or trying to delete, free() or realloc() something. However, altering of chars in the buffer may be classified as invariant violation. Because you don't use std::string instance after that and simply throw it away this shouldn't provoke program crash unless some particular std::string implementation decide to check its invariants' integrity before destruction and force a crash if some of these are broken. Because such check couldn't be done in less than O(N) time and std::string is a performance critical class, it is unlikely to be done by anyone.
Another issue may come from Copy-on-Write strategy. So, by modifying the buffer directly you may break some other std::string's instance which shares the buffer with your string. But few years ago majority of C++ experts came to conclusion that COW is too fragile and too slow especially in multi-threaded environments, so modern C++ libraries shouldn't use it and instead adhere to using move construction where possible and avoiding heap traffic for small length strings where applicable.

Does the imlementation of std::string, I use, implement ref-counting or not?

I develop for iOS and use XCode 3.2.5, GCC 4.2.
UPD
This code works:
string s = "aaaa";
string s1 = s;
assert(s.data() == s1.data());
Does it mean ref-counting is used? Or '==' is overloaded for const char* somehow to compare contents, not addresses?
UPD
Okay, it does.
There are different ways of finding out, the first of which is plainly looking at the code. std::string is a typedef to an instantiation of the basic_string template, and being a template, all the code is available to you in the headers. Note that reading standard library headers can be both enlightening and hard. And yet, you don't even need to understand the code, you might get some good hints from a cursory look (as by the fact that basic_string contains a member _M_p with a _M_refcount sub member)
If you don't want to read the code, you can approach the problem from a practical point of view and measure the effects that a copy-on-write implementation would have. You can, for example create a long string [*], then copy it to a different string and compare the addresses of the data() that stores the actual contents.
[*] The reason for the long string is to avoid getting confused with some other implementations, as small object implementation that could be used by the compiler and by which a string could contain a small buffer to avoid dynamic memory allocations for very small uses.
An easy way to find out would be to copy-construct or assign a string, and compare the results of their data() method - if their data area is at the same location in memory, they must be using some form of reference counting.
One obvious answer is: it's unspecified. As far as I know, it's
not only unspecified in the standard, but in every
implementation. But for what it's worth, g++ uses a reference
counted implementation, at least through the latest version I've
looked at (4.4.2).
ref counting is really useful, copy on write...etc. a lot of code relies on it to be efficient. it's probably a bad idea to abandon it. better to have a function that explicitly obtains a copy of a string the way MS does it (lock buffer, etc.) if you're going to tinker with internals in an unsafe manner.

Should I stop using auto_ptr?

I've recently started appreciating std::auto_ptr and now I read that it will be deprecated. I started using it for two situations:
Return value of a factory
Communicating ownership transfer
Examples:
// Exception safe and makes it clear that the caller has ownership.
std::auto_ptr<Component> ComponentFactory::Create() { ... }
// The receiving method/function takes ownership of the pointer. Zero ambiguity.
void setValue(std::auto_ptr<Value> inValue);
Despite the problematic copy semantics I find auto_ptr useful. And there doesn't seem to be an alternative for the above examples.
Should I just keep using it and later switch to std::unique_ptr? Or is it to be avoided?
It is so very very useful, despite it's flaws, that I'd highly recommend just continuing to use it and switching to unique_ptr when it becomes available.
::std::unique_ptr requires a compiler that supports rvalue references which are part of the C++0x draft standard, and it will take a little while for there to be really wide support for it. And until rvalue references are available, ::std::auto_ptr is the best you can do.
Having both ::std::auto_ptr and ::std::unique_ptr in your code might confuse some people. But you should be able to search and replace for ::std::unique_ptr when you decide to change it. You may get compiler errors if you do that, but they should be easily fixable. The top rated answer to this question about replacing ::std::auto_ptr with ::std::unique_tr has more details.
deprecated doesn't mean it's going away, just that there will be a better alternative.
I'd suggest keeping using it on current code, but using the new alternative on new code (New programs or modules, not small changes to current code). Consistency is important
I'd suggest you go use boost smart pointers.