Can I edit a global vector using multiple threads in C++? - c++

I currently have a code which works well, but I am learning C++, and hence would like to rid myself of any newbie mistakes. Basically the code is
vector<vector<float>> gAbs;
void functionThatAddsEntryTogAbs(){
...
gAbs.pushback(value);
}
int main(){
thread thread1 = thread(functionThatAddsEntryTogAbs,args);
thread thread2 = thread(functionThatAddsEntryTogAbs,args);
thread1.join(); thread2.join();
std::sort(gAbs.begin(),gAbs.end());
writeDataToFile(gAbs,"filename.dat");
}
For instance I remember learning that there are only a few instances where global variables are the right choice. My initial thought was just to have the threads write to the file, but then I cannot guarantee that the data is sorted (which I need), which is my I use std::sort. Are there any suggestions of how to improve this, and what are some alternatives that the more experiences programmers would use instead?
The code needs to be as fast as possible.
Thanks in advance

You can access and modify global resources, including containers from different threads, but you have to protect them from doing that at the same time. Some exceptions are: no modifications are possible, the container itself is not changed and the threads are working on separate entries.
In your code, entries are added to the container, so you need mutexes, but by doing that your parallel code probably doesn't gain you much in speed. A better way could be to know how many entries need to be added, add empty entries (just initialize) and then assign ranges to the threads, so they can fill in the entries.

Related

Shared file logging between threads in C++ 11 [duplicate]

This question already has answers here:
Is cout synchronized/thread-safe?
(4 answers)
Closed 5 years ago.
Recently I started learning C++ 11. I only studied C/C++ for a brief period of time when I was in college.I come from another ecosystem (web development) so as you can imagine I'm relatively new into C++.
At the moment I'm studying threads and how could accomplish logging from multiple threads with a single writer (file handle). So I wrote the following code based on tutorials and reading various articles.
My First question and request would be to point out any bad practices / mistakes that I have overlooked (although the code works with VC 2015).
Secondly and this is what is my main concern is that I'm not closing the file handle, and I'm not sure If that causes any issues. If it does when and how would be the most appropriate way to close it?
Lastly and correct me if I'm wrong I don't want to "pause" a thread while another thread is writing. I'm writing line by line each time. Is there any case that the output messes up at some point?
Thank you very much for your time, bellow is the source (currently for learning purposes everything is inside main.cpp).
#include <iostream>
#include <fstream>
#include <thread>
#include <string>
static const int THREADS_NUM = 8;
class Logger
{
public:
Logger(const std::string &path) : filePath(path)
{
this->logFile.open(this->filePath);
}
void write(const std::string &data)
{
this->logFile << data;
}
private:
std::ofstream logFile;
std::string filePath;
};
void spawnThread(int tid, std::shared_ptr<Logger> &logger)
{
std::cout << "Thread " + std::to_string(tid) + " started" << std::endl;
logger->write("Thread " + std::to_string(tid) + " was here!\n");
};
int main()
{
std::cout << "Master started" << std::endl;
std::thread threadPool[THREADS_NUM];
auto logger = std::make_shared<Logger>("test.log");
for (int i = 0; i < THREADS_NUM; ++i)
{
threadPool[i] = std::thread(spawnThread, i, logger);
threadPool[i].join();
}
return 0;
}
PS1: In this scenario there will always be only 1 file handle open for threads to log data.
PS2: The file handle ideally should close right before the program exits... Should it be done in Logger destructor?
UPDATE
The current output with 1000 threads is the following:
Thread 0 was here!
Thread 1 was here!
Thread 2 was here!
Thread 3 was here!
.
.
.
.
Thread 995 was here!
Thread 996 was here!
Thread 997 was here!
Thread 998 was here!
Thread 999 was here!
I don't see any garbage so far...
My First question and request would be to point out any bad practices / mistakes that I have overlooked (although the code works with VC 2015).
Subjective, but the code looks fine to me. Although you are not synchronizing threads (some std::mutex in logger would do the trick).
Also note that this:
std::thread threadPool[THREADS_NUM];
auto logger = std::make_shared<Logger>("test.log");
for (int i = 0; i < THREADS_NUM; ++i)
{
threadPool[i] = std::thread(spawnThread, i, logger);
threadPool[i].join();
}
is pointless. You create a thread, join it and then create a new one. I think this is what you are looking for:
std::vector<std::thread> threadPool;
auto logger = std::make_shared<Logger>("test.log");
// create all threads
for (int i = 0; i < THREADS_NUM; ++i)
threadPool.emplace_back(spawnThread, i, logger);
// after all are created join them
for (auto& th: threadPool)
th.join();
Now you create all threads and then wait for all of them. Not one by one.
Secondly and this is what is my main concern is that I'm not closing the file handle, and I'm not sure If that causes any issues. If it does when and how would be the most appropriate way to close it?
And when do you want to close it? After each write? That would be a redundant OS work with no real benefit. The file is supposed to be open through entire program's lifetime. Therefore there is no reason to close it manually at all. With graceful exit std::ofstream will call its destructor that closes the file. On non-graceful exit the os will close all remaining handles anyway.
Flushing a file's buffer (possibly after each write?) would be helpful though.
Lastly and correct me if I'm wrong I don't want to "pause" a thread while another thread is writing. I'm writing line by line each time. Is there any case that the output messes up at some point?
Yes, of course. You are not synchronizing writes to the file, the output might be garbage. You can actually easily check it yourself: spawn 10000 threads and run the code. It's very likely you will get a corrupted file.
There are many different synchronization mechanisms. But all of them are either lock-free or lock-based (or possibly a mix). Anyway a simple std::mutex (basic lock-based synchronization) in the logger class should be fine.
The first massive mistake is saying "it works with MSVC, I see no garbage", even moreso as it only works because your test code is broken (well it's not broken, but it's not concurrent, so of course it works fine).
But even if the code was concurrent, saying "I don't see anything wrong" is a terrible mistake. Multithreaded code is never correct unless you see something wrong, it is incorrect unless proven correct.
The goal of not blocking ("pausing") one thread while another is writing is unachieveable if you want correctness, at least if they concurrently write to the same descriptor. You must synchronize properly (call it any way you like, and use any method you like), or the behavior will be incorrect. Or worse, it will look correct for as long as you look at it, and it will behave wrong six months later when your most important customer uses it for a multi-million dollar project.
Under some operating systems, you can "cheat" and get away without synchronization as these offer syscalls that have atomicity guarantees (e.g. writev). That is however not what you may think, it is indeed heavyweight synchronization, only just you don't see it.
A better (more efficient) strategy than to use a mutex or use atomic writes might be to have a single consumer thread which writes to disk, and to push log tasks onto a concurrent queue from how many producer threads you like. This has minimum latency for threads that you don't want to block, and blocking where you don't care. Plus, you can coalesce several small writes into one.
Closing or not closing a file seems like a non-issue. After all, when the program exits, files are closed anyway. Well yes, except, there are three layers of caching (four actually if you count the physical disk's caches), two of them within your application and one within the operating system.
When data has made it at least into the OS buffers, all is good unless power fails unexpectedly. Not so for the other two levels of cache!
If your process dies unexpectedly, its memory will be released, which includes anything cached within iostream and anything cached within the CRT. So if you need any amount of reliability, you will either have to flush regularly (which is expensive), or use a different strategy. File mappying may be such a strategy because whatever you copy into the mapping is automatically (by definition) within the operating system's buffers, and unless power fails or the computer explodes, it will be written to disk.
That being said, there exist dozens of free and readily available logging libraries (such as e.g. spdlog) which do the job very well. There's really not much of a reason to reinvent this particular wheel.
Hello and welcome to the community!
A few comments on the code, and a few general tips on top of that.
Don't use native arrays if you do not absolutely have to.
Eliminating the native std::thread[] array and replacing it with an std::array would allow you to do a range based for loop which is the preferred way of iterating over things in C++. An std::vector would also work since you have to generate the thredas (which you can do with std::generate in combination with std::back_inserter)
Don't use smart pointers if you do not have specific memory management requirements, in this case a reference to a stack allocated logger would be fine (the logger would probably live for the duration of the program anyway, hence no need for explicit memory management). In C++ you try to use the stack as much as possible, dynamic memory allocation is slow in many ways and shared pointers introduce overhead (unique pointers are zero cost abstractions).
The join in the for loop is probably not what you want, it will wait for the previously spawned thread and spawn another one after it is finished. If you want parallelism you need another for loop for the joins, but the preferred way would be to use std::for_each(begin(pool), end(pool), [](auto& thread) { thread.join(); }) or something similar.
Use the C++ Core Guidelines and a recent C++ standard (C++17 is the current), C++11 is old and you probably want to learn the modern stuff instead of learning how to write legacy code. http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
C++ is not java, use the stack as much as possible - this is one of the biggest advantages to using C++. Make sure you understand how the stack, constructors and destructors work by heart.
The first question is subjective so someone else would want to give an advice, but I don't see anything awful.
Nothing in C++ standard library is thread-safe except for some rare cases. A good answer on using ofstream in a multithreaded environment is given here.
Not closing a file is indeed an issue. You have to get familiar with RAII as it is one of the first things to learn. The answer by Detonar is a good piece of advice.

LInux/c++, how to protect two data structures at same time?

I'm developing some programs with c/c++ in Linux. my question is:
I have an upper-level class called Vault, inside of which there are an OrderMap which uses unordered_map as data structure and a OrderBook, which has 2 std::list in side.
Both the OrderMap and OrderBook stores Order* as the element, they share an Order object which is allocated on heap. So either the OrderBook or the OrderMap modifies the order within.
I have two threads that will do read and write operations on them. the 2 threads can insert/modify/retrieve(read)/delete the elements.
My Question is: how can I protect this big structure of "Vault"? I can actually protect the map or list, but I don't know how to protect them both at the same time.
anybody gives me some ideas?
Add a mutex and lock it before accessing any of the two.
Make them private, so you know accesses are made via your member functions (which have proper locks)
Consider using std::shared_ptr instead of Order *
I don't think I've ever seen a multi-thread usage of an order book. I really think you'd be better off using it in one thread only.
But, to get back to your question, I'll assume you're staying with 2 threads for whatever reasons.
This data structure is too complex to make it lock free. So these are the multi-thread options I can see:
1. If you use a single Vault instance for both threads you will have to lock around it. I assume you don't care to burn some cpu time, so I strongly suggest you use a spin lock, not a mutex.
2. If you can allow having 2 Vault instances this can improve things as every thread can keep it's own private instance and exchange modifications with the other thread using some other means: a lock free queue, or something else.
3. If your book is fast enough to copy, you can have a single central Vault pointer, make a copy on every update, or a group of updates, CAS onto that central pointer, and reuse the old one to not have to allocate every time. You'll end up with one spare instance per thread. Like this:
Vault* old_ptr = centralVault;
Vault* new_ptr = cachedVault ? cachedVault : new Vault;
do {
*new_ptr = *old_ptr;
makeChanges(new_ptr);
}
while( !cas(&centralVault, old_ptr, new_ptr) );
cachedVault = old_ptr;

multithreading in c in a mmorpg

I want to use multithreading in my mmorpg in c++, ive got 5 threads at the moment, and i want to split another one in two, but my mmorpg server conists of loads of vectors, and because the vectors arn't threadsafe to write, i can't do it properly.
Is there an alternative to using vectors across threads, or is there a way to make the vector read/write multithread safe.
Heres an example of what i wan't, try to find an alternative to something like this:
Obviously this isn't actual code, im just making an example.
//Thread1
//Load monster and send data to the player
globals::monstername[myid];//Myid = 1 for now -.-
senddata(globals::monstername[myid]);//Not the actual networking code, im just lazy.
//Thread2
//Create a monster and manage it
globals::monstername.push_back("FatBlobMonster");
//More managing code i cant be bothered inserting >.<
Two things.
Don't store shared data in one big data structure that gets locked completely. Lock just parts of it. For example, if you must use vectors, then create a set of locks for regions of the vector. Say I have 1000 entries, I might create 10 locks, that each lock 100 consecutive entries. But you can probably do better. For example, store your monsters in a hash table, where each "bucket" in your hash table has its own lock.
Use a "read/write" lock. It is possible to create a type of lock that allows multiple readers and a single writer. So each hash bucket might have a read write lock. If no monsters are being created in a particular bucket, then multiple threads can be reading monsters out of that bucket. If you need to hash a new monster into the bucket then you lock the bucket for writing. This lock will wait for all current readers to release, and not allow more readers to lock until the write is complete. Once there are no more readers, the w rite operation succeeeds
I am not aware of any thread safe vector class. However, you could create one yourself that uses std::vector and a std::mutex (in C++11):
template <typename T>
class LockedVector {
private:
std::mutex m;
std::vector<T> vec;
};
You lock the mutex with std::lock.

what is the best way to synchronize container access between multiple threads in real-time application

I have std::list<Info> infoList in my application that is shared between two threads. These 2 threads are accessing this list as follows:
Thread 1: uses push_back(), pop_front() or clear() on the list (Depending on the situation)
Thread 2: uses an iterator to iterate through the items in the list and do some actions.
Thread 2 is iterating the list like the following:
for(std::list<Info>::iterator i = infoList.begin(); i != infoList.end(); ++i)
{
DoAction(i);
}
The code is compiled using GCC 4.4.2.
Sometimes ++i causes a segfault and crashes the application. The error is caused in std_list.h line 143 at the following line:
_M_node = _M_node->_M_next;
I guess this is a racing condition. The list might have changed or even cleared by thread 1 while thread 2 was iterating it.
I used Mutex to synchronize access to this list and all went ok during my initial test. But the system just freezes under stress test making this solution totally unacceptable. This application is a real-time application and i need to find a solution so both threads can run as fast as possible without hurting the total applications throughput.
My question is this:
Thread 1 and Thread 2 need to execute as fast as possible since this is a real-time application. what can i do to prevent this problem and still maintain the application performance? Are there any lock-free algorithms available for such a problem?
Its ok if i miss some newly added Info objects in thread 2's iteration but what can i do to prevent the iterator from becoming a dangling pointer?
Thanks
Your for() loop can potentially keep a lock for a relatively long time, depending on how many elements it iterates. You can get in real trouble if it "polls" the queue, constantly checking if a new element became available. That makes the thread own the mutex for an unreasonably long time, giving few opportunities to the producer thread to break in and add an element. And burning lots of unnecessary CPU cycles in the process.
You need a "bounded blocking queue". Don't write it yourself, the lock design is not trivial. Hard to find good examples, most of it is .NET code. This article looks promising.
In general it is not safe to use STL containers this way. You will have to implement specific method to make your code thread safe. The solution you chose depends on your needs. I would probably solve this by maintaining two lists, one in each thread. And communicating the changes through a lock free queue (mentioned in the comments to this question). You could also limit the lifetime of your Info objects by wrapping them in boost::shared_ptr e.g.
typedef boost::shared_ptr<Info> InfoReference;
typedef std::list<InfoReference> InfoList;
enum CommandValue
{
Insert,
Delete
}
struct Command
{
CommandValue operation;
InfoReference reference;
}
typedef LockFreeQueue<Command> CommandQueue;
class Thread1
{
Thread1(CommandQueue queue) : m_commands(queue) {}
void run()
{
while (!finished)
{
//Process Items and use
// deleteInfo() or addInfo()
};
}
void deleteInfo(InfoReference reference)
{
Command command;
command.operation = Delete;
command.reference = reference;
m_commands.produce(command);
}
void addInfo(InfoReference reference)
{
Command command;
command.operation = Insert;
command.reference = reference;
m_commands.produce(command);
}
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
class Thread2
{
Thread2(CommandQueue queue) : m_commands(queue) {}
void run()
{
while(!finished)
{
processQueue();
processList();
}
}
void processQueue()
{
Command command;
while (m_commands.consume(command))
{
switch(command.operation)
{
case Insert:
m_infoList.push_back(command.reference);
break;
case Delete:
m_infoList.remove(command.reference);
break;
}
}
}
void processList()
{
// Iterate over m_infoList
}
private:
CommandQueue& m_commands;
InfoList m_infoList;
}
void main()
{
CommandQueue commands;
Thread1 thread1(commands);
Thread2 thread2(commands);
thread1.start();
thread2.start();
waitforTermination();
}
This has not been compiled. You still need to make sure that access to your Info objects is thread safe.
I would like to know what is the purpose of this list, it would be easier to answer the question then.
As Hoare said, it is generally a bad idea to try to share data to communicate between two threads, rather you should communicate to share data: ie messaging.
If this list is modelling a queue, for example, you might simply use one of the various ways to communicate (such as sockets) between the two threads. Consumer / Producer is a standard and well-known problem.
If your items are expensive, then only pass the pointers around during communication, you'll avoid copying the items themselves.
In general, it's exquisitely difficult to share data, although it is unfortunately the only way of programming we hear of in school. Normally only low-level implementation of "channels" of communication should ever worry about synchronization and you should learn to use the channels to communicate instead of trying to emulate them.
To prevent your iterator from being invalidated you have to lock the whole for loop. Now I guess the first thread may have difficulties updating the list. I would try to give it a chance to do its job on each (or every Nth iteration).
In pseudo-code that would look like:
mutex_lock();
for(...){
doAction();
mutex_unlock();
thread_yield(); // give first thread a chance
mutex_lock();
if(iterator_invalidated_flag) // set by first thread
reset_iterator();
}
mutex_unlock();
You have to decide which thread is the more important. If it is the update thread, then it must signal the iterator thread to stop, wait and start again. If it is the iterator thread, it can simply lock the list until iteration is done.
The best way to do this is to use a container that is internally synchronized. TBB and Microsoft's concurrent_queue do this. Anthony Williams also has a good implementation on his blog here
Others have already suggested lock-free alternatives, so I'll answer as if you were stuck using locks...
When you modify a list, existing iterators can become invalidated because they no longer point to valid memory (the list automagically reallocates more memory when it needs to grow). To prevent invalidated iterators, you could make the producer block on a mutex while your consumer traverses the list, but that would be needless waiting for the producer.
Life would be easier if you used a queue instead of a list, and have your consumer use a synchronized queue<Info>::pop_front() call instead of iterators that can be invalidated behind your back. If your consumer really needs to gobble chunks of Info at a time, then use a condition variable that'll make your consumer block until queue.size() >= minimum.
The Boost library has a nice portable implementation of condition variables (that even works with older versions of Windows), as well as the usual threading library stuff.
For a producer-consumer queue that uses (old-fashioned) locking, check out the BlockingQueue template class of the ZThreads library. I have not used ZThreads myself, being worried about lack of recent updates, and because it didn't seem to be widely used. However, I have used it as inspiration for rolling my own thread-safe producer-consumer queue (before I learned about lock-free queues and TBB).
A lock-free queue/stack library seems to be in the Boost review queue. Let's hope we see a new Boost.Lockfree in the near future! :)
If there's interest, I can write up an example of a blocking queue that uses std::queue and Boost thread locking.
EDIT:
The blog referenced by Rick's answer already has a blocking queue example that uses std::queue and Boost condvars. If your consumer needs to gobble chunks, you can extend the example as follows:
void wait_for_data(size_t how_many)
{
boost::mutex::scoped_lock lock(the_mutex);
while(the_queue.size() < how_many)
{
the_condition_variable.wait(lock);
}
}
You may also want to tweak it to allow time-outs and cancellations.
You mentioned that speed was a concern. If your Infos are heavyweight, you should consider passing them around by shared_ptr. You can also try making your Infos fixed size and use a memory pool (which can be much faster than the heap).
As you mentioned that you don't care if your iterating consumer misses some newly-added entries, you could use a copy-on-write list underneath. That allows the iterating consumer to operate on a consistent snapshot of the list as of when it first started, while, in other threads, updates to the list yield fresh but consistent copies, without perturbing any of the extant snapshots.
The trade here is that each update to the list requires locking for exclusive access long enough to copy the entire list. This technique is biased toward having many concurrent readers and less frequent updates.
Trying to add intrinsic locking to the container first requires you to think about which operations need to behave in atomic groups. For instance, checking if the list is empty before trying to pop off the first element requires an atomic pop-if-not-empty operation; otherwise, the answer to the list being empty can change in between when the caller receives the answer and attempts to act upon it.
It's not clear in your example above what guarantees the iteration must obey. Must every element in the list eventually be visited by the iterating thread? Does it make multiple passes? What does it mean for one thread to remove an element from the list while another thread is running DoAction() against it? I suspect that working through these questions will lead to significant design changes.
You're working in C++, and you mentioned needing a queue with a pop-if-not-empty operation. I wrote a two-lock queue many years ago using the ACE Library's concurrency primitives, as the Boost thread library was not yet ready for production use, and the chance for the C++ Standard Library including such facilities was a distant dream. Porting it to something more modern would be easy.
This queue of mine -- called concurrent::two_lock_queue -- allows access to the head of the queue only via RAII. This ensures that acquiring the lock to read the head will always be mated with a release of the lock. A consumer constructs a const_front (const access to head element), a front (non-const access to head element), or a renewable_front (non-const access to head and successor elements) object to represent the exclusive right to access the head element of the queue. Such "front" objects can't be copied.
Class two_lock_queue also offers a pop_front() function that waits until at least one element is available to be removed, but, in keeping with std::queue's and std::stack's style of not mixing container mutation and value copying, pop_front() returns void.
In a companion file, there's a type called concurrent::unconditional_pop, which allows one to ensure through RAII that the head element of the queue will be popped upon exit from the current scope.
The companion file error.hh defines the exceptions that arise from use of the function two_lock_queue::interrupt(), used to unblock threads waiting for access to the head of the queue.
Take a look at the code and let me know if you need more explanation as to how to use it.
If you're using C++0x you could internally synchronize list iteration this way:
Assuming the class has a templated list named objects_, and a boost::mutex named mutex_
The toAll method is a member method of the list wrapper
void toAll(std::function<void (T*)> lambda)
{
boost::mutex::scoped_lock(this->mutex_);
for(auto it = this->objects_.begin(); it != this->objects_.end(); it++)
{
T* object = it->second;
if(object != nullptr)
{
lambda(object);
}
}
}
Calling:
synchronizedList1->toAll(
[&](T* object)->void // Or the class that your list holds
{
for(auto it = this->knownEntities->begin(); it != this->knownEntities->end(); it++)
{
// Do something
}
}
);
You must be using some threading library. If you are using Intel TBB, you can use concurrent_vector or concurrent_queue. See this.
If you want to continue using std::list in a multi-threaded environment, I would recommend wrapping it in a class with a mutex that provides locked access to it. Depending on the exact usage, it might make sense to switch to a event-driven queue model where messages are passed on a queue that multiple worker threads are consuming (hint: producer-consumer).
I would seriously take Matthieu's thought into consideration. Many problems that are being solved using multi-threaded programming are better solved using message-passing between threads or processes. If you need high throughput and do not absolutely require that the processing share the same memory space, consider using something like the Message-Passing Interface (MPI) instead of rolling your own multi-threaded solution. There are a bunch of C++ implementations available - OpenMPI, Boost.MPI, Microsoft MPI, etc. etc.
I don't think you can get away without any synchronisation at all in this case as certain operation will invalidate the iterators you are using. With a list, this is fairly limited (basically, if both threads are trying to manipulate iterators to the same element at the same time) but there is still a danger that you'll be removing an element at the same time you're trying to append one to it.
Are you by any chance holding the lock across DoAction(i)? You obviously only want to hold the lock for the absolute minimum of time that you can get away with in order to maximise the performance. From the code above I think you'll want to decompose the loop somewhat in order to speed up both sides of the operation.
Something along the lines of:
while (processItems) {
Info item;
lock(mutex);
if (!infoList.empty()) {
item = infoList.front();
infoList.pop_front();
}
unlock(mutex);
DoAction(item);
delayALittle();
}
And the insert function would still have to look like this:
lock(mutex);
infoList.push_back(item);
unlock(mutex);
Unless the queue is likely to be massive, I'd be tempted to use something like a std::vector<Info> or even a std::vector<boost::shared_ptr<Info> > to minimize the copying of the Info objects (assuming that these are somewhat more expensive to copy compared to a boost::shared_ptr. Generally, operations on a vector tend to be a little faster than on a list, especially if the objects stored in the vector are small and cheap to copy.

Do I need to protect read access to an STL container in a multithreading environment?

I have one std::list<> container and these threads:
One writer thread which adds elements indefinitely.
One reader/writer thread which reads and removes elements while available.
Several reader threads which access the SIZE of the container (by using the size() method)
There is a normal mutex which protects the access to the list from the first two threads. My question is, do the size reader threads need to acquire this mutex too? should I use a read/write mutex?
I'm in a windows environment using Visual C++ 6.
Update: It looks like the answer is not clear yet. To sum up the main doubt: Do I still need to protect the SIZE reader threads even if they only call size() (which returns a simple variable) taking into account that I don't need the exact value (i.e. I can assume a +/- 1 variation)? How a race condition could make my size() call return an invalid value (i.e. one totally unrelated to the good one)?
Answer: In general, the reader threads must be protected to avoid race conditions. Nevertheless, in my opinion, some of the questions stated above in the update haven't been answered yet.
Thanks in advance!
Thank you all for your answers!
Yes, the read threads will need some sort of mutex control, otherwise the write will change things from under it.
A reader/writer mutex should be enough. But strictly speaking this is an implmentation-specific issue. It's possible that an implementation may have mutable members even in const objects that are read-only in your code.
Checkout the concurrent containers provided by Intel's Open Source Threading Building Blocks library. Look under "Container Snippets" on the Code Samples page for some examples. They have concurrent / thread-safe containers for vectors, hash maps and queues.
I don't believe the STL containers are thread-safe, as there isn't a good way to handle threads cross-platform. The call to size() is simple, but it still needs to be protected.
This sounds like a great place to use read-write locks.
You should consider some SLT implementation might calculate the size when called.
To overcome this, you could define a new variable
volatile unsigned int ContainerSize = 0;
Update the variable only inside already protected update calls, but you can read / test the variable without protection (taking into account you don't need the exact value).
Yes. I would suggest wrapping your STL library with a class that enforces serial access. Or find a similar class that's already been debugged.
I'm going to say no. In this case.
Without a mutex, what you might find is that the size() returns the wrong value occasionally, as items are added or removed. If this is acceptable to you, go for it.
however, if you need the accurate size of the list when the reader needs to know it, you'll have to put a critical section around every size call in addition to the one you have around the add and erase calls.
PS. VC6 size() simply returns the _Size internal member, so not having a mutex won't be a problem with your particular implementation, except that it might return 1 when a second element is being added, or vice-versa.
PPS. someone mentioned a RW lock, this is a good thing, especially if you're tempted to access the list objects later. Change your mutex to a Boost::shared_mutex would be beneficial then. However no mutex whatsoever is needed if all you're calling is size().
The STL in VC++ version 6 is not thread-safe, see this recent question. So it seems your question is a bit irrelevant. Even if you did everything right, you may still have problems.
Whether or not size() is safe (for the definition of "safe" that you provide) is implementation-dependent. Even if you are covered on your platform, for your version of your compiler at your optimization level for your version of thread library and C runtime, please do not code this way. It will come back to byte you, and it will be hell to debug. You're setting yourself up for failure.