STL vector and thread-safety - c++

Let's say I have a vector of N elements, but up to n elements of this vector have meaningful data. One updater thread updates the nth or n+1st element (then sets n = n+1), also checks if n is too close to N and calls vector::resize(N+M) if necessary. After updating, the thread calls multiple child threads to read up to nth data and do some calculations.
It is guaranteed that child threads never change or delete data, (in fact no data is deleted what so ever) and updater calls children just after it finishes updating.
So far no problem has occured, but I want to ask whether a problem may occur during reallocating of vector to a larger memory block, if there are some child working threads left from the previous update.
Or is it safe to use vector, as it is not thread-safe, in such a multithreaded case?
EDIT:
Since only insertion takes place when the updater calls vector::resize(N+M,0), are there any possible solutions to my problem? Due to the great performance of STL vector I am not willing to replace it with a lockable vector or in this case are there any performant,known and lock-free vectors?

I want to ask whether a problem may occur during reallocating of vector to a larger memory block, if there are some child working threads left from the previous update.
Yes, this would be very bad.
If you are using a container from multiple threads and at least one thread may perform some action that may modify the state of the container, access to the container must be synchronized.
In the case of std::vector, anything that changes its size (notably, insertions and erasures) change its state, even if a reallocation is not required (any insertion or erasure requires std::vector's internal size bookkeeping data to be updated).
One solution to your problem would be to have the producer dynamically allocate the std::vector and use a std::shared_ptr<std::vector<T> > to own it and give this std::shared_ptr to each of the consumers.
When the producer needs to add more data, it can dynamically allocate a new std::vector with a new, larger size and copies of the elements from the old std::vector. Then, when you spin off new consumers or update consumers with the new data, you simply need to give them a std::shared_ptr to the new std::vector.

Is how your workers decide to work on data thread safe? Is there any signaling between workers done and the producer? If not then there is definitely an issue where the producer could cause the vector to move while it is still being worked on. Though this could trivially be fixed by moving to a std::deque instead.(note that std::deque invalidates iterators on push_back but references to elements are not affected).

I've made my own GrowVector. It works for me and it is really fast.
Link: QList, QVector or std::vector multi-threaded usage

Related

Do we need locking if we are appending and reading data from vector simultaneously?(no-modification)

We know that if more than one thread operates on an object and there is a modification involved, we need some kind of locking(atomic/mutex). For my case only these operations are happening simultaneously for a std::vector:
1. Read
2. Append/Push
Will the vector need a lock in this case? and if yes, why? My program is based on CPP.
I'm new to the lock concept. Any hint in the right direction will work for me.
Yes you need locking, in general, because push_back can cause reallocation.
You can check the reference:
https://en.cppreference.com/w/cpp/container/vector/push_back says
If the new size() is greater than capacity() then all iterators and
references (including the past-the-end iterator) are invalidated.
Otherwise only the past-the-end iterator is invalidated.
https://www.cplusplus.com/reference/vector/vector/push_back/ mentions:
The container is modified. If a reallocation happens, all contained
elements are modified. Otherwise, no existing element is accessed, and
concurrently accessing or modifying them is safe.
So, you should lock if you want to be careful. Or if you care about clean maintainable code.
If you need extra performance and know what you are doing, you can get away with locking only when you know that no push_back() will bring size() above capacity(). That is very tricky and error prone: as soon as you allow one thread to start reading, you have to be sure no reallocation will occur in other thread, even later.
Edit: re-worded above. tl-dr: use synchronization :-)
You will most likely need resource locking. Take this example, if you insert an element to the vector, it might resize. Now when your resizing the vector, what if another thread tries to access data from the array. See a clash? That's why need to lock resources. Now this is if your inserting or removing data (meaning that your altering the actual allocation of the container). If the size is fixed (meaning if you have pre-allocated it), then there wont be an issue.

What is the fastest way for multiple threads to insert into a vector safely?

I have a program where multiple threads share the same data structure which is basically a 2D array of vectors and sometimes two or more threads might have to insert at the same position i.e. vector which might result in a crash if no precautions were taken. What is the fastest and most efficient way to implement a safe solution for this issue ? Since this issue does not happen very often (no high contention) I had a 2D array of mutexes where each mutex maps to a vector and then each thread locks then unlocks the mutex after finishing from updating the corresponding vector. If this is a good solution, I would like to know if there is something faster than mutex to use.
Note, I am using OpenMP for the multithreading.
The solution greatly depends on how the problem is. For example:
If the vector size may exceed its capacity (i.e. reallocation is required).
Whether the vector is only being read, elements are being inserted or elements can be both inserted and removed.
In the first case, you don't have any other possibility than using locks, since you always need to check whether the vector is being reallocated, and wait for the reallocation to complete if necessary.
On the other hand, if you are completely sure that the vector is only initialized once by a single thread (which is not your case), probably you would not need any synchronization mechanism to perform access to vector elements (inside-element access synchronization may still be required though).
If elements are being inserted and removed from the back of the vector only (queue style), then using atomic compare and swap would be enough (atomically increase the size of the vector, and insert in position size-1 when the swap was successful.
If elements may be removed at any point of the vector, its contents may need to be moved to remove empty holes. This case is similar to a reallocation. You can use a customized heap to manage the empty positions in your vector, although this will increase the complexity.
At the end of the day, probably you will need to either develop your own parallel data structure or rely on a library, such as TBB or Boost.

Multithreading on arrays / Do I need locking mechanisms here?

I am writing a Multithreaded application. That application contains an array of length, lets say, 1000.
If I now would have two threads and I would make sure, that thread 1 will only access the elements 0-499 and thread 2 would only access elements 500-999, would I need a locking mechanism to protect the array or would that be fine.
Note: Only the content of the array will be changed during calculations! The array wont be moved, memcpyed or in some other way altered than altering elements inside of the array.
What you want is perfectly fine! Those kind of strategies (melt together with a bunch of low level atomic primitives) are the basis for what's called lock-free programming.
Actually, there could be possible problems in implementing this solution. You have to strongly guarantee the properties, that you have mentioned.
Make sure, that your in memory data array never moves. You cannot rely on most std containers. Most of them could significantly change during modification. std::map are rebalancing inner trees and making some inner pointers invalid. std::vector sometimes reallocates the whole container when inserting.
Make sure that there is only one consumer and only one producer for any data, that you have. Each consumer have to store inner iterator in valid state to prevent reading same item twice, or skip some item. Each producer must put data in valid place, without possibility to overwrite existing, not read data.
Disobeying of any of this rules makes you need to implement mutexes.

Thread safety std::vector push_back and reserve

I have an application that continuously std::vector::push_back elements into a vector. As it is a real-time system I cannot afford it to stall at any time. Unfortunately, when the reserved memory is exhausted the push_back automatic memory allocation does cause stalls (up to 800ms in my measurements).
I have tackled the problem by having a second thread that monitors when the available memory and calls a std::vector::reserve if necessary.
My question is: is it safe to execute reserve and push_back concurrently?
(clearly under the assumption that the push_back will not reallocate memory)
Thanks!
It is not thread-safe because a vector is contiguous and if it gets larger then you might need to move the contents of a vector to a different location in memory.
As suggested by stefan, you can look at non-blocking queues or have a list (or vector) of vectors such that when you need more space, the other thread can reserve a new vector for you while not blocking the original for lookups. You would just need to remap your indices to look up into the correct vector within the list.
No.
Thread-safety concept exists only in C++11. And it is defined in relation to const. In C++11 standard library const now means thread-safe. Neither of those 2 methods is const, so no, it is not thread-safe.

should I synchronize the deque or not

I have a deque with pointers inside in a C++ application. I know there are two threads to access it.
Thread1 will add pointers from the back and Thread2 will process and remove pointers from the front.
The Thread2 will wait until deque reach certain of amount, saying 10 items, and then start to process it. It will only loop and process 10 items at a time. In the meantime, Thread1 may still keep adding new items into the deque.
I think it will be fine without synchronize the deque because Thread1 and Thread2 are accessing different part of the deque. It is deque not vector. So there is no case that the existing memory of the container will be reallocated.
Am I right? if not, why (I want to know what I am missing)?
EDIT:
I know it will not hurt to ALWAYS synchronize it. But it may hurt the performance or not necessary. I just want it run faster and correctly if possible.
The deque has to keep track of how many elements it has and where those elements are. Adding an element changes that stored data, as does removing an element. Changing that data from two threads without synchronization is a data race, and produces undefined behavior.
In short, you must synchronize those operations.
In general, the Standard Library containers cannot be assumed to be thread-safe unless all you do is reading from them.
If you take a look under the covers, at deque implementation, you will uncover something similar to this:
template <typename T>
class deque {
public:
private:
static size_t const BufferCapacity = /**/;
size_t _nb_available_buffer;
size_t _first_occupied_buffer;
size_t _last_occupied_buffer;
size_t _size_first_buffer;
size_t _size_last_buffer;
T** _buffers; // heap allocated array of
// heap allocated arrays of fixed capacity
}; // class deque
Do you see the problem ? _buffers, at the very least, may be access concurrently by both enqueue and dequeue operations (especially when the array has become too small and need be copied in a bigger array).
So, what is the alternative ? What you are looking for is a concurrent queue. There are some implementations out there, and you should probably not worry too much on whether or not they are lock-free unless it proves to be a bottleneck. An example would be TTB concurrent_queue.
I would advise against creating your own lock-free queue, even if you heard it's all the fad, because all first implementations I have seen had (sometimes subtle) race-conditions.