Pointer to STL Container Thread Safety (Queue/Deque) - c++

I currently have a bit of a multi-threading conundrum. I have two threads, one that reads serial data, and another that attempts to extracts packets from the data. The two threads share a queue. The thread that attempts to create packets has a function entitled parse with the following declaration:
Parse(std::queue<uint8_t>* data, pthread_mutex_t* lock);
Essentially it takes a pointer to the STL queue and uses pop() as it goes through the queue looking for a packet. The lock is used since any pop() is locked and this lock is shared between the Parse function and the thread that is pushing data onto the queue. This way, the queue can be parsed while data is being actively added to it.
The code seems to work for the most part, but I'm seeing invalid packets at a somewhat higher rate than I'd expect. My main question is I'm wondering if the pointer is changing while I'm reading data out of the queue. For example, if the first thread pushes a bunch of data, is there a chance that where the queue is found in memory can change? Or am I guaranteed that the pointer to the queue will remain constant, even as data is added? My concern is that the memory for the queue can be reallocated during my Parse() function, and therefore in the middle of my function, the pointer is invalidated.
For example, I understand that certain STL iterators are invalidated for certain operations. However, I am passing a pointer to the container itself. That is, something like this:
// somewhere in my code I create a queue
std::queue<uint8_t> queue;
// elsewhere...
Parse(&queue, &lock_shared_between_the_two_threads);
Does the pointer to the container itself ever get invalidated? And what does it point to? The first element, or ...?
Note that I'm not pointing to any given element, but to the container itself. Also, I never specified which underlying container should be used to implement the queue, so underneath it all, it's just a deque.
Any help will be greatly appreciated.
EDIT 8/1:
I was able to run a few tests on my code. A couple of points:
The pointer for the container itself does not change over the lifecycle of my program. This makes sense since the queue itself is a member variable of a class. That is, while the queue's elements are dynamically allocated, it does not appear to be the case for the queue itself.
The bad packets I was experiencing appear to be a function of the serial data I'm receiving. I dumped all the data to a hex file and was able to find packets that were invalid, and my alogrithm was correctly marking them as such.
As a result, I'm thinking that passing a reference or pointer to an STL container into a function is thread safe, but I'd like to hear some more commentary ensuring that this is the case, or if this is implementation specific (as alot of STL is...).

You are worried that modifying a container (adding/deleting nodes) in one thread will somehow invalidate the pointer to the container in another thread. The container is an abstraction and will remain valid unless you delete the container object itself. The memory for the data maintained by the containers are typically allocated on the heap by stl::allocators.
This is quite different from the memory allocated for the container object itself which can be on the stack, heap etc., based on how the container object itself was created. This separation of the container from the allocator is what's preventing some modification to the data from modifying the container object itself.
To make debugging your problem simpler, like Jonathan Reinhart suggests, make it a single threaded system, that reads the stream AND parses it.
On a side note, have you considered using Boost Lookfree Queues or something similar. They are designed exactly for this type of scenarios. If you were receiving packets/reading them frequently, locking the queue for reading/writing for each packet can become a significant performance overhead.

Related

How to safely access and write to a complex container from multiple threads in parallel?

I have a case where there is an unordered_map of structs. The struct contains int(s), bool(s) and a vector. My program will fetch data for each item in the map either through a https call to a server or using websocket (seperate https calls are required for each item in map). When using websocket, data for all items in the map is returned together. The fetched data is processed and stored in respective vectors.
The websocket is running in a seperate thread and should run throughout the lifetime of the program.
My program has a delete function which can "empty" the entire map. There is also a addItem() function, which will add new struct to my map.
Whenever "updatesOn" member of struct is false, no data is pushed into the vector.
My current implementation has 3 threads:
main thread will add new items to the map. Another function of main thread is to fetch data from vector in struct. Main thread has a function to empty the map and start again. It has another function which only empties the vector.
second thread will run websocket client and fills up vector in struct as new data arrives. There is a while loop which checks for exit flag. Once exit flag is set in main thread, this thread terminates.
third thread is the manager thread. It looks for new entries in map and does http download and then add this item to websocket for subsequent data updates. It also runs http downloads at regular interval, empties vector and refills it.
Right now I have two mutex.
One for locking before data is written/read to/from the vector.
Second mutex is when new data is added or removed from the map. Also to use when the map is emptied.
I sense this is wrong usage of mutex. As I may empty the map when one of the vector element of its struct is being read or written to. This brings me to use one mutex for all.
The problem is this is a realtime stock data program, i.e. new data pops in every second, sometimes even faster. I am afraid one mutex lock for all could slow down my entire app.
As described above, all 3 threads have write access to this map, with the main thread capable of emptying it complete.
Keeping in mind speed and thread safety, What would be a good way to implement this?
My data members:
unordered_map<string, tickerDiary> tDiaries;
struct tickerDiary {
tickerDiary() : name(""), ohlcPeriodicity("minute"), ohlcStatus(false), updatesOn(true), ohlcDayBarIndex(0), rtStatus(false) {}
string name;
string ohlcPeriodicity;
bool ohlcStatus;
bool rtStatus;
bool updatesOn;
int32 ohlcDayBarIndex;
vector<Quotation> data;
};
struct Quotation {
union AmiDate DateTime;
float Price;
float Open;
float High;
float Low;
float Volume;
float OpenInterest;
float AuxData1;
float AuxData2;
};
Note: I am using C++11.
If I understand your question correctly, your map itself is primarily written in the main thread, and the other threads are only used to operate on the data contained within entries in the map.
Given that, for the non-main threads there are two concerns:
The item that they work on should not randomly disappear
They should be the only one working on their item.
The first of these can most efficiently be solved by decoupling the storage from the map. So for each item, storage is allocated separately (either through the default allocator, or some pooling scheme if you add/remove items a lot), and the map only stores a shared ptr. Then each thread working on an item just needs to keep around a shared ptr to make sure that the storage will not disappear out from under them. Then acquiring the map's associated mutex/shared_mutex is only necessary for the duration of the fetch/store/remove of the pointers. This will then work okay so long as it is acceptable that some threads may waste some time doing actions on items already removed from the map. Using shared_ptrs will make sure you wont leak memory by using reference counters, and they will also do the locking/unlocking for these refcounts (or rather, try to use more efficient platform primitives for these). If you want to know more on shared_ptr, and smart pointers in general, this is a reasonable introduction to the c++ system of smart pointers.
That leaves the second problem, which is probably most easily resolved by keeping a mutex in the data struct (tickerDiary) itself, that threads acquire when starting to do operations that require predictable behavior from the struct, and can be released after they have done what they should do.
Separating the locking this way should reduce the amount of contention on the global lock for the map. However, you should probably benchmark your code to see whether that reduction is worth it given the extra costs of the allocations and refcounts for the individual items.
I don't think using std::vector is the right collection here. But if you insist on using it you should just have one mutex for each collection.
I would recommend concurrent_vector from INTEL TBB or a synchronized data structure from boost.
A third solution could be implementing your own concurrent vector

Do I have to lock std::queue when getting size?

I'm using std::queue in a multithreaded environment. Other threads may modify the queue as they wish. At some point I would like to call std::queue::size(). Do I have to lock the queue for that call? Will something bad happen if I don't?
This is undefined behavior and anything can happen. Behavior is undefined when one thread accesses an object while another thread is, or might be, modifying it.
I hesitate to add this because whether or not you can think of a way it can fail is not relevant. It's not defined, period. But just in case someone argues there's no imaginable way it could fail: Consider a queue that dynamically allocates a control structure that contains the size of the queue and information about each object in the queue. When the queue is enlarged, a new control structure might be allocated, the old structure freed, and a pointer updated. A concurrent call to size might grab the old pointer and then access it after it's freed and possibly contains completely different information or even has been removed from the memory map.
Just reading the size isn't going to be an issue in itself. You're going to read a single memory location. You might read the wrong size, because a change in the size made by another thread might not yet be visible in this thread, but you're not going to read a corrupted value. Such as half the bits from one value and the other from another value.

Data sharing via stack between publisher and consumer thread

I have a publisher thread and a consumer thread. They share data via a std::stack<Data *>. The publisher simply push() the pointer and consumer simply pop() the pointer, use it and call delete on it. Since there is only single thread publishing pointers one at a time, and one thread consuming pointers, is there any need to synchronize the stack? Keep in mind that stack is only storing pointers. Publisher pushes pointer only when Data() is fully constructed.
Failure to synchronize on non-const methods of containers in std namespace is undefined behavior.
Neither push nor pop is const on the underlying container of a stack. So two threads are both writing to the state of the underlying container of the stack.
A way to think about it is that both are, at the very least, going to have to fight over the count of the number of elements in the stack: one is trying to increase it, the other is trying to decrease it. (There are other problems, but that one should convince you that both are writing to the state of the stack)
The std::stack<Data*> instance will need to have access synchronized as more than one thread can be modifying it (via pop() and push()) but the elements contained in it do not as only a single thread can be operating on an element at any one time.
Yes, there is a need to synchronize access to the stack, because std::stack class does not guarantee that any operation is atomic and it is possible, that push(), top() and pop() will interleave.

Queue in shared memory acting up

Shared memory is giving me a hard time and GDB isn't being much help. I've got 32KB of shared memory allocated, and I used shmat to cast it to a pointer to a struct containing A) a bool and B) a queue of objects containing one std::string, three ints, and one bool, plus assorted methods. (I don't know if this matryoshka structure is how you're supposed to do it, but it's the only way I know. Using a message queue isn't an option, and I need to use multiple processes.)
Pushing one object onto the queue works, but when I try to push a second, the program freezes. No error message, no nothing. What's causing this? I doubt it's a lack of memory, but if it is, how much do I need?
EDIT: In case I was unclear -- the objects in the queue are of a class with the five data members described.
EDIT 2: I changed the class of the queue's entries so that it doesn't use std::string. (Embarrassingly enough, I was able to represent the data with a primitive.) The program still freezes on the second push().
EDIT 3: I tried calling front() from the same queue immediately after the first push(), and it froze the program too. Checking the value of the bool outside the queue, however, worked fine, so it's gotta be something wrong with the queue itself.
EDIT 4: As an experiment, I added an std::queue<int> to the struct I was using for the shared memory. It showed the same behavior -- push() worked once, then front() made it freeze. So it's not a problem with the class I'm using for the queue items, either.
This question suggests I'm not likely to solve this with std::queue. Is that so? Should I use boost like it says? (In my case, I'm executing shmget() and shmat() in the parent process and trying to let two child processes communicate, so it's slightly different.)
EDIT 5: The other child process also freezes when it calls front(). A semaphore ensures this happens after the first push() call.
Putting std::string objects into a shared memory segment can't possibly work.
It should work fine for a single process, but as soon as you try to access it from a second process, you'll get garbage: the string will contain a pointer to heap-allocated data, and that pointer is only valid in the process that allocated it.
I don't know why your program freezes, but it is completely pointless to even think about.
As I said in my comment, your problem stems from attempting to use objects that internally require heap allocation in a structure, which should be self contained (i.e. requires no further dynamically allocated memory).
I would tweak your setup, and change the std::string to some fixed size character array, something like
// this structure fits nicely into a typical cache line
struct Message
{
boost::array<char, 48> some_string;
int a, b, c;
bool c;
};
Now, when you need to post something on the queue, copy the string content into some_string. Of course you should size your strings appropriately (and boost::array probably isn't the best - ideally you want some length information too) but you get the idea...

Object delete itself from container

So I have a container(any kind, probably std::map or std::vector) which contains objects of a class with some network thing running in a thread that checks if it is still connected (the thread is defined inside that class and launches when constructed).
Is there any way I can make the object delete itself from the container when its disconnected or should I move the thread outside the object and use that class just to store data?
In order for the object to delete itself from the container, it will have to know which container it is in. You will need to maintain a pointer to the container in the object. You will also have to protect the container with a lock to stop multiple threads accessing the container at the same time.
I think I prefer your second solution - some managing object looks after removing dead objects from the collection. If nothing else, this will be quite a bit easier to debug and the locking logic becomes centralised in a single object.
I would have am unload queue.
When a thread notices that the connection is down it registers the object (and continer) with the unload queue tides everything up as much as possible then the thred terminates.
A separate thread is then inside the unload queue. Its sole purpose is to monitor the queue. When it sees a new object on the queue, remove it from the container and then destroy it (syncing with the objects thread as required).
STL containers tend to assume they're storing values; objects that can be copied and where copies are identical. Typically, objects which have threads fit poorly into that model. They have a much stronger sense of identity. In this case, you definitely have indentity - a copy of the object in a container is distinct from a copy outside.
I had a problem very similar to yours, which I solved by emitting a boost::signal from the "network thing" when it detected the disconnection, being caught by the object managing the container. Upon receiving that signal, it would iterate through the container, removing the dead network session from it. It might be worth looking at it here:
How to make a C++ boost::signal be caught from an object which encapsulates the object which emits it?
Cheers,
Claudio