I have a simulation program. In the main class of the simulation I am "creating + adding" and "removing + destroying" Agents.
The problem is that once in a while (once every 3-4 times I run the program) the program crashes because I am apparently calling a function of an invalid agent in the main loop. The program works just fine most of the time. There are normally thousands of agents in the list.
I don't know how is it possible that I have invalid Agents in my Loop.
It is very difficult to debug the code because I receive the memory exception inside the "Agent::Step function" (which is too late because I cannot understand how was the invalid Agent in the list and got called).
When I look into the Agent reference inside the Agent::Step function (exception point) no data in the agent makes sense, not even the initialized data. So it is definitely invalid.
void World::step()
{
AddDemand();
// run over all the agents and check whether they have remaining actions
// Call their step function if they have, otherwise remove them from space and memory
list<Agent*>::iterator it = agents_.begin();
while (it != agents_.end())
{
if (!(*it)->AllIntentionsFinished())
{
(*it)->step();
it++;
}
else
{
(*it)->removeYourselfFromSpace(); //removes its reference from the space
delete (*it);
agents_.erase(it++);
}
}
}
void World::AddDemand()
{
int demand = demandIdentifier_.getDemandLevel(stepCounter_);
for (int i = 0; i < demand; i++)
{
Agent* tmp = new Agent(*this);
agents_.push_back(tmp);
}
}
Agent:
bool Agent::AllIntentionsFinished()
{
return this->allIntentionsFinished_; //bool flag will be true if all work is done
}
1- Is it possible that VStudio 2012 optimization of Loops (i.e. running in multi-thread if possible) creates the problem?
2- Any suggestions on debugging the code?
If you're running the code multi-threaded, then you'll need to add code to protect things like adding items to and removing items from the list. You can create a wrapper that adds thread safety for a container fairly easily -- have a mutex that you lock any time you do a potentially modifying operation on the underlying container.
template <class Container>
thread_safe {
Container c;
std::mutex m;
public:
void push_back(typename Container::value_type const &t) {
std::lock_guard l(m);
c.push_back(t);
}
// ...
};
A few other points:
You can almost certainly clean your code up quite a bit by having the list hold Agents directly, instead of a pointer to an Agent that you have to allocate dynamically.
Your Agent::RemoveYourselfFromSpace looks/sounds a lot like something that should be handled by Agent's destructor.
You can almost certainly do quite a bit more to clean up the code by using some standard algorithms.
For example, it looks to me like your step could be written something like this:
agents.remove_if([](Agent const &a) { return a.AllIntentionsFinished(); });
std::for_each(agents.begin(), agents.end(),
[](Agent &a) { a.step(); });
...or, you might prefer to continue using an explicit loop, but use something like:
for (Agent & a : agents)
a.step();
The problem is this:
agents_.erase(it++);
See Add and remove from a list in runtime
I don't see any thread-safe components in the code you showed, so if you are running multiple threads and sharing data between them, then absolutely you could have a threading issue. For instance, you do this:
(*it)->removeYourselfFromSpace(); //removes its reference from the space
delete (*it);
agents_.erase(it++);
This is the worst possible order for an unlocked list. You should: remove from the list, destruct object, delete object, in that order.
But if you are not specifically creating threads which share lists/agents, then threading is probably not your problem.
Related
What I want to do is basically queue a bunch to task objects to a container, where the task can remove itself from the queue. But I also don't want the object to be destroyed when it removes itself, so it can continue to finish whatever the work is doing.
So, a safe way to do this is either call RemoveSelf() when the work is done, or use a keepAlive reference then continue to do the work. I've verified that this does indeed work, while the DoWorkUnsafe will always crash after a few iterations.
I'm not particularly happy with the solution, because I have to either remember to call RemoveSelf() at the end of work being done, or remember to use a keepAlive, otherwise it will cause undefined behavior.
Another problem is that if someone decides to iterate through the ownerList and do work, it would invalidate the iterator as they iterate, which is also unsafe.
Alternatively, I know I can instead put the task onto a separate "cleanup" queue and destroy finished tasks separately. But this method seemed neater to me, but with too many caveats.
Is there a better pattern to handle something like this?
#include <memory>
#include <unordered_set>
class SelfDestruct : public std::enable_shared_from_this<SelfDestruct> {
public:
SelfDestruct(std::unordered_set<std::shared_ptr<SelfDestruct>> &ownerSet)
: _ownerSet(ownerSet){}
void DoWorkUnsafe() {
RemoveSelf();
DoWork();
}
void DoWorkSafe() {
DoWork();
RemoveSelf();
}
void DoWorkAlsoSafe() {
auto keepAlive = RemoveSelf();
DoWork();
}
std::shared_ptr<SelfDestruct> RemoveSelf() {
auto keepAlive = shared_from_this();
_ownerSet.erase(keepAlive);
return keepAlive;
};
private:
void DoWork() {
for (auto i = 0; i < 100; ++i)
_dummy.push_back(i);
}
std::unordered_set<std::shared_ptr<SelfDestruct>> &_ownerSet;
std::vector<int> _dummy;
};
TEST_CASE("Self destruct should not cause undefined behavior") {
std::unordered_set<std::shared_ptr<SelfDestruct>> ownerSet;
for (auto i = 0; i < 100; ++i)
ownerSet.emplace(std::make_shared<SelfDestruct>(ownerSet));
while (!ownerSet.empty()) {
(*ownerSet.begin())->DoWorkSafe();
}
}
There is a good design principle that says each class should have exactly one purpose. A "task object" should exist to perform that task. When you start adding additional responsibilities, you tend to end up with a mess. Messes can include having to remember to call a certain method after completing the primary purpose, or having to remember to use a hacky workaround to keep the object alive. Messes are often a sign of inadequate thought put into the design. Being unhappy with a mess speaks well of your potential for good design.
Let us backtrack and look at the real problem. There are task objects stored in a container. The container decides when to invoke each task. The task must be removed from the container before the next task is invoked (so that it is not invoked again). It looks to me like the responsibility for removing elements from the container should fall to the container.
So we'll re-envision your class without that "SelfDestruct" mess. Your task objects exist to perform a task. They are probably polymorphic, hence the need for a container of pointers to task objects rather than a container of task objects. The task objects don't care how they are managed; that is work for someone else.
class Task {
public:
Task() {}
// Other constructors, the destructor, assignment operators, etc. go here
void DoWork() {
// Stuff is done here.
// The work might involve adding tasks to the queue.
}
};
Now focus on the container. The container (more precisely, the container's owner) is responsible for adding and removing elements. So do that. You seem to prefer removing the element before invoking it. That seems like a good idea to me, but don't try to pawn off the removal on the task. Instead use a helper function, keeping this logic at the abstraction level of the container's owner.
// Extract the first element of `ownerSet`. That is, remove it and return it.
// ASSUMES: `ownerSet` is not empty
std::shared_ptr<Task> extract(std::unordered_set<std::shared_ptr<Task>>& ownerSet)
{
auto begin = ownerSet.begin();
std::shared_ptr<Task> first{*begin};
ownerSet.erase(begin);
return first;
}
TEST_CASE("Removal from the container should not cause undefined behavior") {
std::unordered_set<std::shared_ptr<Task>> ownerSet;
for (int i = 0; i < 100; ++i)
ownerSet.emplace(std::make_shared<Task>());
while (!ownerSet.empty()) {
// The temporary returned by extract() will live until the semicolon,
// so it will (barely) outlive the call to DoWork().
extract(ownerSet)->DoWork();
// This is equivalent to:
//auto todo{extract(ownerSet)};
//todo->DoWork();
}
}
From one perspective, this is an almost trivial change from your approach, as all I did was shift a responsibility from the task object to the owner of the container. Yet with this shift, the mess disappears. The same steps are performed, but they make sense and are almost forced when moved to a more appropriate context. Clean design tends to lead to clean implementation.
Edit:
I now have finished my queue (overcoming the problem described below, and more). For those interested it can be found here. I'd be happy to hear any remarks:). Please note the queue isn't just a work item queue, but rather a template container which of course could be instantiated with work items.
Original:
After watching Herb Sutter's talk on concurrency in C++11 and 14 I got all excited about non blocking concurrency.
However, I've not yet been able to find a solution for what I considered a basic problem. So if this is already on here, please be gentile with me.
My problem is quite simple. I'm creating a very simple threadpool. In order to do this I've got some worker threads running inside the workPool class. And I keep a list of workItems.
How do I add a work item in a lock free way.
The non lock free way of doing this would of course be to create a mutex. Lock it if you add an item and read(and lock of course) the list once the current work item is done.
I do not know how to do this in an lock free way however.
Below a rough idea of what I'm creating. This code I've written for this question. And It's neither complete, nor error less:)
#include <thread>
#include <deque>
#include <vector>
class workPool
{
public:
workPool(int workerCount) :
running(1)
{
for (int i = workerCount; i > 0; --i)
workers.push_back(std::thread(&workPool::doWork, this));
}
~workPool()
{
running = 0;
}
private:
bool running;
std::vector< std::thread > workers;
std::deque< std::function<void()> > workItems;
void doWork()
{
while (running)
{
(*workItems.begin())();
workItems.erase(workItems.begin());
if (!workItems.size())
//here the thread should be paused till a new item is added
}
}
void addWorkitem()
{
//This is my confusion. How should I do this?
}
};
I have seen Herb's talks recently and I believe his lock-free linked list should do fine. The only problem is that atomic< shared_ptr<T> > is not yet implemented. I've used the atomic_* function calls as also explained by Herb in his talk.
In the example, I've simplified a task to an int, but it could be anything you want.
The function atomic_compare_exchange_weak takes three arguments: the item to compare, the expected value and the desired value. It returns true or false to indicate success or failure. On failure, the expected value will be changed to the value that was found instead.
#include <memory>
#include <atomic>
// Untested code.
struct WorkItem { // Simple linked list implementation.
int work;
shared_ptr<WorkItem> next; // remember to use as atomic
};
class WorkList {
shared_ptr<WorkItem> head; // remember to use as atomic
public:
// Used by producers to add work to the list. This implementation adds
// new items to the front (stack), but it can easily be changed to a queue.
void push_work(int work) {
shared_ptr<WorkItem> p(new WorkItem()); // The new item we want to add.
p->work = work;
p->next = head;
// Do we get to change head to p?
while (!atomic_compare_exchange_weak(&head, &p->next, p)) {
// Nope, someone got there first, try again with the new p->next,
// and remember: p->next is automatically changed to the new value of head.
}
// Yup, great! Everything's done then.
}
// Used by consumers to claim items to process.
int pop_work() {
auto p = atomic_load(&head); // The item we want to process.
int work = (p ? p->work : -1);
// Do we get to change head to p->next?
while (p && !atomic_compare_exchange_weak(&head, &p, p->next)) {
// Nope, someone got there first, try again with the new p,
// and remember: p is automatically changed to the new value of head.
work = (p ? p->work : -1); // Make sure to update work as well!
}
// Yup, great! Everything's done then, return the new task.
return work; // Returns -1 if list is empty.
}
};
Edit: The reason for using shared_ptr in combination with atomic_* functions is explained in the talk. In a nutshell: popping an item from the linked list might delete it from underneath someone traversing the list, or a different node might get allocated on the same memory address (The ABA Problem). Using shared_ptr will ensure any old readers will hold a valid reference to the original item.
As Herb explained, this makes the pop-function trivial to implement.
Lock free in this kind of context where you have a shared resource (a work queue) is often going to be replaced by atomics and a CAS loop if you really dig deep.
The basic idea is rather simple to get a lock-free concurrent stack (edit: though perhaps a bit deceptively tricky as I made a goof in my first post -- all the more reason to appreciate a good lib). I chose a stack for simplicity but it doesn't take much more to use a queue instead.
Writing to the stack:
Create a new work item.
Loop Repeatedly:
Store the top pointer to the stack.
Set the work item's next pointer to the top of the stack.
Atomic: Compare and swap the top pointer with the pointer to the work item.
If this succeeds and returns the top pointer we stored, break out
of the loop.
Popping from the stack:
Loop:
Fetch top pointer.
If top pointer is not null:
Atomic: CAS top pointer with next pointer.
If successful, break.
Else:
(Optional) Sleep/Yield to avoid burning cycles.
Process the item pointed to by the previous top pointer.
Now if you get really elaborate, you can stick in other work for the thread to do when a push or pop fails, e.g.
I do not know how to do this in C++ 11 (or later); however, here is a solution for how to do it with C++ 98 and `boost (v1.50):
This is obviously not a very useful example, it's only for demonstrative purposes:
#include <boost/scoped_ptr.hpp>
#include <boost/function.hpp>
#include <boost/asio/io_service.hpp>
#include <boost/thread.hpp>
class WorkHandler
{
public:
WorkHandler();
~WorkHandler();
typedef boost::function<void(void)> Work; // the type of work we can handle
void AddWork(Work w) { pThreadProcessing->post(w); }
private:
void ProcessWork();
boost::scoped_ptr<boost::asio::io_service> pThreadProcessing;
boost::thread thread;
bool runThread; // Make sure this is atomic
};
WorkHandler::WorkHandler()
: pThreadProcessing(new boost::asio::io_service), // create our io service
thread(&WorkHandler::ProcessWork, this), // create our thread
runThread(true) // run the thread
{
}
WorkHandler::~WorkHandler()
{
runThread = false; // stop running the thread
thread.join(); // wait for the thread to finish
}
void WorkHandler::ProcessWork()
{
while (runThread) // while the thread is running
{
pThreadProcessing->run(); // process work
pThreadProcessing->reset(); // prepare for more work
}
}
int CalculateSomething(int a, int b)
{
return a + b;
}
int main()
{
WorkHandler wh; // create a work handler
// give it some work to do
wh.AddWork(boost::bind(&CalculateSomething, 4, 5));
wh.AddWork(boost::bind(&CalculateSomething, 10, 100));
wh.AddWork(boost::bind(&CalculateSomething, 35, -1));
Sleep(2000); // ONLY for demonstration! This just allows the thread a chance to work before we destroy it.
return 0;
}
boost::asio::io_service is thread-safe, so you can post work to it without needing mutexes.
NB: Although I haven't made the bool runThread atomic, for thread-safety it should be (I just don't have atomic in my c++)
I currently have a program that has a cache like mechanism. I have a thread listening for updates from another server to this cache. This thread will update the cache when it receives an update. Here is some pseudo code:
void cache::update_cache()
{
cache_ = new std::map<std::string, value>();
while(true)
{
if(recv().compare("update") == 0)
{
std::map<std::string, value> *new_info = new std::map<std::string, value>();
std::map<std::string, value> *tmp;
//Get new info, store in new_info
tmp = cache_;
cache_ = new_cache;
delete tmp;
}
}
}
std::map<std::string, value> *cache::get_cache()
{
return cache_;
}
cache_ is being read from many different threads concurrently. I believe how I have it here I will run into undefined behavior if one of my threads call get_cache(), then my cache updates, then the thread tries to access the stored cache.
I am looking for a way to avoid this problem. I know I could use a mutex, but I would rather not block reads from happening as they have to be as low latency as possible, but if need be, I can go that route.
I was wondering if this would be a good use case for a unique_ptr. Is my understanding correct in that if a thread calls get_cache, and that returns a unique_ptr instead of a standard pointer, once all threads that have the old version of cache are finished with it(i.e leave scope), the object will be deleted.
Is using a unique_ptr the best option for this case, or is there another option that I am not thinking of?
Any input will be greatly appreciated.
Edit:
I believe I made a mistake in my OP. I meant to use and pass a shared_ptr not a unique_ptr for cache_. And when all threads are finished with cache_ the shared_ptr should delete itself.
A little about my program: My program is a webserver that will be using this information to decide what information to return. It is fairly high throughput(thousands of req/sec) Each request queries the cache once, so telling my other threads when to update is no problem. I can tolerate slightly out of date information, and would prefer that over blocking all of my threads from executing if possible. The information in the cache is fairly large, and I would like to limit any copies on value because of this.
update_cache is only run once. It is run in a thread that just listens for an update command and runs the code.
I feel there are multiple issues:
1) Do not leak memory: for that never use "delete" in your code and stick with unique_ptr (or shared_ptr in specific cases)
2) Protect accesses to shared data, for that either using locking (mutex) or lock-free mecanism (std::atomic)
class Cache {
using Map = std::map<std::string, value>();
std::unique_ptr<Map> m_cache;
std::mutex m_cacheLock;
public:
void update_cache()
{
while(true)
{
if(recv().compare("update") == 0)
{
std::unique_ptr<Map> new_info { new Map };
//Get new info, store in new_info
{
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
}
}
Note: I don't like update_cache() being part of a public interface for the cache as it contains an infinite loop. I would probably externalize the loop with the recv and have a:
void update_cache(std::unique_ptr<Map> new_info)
{
{ // This inner brace is not useless, we don't need to keep the lock during deletion
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
Now for the reading to the cache, use proper encapsulation and don't leave the pointer to the member map escape:
value get(const std::string &key)
{
// lock, fetch, and return.
// Depending on value type, you might want to allocate memory
// before locking
}
Using this signature you have to throw an exception if the value is not present in the cache, another option is to return something like a boost::optional.
Overall you can keep a low latency (everything is relative, I don't know your use case) if you take care of doing costly operations (memory allocation for instance) outside of the locking section.
shared_ptr is very reasonable for this purpose, C++11 has a family of functions for handling shared_ptr atomically. If the data is immutable after creation, you won't even need any additional synchronization:
class cache {
public:
using map_t = std::map<std::string, value>;
void update_cache();
std::shared_ptr<const map_t> get_cache() const;
private:
std::shared_ptr<const map_t> cache_;
};
void cache::update_cache()
{
while(true)
{
if(recv() == "update")
{
auto new_info = std::make_shared<map_t>();
// Get new info, store in new_info
// Make immutable & publish
std::atomic_store(&cache_,
std::shared_ptr<const map_t>{std::move(new_info)});
}
}
}
auto cache::get_cache() const -> std::shared_ptr<const map_t> {
return std::atomic_load(&cache_);
}
My code have one thread continuosly handling objects queued by other threads. Queued objects are created using "new" in a function that will have finished when the object will be handled. I have no problem with this but deleting the object.
Should I just not delete the object? Maybe change the way of passing/creating this objects?
Object* myQueue[10];
function() {
Object* myobject = new Object();
queueObject(myobject);
}
queueObject(Object* object) {
myQueue[index_x] = object;
sem_post(&mySemaphore);
}
//// Thread 1
function();
...
//// Thread 2
handleObjects() {
while(true) {
sem_wait(&mySemaphore);
// handle myQueue[index_x]
delete myQueue[index_x]; ---> this produces Segmentation Fault
}
}
(the treatment of index_x is not posted to abbreviate)
I'm guessing you have a race condition. What is the synchronization mechanism you're using to prevent index_x from being modified by both threads?
Typically a worker thread should call sem_wait, modify the critical data, and then call sem_post. I can't provide 100% accurate example code without seeing how you're using index_x, but it will look something like the following:
queueObject(Object* object) {
sem_wait(&mySemaphore);
myQueue[index_x++] = object;
sem_post(&mySemaphore);
}
handleObjects() {
while(true) {
sem_wait(&mySemaphore);
// handle myQueue[index_x]
delete myQueue[--index_x]
sem_post(&mySemaphore);
}
}
Currently it looks like you have nothing to prevent index_x from being modified by both threads, this can cause index_x to do whacky things (fail to increment or decrement being the most common whacky thing). Here is a wikipedia article explaining exactly what can go wrong.
Add some checks around the delete
if ( myQueue[index] != 0 ) {
delete myQueue[index];
myQueue[index] = 0;
} else {
for diagnosis print large warning here - something is confused
}
this catches double deletion via the same index. However there are several other ways a crash could occur. Catching those would need other actions.
Consider:
Is there any possibility of a race condition? Could two threads attempt to delete at the same index? Do you need to add any synchronization?
Is it possible for the same object to be added to the array twice, with different indexes? In extremis I might add code to verify that the item isn't already in the array before adding it.
I am having some problem related to C/C++:
Suppose I have some class
class Demo
{
int constant;
public:
void setConstant(int value)
{
constant=value;
}
void submitTask()
{
// need to make a call to C-based runtime system to submit a
// task which will be executed "asynchronously"
submitTask((void *)&constant);
}
};
// runtime system will call this method when task will be executed
void func(void *arg)
{
int constant= *((int *)arg);
// Read this constant value but don't modify here....
}
Now in my application, I do something like this:
int main()
{
...
Demo objDemo;
for(...)
{
objDemo.setConstant(<somevalue>);
objDemo.submitTask();
}
...
}
Now, hopefully you see the problem as tasks should read the value set immediately before a asynchronous call . As task calls are asynchronous so a task can read wrong value and sometimes results in unexpected behavior.
I don't want to enforce synchronous task execution just because of this constraint. The number of tasks created are not known in advance. I just need to pass this simple integer constant in an elegant way that will work with asynchronous. Obviously I cannot change the runtime behavior (mean that signature of this method void func(void *arg) is fixed).
Thanks in advance.
If you don't want to wait for the C code to finish before you make the next call then you can't reuse the same memory location over and over. Instead, create an array and then pass those locations. For this code, I'm going to assume that the number of times the for loop will run is n. This doesn't have to be known until it's time for the for loop to run.
int* values = new int[n];
for(int i=0;i<n;i++) {
values[i] = <somevalue>;
submitTask((void*)&values[i]);
}
At some later point when you're sure it's all done, then call
delete[] values;
Or, alternately, instead of an array of ints, create an array of Demo objects.
Demo demo[] = new Demo[n];
for(int i=0;i<n;i++) {
demo[i].setConstant(<somevalue>);
demo[i].submitTask();
}
But the first makes more sense to me as the Demo object doesn't really seem to do anything worthwhile. But you may have left out methods and members not relevant to the question, so that could change which option is best. Regardless, the point is that you need separate memory locations for separate values if you don't know when they're going to get used and don't want to wait.