Iterating a changing container - c++

I am iterating over a set of callback functions. Functions are called during iteration and may lead to drastic changes to the actual container of the functions set.
What I am doing now is:
make a copy of original set
iterate over copy, but for every element check whether it still exists in the original set
Checking for every element's existence is super-dynamic, but seems quite slow too.
Are there other propositions to tackle this case?
Edit : here is the actual code :
// => i = event id
template <class Param>
void dispatchEvent(int i, Param param) {
EventReceiverSet processingNow;
const EventReceiverSet& eventReceiverSet = eventReceiverSets[i];
std::copy(eventReceiverSet.begin(), eventReceiverSet.end(), std::inserter(processingNow, processingNow.begin()));
while (!processingNow.empty()) {
EventReceiverSet::iterator it = processingNow.begin();
IFunction<>* function = it->getIFunction(); /// get function before removing iterator
processingNow.erase(it);
// is EventReceiver still valid? (may have been removed from original set)
if (eventReceiverSet.find(ERWrapper(function)) == eventReceiverSet.end()) continue; // not found
function->call(param);
}
};

Two basic approaches come to mind:
use a task based approach (with the collection locked, push tasks onto a queue for each element, then release all parties to do work and wait till completion). You'll still need a check to see whether the element for the current task is still present/current in the collection when the task is actually starting.
this could leverage reader-writer locks for the checks, which is usually speedier than fullblown mutual exclusions (especially with more readers than writers)
use a concurrent data structure (I mean, one that is suitable for multithreaded access without explicit locking). The following libraries contain implementations of concurrent data structures:
Intel Thread Building Blocks
MS ConCrt concurrent_vector
libcds Concurrent Data Structures
(adding links shortly)

There is a way to do it in two steps: first, go through the original set, and make a set of action items. Then go through the set of action items, and apply them to the original set.
An action item is a base class with subclasses. Each subclass takes in a set, and performs a specific operation on it, for example:
struct set_action {
virtual void act(std::set<int> mySet) const;
};
class del_action : public set_action {
private:
int item;
public:
del_action(int _item) : item(_item) {}
virtual void act(std::set<int> mySet) const {
// delete item from set
}
};
class upd_action : public set_action {
private:
int from, to;
public:
upd_action(int _from, int _to) : from(_from), to(_to) {}
virtual void act(std::set<int> mySet) const {
// delete [from], insert [to]
}
};
Now you can create a collection of set_action*s in the first pass, and run them in the second pass.

The operations which mutate the set structure are insert() and erase().
While iterating, consider using the iterator returned by the mutating operations.
it = myset.erase( it );
http://www.cplusplus.com/reference/stl/set/erase/

Related

Remove related object from list C++

I have some code:
class LowLevelObject {
public:
void* variable;
};
// internal, can't get access, erase, push. just exists somewhere
std::list<LowLevelObject*> low_level_objects_list;
class HighLevelObject {
public:
LowLevelObject* low_level_object;
};
// my list of objects
std::list<HighLevelObject*> high_level_objects_list;
// some callback which notifies that LowLevelObject* added to low_level_objects_list.
void CallbackAttachLowLevelObject(LowLevelObject* low_level_object) {
HighLevelObject* high_level_object = new HighLevelObject;
high_level_object->low_level_object = low_level_object;
low_level_object->variable = high_level_object;
high_level_objects_list.push_back(high_level_object);
}
void CallbackDetachLowLevelObject(LowLevelObject* low_level_object) {
// how to delete my HighLevelObject* from high_level_objects_list?
// HighLevelObject* address in field `variable` of LowLevelObject.
}
I have low level object which defined in library, it contains field variable for using by user.
I set to this varaible pointer to my HighLevelObject from my code.
I can set callbacks on add and remove LowLevelObject from list in library.
But how can I remove my HighLevelObject from my list of objects?
Of course, I know that I can iterate whole list and find by object by pointer and remove, but it's long way.
List may contains a lot of objects.
Thanks in advance!
The setup lends itself to finding a solution where converting a pointer to an iterator is a constant-time operation. Boost.Intrusive offers this feature. This will require changes to your code though; if you were not careful about encapsulation, these changes might be significant. A boost::intrusive::list is functionally similar to a std::list, but requires some changes to your data structure. This option might not be for everyone.
Another feature of Boost.Intrusive is that sometimes you do not need to explicitly convert a pointer to an iterator. If you enable auto-unlinking, then the actual deletion from the list happens behind the scenes in a destructor. This is not a good option if you need to get the size of your list in constant time, though. (Nothing in the question indicates that getting the size of the list is needed, so I'll go ahead with this approach.)
If you had a container of objects, I might let you work through the documentation for the intrusive list. However, your use of pointers makes the conversion potentially confusing, so I'll walk through the setup. The setup begins with the following.
#include <boost/intrusive/list.hpp>
// Shorten the needed boost namespace.
namespace bi = boost::intrusive;
Since the list of high-level objects contains pointers, an auxiliary structure is needed. We need what amounts to a pointer that derives from a class provided by Boost. (I will proceed assuming that the objects created in CallbackAttachLowLevelObject() must be destroyed in CallbackDetachLowLevelObject(). Hence, I've changed the raw pointer to a smart pointer.)
#include <memory>
#include <utility>
// The auxiliary structure that will be stored in the high level list:
// The hook supplies the intrusive infrastructure.
// The link_mode enables auto-unlinking.
class ListEntry : public bi::list_base_hook< bi::link_mode<bi::auto_unlink> >
{
public:
// The expected way to construct this.
explicit ListEntry(std::unique_ptr<HighLevelObject> && p) : ptr(std::move(p)) {}
// Another option would be to forward parameters for constructing HighLevelObject,
// and have the constructor call make_unique. I'll leave that as an exercise.
// Make this class look like a pointer to HighLevelObject.
const std::unique_ptr<HighLevelObject> & operator->() const { return ptr; }
HighLevelObject& operator*() const { return *ptr; }
private:
std::unique_ptr<HighLevelObject> ptr;
};
The definition of the list becomes the following. We need to specify non-constant time size() to allow auto-unlinking.
bi::list<ListEntry, bi::constant_time_size<false>> high_level_objects_list;
These changes require some changes to the "attach" callback. I'll present them before going on to the "detach" callback.
// Callback that notifies when LowLevelObject* is added to low_level_objects_list.
void CallbackAttachLowLevelObject(LowLevelObject* low_level_object) {
// Dynamically allocate the entry, in addition to allocating the high level object.
ListEntry * entry = new ListEntry(std::make_unique<HighLevelObject>());
(*entry)->low_level_object = low_level_object; // Double indirection needed here.
low_level_object->variable = entry;
high_level_objects_list.push_back(*entry); // Intentional indirection here!
}
With this prep work, the cleanup is in your destructors, as is appropriate for RAII. Your "detach" just has to initiate the process. One line suffices.
void CallbackDetachLowLevelObject(LowLevelObject* low_level_object) {
delete static_cast<ListEntry *>(low_level_object->variable);
}
There (appropriately) is not enough context in the question to explain why the high level list is of pointers instead of being of objects. One potential reason is that the high-level object is polymorphic, and the use of pointers avoids slicing. If this is the case (or if there is not a good reason for using pointers), an intrusive list could be designed with less impact on existing code. The caveat here is that changes to HighLevelObject are required.
The initial setup is the same as before.
#include <boost/intrusive/list.hpp>
// Shorten the needed boost namespace.
namespace bi = boost::intrusive;
Next, have HighLevelObject derive from the hook.
class HighLevelObject : public bi::list_base_hook< bi::link_mode<bi::auto_unlink> > {
public:
LowLevelObject* low_level_object;
};
In this situation, the list is of HighLevelObjects, not of pointers, nor of pointer stand-ins.
bi::list<HighLevelObject, bi::constant_time_size<false>> high_level_objects_list;
The "attach" callback reverts to almost what is in the question. The one change to this function is that the object itself is pushed into the list, not a pointer. This is why slicing is not a problem; it's not a copy that is added to the list, but the object itself.
high_level_objects_list.push_back(*high_level_object); // Intentional indirection!
The rest of your code might work as-is. We just need the "detach" callback, which again is a one-liner.
void CallbackDetachLowLevelObject(LowLevelObject* low_level_object) {
delete static_cast<HighLevelObject *>(low_level_object->variable);
}
This answer is for those who do not want to use – or cannot use – Boost.Intrusive.
As long as modifying HighLevelObject is an option, the object could be told how to remove itself from the list. Add a callback to HighLevelObject and invoke it in its destructor.
#include <functional>
#include <utility>
class HighLevelObject {
public:
LowLevelObject* low_level_object;
// ****** The above is from the question. The below is new. ******
// Have the destructor invoke the callback.
~HighLevelObject() { if ( on_delete ) on_delete(); }
// Provide a way to set the callback.
void set_deleter(std::function<void()> && deleter)
{ on_delete = std::move(deleter); }
private:
// Storage for the callback:
std::function<void()> on_delete;
};
Set the callback when an object is added to the high level list.
Caution: This setup supports only one callback. Don't overwrite the callback somewhere else in your code!
Caution: Additional precautions are needed if multiple threads might add elements to high_level_objects_list.
// Callback that notifies when LowLevelObject* is added to low_level_objects_list.
void CallbackAttachLowLevelObject(LowLevelObject* low_level_object) {
HighLevelObject* high_level_object = new HighLevelObject;
high_level_object->low_level_object = low_level_object;
low_level_object->variable = high_level_object;
high_level_objects_list.push_back(high_level_object);
// ****** The above is from the question. The below is new. ******
// Arrange cleanup.
auto iter = high_level_objects_list.end(); // Not thread-safe
high_level_object->set_deleter([iter]() { high_level_objects_list.erase(iter); });
}
With this prep work, the cleanup is in your destructor, as is appropriate for RAII. Your "detach" just has to initiate the process. One line suffices.
void CallbackDetachLowLevelObject(LowLevelObject* low_level_object) {
delete static_cast<HighLevelObject *>(low_level_object->variable);
}
I was thinking of storing an iterator (specifically, iter in the above) in HighLevelObject and having the destructor use that to call erase() instead of going through a lambda. However, I ran into trouble with the declarations, since members of std::list cannot be instantiated with an incomplete element type. It could be done with type erasure, but at that point I preferred using a function object.

Using ranges-v3 to implement DFS

I'm interested in using range-v3 to build and query linear quadtree data structures. I've been able to successfully use range-v3 to construct a linear quadtree data structure using existing views in the library. I'm excited to be able to express query logic as a view adaptor since you can iterate through nodes in the quadtree via advancing a RandomAccessIterator of the derived range, which conveniently helps separate out query behavior from the quadtree's structure.
My view adaptor has a single argument: a user-defined lambda predicate function that is used to evaluate a node and determine whether to step-in or step-out. Stepping in results in evaluating children nodes whereas stepping out results in visiting the next sibling (or potentially the node's parent's next sibling) until either a leaf node is successfully evaluated or we "exit" through the root node. (You can think of this as a DFS pattern.)
Thus, we are able to define this range in terms of a RandomAccessIterator (from the derived range) and a Sentinel (as opposed to another Iterator).
Here's some trimmed-down code that shows the overall structure. (My apologies if there is missing member data/structure):
template<typename Rng, typename Fun>
class quadtree_query_view
: public ranges::view_adaptor<quadtree_query_view<Rng, Fun>, Rng>
{
friend ranges::range_access;
using base_iterator_t = ranges::iterator_t<Rng>;
ranges::semiregular_t<Fun> fun;
uint tree_depth;
struct query_termination_adaptor : public ranges::adaptor_base
{
query_termination_adaptor() = default;
query_termination_adaptor(uint tree_depth) : tree_depth(tree_depth) {};
uint tree_depth;
uint end(quadtree_query_view const&) {
return tree_depth;
}
};
struct query_adaptor : public ranges::adaptor_base
{
query_adaptor() = default;
query_adaptor(ranges::semiregular_t<Fun> const& fun) : fun(fun) {};
ranges::semiregular_t<Fun> fun;
bool exited = false;
uint current_node_depth = 0;
base_iterator_t begin(quadtree_query_view const& rng) {
return ranges::begin(rng.base());
}
// TODO: implement equal?
// TODO: implement empty?
auto read(base_iterator_t const& it) const
{
return *it; // I'm not concerned about the value returned by this range yet.
}
CONCEPT_REQUIRES(ranges::RandomAccessIterator<base_iterator_t>())
void next(base_iterator_t& it ){
if (fun(*it)) { // Step in
// Advance base iterator (step in)
// Increment current_node_depth
} else { // Step out
// Advance base iterator (step out)
// Set "exited = true" if stepping out past root node.
// Decrement current_node_depth
}
}
};
public:
quadtree_query_view() = default;
quadtree_query_view(Rng&& rng, uint tree_depth, Fun fun)
: quadtree_query_view::view_adaptor{std::forward<Rng>(rng)}
, tree_depth(tree_depth)
, fun(std::move(fun))
{}
query_adaptor begin_adaptor() const {
return {std::move(fun)};
}
query_termination_adaptor end_adaptor() const {
return {tree_depth};
}
};
I'm trying to figure out the last few steps to complete this implementation:
My range does not satisfy the Range concept due to WeaklyEqualityComparable requirement not being implemented for my iterator/sentinel pair. What's the best way for going upon doing this?
Do I need to implement the equal member method for the query_adaptor? What do the two iterator arguments correspond to?
I'm assuming that I need to implement the empty member method for query_adaptor. Is this where the query exit criteria logic would go? Based on the documentation, the segment argument needs to be a type associated with the sentinel. Is this the same type that is returned by query_termination_adaptor::end(), e.g., a uint? Or does this need to be another type?
Thanks for any insights you can share. I'm really stoked to see ranges be incorporated into C++20!
Ah.
I was able to solve my problem by using default_sentinel. Since query_adaptor is meant to start at the root node and iterate in a single direction, I can remove end_adaptor and query_termination_adaptor all-together. I only had to implement a bool equal(default_sentinel) const { ... } method for the adaptor where I am able to determine if query exit criteria is met.
I'm still not sure why trying to implement a custom sentinel type caused issue for me. However, it did't provide any additional functionality over default_sentinel, other than owning tree_depth.

C++ design choice

There is a class ActionSelection which has the following method:
ActionBase* SelectAction(Table* table, State* state);
ActionBase is an abstract class. Inside of the SelectAction method some action is fetched from the table considering the state if the table is not empty.
If the table is empty, a random action should be created and returned. However ActionBase is an abstract class, so can not be instantiated.
For different experiments/environments actions are different but have some common behavior (that's why there is an ActionBase class)
The problem is that this function (SelectAction) should return an experiment specific action, if the table is empty, however it does not know anything about the specific experiment. Are there any design workarounds of this?
It depends on whether empty tables...
Are expected to happen under normal circumstances
May happen under abnormal circumstances
Should never happen unless there is a bug in the program
Solution 1:
Include empty table handling into your control flow. As-is the function does not have enough information to react properly, so either :
Pass in a third parameter, containing a default action to return :
ActionBase *SelectAction(Table *table, State *state, ActionBase *defaultAction);
If you don't want to construct the default action unless it's needed, you can pass its type via a template parameter instead, optionally with additional parameters to construct it with :
template <class DefaultAction, class... DefActArgs>
ActionBase *SelectAction(Table *table, State *state, DefActArgs &&... args);
Let the caller handle it, by returning whether or not the operation was successful :
bool SelectAction(Table *table, State *state, ActionBase *&selectedAction);
Solution 2:
Throw an exception. It will bubble up to whoever can handle it. This is quite rarely used as a parameter check, since it should have been thrown by the object that should have produced a non-empty table in the first place.
ActionBase *SelectAction(Table *table, State *state) {
if(table->empty())
throw EmptyTableException();
// ...
}
Solution 3:
Setup an assertion. If your function received an empty table, something is broken, better halt the program and have a look at it with a debugger.
ActionBase *SelectAction(Table *table, State *state) {
assert(!table->empty());
// ...
}
Here is what I had in mind : It is not tested code but you get the idea.
1.
//header
class RandomActionBase : public ActionBase{
public
RandomActionBase();
static RandomAction* selectRandomAction();
protected:
static RandomActionBase* _first;
RandomActionBase* _next;
void register(RandomActionBase* r);
};
//implementation
RandomActionBase::_first = NULL;
RandomActionBase::RandomActionBase():_next(NULL){
if (_first==NULL) _first = this;
else _first->register(this);
}
void RandomActionBase::register(RandomActionBase* r)
{
if (_next==NULL) _next = r;
else _next->register(r);
}
RandomAction* RandomActionBase::selectRandomAction()
{
//count the number of randomactionbases
int count = 0;
RandomActionBase* p = _first;
while(p){
++count;
p = p->_next;
}
//now that you know the count you can create a random number ranging from 0 to count, I 'll leave this up to you and assume the random number is simply 2,
unsigned int randomnbr = 2;
RandomActionBase* p = _first;
while(randomnbr>0){
p= p->_next;
--randomnbr;
}
return p;
}
//header
class SomeRandomAction : public RandomActionBase{
public:
//implement the custom somerandomaction
}
//implementation
static SomeRandomAction SomeRandomAction_l;
The idea of course is to create different implementations of SomeRandomAction or even to pass parameters to them via their constructor to make them all distinct. For each instance you create they will appear in the static list.
Extending the list with a new imlementation just means to derive from RandomActionBase , implement it and make sure to create an instance, the base class is never impacted by this which make it even a design according to OCP.
Open closed principle. The code is extendable while not having to change the code that is already in place. OCP is part of SOLID.
2.
Another viable solution is to return a null object. It is quite similar as above but you always return the null object when the list is empty. Mind you a null object is not simply null. See https://en.wikipedia.org/wiki/Null_Object_pattern
It is simply a dummy implementation of a class to avoid having to check for null pointers to make the design more elegant and less susceptible for null pointer dereferencing errors.

Have an extra data member only when something is active in c++

I have an implementation of a queue, something like template <typename T> queue<T> with a struct QueueItem { T data;} and I have a separate library that times the passage of data across different places (including from one producer thread to consumer thread via this queue). In order to do this, I inserted code from that timing library into the push and pop functions of the queue so that when they assign a BufferItem.data they also assign an extra member i added of type void* to some timing metadata from that library. I.e. what used to be something like:
void push(T t)
{
QueueItem i;
i.data = t;
//insert i into queue
}
became
void push(T t)
{
QueueItem i;
i.data = t;
void* fox = timinglib.getMetadata();
i.timingInfo = fox;
//insert i into queue
}
with QueueItem going from
struct QueueItem
{
T data;
}
to
struct QueueItem
{
T data;
void* timingInfo;
}
What I would like to achieve, however, is the ability to swap out of the latter struct in favor of the lighter weight struct whenever the timing library is not activated. Something like:
if timingLib.isInactive()
;//use the smaller struct QueueItem
else
;//use the larger struct QueueItem
as cheaply as possible. What would be a good way to do this?
You can't have a struct that is big and small at the same time, obviously, so you're going to have to look at some form of inheritance or pointer/reference, or a union.
A union would be ideal for you if there's "spare" data in T that could be occupied by your timingInfo. If not, then it's going to be as 'heavy' as the original.
Using inheritance is also likely to be as big as the original, as it'll add a vtable in there which will pad it out too much.
So, the next option is to store a pointer only, and have that point to the data you want to store, either the data or the data+timing. This kind of pattern is known as 'flyweight' - where common data is stored separately to the object that is manipulated. This might be what you're looking for (depending on what the timing info metadata is).
The other, more complex, alternative is to have 2 queues that you keep in sync. You store data in one, and the other one stores the associated timeing info, if enabled. If not enabled, you ignore the 2nd queue. The trouble with this is ensuring the 2 are kept in sync, but that's a organisational problem rather than a technical challenge. Maybe create a new Queue class that contains the 2 real queues internally.
I'll start by just confirming my assumption that this needs to be a runtime choice and you can't just build two different binaries with timing enabled/disabled. That approach eliminates as much overhead in any approach as possible.
So now let's assume we want different runtime behavior. There will need to be runtime decisions, so there are a couple options. If you can get away with the (relatively small) cost of polymorphism then you could make your queue polymorphic and create the appropriate instance once at startup and then its push for example either will or won't add the extra data.
However if that's not an option I believe you can use templates to help accomplish your end, although there will likely be some up-front work and it will probably increase the size of your binary with the extra code.
You start with a template to add timing to a class:
template <typename Timee>
struct Timed : public Timee
{
void* timingInfo;
};
Then a timed QueueItem would look like:
Timed<QueueItem> timed_item;
To anything that doesn't care about the timing, this class looks exactly like a QueueItem: It will automatically upcast or slice to the parent as appropriate. And if a method needs to know the timing information you either create an overload that knows what to do for a Timed<T> or do a runtime check (for the "is timing enabled" flag) and downcast to the correct type.
Next, you'll need to change your Queue instantiation to know whether it's using the base QueueItem or the Timed version. For example, a very very rough sketch of a possible mechanism:
template <typename Element>
void run()
{
Queue<Element> queue;
queue.setup();
queue.process();
}
int main()
{
if(do_timing)
{
run<Timed<QueueItem> >();
}
else
{
run<QueueItem>();
}
return 0;
}
You would "likely" need a specialization for Queue when used with Timed items unless getting the metadata is stateless in which case the Timed constructor can gather the info and self-populate itself when created. Then Queue just stays the same and relies on which instantiation you're using.

Extending a thrift generated object in C++

Using the following .thrift file
struct myElement {
1: required i32 num,
}
struct stuff {
1: optional map<i32,myElement> mymap,
}
I get thrift-generated class with an STL map. The instance of this class is long-lived
(I append and remove from it as well as write it to disk using TSimpleFileTransport).
I would like to extend myElement in C++, the extenstions should not affect
the serialized version of this object (and this object is not used in any
other language). Whats a clean way to acomplish that?
I contemplated the following, but they didn't seem clean:
Make a second, non thrift map that is indexed with the same key
keeping both in sync could prove to be a pain
Modify the generated code either by post-processing of the generated
header (incl. proprocessor hackery).
Similar to #2, but modify the generation side to include the following in the generated struct and then define NAME_CXX_EXT in a forced-included header
#ifdef NAME_CXX_EXT
NAME_CXX_EXT ...
#endif
All of the above seem rather nasty
The solution I am going to go with for now:
[This is all pseudo code, didn't check this copy for compilation]
The following generated code, which I cannot modify
(though I can change the map to a set)
class GeneratedElement {
public:
// ...
int32_t num;
// ...
};
class GeneratedMap {
public:
// ...
std::map<int32_t, GeneratedElement> myGeneratedMap;
// ...
};
// End of generated code
Elsewhere in the app:
class Element {
public:
GeneratedElement* pGenerated; // <<== ptr into element of another std::map!
time_t lastAccessTime;
};
class MapWrapper {
private:
GeneratedMap theGenerated;
public:
// ...
std::map<int32_t, Element> myMap;
// ...
void doStuffWIthBoth(int32_t key)
{
// instead of
// theGenerated.myGeneratedMap[key].num++; [lookup in map #1]
// time(&myMap[key].lastAccessTime); [lookup in map #2]
Element& el=myMap[key];
el.pGenerated->num++;
time(&el.lastAccessTime);
}
};
I wanted to avoid the double map lookup for every access
(though I know that the complexity remains the same, it is still two lookups ).
I figured I can guarantee that all insertions and removals to/from the theGenerated)
are done in a single spot, and in that same spot is where I populate/remove
the corresponding entry in myMap, I would then be able to initialize
Element::pGenerated to its corresponding element in theGenerated.myGeneratedMap
Not only will this let me save half of the lookup time, I may even change
myMap to a better container type for my keytype (say a hash_map or even a boost
multi index map)
At first this sounded to me like a bad idea. With std::vector and std::dqueue I can
see how this can be a problem as the values will be moved around,
invalidating the pointers. Given that std::map is implemented with a tree
structure, is there really a time where a map element will be relocated?
(my above assumptions were confirmed by the discussion in enter link description here)
While I probably won't provide an access method to each member of myElement or any syntactic sugar (like overloading [] () etc), this lets me treat these elements almost a consistent manner. The only key is that (aside for insertion) I never look for members of mymap directly.
Have you considered just using simple containership?
You're using C++, so you can just wrap the struct(s) in some class or other struct, and provide wrapper methods to do whatever you want.