Erase an element from a map then put it back - c++

Suppose I have a backtracking algorithm where I need to remove an element from a map, do something, then put it back. I am not sure if there is a good way to do it:
func(std::<K, V> map &dict) {
for (auto i : dict) {
remove i from dict;
func(dict);
put i back to dict;
}
}
I know there are ways to delete an element from map in here but I am not sure if there are ways to achieve what I described above. I am aware that I can create a new dict in for loop and pass it in func but I am wondering if it can be done using the same dict.

What you ask is definitely possible. Here is one way to do it while trying to keep things simple and efficient:
void func(std::map<K, V> &dict) {
for (auto i = dict.cbegin(); i != dict.cend(); ) {
auto old = *i;
i = dict.erase(i); // i now points to the next element (or dict.end())
some_other_func(dict);
dict.insert(i, old); // i is used a hint where to re-insert the old value
}
}
By calling std::map::erase with an iterator argument and std::map::insert with a hint, the complexity of each iteration through this loop is amortized constant. Note that I assumed your calling of func in line 5 was actually supposed to be some_other_func, because calling func recursively would invalidate the iterator you carefully set in line 4.
However, this is not the most efficient way to do this sort of processing (#rakurai has a suggestion that should be considered).

This question has an answer here.
However, your problem may be that you would like the function to ignore the key K while processing dict. How about a second parameter "K ignore" and make the function skip that pair? You could give it a default value in the declaration if that breaks other code, or overload the function.

Related

Constraining remove_if on only part of a C++ list

I have a C++11 list of complex elements that are defined by a structure node_info. A node_info element, in particular, contains a field time and is inserted into the list in an ordered fashion according to its time field value. That is, the list contains various node_info elements that are time ordered. I want to remove from this list all the nodes that verify some specific condition specified by coincidence_detect, which I am currently implementing as a predicate for a remove_if operation.
Since my list can be very large (order of 100k -- 10M elements), and for the way I am building my list this coincidence_detect condition is only verified by few (thousands) elements closer to the "lower" end of the list -- that is the one that contains elements whose time value is less than some t_xv, I thought that to improve speed of my code I don't need to run remove_if through the whole list, but just restrict it to all those elements in the list whose time < t_xv.
remove_if() though does not seem however to allow the user to control up to which point I can iterate through the list.
My current code.
The list elements:
struct node_info {
char *type = "x";
int ID = -1;
double time = 0.0;
bool spk = true;
};
The predicate/condition for remove_if:
// Remove all events occurring at t_event
class coincident_events {
double t_event; // Event time
bool spk; // Spike condition
public:
coincident_events(double time,bool spk_) : t_event(time), spk(spk_){}
bool operator()(node_info node_event){
return ((node_event.time==t_event)&&(node_event.spk==spk)&&(strcmp(node_event.type,"x")!=0));
}
};
The actual removing from the list:
void remove_from_list(double t_event, bool spk_){
// Remove all events occurring at t_event
coincident_events coincidence(t_event,spk_);
event_heap.remove_if(coincidence);
}
Pseudo main:
int main(){
// My list
std::list<node_info> event_heap;
...
// Populate list with elements with random time values, yet ordered in ascending order
...
remove_from_list(0.5, true);
return 1;
}
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for cycle as suggested for example in this post?
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for loop?
Yes and yes. Don't fight to use code that is preventing you from reaching your goals. Keep it simple. Loops are nothing to be ashamed of in C++.
First thing, comparing double exactly is not a good idea as you are subject to floating point errors.
You could always search the point up to where you want to do a search using lower_bound (I assume you list is properly sorted).
The you could use free function algorithm std::remove_if followed by std::erase to remove items between the iterator returned by remove_if and the one returned by lower_bound.
However, doing that you would do multiple passes in the data and you would move nodes so it would affect performance.
See also: https://en.cppreference.com/w/cpp/algorithm/remove
So in the end, it is probably preferable to do you own loop on the whole container and for each each check if it need to be removed. If not, then check if you should break out of the loop.
for (auto it = event_heap.begin(); it != event_heap.end(); )
{
if (coincidence(*it))
{
auto itErase = it;
++it;
event_heap.erase(itErase)
}
else if (it->time < t_xv)
{
++it;
}
else
{
break;
}
}
As you can see, code can easily become quite long for something that should be simple. Thus, if you need to do that kind of algorithm often, consider writing you own generic algorithm.
Also, in practice you might not need to do a complete search for the end using the first solution if you process you data in increasing time order.
Finally, you might consider using an std::set instead. It could lead to simpler and more optimized code.
Thanks. I used your comments and came up with this solution, which seemingly increases speed by a factor of 5-to-10.
void remove_from_list(double t_event,bool spk_){
coincident_events coincidence(t_event,spk_);
for(auto it=event_heap.begin();it!=event_heap.end();){
if(t_event>=it->time){
if(coincidence(*it)) {
it = event_heap.erase(it);
}
else
++it;
}
else
break;
}
}
The idea to make erase return it (as already ++it) was suggested by this other post. Note that in this implementation I am actually erasing all list elements up to t_event value (meaning, I pass whatever I want for t_xv).

insert doesn't permanently change map?

I have a member variable of this class that is set<pair<string,map<string,int> > > setLabelsWords; which is a little convoluted but bear with me. In a member function of the same class, I have the following code:
pair<map<string,int>::iterator,bool> ret;
for (auto j:setLabelsWords) {
if (j.first == label) {
for (auto k:words) {
ret = j.second.insert(make_pair(k,1));
if (ret.second == false) {
j.second[k]++;
}
}
}
}
"words" is a set of strings and "label" is a string. So basically it's supposed to insert "k" (a string) and if k is already in the map, it increments the int by 1. The problem is it works until the outermost for loop is over. If I print j.second's size right before the last bracket, it will give me a size I expect like 13, but right outside the last bracket the size goes back to 7, which is what the size of the map is initially before this code block runs. I am super confused why this is happening and any help would be much appreciated.
for (auto j:setLabelsWords) {
This iterates over the container by value. This is the same thing as if you did:
class Whatever { /* something in here */ };
void changeWhatever(Whatever w);
// ...
{
Whatever w;
changewhatever(w);
}
Whatever changewhatever does to w, whatever modifications are made, are made to a copy of w, because it gets passed by value, to the function.
In order to correctly update your container, you must iterate by reference:
for (auto &j:setLabelsWords) {
for (auto j:setLabelsWords) {
This creates a copy of each element. All the operations you perform on j affect that copy, not the original element in setLabelsWords.
Normally, you would just use a reference:
for (auto&& j:setLabelsWords) {
However, due to the nature of a std::set, this won't get you far, because a std::set's elements cannot be modified freely via iterators or references to elements, because that would allow you to create a set with duplicates in it. The compiler will not allow that.
Here's a pragmatic solution: Use a std::vector instead of a std::set:
std::vector<std::pair<std::string, std::map<std::string,int>>> setLabelsWords;
You will then be able to use the reference approach explained above.
If you need std::set's uniqueness and sorting capabilities later on, you can either apply std::sort and/or std::unique on the std::vector, or create a new std::set from the std::vector's elements.

c++ vector object .erase

I have been struggling to put a vector object into a project im doing
I have read what little i could find about doing this and decided to give it a go.
std::vector<BrickFalling> fell;
BrickFalling *f1;
I created the vector. This next piece works fine until i get to the erase
section.
if(brickFall == true){
f1 = new BrickFalling;
f1->getBrickXY(brickfallx,brickfally);
fell.push_back(*f1);
brickFall = false;
}
// Now setup an iterator loop through the vector
vector<BrickFalling>::iterator it;
for( it = fell.begin(); it != fell.end(); ++it ) {
// For each BrickFalling, print out their info
it->printBrickFallingInfo(brick,window,deadBrick);
//This is the part im doing wrong /////
if(deadBrick == true)// if dead brick erase
{
BrickFalling[it].erase;//not sure what im supposed to be doing here
deadBrick = false;
}
}
You can totally avoid the issue by using std::remove_if along with vector::erase.
auto it =
std::remove_if(fell.begin(), fell.end(), [&](BrickFalling& b)
{ bool deadBrick = false;
b.printBrickFallingInfo(brick,window,deadBrick);
return deadBrick; });
fell.erase(it, fell.end());
This avoids the hand-writing of the loop.
In general, you should strive to write erasure loops for sequence containers in this fashion. The reason is that it is very easy to get into the "invalid iterator" scenario when writing the loop yourself, i.e. not remembering to reseat your looping iterator each time an erase is done.
The only issue with your code which I do not know about is the printBrickFallingInfo function. If it throws an exception, you may introduce a bug during the erasure process. In that case, you may want to protect the call with a try/catch block to ensure you don't leave the function block too early.
Edit:
As the comment stated, your print... function could be doing too much work just to determine if a brick is falling. If you really are attempting to print stuff and do even more things that may cause some sort of side-effect, another approach similar in nature would be to use std::stable_partition.
With std::stable_partition you can "put on hold" the erasure and just move the elements to be erased at one position in the container (either at the beginning or at the end) all without invalidating those items. That's the main difference -- with std::stable_partition, all you would be doing is move the items to be processed, but the items after movement are still valid. Not so with std::remove and std::remove_if -- moved items are just invalid and any attempt to use those items as if they are still valid is undefined behavior.
auto it =
std::stable_partition(fell.begin(), fell.end(), [&](BrickFalling& b)
{ bool deadBrick = false;
b.printBrickFallingInfo(brick,window,deadBrick);
return deadBrick; });
// if you need to do something with the moved items besides
// erasing them, you can do so. The moved items start from
// fell.begin() up to the iterator it.
//...
//...
// Now we erase the items since we're done with them
fell.erase(fell.begin(), it);
The difference here is that the items we will eventually erase will lie to the left of the partitioning iterator it, so our erase() call will remove the items starting from the beginning. In addition to that, the items are still perfectly valid entries, so you can work with them in any way you wish before you finally erase them.
The other answer detailing the use of remove_if should be used whenever possible. If, however, your situations does not allow you to write your code using remove_if, which can happen in more complicated situations, you can use the following:
You can use vector::erase with an iterator to remove the element at that spot. The iterator used is then invalidated. erase returns a new iterator that points to the next element, so you can use that iterator to continue.
What you end up with is a loop like:
for( it = fell.begin(); it != fell.end(); /* iterator updated in loop */ )
{
if (shouldDelete)
it = fell.erase(it);
else
++it;
}

Erase final member of std::set

How can I delete the last member from a set?
For example:
set<int> setInt;
setInt.insert(1);
setInt.insert(4);
setInt.insert(3);
setInt.insert(2);
How can I delete 4 from setInt? I tried something like:
setInt.erase(setInt.rbegin());
but I received an error.
in C++11
setInt.erase(std::prev(setInt.end()));
You can decide how you want to handle cases where the set is empty.
if (!setInt.empty()) {
std::set<int>::iterator it = setInt.end();
--it;
setInt.erase(it);
}
By the way, if you're doing this a lot (adding things to a set in arbitrary order and then removing the top element), you could also take a look at std::priority_queue, see whether that suits your usage.
Edit: You should use std::prev as shown in Benjamin's better answer instead of the older style suggested in this answer.
I'd propose using a different name for rbegin which has a proper type:
setInt.erase(--setInt.end());
Assuming you checked that setInt is not empty!
Btw. this works because you can call the mutating decrement operator on a temporary (of type std::set<int>::iterator). This temporary will then be passed to the erase function.
A bit less performant, but an alternative option:
setInt.erase(*setInt.rbegin());
If you want to delete 4 instead of the last you should use the find method.
Depending on the use case 4 might not be the last.
std::set<int>::iterator it = setInt.find(4);
if(it != setInt.end()) {
setInt.erase(it);
}
If you want to delete the last element use:
if (!setInt.empty()) {
setInt.erase(--setInt.rbegin().base());
// line above is equal to
// setInt.erase(--setInt.end());
}
While I was not sure if --*.end(); is O.K. I did some reading.
So the -- on rbegin().base() leads to the same result as -- on end().
And both should work.
Check if the set is empty or not. If not, then get the last element and set that as iterator and reduce that iterator and erase the last element.
if (!setInt.empty())
{
std::set<int>::iterator it = setInt.end();
--it;
if(it != setInt.end()) {
setInt.erase(it);
}
}

std::map insert or std::map find?

Assuming a map where you want to preserve existing entries. 20% of the time, the entry you are inserting is new data. Is there an advantage to doing std::map::find then std::map::insert using that returned iterator? Or is it quicker to attempt the insert and then act based on whether or not the iterator indicates the record was or was not inserted?
The answer is you do neither. Instead you want to do something suggested by Item 24 of Effective STL by Scott Meyers:
typedef map<int, int> MapType; // Your map type may vary, just change the typedef
MapType mymap;
// Add elements to map here
int k = 4; // assume we're searching for keys equal to 4
int v = 0; // assume we want the value 0 associated with the key of 4
MapType::iterator lb = mymap.lower_bound(k);
if(lb != mymap.end() && !(mymap.key_comp()(k, lb->first)))
{
// key already exists
// update lb->second if you care to
}
else
{
// the key does not exist in the map
// add it to the map
mymap.insert(lb, MapType::value_type(k, v)); // Use lb as a hint to insert,
// so it can avoid another lookup
}
The answer to this question also depends on how expensive it is to create the value type you're storing in the map:
typedef std::map <int, int> MapOfInts;
typedef std::pair <MapOfInts::iterator, bool> IResult;
void foo (MapOfInts & m, int k, int v) {
IResult ir = m.insert (std::make_pair (k, v));
if (ir.second) {
// insertion took place (ie. new entry)
}
else if ( replaceEntry ( ir.first->first ) ) {
ir.first->second = v;
}
}
For a value type such as an int, the above will more efficient than a find followed by an insert (in the absence of compiler optimizations). As stated above, this is because the search through the map only takes place once.
However, the call to insert requires that you already have the new "value" constructed:
class LargeDataType { /* ... */ };
typedef std::map <int, LargeDataType> MapOfLargeDataType;
typedef std::pair <MapOfLargeDataType::iterator, bool> IResult;
void foo (MapOfLargeDataType & m, int k) {
// This call is more expensive than a find through the map:
LargeDataType const & v = VeryExpensiveCall ( /* ... */ );
IResult ir = m.insert (std::make_pair (k, v));
if (ir.second) {
// insertion took place (ie. new entry)
}
else if ( replaceEntry ( ir.first->first ) ) {
ir.first->second = v;
}
}
In order to call 'insert' we are paying for the expensive call to construct our value type - and from what you said in the question you won't use this new value 20% of the time. In the above case, if changing the map value type is not an option then it is more efficient to first perform the 'find' to check if we need to construct the element.
Alternatively, the value type of the map can be changed to store handles to the data using your favourite smart pointer type. The call to insert uses a null pointer (very cheap to construct) and only if necessary is the new data type constructed.
There will be barely any difference in speed between the 2, find will return an iterator, insert does the same and will search the map anyway to determine if the entry already exists.
So.. its down to personal preference. I always try insert and then update if necessary, but some people don't like handling the pair that is returned.
I would think if you do a find then insert, the extra cost would be when you don't find the key and performing the insert after. It's sort of like looking through books in alphabetical order and not finding the book, then looking through the books again to see where to insert it. It boils down to how you will be handling the keys and if they are constantly changing. Now there is some flexibility in that if you don't find it, you can log, exception, do whatever you want...
If you are concerned about efficiency, you may want to check out hash_map<>.
Typically map<> is implemented as a binary tree. Depending on your needs, a hash_map may be more efficient.
I don't seem to have enough points to leave a comment, but the ticked answer seems to be long winded to me - when you consider that insert returns the iterator anyway, why go searching lower_bound, when you can just use the iterator returned. Strange.
Any answers about efficiency will depend on the exact implementation of your STL. The only way to know for sure is to benchmark it both ways. I'd guess that the difference is unlikely to be significant, so decide based on the style you prefer.
map[ key ] - let stl sort it out. That's communicating your intention most effectively.
Yeah, fair enough.
If you do a find and then an insert you're performing 2 x O(log N) when you get a miss as the find only lets you know if you need to insert not where the insert should go (lower_bound might help you there). Just a straight insert and then examining the result is the way that I'd go.