I'm working on a problem here where I have two vector objects in my code: one is a vector<string>, and the other a vector<unsigned> that I'm passing as a const ref to some function. I'm using these functions to find the minimum or maximum value out of one vector, but I need the index value of the minimum or maximum so that I can index into the other vector. My code looks something like this:
std::string getTopEmployee( const std::vector<std::string>& names, const std::vector<unsigned>& ratings ) {
// Find Largest Value in ratings
std::size_t largest = *std::max_element( ratings.begin(), ratings.end() );
// How to get the index?
// I do not need the largest value itself.
return names[index];
}
std::string getWorstEmployee( const std::vector<std::string>& names, const std::vector<unsigned>& ratings ) {
// Find Smallest Value in ratings
std::size_t smallest = *std::min_element( ratings.begin(), ratings.end() );
// How to get the index?
// I do not need the smallest value itself.
return names[index];
}
The two vectors passed into this function are of the same size: and we are assuming that there are no two values in the ratings vector that are equal in value. Sorting the second vector is not an option.
std::min_element() and std::max_element() work with iterators, not indexes.
For an indexable container like std::vector, you can convert an iterator to an index using std::distance(), eg:
std::string getTopEmployee( const std::vector<std::string>& names, const std::vector<unsigned>& ratings ) {
// Find Largest Value in ratings
auto largest = std::max_element( ratings.begin(), ratings.end() );
if (largest == ratings.end()) return "";
return names[std::distance(ratings.begin(), largest)];
}
std::string getWorstEmployee( const std::vector<std::string>& names, const std::vector<unsigned>& ratings ) {
// Find Smallest Value in ratings
auto smallest = std::min_element( ratings.begin(), ratings.end() );
if (smallest == ratings.end()) return "";
return names[std::distance(ratings.begin(), smallest)];
}
For std::vector or any other container with random-access iterators you can use arithmetic operators (let's assume for simplicity that containers are not empty):
auto maxi = std::max_element(ratings.begin(), ratings.end());
return names[maxi - ratings.begin()];
Complexity: O(1).
For containers with iterators that are at least input iterators, you can use std::distance:
auto maxi = std::max_element(ratings.begin(), ratings.end());
return names[std::distance(ratings.begin(), maxi)];
Complexity: O(1) with random-access iterators, O(n) with not random-access.
I am using a multi-set in c++, which I believe stores an element and the respective count of it when it is inserted.
Here, when I want to delete an element, I just want to decrease the count of that element in the set by 1 till it is greater than 0.
Example C++ code:
multiset<int>mset;
mset.insert(2);
mset.insert(2);
printf("%d ",mset.count(2)); //this returns 2
// here I need an O(1) constant time function (in-built or whatever )
// to decrease the count of 2 in the set without deleting it
// Remember constant time only
-> Function and its specifications
printf("%d ",mset.count(2)); // it should print 1 now .
Is there any way to achieve that or should i go by deleting that and inserting the element 2 by the required (count-1) times?
... I am using a multi-set in c++, which stores an element and the respective count of it ...
No you aren't. You're using a multi-set which stores n copies of a value which was inserted n times.
If you want to store something relating a value to a count, use an associative container like std::map<int, int>, and use map[X]++ to increment the number of Xs.
... i need an O(1) constant time function ... to decrease the count ...
Both map and set have O(log N) complexity just to find the element you want to alter, so this is impossible with them. Use std::unordered_map/set to get O(1) complexity.
... I just want to decrease the count of that element in the set by 1 till it is >0
I'm not sure what that means.
with a set:
to remove all copies of an element from the set, use equal_range to get a range (pair of iterators), and then erase that range
to remove all-but-one copies in a non-empty range, just increment the first iterator in the pair and check it's still not equal to the second iterator before erasing the new range.
these both have an O(log N) lookup (equal_range) step followed by a linear-time erase step (although it's linear with the number of elements having the same key, not N).
with a map:
to remove the count from a map, just erase the key
to set the count to one, just use map[key]=1;
both of these have an O(log N) lookup followed by a constant-time erase
with an unordered map ... for your purposes it's identical to the map above, except with O(1) complexity.
Here's a quick example using unordered_map:
template <typename Key>
class Counter {
std::unordered_map<Key, unsigned> count_;
public:
unsigned inc(Key k, unsigned delta = 1) {
auto result = count_.emplace(k, delta);
if (result.second) {
return delta;
} else {
unsigned& current = result.first->second;
current += delta;
return current;
}
}
unsigned dec(Key k, unsigned delta = 1) {
auto iter = count_.find(k);
if (iter == count_.end()) return 0;
unsigned& current = iter->second;
if (current > delta) {
current -= delta;
return current;
}
// else current <= delta means zero
count_.erase(iter);
return 0;
}
unsigned get(Key k) const {
auto iter = count_.find(k);
if (iter == count_.end()) return 0;
return iter->second;
}
};
and use it like so:
int main() {
Counter<int> c;
// test increment
assert(c.inc(1) == 1);
assert(c.inc(2) == 1);
assert(c.inc(2) == 2);
// test lookup
assert(c.get(0) == 0);
assert(c.get(1) == 1);
// test simple decrement
assert(c.get(2) == 2);
assert(c.dec(2) == 1);
assert(c.get(2) == 1);
// test erase and underflow
assert(c.dec(2) == 0);
assert(c.dec(2) == 0);
assert(c.dec(1, 42) == 0);
}
What I have is a vector of elements, I do not care about the order of them.
Than I have N indexes (each addressing unique position in the vector) of elements to be removed from the vector. I want the removal be as fast as possible.
Best I could come up with was to store indexes in set (order indexes):
std::set<unsigned int> idxs;
for (int i=0; i<N; ++i)
idxs.insert(some_index);
And than iterating over the set in reversed order and replacing index to remove by last element of the vector.
std::set<unsigned int>::reverse_iterator rit;
for (rit = idxs.rbegin(); rit != idxs.rend(); ++rit) {
vec[*rit].swap(vec[vec.size() - 1]);
vec.resize(vec.size() - 1);
}
However I was thinking whether there is some more efficient way of doing this since the usage of set seems a bit overkill to me and I would love to avoid sorting at all.
EDIT1:
Let us assume I use vector and sort it afterwards.
std::vector<unsigned int> idxs;
for (int i=0; i<N; ++i)
idxs.push_back(some_index);
std::sort(idxs.begin(), idxs.end());
Can I push it any further?
EDIT2:
I should have mentioned that the vector will have up to 10 elements. However the removal in my program occurs very often (hundreds of thousands times).
set is a good choice. I'd guess using another allocator (e.g. arena)would have the biggest impact. Why not use a set instead of a vector of elements to begin with?
I see the following relevant variations:
Instead of remove, create a new vector and copy preserved elements, then swap back.
This keeps your indices stable (unlike removal, which would require sorting or updating the indices).
Instead of a vector of indices, use a vector of bools of the same length as your data.
With the length of "maximum 10" given, a bit mask seems sufficient
So, roughly:
struct Index
{
DWORD removeMask = 0; // or use bit vector for larger N
void TagForRemove(int idx) { removeMask |= (1<<idx); }
boll DoRemove(int idx) const { return (removeMask & (1<<idx)) != 0; }
}
// create new vector, or remove, as you like
void ApplyRemoveIndex(vector<T> & v, Index remove)
{
vector<T> copy;
copy.reserve(v.size());
for (i=0..v.size())
if (!remove.DoRemove(i))
copy.push_back(v[i]);
copy.swap(v);
}
You can use swap/pop_back to remove an item at a given index, and keep track of which indices you've moved with a hash table. It's linear space & time in the number of removals.
std::vector<T> vec = ...;
std::vector<unsigned int> idxs;
std::unordered_map<unsigned int, unsigned int> map;
for(auto index : idxs) {
unsigned int trueIndex = index;
while (trueIndex >= vec.size()) {
trueIndex = map[trueIndex];
}
// element at index 'vec.size()-1' is being moved to index 'index'
map[vec.size()-1] = index;
swap(vec[trueIndex], vec[vec.size()-1]);
vec.pop_back();
}
I came across the following question in stackOverflow
std::map insert or std::map find?
why is using find() considered inferior to lower_bound() + key_comp() ?
assume I have the following map
map<int, int> myMap;
myMap[1]=1;
myMap[2]=3;
myMap[3]=5;
int key = xxx; //some value of interest.
int value = yyy;
the suggested answer is to use
map<int, int>::iterator itr = myMap.lower_bound(key);
if (itr != myMap.end() && !(myMap.key_comp()(key, itr->first)))
{
//key found.
// do processing for itr->second
//
}else {
//insert into the end position
myMap.insert (itr, map<int, int>::value_type(key, value));
}
why is it better than the following?
map<int, int>::iterator itr = myMap.find(key);
if (itr != myMap.end())
{
//key found.
// do processing for itr->second
//
}else {
//insert into the end position
myMap.insert (itr, map<int, int>::value_type(key, value));
}
In the second case, notice that if you need to insert the value, the iterator is always myMap.end(). This can not help to improve the performance of the insert operation (except when the new element is inserted at the end, of course). The container needs to find the correct position where to insert the new node, which is usually O(log N).
With lower_bound(), you already found the best hint for the container where to insert the new element and this is the optimization opportunity that the first technique offers. This might lead to a performance close to O(1). You have an additional key comparison, but that is O(1) as well (from the container's perspective).
Since both the initial find() and lower_bound are O(log N), you end up with a O(log N) plus two O(1) operation in the first technique and with two O(log N) operations in the second case.
Is there a way to find a nonexisting key in a map?
I am using std::map<int,myclass>, and I want to automatically generate a key for new items. Items may be deleted from the map in different order from their insertion.
The myclass items may, or may not be identical, so they can not serve as a key by themself.
During the run time of the program, there is no limit to the number of items that are generated and deleted, so I can not use a counter as a key.
An alternative data structure that have the same functionality and performance will do.
Edit
I trying to build a container for my items - such that I can delete/modify items according to their keys, and I can iterate over the items. The key value itself means nothing to me, however, other objects will store those keys for their internal usage.
The reason I can not use incremental counter, is that during the life-span of the program they may be more than 2^32 (or theoretically 2^64) items, however item 0 may theoretically still exist even after all other items are deleted.
It would be nice to ask std::map for the lowest-value non-used key, so i can use it for new items, instead of using a vector or some other extrnal storage for non-used keys.
I'd suggest a combination of counter and queue. When you delete an item from the map, add its key to the queue. The queue then keeps track of the keys that have been deleted from the map so that they can be used again. To get a new key, you first check if the queue is empty. If it isn't, pop the top index off and use it, otherwise use the counter to get the next available key.
Let me see if I understand. What you want to do is
look for a key.
If not present, insert an element.
Items may be deleted.
Keep a counter (wait wait) and a vector. The vector will keep the ids of the deleted items.
When you are about to insert the new element,look for a key in the vector. If vector is not empty, remove the key and use it. If its empty, take one from the counter (counter++).
However, if you neveer remove items from the map, you are just stuck with a counter.
Alternative:
How about using the memory address of the element as a key ?
I would say that for general case, when key can have any type allowed by map, this is not possible. Even ability to say whether some unused key exists requires some knowledge about type.
If we consider situation with int, you can store std::set of contiguous segments of unused keys (since these segments do not overlap, natural ordering can be used - simply compare their starting points). When a new key is needed, you take the first segment, cut off first index and place the rest in the set (if the rest is not empty). When some key is released, you find whether there are neighbour segments in the set (due to set nature it's possible with O(log n) complexity) and perform merging if needed, otherwise simply put [n,n] segment into the set.
in this way you will definitely have the same order of time complexity and order of memory consumption as map has independently on requests history (because number of segments cannot be more than map.size()+1)
something like this:
class TKeyManager
{
public:
TKeyManager()
{
FreeKeys.insert(
std::make_pair(
std::numeric_limits<int>::min(),
std::numeric_limits<int>::max());
}
int AlocateKey()
{
if(FreeKeys.empty())
throw something bad;
const std::pair<int,int> freeSegment=*FreeKeys.begin();
if(freeSegment.second>freeSegment.first)
FreeKeys.insert(std::make_pair(freeSegment.first+1,freeSegment.second));
return freeSegment.first;
}
void ReleaseKey(int key)
{
std:set<std::pair<int,int>>::iterator position=FreeKeys.insert(std::make_pair(key,key)).first;
if(position!=FreeKeys.begin())
{//try to merge with left neighbour
std::set<std::pair<int,int>>::iterator left=position;
--left;
if(left->second+1==key)
{
left->second=key;
FreeKeys.erase(position);
position=left;
}
}
if(position!=--FreeKeys.end())
{//try to merge with right neighbour
std::set<std::pair<int,int>>::iterator right=position;
++right;
if(right->first==key+1)
{
position->second=right->second;
FreeKeys.erase(right);
}
}
}
private:
std::set<std::pair<int,int>> FreeKeys;
};
Is there a way to find a nonexisting
key in a map?
I'm not sure what you mean here. How can you find something that doesn't exist? Do you mean, is there a way to tell if a map does not contain a key?
If that's what you mean, you simply use the find function, and if the key doesn't exist it will return an iterator pointing to end().
if (my_map.find(555) == my_map.end()) { /* do something */ }
You go on to say...
I am using std::map, and
I want to automatically generate a key
for new items. Items may be deleted
from the map in different order from
their insertion. The myclass items may, or may not be identical, so they can not serve as a key by themself.
It's a bit unclear to me what you're trying to accomplish here. It seems your problem is that you want to store instances of myclass in a map, but since you may have duplicate values of myclass, you need some way to generate a unique key. Rather than doing that, why not just use std::multiset<myclass> and just store duplicates? When you look up a particular value of myclass, the multiset will return an iterator to all the instances of myclass which have that value. You'll just need to implement a comparison functor for myclass.
Could you please clarify why you can not use a simple incremental counter as auto-generated key? (increment on insert)? It seems that there's no problem doing that.
Consider, that you decided how to generate non-counter based keys and found that generating them in a bulk is much more effective than generating them one-by-one.
Having this generator proved to be "infinite" and "statefull" (it is your requirement), you can create a second fixed sized container with say 1000 unused keys.
Supply you new entries in map with keys from this container, and return keys back for recycling.
Set some low "threshold" to react on key container reaching low level and refill keys in bulk using "infinite" generator.
The actual posted problem still exists "how to make efficient generator based on non-counter". You may want to have a second look at the "infinity" requirement and check if say 64-bit or 128-bit counter still can satisfy your algorithms for some limited period of time like 1000 years.
use uint64_t as a key type of sequence or even if you think that it will be not enough
struct sequence_key_t {
uint64_t upper;
uint64_t lower;
operator++();
bool operator<()
};
Like:
sequence_key_t global_counter;
std::map<sequence_key_t,myclass> my_map;
my_map.insert(std::make_pair(++global_counter, myclass()));
and you will not have any problems.
Like others I am having difficulty figuring out exactly what you want. It sounds like you want to create an item if it is not found. sdt::map::operator[] ( const key_type& x ) will do this for you.
std::map<int, myclass> Map;
myclass instance1, instance2;
Map[instance1] = 5;
Map[instance2] = 6;
Is this what you are thinking of?
Going along with other answers, I'd suggest a simple counter for generating the ids. If you're worried about being perfectly correct, you could use an arbitrary precision integer for the counter, rather than a built in type. Or something like the following, which will iterate through all possible strings.
void string_increment(std::string& counter)
{
bool carry=true;
for (size_t i=0;i<counter.size();++i)
{
unsigned char original=static_cast<unsigned char>(counter[i]);
if (carry)
{
++counter[i];
}
if (original>static_cast<unsigned char>(counter[i]))
{
carry=true;
}
else
{
carry=false;
}
}
if (carry)
{
counter.push_back(0);
}
}
e.g. so that you have:
std::string counter; // empty string
string_increment(counter); // now counter=="\x00"
string_increment(counter); // now counter=="\x01"
...
string_increment(counter); // now counter=="\xFF"
string_increment(counter); // now counter=="\x00\x00"
string_increment(counter); // now counter=="\x01\x00"
...
string_increment(counter); // now counter=="\xFF\x00"
string_increment(counter); // now counter=="\x00\x01"
string_increment(counter); // now counter=="\x01\x01"
...
string_increment(counter); // now counter=="\xFF\xFF"
string_increment(counter); // now counter=="\x00\x00\x00"
string_increment(counter); // now counter=="\x01\x00\x00"
// etc..
Another option, if the working set actually in the map is small enough would be to use an incrementing key, then re-generate the keys when the counter is about to wrap. This solution would only require temporary extra storage. The hash table performance would be unchanged, and the key generation would just be an if and an increment.
The number of items in the current working set would really determine if this approach is viable or not.
I loved Jon Benedicto's and Tom's answer very much. To be fair, the other answers that only used counters may have been the starting point.
Problem with only using counters
You always have to increment higher and higher; never trying to fill the empty gaps.
Once you run out of numbers and wrap around, you have to do log(n) iterations to find unused keys.
Problem with the queue for holding used keys
It is easy to imagine lots and lots of used keys being stored in this queue.
My Improvement to queues!
Rather than storing single used keys in the queue; we store ranges of unused keys.
Interface
using Key = wchar_t; //In my case
struct Range
{
Key first;
Key last;
size_t size() { return last - first + 1; }
};
bool operator< (const Range&,const Range&);
bool operator< (const Range&,Key);
bool operator< (Key,const Range&);
struct KeyQueue__
{
public:
virtual void addKey(Key)=0;
virtual Key getUniqueKey()=0;
virtual bool shouldMorph()=0;
protected:
Key counter = 0;
friend class Morph;
};
struct KeyQueue : KeyQueue__
{
public:
void addKey(Key)override;
Key getUniqueKey()override;
bool shouldMorph()override;
private:
std::vector<Key> pool;
friend class Morph;
};
struct RangeKeyQueue : KeyQueue__
{
public:
void addKey(Key)override;
Key getUniqueKey()override;
bool shouldMorph()override;
private:
boost::container::flat_set<Range,std::less<>> pool;
friend class Morph;
};
void morph(KeyQueue__*);
struct Morph
{
static void morph(const KeyQueue &from,RangeKeyQueue &to);
static void morph(const RangeKeyQueue &from,KeyQueue &to);
};
Implementation
Note: Keys being added are assumed to be key not found in queue
// Assumes that Range is valid. first <= last
// Assumes that Ranges do not overlap
bool operator< (const Range &l,const Range &r)
{
return l.first < r.first;
}
// Assumes that Range is valid. first <= last
bool operator< (const Range &l,Key r)
{
int diff_1 = l.first - r;
int diff_2 = l.last - r;
return diff_1 < -1 && diff_2 < -1;
}
// Assumes that Range is valid. first <= last
bool operator< (Key l,const Range &r)
{
int diff = l - r.first;
return diff < -1;
}
void KeyQueue::addKey(Key key)
{
if(counter - 1 == key) counter = key;
else pool.push_back(key);
}
Key KeyQueue::getUniqueKey()
{
if(pool.empty()) return counter++;
else
{
Key key = pool.back();
pool.pop_back();
return key;
}
}
bool KeyQueue::shouldMorph()
{
return pool.size() > 10;
}
void RangeKeyQueue::addKey(Key key)
{
if(counter - 1 == key) counter = key;
else
{
auto elem = pool.find(key);
if(elem == pool.end()) pool.insert({key,key});
else // Expand existing range
{
Range &range = (Range&)*elem;
// Note at this point, key is 1 value less or greater than range
if(range.first > key) range.first = key;
else range.last = key;
}
}
}
Key RangeKeyQueue::getUniqueKey()
{
if(pool.empty()) return counter++;
else
{
Range &range = (Range&)*pool.begin();
Key key = range.first++;
if(range.first > range.last) // exhausted all keys in range
pool.erase(pool.begin());
return key;
}
}
bool RangeKeyQueue::shouldMorph()
{
return pool.size() == 0 || pool.size() == 1 && pool.begin()->size() < 4;
}
void morph(KeyQueue__ *obj)
{
if(KeyQueue *queue = dynamic_cast<KeyQueue*>(obj))
{
RangeKeyQueue *new_queue = new RangeKeyQueue();
Morph::morph(*queue,*new_queue);
obj = new_queue;
}
else if(RangeKeyQueue *queue = dynamic_cast<RangeKeyQueue*>(obj))
{
KeyQueue *new_queue = new KeyQueue();
Morph::morph(*queue,*new_queue);
obj = new_queue;
}
}
void Morph::morph(const KeyQueue &from,RangeKeyQueue &to)
{
to.counter = from.counter;
for(Key key : from.pool) to.addKey(key);
}
void Morph::morph(const RangeKeyQueue &from,KeyQueue &to)
{
to.counter = from.counter;
for(Range range : from.pool)
while(range.first <= range.last)
to.addKey(range.first++);
}
Usage:
int main()
{
std::vector<Key> keys;
KeyQueue__ *keyQueue = new KeyQueue();
srand(time(NULL));
bool insertKey = true;
for(int i=0; i < 1000; ++i)
{
if(insertKey)
{
Key key = keyQueue->getUniqueKey();
keys.push_back(key);
}
else
{
int index = rand() % keys.size();
Key key = keys[index];
keys.erase(keys.begin()+index);
keyQueue->addKey(key);
}
if(keyQueue->shouldMorph())
{
morph(keyQueue);
}
insertKey = rand() % 3; // more chances of insert
}
}