Retaining just unique elements in a queue - c++

I have queue of vectors of the following form:
queue<vector<unsigned> > a;
vector<unsigned> b;
b.push_back(10); b.push_back(12); b.push_back(15);
a.push(b);
vector<unsigned> b2;
b1.push_back(15); b1.push_back(19); b1.push_back(18);
vector<unsigned> b1;
b1.push_back(10); b1.push_back(12); b1.push_back(15);
I want to enter only unique vectors in the queue. For example in the example above I want to retain just the vector elements: (10,12,15),(15,19,18) i.e here I have removed the duplicate element: (10,12,15) and have retained its copy just once.
One of the ways of checking whether a vector is already present in the queue or not is to iterate over it. Is there some other way by which I can check whether a vector is already present in the queue or not efficiently?
I am using gcc version: gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

If the order of insertion is important, then I would use a second data structure to keep track of the unique inserted elements, such as std::set.
#include <cassert>
#include <iostream>
#include <queue>
#include <set>
#include <vector>
template <typename T>
class unique_queue {
private:
std::queue<T> m_queue;
std::set<T> m_set;
public:
bool push(const T& t) {
if (m_set.insert(t).second) {
m_queue.push(t);
return true;
}
return false;
}
void pop() {
assert(!m_queue.empty());
const T& val = front();
typename std::set<T>::iterator it = m_set.find(val);
assert(it != m_set.end());
m_set.erase(it);
m_queue.pop();
}
const T& front() const {
return m_queue.front();
}
};
int main(int argc, char *argv[]) {
unique_queue<std::vector<unsigned> > q;
std::vector<unsigned> b1;
b1.push_back(10); b1.push_back(12); b1.push_back(15);
std::cout << "pushed: " << q.push(b1) << std::endl;
std::vector<unsigned> b2;
b2.push_back(15); b2.push_back(17); b2.push_back(18);
std::cout << "pushed: " << q.push(b2) << std::endl;
std::vector<unsigned> b3;
b3.push_back(10); b3.push_back(12); b3.push_back(15);
std::cout << "pushed: " << q.push(b3) << std::endl;
q.pop();
q.pop();
std::cout << "pushed: " << q.push(b3) << std::endl;
}
By default, std::set<T> will use std::less<T> to compare its elements. For a std::vector<unsigned>, this boils down to lexicographically comparing the vectors when inserting them into the set.

Queues are not the data structures that give you efficient search by the element values (they are essentially like vectors). Sets are, but they do not guarantee the ordering of elements.
Use std::unique that will try to give the best solution with respect to the actual queue organization.

If you have such special requirements, I tend to not use the standard containers directly. Instead, I define an interface first:
class my_queue {
public:
typedef vector<unsigned> element_type;
void push(element_type const&);
bool empty() const;
element_type pop();
Then, since you want elements to be unique, I'd use a regular queue and a set:
private:
queue<element_type> m_queue;
set<element_type> m_set;
};
I think you get the point and I'm too lazy to fire up a compiler to actually test this. ;)
Some further notes:
Even though it's more complex, a vector is just a datatype, so it can be assigned, copied and compared. This comparison is used by e.g. std::set.
This could be optimized, since the storage of the data in both is actually redundant. I'd then store the actual elements in the set and their order in the queue (i.e. store set iterators).
Unless you store many elements in the queue, doing a linear search over the queue (using e.g. std::deque as replacement) might be a better-performing alternative.
It's not clear whether inserting a duplicate should influence the order. Also, what if elements were already removed and then added back? In any case, write tests that make sure the queue has the required behaviour.

Related

Why does Apple Clang make a call to compare for a unique hash in an unordered map?

I was trying to improve my understanding of the implementation of unordered_map
and was surprised by this behavior. Consider this minimal example below.
#include <iostream>
#include <unordered_map>
using namespace std;
template<>
struct std::hash<int*>
{
size_t operator()(int* arr) const
{
cout << "custom hash called" << endl;
return arr[0];
}
};
template <>
struct std::equal_to<int*>
{
bool operator()(const int* lhs, const int* rhs) const
{
std::cout << "call to compare" << std::endl;
return lhs == rhs;
}
};
int main(int argc, char *argv[])
{
int arr1[8] {11,12,13,14,15,16,17,18};
int arr2[8] {1,2,3,4,5,6,7,8};
unordered_map<int*, string> myMap;
myMap.insert(make_pair(arr1, "one"));
myMap.insert({arr2, "two"});
}
I would have expected this output:
custom hash called
custom hash called
The hash for both inserts is unique and therefore no comparison of multiple keys should be required as I understand it (since the bucket should only contain exactly one key). And indeed this is the result when I try it with Clang, GCC and MSVC on godbolt.org. However, when I compile and run this example on a local Mac an additional call to the equal_to call operator happens for the second insert:
custom hash called
custom hash called
call to compare
Tested with
Apple clang version 13.1.6 (clang-1316.0.21.2)
Target: arm64-apple-darwin21.4.0
Thread model: posix
and
Apple clang version 13.1.6 (clang-1316.0.21.2.3)
Target: x86_64-apple-darwin21.4.0
Thread model: posix
In all cases only the C++20 flag was used.
There are basically two cases where the comparator does not need to be applied:
The first one is when the target bucket is empty (then, there is nothing to compare with). A simple demo code that works with both libstdc++ and libc++ is as follows:
struct Hash {
size_t operator()(int a) const { return a; }
};
struct Equal { ... /* log operator call */ };
std::unordered_map<int, int, Hash, Equal> m;
m.reserve(2);
std::cout << m.bucket(0) << std::endl; // 0
std::cout << m.bucket(1) << std::endl; // 1
m.insert({0, 0});
m.insert({1, 0})
Here, both keys 0 and 1 target different buckets, so there is no comparison with both implementations.
Live demo: https://godbolt.org/z/5jfYv6sba
The second case is when all the keys in the target bucket have different hashes and those hashes are stored (cached) in the hash table nodes. This caching is supported by libstdc++ and seems to be applied by default. However, it does not seem to be supported by libc++. Exemplary code:
std::unordered_map<int, int, Hash, Equal> m;
m.reserve(2);
std::cout << m.bucket(0) << std::endl; // 0
std::cout << m.bucket(2) << std::endl; // 0
m.insert({0, 0});
m.insert({2, 0})
Here, both keys target the same bucket (with index 0). With libstdc++, since the hashes are cached and are different, they are compared and there is no reason to additionally compare the entire keys. However, with libc++, hashes are not cached and the keys need to be compared.
Live demo: https://godbolt.org/z/vWK4Ko7Yj

Is there any way to hook insertion and deletion operations for the std containers?

Let's say, we are going to subclass the std::map and we need to catch all insertions and deletions to/from the container. For example, in order to save some application-specific information about the keys present in the container.
What's the easiest way to do this, if at all possible?
Probably, the most obvious way to do this is to override all methods and operators that perform the insertion and deletion. But I think, something may be easily lost sight of on this way, isn't it?
There is no way to do that in the general case. Inheritance is not a good idea because std::map is not polymorphic and no virtual dispatch will happen when you use a pointer to a map. You might as well use a simple wrapper class at that point and save yourself a lot of hassle:
#include <iostream>
#include <map>
template <class Key, class Value>
struct Map {
private:
std::map<Key, Value> _data;
public:
template <class Y, class T>
void insert(Y &&key, T &&val) {
std::cout << "[" << key << "] = " << val << "\n";
_data.insert_or_assign(std::forward<Y>(key), std::forward<T>(val));
}
void remove(Key const &key) {
auto const it = _data.find(key);
if (it == _data.end())
return;
std::cout << "[" << key << "] -> removed\n";
_data.erase(it);
}
Value *get(Key const &key) {
auto const it = _data.find(key);
if (it == _data.end())
return nullptr;
return &it->second;
}
};
int main() {
Map<int, char const *> map;
map.insert(10, "hello");
map.insert(1, "world");
map.remove(1);
map.remove(10);
map.remove(999);
}
Short answer: No
C++ standard library data structures were not designed to support this use case. You may subclass and try to override but this will not work as you'd expect. In fact you'll get an error at compile time if you do it properly with the help of the keyword override. The problem is that std::map methods are not virtual so they don't support so called late binding. Functions that work with references and pointers to std::map will keep using std::map methods even in the case of passing instances of your std::map subclass.
Your only option is to create a completely new class your_map with a subset of requred methods of std::map and to delegate the job to an inner instance of std::map as shown in Ayxan Haqverdili's answer. Unfortunately this solution requires you to change the signature of functions working with your code replacing std::map & arguments with your_map & which may not be always possible.

Custom iterator returning std::pair of custom container elements (no boost)

I have a class simplified as much as possible and for demonstration purposes. It looks as simple as:
#include <iostream>
#include <vector>
class Simple
{
private:
std::vector<std::size_t> indices;
std::vector<int> values;
public:
void insert(std::size_t index, int value)
{
indices.push_back(index);
values.push_back(value);
}
int at(std::size_t index)
{
return values[indices[index]];
}
};
int main()
{
Simple s;
s.insert(10, 100);
std::cout << s.at(10) << std::endl;
return 0;
}
What I wat to achieve is to iterate over elements of this container and get std::pair at each iteration, whose first element would be a value from indices member and whose second element would be a value from values member. Something like:
Simple s;
s.insert(10, 100);
for (std::pair<std::size_t, int> node : s)
{
std::cout << node.first << " " << node.second << std::endl; // expect to print 10 100
}
I'm really new to iterators. I know how to iterate through standard containers and get their values. I even know how to iterate through my Simple container and get value from values member at each iteration. I could do it like so:
//new member functions in Simple class
auto begin()
{
std::begin(values);
}
auto end()
{
std::end(values);
}
But I do not know how to create some new data type at each iteration and return it to the client code.
Notice that your trying to iterate through a class that doesn't have any kind of abstraction. What I mean is that you can only iterate through it if you "make it" a vector, or a different data structure that you could iterate through.
class Simple : public vector<std::pair<std::size_t, int>> { /*...*/ };
Once you declare a class with such syntax you would be able to handle it as a vector of pairs. You will have the advantage of declaring your own custom methods as well.
You can now:
Simple simple;
simple.push_back({10, 100});
for (auto element : simple) {
std::cout << element.first << " " << element.second << std::endl;
}
I recommend you some reading in this kind of implementations. Sometimes it can save a lot of work! And you know what they say, why to reinvent the wheel?
Remember that the method at(10) will return the element on position 10 of your vector. In your example you are trying to read from an out_of_range position.
Maybe you're interested in map, which contains a key and a value. From your example, it doesn't seem that you are trying to keep your data structure sorted. If you are, you can use a map instead of an unordered_map. You can retrieve a value from a key by using the find method:
#include <unordered_map>
/* ... */
unordered_map<std::size_t, int> simple;
simple.insert({10, 100});
auto it = simple.find(10);
if (it != simple.end()) {
std::cout << "Found value from key: " << *it << std::endl;
} else {
std::cout << "Your map doesn't contain such key!" << std::endl;
}
Notice that a map does not allow multiple keys with the same value. But a multimap does.

Vector of non-const objects seems to be treated as constant in range-based for loop

I have a std::vector of objects being filled by de-referencing std::unique_ptr's in the push_back calls. However, when I run through a mutable range-based for-loop, my modification to these objects stays local to the loop. In other words, it seems as those objects are being treated as constant, despite that lack of a const keyword in the loop. Here is minimal code to demonstrate what I'm seeing:
#include <vector>
#include <memory>
#include <iostream>
class Item
{
public:
typedef std::unique_ptr<Item> unique_ptr;
inline static Item::unique_ptr createItem()
{
return std::unique_ptr<Item>(new Item());
}
inline const int getValue() const { return _value; }
inline void setValue(const int val) { _value = val; }
private:
int _value;
};
int main()
{
std::vector<Item> _my_vec;
for (int i = 0; i < 5; i++)
{
Item::unique_ptr item = Item::createItem();
_my_vec.push_back(*item);
}
for (auto item : _my_vec)
{
// modify item (default value was 0)
item.setValue(10);
// Correctly prints 10
std::cout << item.getValue() << std::endl;
}
for (auto item : _my_vec)
{
// Incorrectly prints 0's (default value)
std::cout << item.getValue() << std::endl;
}
}
I suspect this has something to do with the move semantics of std::unique_ptr? But that wouldn't quite make sense because even if push_back is calling the copy constructor or something and copying the added item rather than pointing to it, the iterator is still passing over the same copies, no?
Interestingly enough, in my actual code, the class represented here by Item has a member variable that is a vector of shared pointers to objects of another class, and modifications to the objects being pointed to by those shared pointers persist between loops. This is why I suspect there's something funky with the unique_ptr.
Can anyone explain this behavior and explain how I may fix this issue while still using pointers?
When you write a range-based for loop like that:
std::vector<int> v = ...;
for(auto elt : v) {
...
}
the elements of v are copied into elt.
In your example, in each iteration, you modify the local copy of the Item and not the Item in the vector.
To fix your issue, use a reference:
for (auto& item : _my_vec)
{
item.setValue(10);
std::cout << item.getValue() << std::endl;
}
Vector of non-const objects seems to be treated as constant
If it was treated as constant, then the compiler would scream at you, because writing to a constant is treated as ill-formed and the compiler would be required to scream at you. The shown code compiles just fine, with no warnings.
I suspect that you may be referring to the fact that you don't modify the elements within the vector. That is because you modify auto item. That item is not an element of the vector, it is a copy of the item in the vector. You could refer to the item within that vector by using a reference: auto& item. Then modifications to item would be modifications to the referred element of the vector.

Does C++ have ordered hash?

Perl has a structure called "ordered hash" Tie::IxHash. One can use it as a hashtable/map. The entries are in the order of insertion.
Wonder if there is such a thing in C++.
Here is a sample Perl snippet:
use Tie::IxHash;
tie %food_color, "Tie::IxHash";
$food_color{Banana} = "Yellow";
$food_color{Apple} = "Green";
$food_color{Lemon} = "Yellow";
print "In insertion order, the foods are:\n";
foreach $food (keys %food_color) {
print " $food\n"; #will print the entries in order
}
Update 1
As #kerrek-sb pointed out, one can use Boost Multi-index Containers Library. Just wonder if it is possible to do it with STL.
Yes and no. No, there's no one that that's specifically intended to provide precisely the same functionality. But yes, you can do the same in a couple of different ways. If you expect to access the data primarily in the order inserted, then the obvious way to go would be a simple vector of pairs:
std::vector<std::string, std::string> food_colors;
food_colors.push_back({"banana", "yellow"});
food_colors.push_back({"apple", "green"});
food_colors.push_back({"lemon", "yellow"});
for (auto const &f : food_colors)
std::cout << f.first << ": " << f.second << "\n";
This preserves order by simply storing the items in order. If you need to access them by key, you can use std::find to do a linear search for a particular item. That minimizes extra memory used, at the expense of slow access by key if you get a lot of items.
If you want faster access by key with a large number of items, you could use a Boost MultiIndex. If you really want to avoid that, you can create an index of your own pretty easily. To do this, you'd start by inserting your items into a std::unordered_map (or perhaps an std::map). This gives fast access by key, but no access in insertion order. It does, however, return an iterator to each items as it's inserted into the map. You can simply store those iterators into a vector to get access in the order of insertion. Although the principle of this is fairly simple, the code is a bit on the clumsy side, to put it nicely:
std::map<std::string, std::string> fruit;
std::vector<std::map<std::string, std::string>::iterator> in_order;
in_order.push_back(fruit.insert(std::make_pair("banana", "yellow")).first);
in_order.push_back(fruit.insert(std::make_pair("apple", "green")).first);
in_order.push_back(fruit.insert(std::make_pair("lemon", "yellow")).first);
This allows access either by key:
// ripen the apple:
fruit["apple"] = "red";
...or in insertion order:
for (auto i : in_order)
std::cout << i->first << ": " << i->second << "\n";
For the moment, I've shown the basic mechanism for doing this--if you wanted to use it much, you'd probably want to wrap that up into a nice class to hide some of the ugliness and the keep things pretty and clean in normal use.
An associative container that remembers insertion order does not come with the C++ standard library, but it is straightforward to implement one using existing STL containers.
For example, a combination of std::map (for fast lookup) and std::list (to maintain key ordering) can be used to emulate an insertion-ordered map. Here is an example that demonstrates the idea:
#include <unordered_map>
#include <list>
#include <stdexcept>
template<typename K, typename V>
class InsOrderMap {
struct value_pos {
V value;
typename std::list<K>::iterator pos_iter;
value_pos(V value, typename std::list<K>::iterator pos_iter):
value(value), pos_iter(pos_iter) {}
};
std::list<K> order;
std::unordered_map<K, value_pos> map;
const value_pos& locate(K key) const {
auto iter = map.find(key);
if (iter == map.end())
throw std::out_of_range("key not found");
return iter->second;
}
public:
void set(K key, V value) {
auto iter = map.find(key);
if (iter != map.end()) {
// no order change, just update value
iter->second.value = value;
return;
}
order.push_back(key);
map.insert(std::make_pair(key, value_pos(value, --order.end())));
}
void erase(K key) {
order.erase(locate(key).pos_iter);
map.erase(key);
}
V operator[](K key) const {
return locate(key).value;
}
// iterate over the mapping with a function object
// (writing a real iterator is too much code for this example)
template<typename F>
void walk(F fn) const {
for (auto key: order)
fn(key, (*this)[key]);
}
};
// TEST
#include <string>
#include <iostream>
#include <cassert>
int main()
{
typedef InsOrderMap<std::string, std::string> IxHash;
IxHash food_color;
food_color.set("Banana", "Yellow");
food_color.set("Apple", "Green");
food_color.set("Lemon", "Yellow");
assert(food_color["Banana"] == std::string("Yellow"));
assert(food_color["Apple"] == std::string("Green"));
assert(food_color["Lemon"] == std::string("Yellow"));
auto print = [](std::string k, std::string v) {
std::cout << k << ' ' << v << std::endl;
};
food_color.walk(print);
food_color.erase("Apple");
std::cout << "-- without apple" << std::endl;
food_color.walk(print);
return 0;
}
Developing this code into a drop-in replacement for a full-fledged container such as std::map requires considerable effort.
C++ has standard containers for this. An unordered map seems like what you are looking for:
std::unordered_map <std::string, std::string> mymap = {{"Banana", "Yellow" }, {"Orange","orange" } }