Read the following statement somewhere:
An additional hash table can be used to make deletion fast in
min-heap.
Question> How to combine priority_queue and unordered_map so that I can implement the idea above?
#include <queue>
#include <unordered_map>
#include <iostream>
#include <list>
using namespace std;
struct Age
{
Age(int age) : m_age(age) {}
int m_age;
};
// Hash function for Age
class HashAge {
public:
const size_t operator()(const Age &a) const {
return hash<int>()(a.m_age);
}
};
struct AgeGreater
{
bool operator()(const Age& lhs, const Age& rhs) const {
return lhs.m_age < rhs.m_age;
}
};
int main()
{
priority_queue<Age, list<Age>, AgeGreater> min_heap; // doesn't work
//priority_queue<Age, vector<Age>, AgeGreater> min_heap;
// Is this the right way to do it?
unordered_map<Age, list<Age>::iterator, HashAge > hashTable;
}
Question> I am not able to make the following work:
priority_queue<Age, list<Age>, AgeGreater> min_heap; // doesn't work
I have to use list as the container b/c the iterators of list is not affected by insertion/deletion (Iterator invalidation rules)
You can't do this with the supplied priority_queue data structure:
In a priority queue you don't know where the elements are stored, so it is hard to delete them in constant time, because you can't find the elements. But, if you maintain a hash table with the location of every element in the priority queue stored in the hash table, then you can find and remove an item quickly, although I would expect log(N) time in the worst case, not constant time. (I don't recall offhand if you get amortized constant time.)
To do this you usually need to roll your own data structures, because you have to update the hash table each time an item is moved around in the priority queue.
I have some example code that does this here:
http://code.google.com/p/hog2/source/browse/trunk/algorithms/AStarOpenClosed.h
It's based on older coding styles, but it does the job.
To illustrate:
/**
* Moves a node up the heap. Returns true if the node was moved, false otherwise.
*/
template<typename state, typename CmpKey, class dataStructure>
bool AStarOpenClosed<state, CmpKey, dataStructure>::HeapifyUp(unsigned int index)
{
if (index == 0) return false;
int parent = (index-1)/2;
CmpKey compare;
if (compare(elements[theHeap[parent]], elements[theHeap[index]]))
{
// Perform normal heap operations
unsigned int tmp = theHeap[parent];
theHeap[parent] = theHeap[index];
theHeap[index] = tmp;
// Update the element location in the hash table
elements[theHeap[parent]].openLocation = parent;
elements[theHeap[index]].openLocation = index;
HeapifyUp(parent);
return true;
}
return false;
}
Inside the if statement we do the normal heapify operations on the heap and then update the location in the hash table (openLocation) to point to the current location in the priority queue.
Related
One of major drawbacks to linked lists is that the access time to elements is linear. Hashtables, on the other hand, have access in constant time. However, linked lists have constant insertion and deletion time given an adjacent node in the list. I am trying to construct a FIFO datastructure with constant access time, and constant insertion/deletion time. I came up with the following code:
unordered_map<string key, T*> hashTable;
list<T> linkedList;
T Foo;
linkedList.push_front(Foo);
hashTable.insert(pair<string, T*>("A", &Foo);
T Bar;
linkedList.push_front(Bar);
hashTable.insert(pair<string, T*>("B", &Bar);
However, this code feels like it is really dangerous. The idea was that I could use the hashtable to access any given element in constant time, and since insertion always occurs at the start of the list, and deletion from the end of the list, I can insert and delete elements in constant time. Is there anything inherently poor about the above code? If I wanted to instead store pointers to the nodes in the linkedList to have constant insertion/deletion time from any node would I just store list< T >::iterator*?
If you build a lookup table you will end up with 2 data structures holding the same data which does not make any sense.
The best thing you can do is making your linked list ordered and build a sparse lookup table in some way to select the start node to search in order to amortize some run time.
It is unclear how you want to access the list elements. From the code you provided, I assume you want the first pushed node as "A", the second pushed node as "B", etc. Then my question is what happens when we delete the first pushed node? Does the second pushed node become "A"?
If the node identifiers don't change when updating the list, your approach seems alright.
If the node identifiers has to change on list update, then here is a lightweight and limited approach:
const int MAX_ELEMENTS = 3;
vector<int> arr(MAX_ELEMENTS + 1);
int head = 0, tail = 0;
int get_size() {
if (head > tail)
return tail + ((int)arr.size() - head);
return tail - head;
}
void push(int value) {
if (get_size() == MAX_ELEMENTS) {
// TODO: handle push to full queue
cout << "FULL QUEUE\n";
return;
}
arr[tail++] = value;
if (tail == (int)arr.size()) {
tail = 0;
}
}
int pop() {
if (get_size() == 0) {
// TODO: handle access to empty queue
cout << "EMPTY QUEUE\n";
return -1;
}
int result = arr[head++];
if (head == (int)arr.size()) {
head = 0;
}
return result;
}
int get_item_at(int id) {
if (id >= get_size()) {
// TODO: index out of range
cout << "INDEX OUT OF RANGE\n";
return -1;
}
int actual_id = head + id;
if (actual_id >= (int)arr.size()) {
actual_id -= (int)arr.size();
}
return arr[actual_id];
}
The above approach will keep indices up-to-date (eg. get_item_at(0) will always return the first node in the queue). You can map ids to any suitable id you want like "A" -> 0, "B" -> 1, etc. The limitation of this solution is that you won't be able to store more than MAX_ELEMENTS in the queue.
If I wanted to instead store pointers to the nodes in the linkedList to have constant insertion/deletion time from any node would I just store list< T >::iterator*?
If identifiers must change with insertion/deletion, then it is going to take O(n) time anyways.
This idea makes sense, especially for large lists where order of insertion matters but things might be removed in the middle. What you want to do is create your own data structure:
template<typename T> class constantAccessLinkedList {
public:
void insert(const T& val) {
mData.push_back(val); //insert into rear of list, O(1).
mLookupTable.insert({val, mData.back()}); //insert iterator into map. O(1).
}
std::list<T>::iterator find(const T& val) {
return mLookupTable[val]; //O(1) lookup time for list member.
}
void delete(const T& val) {
auto iter = mLookupTable.find(val); //O(1)get list iterator
mLookupTable.erase(val); //O(1)remove from LUT.
mData.erase(iter); //O(1) erase from list.
}
private:
std::list<T> mData;
std::unordered_map<T, std::list<T>::iterator> mLookupTable;
};
The reason you can do this with list and unordered_map is because list::iterators are not invalidated on modifying the underlying container.
This will keep your data in order in the list, but will provide constant access time to iterators, and allow you to remove elements in constant time.
I try to find optimal data structure for next simple task: class which keeps N last added item values in built-in container. If object obtain N+1 item it should be added at the end of the container and first item should be removed from it. It like a simple queue, but class should have a method GetAverage, and other methods which must have access to every item. Unfortunately, std::queue doesn't have methods begin and end for this purpose.
It's a part of simple class interface:
class StatItem final
{
static int ITEMS_LIMIT;
public:
StatItem() = default;
~StatItem() = default;
void Reset();
void Insert(int val);
int GetAverage() const;
private:
std::queue<int> _items;
};
And part of desired implementation:
void StatItem::Reset()
{
std::queue<int> empty;
std::swap(_items, empty);
}
void StatItem::Insert(int val)
{
_items.push(val);
if (_items.size() == ITEMS_LIMIT)
{
_items.pop();
}
}
int StatItem::GetAverage() const
{
const size_t itemCount{ _items.size() };
if (itemCount == 0) {
return 0;
}
const int sum = std::accumulate(_items.begin(), _items.end(), 0); // Error. std::queue doesn't have this methods
return sum / itemCount;
}
Any ideas?
I'm not sure about std::deque. Does it work effective and should I use it for this task or something different?
P.S.: ITEMS_LIMIT in my case about 100-500 items
The data structure you're looking for is a circular buffer. There is an implementation in the Boost library, however in this situation since it doesn't seem you need to remove items you can easily implement one using a std::vector or std::array.
You will need to keep track of the number of elements in the vector so far so that you can average correctly until you reach the element limit, and also the current insertion index which should just wrap when you reach that limit.
Using an array or vector will allow you to benefit from having a fixed element limit, as the elements will be stored in a single block of memory (good for fast memory access), and with both data structures you can make space for all elements you need on construction.
If you choose to use a std::vector, make sure to use the 'fill' constructor (http://www.cplusplus.com/reference/vector/vector/vector/), which will allow you to create the right number of elements from the beginning and avoid any extra allocations.
I am trying to build a priority queue using a vector that stores each element. Firstly, I wanna insert the element to the vector with its priority. I am not sure if it is possible, if not, Can someone give me another solution.
Here is my code:
template <typename E>
class PriorityQueue {
private:
std::vector<E> elements;
E value;
int pr;
public:
PriorityQueue() {}
void insert(int priority, E element) {
}
};
Here is how to create an element with priority for vector:
struct PriElement
{
int data;
int pri;
bool operator < ( const PriElement & other ) const
{
return pri < other.pri;
}
};
vector<PriElement> _vector;
However, the real problem is to keep the vector sorted per priority.
Here is a naive implementation showing the bubble up method:
class PriorityQueue{
public:
void insert( int data, int pri )
{
_vector.push_back(PriElement(data,pri));
int index = _vector.size() -1;
while ( ( index > 0 )&& (_vector[index] < _vector[index-1] ) )
{
swap(_vector[index],_vector[index-1]);
index--;
}
}
private:
vector<PriElement> _vector;
};
For any real world implementation, as mentioned, use priority_queue.
The standard algorithm (see Introduction To Algorithms chapter 6) for doing this is as follows:
When pushing an item, insert it to the end of the vector, then "bubble" it up to the correct place.
When popping the smallest item, replace the first item (at position 0) with the the item at the end, then "bubble" it down to the correct place.
It's possible to show that this can be done with (amortized) logarithmic time (the amortization is due to the vector possibly doubling itself).
However, there is no need to implement this yourself, as the standard library contains std::priority_queue which is a container adapter using std::vector as its default sequence container. For example, if you define
std::priority_queue<int> q;
then q will be a priority queue adapting a vector.
I want to list the output of my set in alphabetical order. Below is an attempt at getting to this, but it seems slow / inefficient and I haven't even finished it yet.
void ordered(ostream &os) {
bool inserted = false;
for (objects::iterator i = begin(); i != end(); ) {
for (objects::iterator x = begin(); x != end(); ++x) {
if((**i) < (**x)) { //overloaded and works
os << **i << endl;
inserted = true;
break;
}
}
if(inserted) {
++i;
}
}
}
Clearly this will only output objects that come after the first object alphabetically.
I also considered moving the objects from a set into another container but it still seems inefficient.
The std::set is an ordered container, see reference:
http://en.cppreference.com/w/cpp/container/set
std::set is an associative container that contains a sorted set of
unique objects of type Key. Sorting is done using the key comparison
function Compare. Search, removal, and insertion operations have
logarithmic complexity. Sets are usually implemented as red-black
trees.
std::set is already ordered. It looks like you merely need to use a custom comparer that compares the pointed-to values instead of the pointers themselves (which is the default):
template<typename T> struct pless {
inline bool operator()(const T* a, const T* b) const { return *a < *b; }
};
std::set<Foo*, pless<Foo> > objects;
I need to implement a queue containing unique entries(no duplicates) in C or C++. I am thinking of maintaining a reference of elements already available in queue but that seems very inefficient.
Kindly let me know your suggestions to tackle this.
How about an auxiliary data structure to track uniqueness:
std::queue<Foo> q;
std::set<std::reference_wrapper<Foo>> s;
// to add:
void add(Foo const & x)
{
if (s.find(x) == s.end())
{
q.push_back(x);
s.insert(std::ref(q.back())); // or "s.emplace(q.back());"
}
}
Or, alternatively, reverse the roles of the queue and the set:
std::set<Foo> s;
std::queue<std::reference_wrapper<Foo>> q;
void add(Foo const & x)
{
auto p = s.insert(x); // std::pair<std::set<Foo>::iterator, bool>
if (s.second)
{
q.push_back(std::ref(*s.first)); // or "q.emplace_back(*s.first);"
}
}
queuing:
use std::set to maintain your set of unique elements
add any element that you were able to add to the std::set to the std::queue
dequeueing:
remove element from std::queue and std::set
std::queue is a container adaptor and uses relatively few members of the underlying Container. You can easily implement a custom container that contains both: an unordered_map of reference_wrapper<T> and a deque<T>. It needs at least members front and push_back. Check inside that hash_map when push_back of your container is called and reject accordingly (possibly throw). To give the complete example:
#include <iostream>
#include <set>
#include <deque>
#include <queue>
#include <unordered_set>
#include <functional>
namespace std {
// partial specialization for reference_wrapper
// is this really necessary?
template<typename T>
class hash<std::reference_wrapper<T>> {
public:
std::size_t operator()(std::reference_wrapper<T> x) const
{ return std::hash<T>()(x.get()); }
};
}
template <typename T>
class my_container {
// important: this really needs to be a deque and only front
// insertion/deletion is allowed to not get dangling references
typedef std::deque<T> storage;
typedef std::reference_wrapper<const T> c_ref_w;
typedef std::reference_wrapper<T> ref_w;
public:
typedef typename storage::value_type value_type;
typedef typename storage::reference reference;
typedef typename storage::const_reference const_reference;
typedef typename storage::size_type size_type;
// no move semantics
void push_back(const T& t) {
auto it = lookup_.find(std::cref(t));
if(it != end(lookup_)) {
// is already inserted report error
return;
}
store_.push_back(t);
// this is important to not have dangling references
lookup_.insert(store_.back());
}
// trivial functions
bool empty() const { return store_.empty(); }
const T& front() const { return store_.front(); }
T& front() { return store_.front(); }
void pop_front() { lookup_.erase(store_.front()); store_.pop_front(); }
private:
// look-up mechanism
std::unordered_set<c_ref_w> lookup_;
// underlying storage
storage store_;
};
int main()
{
// reference wrapper for int ends up being silly
// but good for larger objects
std::queue<int, my_container<int>> q;
q.push(2);
q.push(3);
q.push(2);
q.push(4);
while(!q.empty()) {
std::cout << q.front() << std::endl;
q.pop();
}
return 0;
}
EDIT: You will want to make my_container a proper model of container (maybe also allocators), but this is another full question. Thanks to Christian Rau for pointing out bugs.
There is one very important point you've not mentioned in your question, and that is whether your queue of items is sorted or have some kind of ordering (called a Priority queue), or unsorted (called a plain FIFO). The solution you choose will depend only on the answer to this question.
If your queue is unsorted, then maintaining an extra data structure in addition to your queue will be more efficient. Using a second structure which is ordered in some way to maintain the contents of your queue will allow you check if an item already exists in your queue or not much quicker that scanning the queue itself. Adding to the end of an unsorted queue takes constant time and can be done very efficiently.
If your queue must be sorted, then placing the item into the queue requires you to know the item's position in the queue, which requires the queue to be scanned anyway. Once you know an item's position, you know if the item is a duplicate because if it's a duplicate then an item will already exist at that position in the queue. In this case, all work can be performed optimally on the queue itself and maintaining any secondary data structure is unnecessary.
The choice of data structures is up to you. However, for (1) the secondary data structure should not be any kind of list or array, otherwise it will be no more efficient to scan your secondary index as to scan the original queue itself.