Own heap implementation in C++ - c++

I have to write my own implementation of heap in C++, which stores objects of type:
std::pair<City, int>
where City is a structure to store two integers, which represent city coords and string - city name.
I do know how to do this with plain integers, but using pair of values is a little problematic to me.
I've already started to write my heap class, but, as I said, I don't know how to do this with those pairs.
I want the heap to be sorted by the int value of the pair.

If you know how to do it for ints, you're almost there. Treat the pair objects just as you would treat ints when assigning, but for comparison purposes, use .second instead of the value directly.

You could try to use std::make_heap which will put a sequence of your pairs into a heap order, see this online example. To sort by the int value only, use a C++11 lambda expression that will compare the second element of each pair
Alternatively, given that you cannot use any STL heap-related algorithms, but given any self-made implementation of
template<typename RandomIt>
void my_make_heap(RandomIt first, RandomIt last)
{
/* some algorithm using `a < b` to do comparisons */
}
you can rewrite it as (or add an overload)
template<typename RandomIt, typename Compare>
void my_make_heap(RandomIt first, RandomIt last, Compare, cmp)
{
/* SAME algorithm, but now using `cmp(a, b)` to do comparisons */
}
and then call it as my_make_heap(first, last, int_cmp) where the lambda expression compares pairs like this:
typedef std::pair<City, int> Element;
auto int_cmp = [](Element const& lhs, Element const& rhs) {
return lhs.second < rhs.second;
};

So from what i understand :
Your structure is something like this ,
struct node
{
int X_coord;
int y_coord;
string name;
}
And you need to form the Heap based on "int' value of pair ,call it 'x' .
So your pair is
pair<node n , int x> ;
This , is a very readable code for Heap , implemented in a class.
It can be easily modified to your requirement for pair<> value .
Just use , "heap.second" as your key value .

Related

Binary search on vector objects of an element greater than an attribute

I have a vector which contains lot of elements of my class X .
I need to find the first occurrence of an element in this vector say S such that S.attrribute1 > someVariable. someVariable will not be fixed . How can I do binary_search for this ? (NOT c++11/c++14) . I can write std::binary_search with search function of greater (which ideally means check of equality) but that would be wrong ? Whats the right strategy for fast searching ?
A binary search can only be done if the vector is in sorted order according to the binary search's predicate, by definition.
So, unless all elements in your vector for which "S.attribute1 > someVariable" are located after all elements that are not, this is going to be a non-starter, right out of the gate.
If all elements in your vector are sorted in some other way, that "some other way" is the only binary search that can be implemented.
Assuming that they are, you must be using a comparator, of some sort, that specifies strict weak ordering on the attribute, in order to come up with your sorted vector in the first place:
class comparator {
public:
bool operator()(const your_class &a, const your_class &b) const
{
return a.attribute1 < b.attribute1;
}
};
The trick is that if you want to search using the attribute value alone, you need to use a comparator that can be used with std::binary_search which is defined as follows:
template< class ForwardIt, class T, class Compare >
bool binary_search( ForwardIt first, ForwardIt last,
const T& value, Compare comp );
For std::binary_search to succeed, the range [first, last) must be
at least partially ordered, i.e. it must satisfy all of the following
requirements:
for all elements, if element < value or comp(element, value) is true
then !(value < element) or !comp(value, element) is also true
So, the only requirement is that comp(value, element) and comp(element, value) needs to work. You can pass the attribute value for T, rather than the entire element in the vector to search for, as long as your comparator can deal with it:
class search_comparator {
public:
bool operator()(const your_class &a, const attribute_type &b) const
{
return a.attribute1 < b;
}
bool operator()(const attribute_type &a, const your_class &b) const
{
return a < b.attribute1;
}
};
Now, you should be able to use search_comparator instead of comparator, and do a binary search by the attribute value.
And, all bets are off, as I said, if the vector is not sorted by the given attribute. In that case, you'll need to use std::sort it explicitly, first, or come up with some custom container that keeps track of the vector elements, in the right order, separately and in addition to the main vector that holds them. Using pointers, perhaps, in which case you should be able to execute a binary search on the pointers themselves, using a similar search comparator, that looks at the pointers, instead.
For std::binary_search to succeed, the range need to be sorted.std::binary_search, std::lower_bound works on sorted containers. So every time you add a new element into your vector you need to keep it sorted.
For this purpose you can use std::lower_bound in your insertion:
class X;
class XCompare
{
public:
bool operator()(const X& first, const X& second) const
{
// your sorting logic
}
};
X value(...);
auto where = std::lower_bound(std::begin(vector), std::end(vector), value, XCompare());
vector.insert(where, value);
And again you can use std::lower_bound to search in your vector:
auto where = std::lower_bound(std::begin(vector), std::end(vector), searching_value, XCompare());
Don't forget to check if std::lower_bound was successful:
bool successed = where != std::end(vector) && !(XCompare()(value, *where));
Or directly use std::binary_search if you only want to know that element is in vector.

How can I implement a custom C++ iterator that efficiently iterates over key-value pairs that are backed by two distinct arrays

I want to make use of libstdc++'s __gnu_parallel::multiway_merge to merge four large sequences of sorted key-value pairs at once (to save memory bandwidth).
Each sequence of key-value pairs is represented by two distinct arrays, such that
values[i] is the value associated with keys[i].
The implementation for multiway-merging a single array of keys (keys-only) or an array of std::pairs would be straightforward. However, I need to implement a custom iterator that I can pass to the multiway_merge, which holds one reference to my keys array and one to the corresponding values array.
So my approach looks something like the following:
template<
typename KeyT,
typename ValueT
>
class ForwardIterator : public std::iterator<std::forward_iterator_tag, KeyT>
{
KeyT* k_itr;
ValueT* v_itr;
size_t offset;
explicit ForwardIterator(KeyT* k_start, ValueT *v_start) : k_itr(k_start), v_itr(v_start), offset(0)
{
}
ForwardIterator& operator++ () // Pre-increment
{
offset++;
return *this;
}
}
However, the problems start as soon as I'm getting to the overloading of the dereferencing operator.
Help is really appreciated! Thanks!

std::sort to sort an array and a list of index?

I have a function that takes two vectors of the same size as parameters :
void mysort(std::vector<double>& data, std::vector<unsigned int>& index)
{
// For example :
// The data vector contains : 9.8 1.2 10.5 -4.3
// The index vector contains : 0 1 2 3
// The goal is to obtain for the data : -4.3 1.2 9.8 10.5
// The goal is to obtain for the index : 3 1 0 2
// Using std::sort and minimizing copies
}
How to solve that problem minimizing the number of required copies ?
An obvious way would be to make a single vector of std::pair<double, unsigned int> and specify the comparator by [](std::pair<double, unsigned int> x, std::pair<double, unsigned int> y){return x.first < y.first;} and then to copy the results in the two original vectors but it would not be efficient.
Note : the signature of the function is fixed, and I cannot pass a single vector of std::pair.
Inside the function, make a vector positions = [0,1,2,3...]
Sort positions with the comparator (int x, int y){return data[x]<data[y];}.
Then iterate over positions , doing result.push_back(index[*it]);
This assumes the values in index can be arbitrary. If it is guaranteed to already be [0,1,2..] as in your example, then you don't to make the positions array, just use index in it's place and skip the last copy.
http://www.boost.org/doc/libs/1_52_0/libs/iterator/doc/index.html#iterator-facade-and-adaptor
Write a iterator over std::pair<double&, signed int&> that actually wraps a pair of iterators into each vector. The only tricky part is making sure that std::sort realizes that the result is a random access iterator.
If you can't use boost, just write the equivalent yourself.
Before doing this, determine if it is worth your bother. A zip, sort and unzip is easier to write, and programmer time can be exchanged for performance in lots of spots: until you konw where it is optimally spent, maybe you should just do a good-enough job and then benchmark where you need to speed things up.
You can use a custom iterator class, which iterates over both vectors in parallel. Its internal members would consist of
Two references (or pointers), one for each vector
An index indicating the current position
The value type of the iterator should be a pair<double, unsigned>. This is because std::sort will not only swap items, but in some cases also temporarily store single values. I wrote more details about this in section 3 of this question.
The reference type has to be some class which again holds references to both vectors and a current index. So you might make the reference type the same as the iterator type, if you are careful. The operator= of the reference type must allow assignment from the value type. And the swap function should be specialized for this reference, to allow swapping such list items in place, by swapping for both lists separately.
You can use a functor class to hold a reference to the value array and use it as the comparator to sort the index array. Then copy the values to a new value array and swap the contents.
struct Comparator
{
Comparator(const std::vector<double> & data) : m_data(data) {}
bool operator()(int left, int right) const { return data[left] < data[right]; }
const std::vector<double> & m_data;
};
void mysort(std::vector<double>& data, std::vector<unsigned int>& index)
{
std::sort(index.begin(), index.end(), Comparator(data));
std::vector<double> result;
result.reserve(data.size());
for (std::vector<int>::iterator it = index.begin(), e = index.end(); it != e; ++it)
result.push_back(data[*it]);
data.swap(result);
}
This should do it:
std::sort(index.begin(), index.end(), [&data](unsigned i1, unsigned i2)->bool
{ return data[i1]<data[i2]; });
std::sort(data.begin(), data.end());

predicate for a map from string to int

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}
std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}
You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.
You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.
A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)
As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.
You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );

stl predicate with different types

I have a vector of ordered container classes where I need to know the index of the container that has a given element
so, I would like to do the following, but this obviously doesn't work. I could create a dummy Container to house the date to find, but I was wondering if there was a nicer way.
struct FooAccDateComp
{
bool operator()(const Container& d1, const MyDate& f1) const
{ return d1->myDate < f1; }
};
class Container
{
MyDate myDate;
...
};
vector<Container> mystuff;
MyDate temp(2008, 3, 15);
//add stuff to variable mystuff
int index = int(upper_bound(events.begin(), events.end(),temp, FooAccDateComp())-events.begin());
EDIT: The container class can contain other dates.
upper_bound needs to be able to evaluate expressions like Comp(date,container), but you've only provided Comp(container,date). You'll need to provide both:
struct FooAccDateComp
{
bool operator()(const Container& c, const MyDate& d) const
{ return c.myDate < d; }
bool operator()(const MyDate& d, const Container& c) const
{ return d < c.myDate; }
};
Remember that the vector must be sorted according to this comparison for upper_bound and friends to work.
You don't necessarily need a special predicate, just enable comparison between Container and MyDate.
#include <vector>
struct MyDate {
MyDate(int, int, int);
};
struct Container {
MyDate myDate;
};
// enable comparison between Container and MyDate
bool operator<(Container const&, MyDate const&);
bool operator==(Container const&, MyDate const&);
std::vector<Container> v;
//add stuff to variable mystuff
MyDate temp(2008, 3, 15);
std::vector<Container>::iterator i = std::lower_bound(v.begin(), v.end(), temp);
ptrdiff_t index = i != v.end() && *i == temp ? i - v.begin() : -1;
You can use find_if if you don't mind degrading performance (you said that you have a vector of sorted Container, so binary search would be faster)
Or you can add
struct Container {
MyDate myDate;
operator MyDate () {return myDate};
}
bool operator <(MyDate const&, MyDate const&)
{
return // your logic here
};
Now you can use binary search functions
std::vector<Container>::iterator i = std::upper_bound(v.begin(), v.end(), MyDateObject);
Surely, it will work only if your vector is sorted by Container.myDate
Your example is broken in several trivial ways: the class Container should be defined before FooAccDateComp in order for it to be used there, you should make myDate a public member of Container, access that member in the comparison method using .myDate rather than ->myDate, and finally decide whether to call your vector mystuff or events, but not mix both. I'll suppose that appropriate corrections have been made.
You should have defined your comparison function to take a Date parameter as first argument and a Container parameter as second; the opposite to what you did. Or you could use std::lower_bound instead of std::upper_bound if that would suit you purpose (since you don't say what you are going to do with index it is hard to tell) as the choice made in the question is adapted to that. Contrary to what the currently accepted answer says you do not need both if you are only using std::upper_bound or only std::lower_bound (though you would need both if using std::equal_range, or when using both std::upper_bound and std::lower_bound).
You can find these at first sight a bit strange specifications in the standard, but there is a way to understand without looking it up why they have to be like this. When using lower_bound, you want to find the point that separates the Container entries that are (strictly) less than your given Date from those that are not, and this requires calling the comparison function with that Date argument in second position. If however you ask for an upper_bound (as you are), you want to find the point that separates the entries that are not strictly greater than your given Date from those that are, and this requires calling the comparison function with that Date argument in first position (and negating the boolean result it returns). And for equal_range you of course need both possibilities.