Using lower_bound() and upper_bound() to select records - c++

I have a map of objects, keyed by a date (stored as a double). I want to filter/extract the objects based on date, so I wrote a function similar to the snippet below.
However, I found that if I provide a date that is either lower than the earliest date, or greater than the last date, the code fails. I have modified the code so that any input startdate that is lower than the first date is set to the first (i.e. lowest) date in the map, likewise, enddate > last date is set to the last (greatest) date in the map
void extractDataRecords(const DatedRecordset& recs, OutStruct& out, const double startdt, const double enddt)
{
double first = recs.begin()->first, last = recs.rbegin()->first;
const double sdate = (start < first) ? first : startdt;
const double edate = (enddt > last) ? last : enddt;
DatedRecordsetConstIter start_iter = recs.lower_bound(sdate), end_iter = recs.upper_bound(edate);
if ((start_iter != recs.end()) && (end_iter != recs.end()))
{
// do Something
}
}
Is this the correct way to achieve this behaviour?

std::lower_bound returns: "the first position into which value can be inserted without violating the ordering." std::upper_bound returns: "the furthermost position into which value can be inserted without violating the ordering." In other words, if you insert the new item at either position, you're guaranteed that the overall ordering of the collection remains intact.
If you're going to use both anyway, you should probably use std::equal_range instead -- it returns an std::pair of iterators, one that's the same as lower_bound would have returned, and the other the same as upper_bound would have returned. Although it has the same worst-case complexity as calling the two separately, it's usually faster than two separate calls.
It's worth noting, however that if what you have is really a map (rather than a multimap) there can only be one entry with a given key, so there's not much reason to deal with both lower_bound and upper_bound for any given key.

From GNU libstdc++
lower_bound:
This function returns the first element of a subsequence of elements
that matches the given key. If
unsuccessful it returns an iterator
pointing to the first element that has
a greater value than given key or
end() if no such element exists
Your original approach on using lower_bound sounds correct to me. However, I think you don't need to use upper_bound, you can do a simple comparison with enddt. I would try
for( DatadRecordsetConstIter cit = recs.lower_bound( startdt );
cit != rec.end(); ++cit ) {
if( *cit > enddt ) {
break;
}
// do stuff with *cit
}

Related

Retrieving index of vector using std::upper_bound, index out of bounds

I am attempting to retrieve a vector's index based on it's value using std::upper_bound. For some reason though, the following code sets tmpKey equal to 2 vs my expected value of 1. Does anything stick out as being horribly wrong?
int main()
{
float time = 30.0000000;
std::vector<float> positionKeyTimes = { 0.000000000, 30.0000000 };
auto it = std::upper_bound(positionKeyTimes.begin(), positionKeyTimes.end(), time);
auto tmpKey = (size_t)(it - positionKeyTimes.begin());
std::cout << tmpKey << "\n";
std::cin.get();
}
std::upper_bound
Returns an iterator pointing to the first element in the range [first, last) that is greater than value, or last if no such element is found.
There's no element greater than 30 in your vector, so the end iterator is returned.
To get your expected value you could use std::lower_bound instead, which
Returns an iterator pointing to the first element in the range [first, last) that is not less than (i.e. greater or equal to) value, or last if no such element is found.
Remembering that
The range [first, last) must be partitioned with respect to the expression element < value or comp(element, value), i.e., all elements for which the expression is true must precede all elements for which the expression is false. A fully-sorted range meets this criterion.

C++ Map get first element when second element is X

I have a C++ map called buttonValues as shown below.
map<int, int> buttonValues;
I put some data into my map as shown below.
buttonValues.insert(std::pair<int, int>(0, 1));
buttonValues.insert(std::pair<int, int>(1, 3));
buttonValues.insert(std::pair<int, int>(2, 0));
What I want to do is search for value 0 in the second column and if 0 is found in the second column, return the value in the first column. In this example, the value I would like to be returned is 2. So far I believe I can search for 0 in the second column with this:
buttonValues.find(0)->second
However, how do I get the value corresponding in the first column?
Thanks
Calum
buttonValues.find(0)->second will give you the value ("2nd column") corresponding to key 0. In your example, it will return 1. You need to iterate over the map and look for values = 0 and then return the key:
for (const auto& keyval : buttonValues) // Look at each key-value pair
{
if (keyval.second == 0) // If the value is 0...
{
return keyval.first; // ...return the first element in the pair
}
}
You can put this in a function. Note that a map has unique keys but not necessarily unique values. So you should probably handle the case where you have multiple keys for which the value is 0.
Something like this:
for ( auto X : map_name )
if ( X.second == 0 )
return X.first;
std::pair<> holds values of first and second columns in your map. You can just iterate through all pairs and check second values for what you want.
Actually from performance's perspective, it is not suggested to search key by value from a map, the time complexity would be linear time O(N). If you search value by key from a map, it would be 'O(logN)'. You can consider to build a reverse map or multimap, or even unordered_map / unordered_multimap depends on your use case.

Find which element is not sorted in a list

I have a list filled with the numbers 3, 7, 10, 8, 12. I'd like to write a line that will tell me which element in the list is not sorted (in this case it is the 4th element). However, the code I have right now tells me the value of the 4th element (8). Is there a way I can rewrite this to tell me it's the 4th element rather than the number 8?
Here is the code I have now:
list<int>::iterator i;
if (!is_sorted(myList.begin(), myList.end())) {
i = is_sorted_until(myList.begin(), myList.end());
cout << *i << endl;
}
The first thing I should say, is that if you care about numerical position, you should be using a random access container, such as std::vector. Then your job would be simple:
// calling is_sorted is a waste if you're about to call is_sorted_until
auto i = is_sorted_until(my_vector.begin(), my_vector.end());
if (i != my_vector.end())
cout << (i - my_vector.begin());
If you must use a list, and you still need the position, then you should write your own algorithm which provides this information. It really shouldn't be that hard, it's just a for loop comparing each element to the one that precedes it. When you find one which compares less than the one which procedes it, you've found your element. Just keep an integer count alongside it, and you're good.
The obvious way would be to simply search for an element that's less than the element that preceded it.
int position = 1;
auto prev = myList.begin(), pos=std::next(prev, 1);
while (pos != myList.end() && *prev < *pos) {
++position;
++prev;
++pos;
}
You could use a standard algorithm instead, but they seem somewhat clumsy for this situation.
Does this help?
std::is_sorted_until()
From http://www.cplusplus.com/reference/algorithm/is_sorted_until/:
Find first unsorted element in range
Returns an iterator to the first element in the range [first,last) which does not follow an ascending order.
The range between first and the iterator returned is sorted.
If the entire range is sorted, the function returns last.
The elements are compared using operator< for the first version, and comp for the second.

C++ Multimap manipulation

I have created a multimap as I have repeating keys. But I want do an efficient manipulation so that I can generate a new multimap with subsequent higher keys aligned. This is what I mean:
This is what I have:
key values
11 qwer
11 mfiri
21 iernr
21 ghfnfjf
43 dnvfrf
This is what I want to achive
key values
11 qwer,iernr
11 mfiri,iernr
21 iernr,dnvfrf
21 ghfnfjf,dnvfrf
43 dnvfrf
I have about 10 million entries so I am looking for something efficient.
In above value "qwer,iernr" is one string.
Here's a simple way to do it:
auto cur = map.begin();
auto next = map.upper_bound(cur->first);
for(; next != map.end(); next = map.upper_bound(cur->first))
{
for(; cur != next; ++cur)
{
cur->second += ", ";
cur->second += next->second;
}
}
... given a std::multimap<int, std::string> map;
However, any operation transforming 10m+ elements isn't going to be super fast.
Looks like straight-forward way would work fine. Map elements will be laid out in ascending order (assuming compare operator suits you). So just going through the equal ranges and modifying them with value of the element just after the range will do what you want.
Clone map (if you need the original), take first element, get equal_range() for its key, modify values with value of second iterator in the range (unless it is the last one). Get equal_range() for the key of second iterator. Repeat.
agree with Eugene ! also see following reference in terms of equal_range()
stl::multimap - how do i get groups of data?
To do this, you need to simply iterate through the map, while building the new map in order.
You can do this in two levels:
for (auto it=map.cbegin(); it != map.cend(); )
{
// The inner loop is over all entries having the same key
auto next_key_it=find_next_key_after(it);
for (; it != next_key_it; ++it) {
new_map.emplace_hint(new_map.end(), it->first, new_value(it->second, next_key_it));
}
}
The new_value function (or lambda) does the value transformation (or not, if the second parameter is map.end()).
The find_next_key_after(it) function returns the same as map.upper_bound(it->first), but could also be implemented as linear search for the first entry with different key.
It depends on your (expected) key distribution, which to use - if keys repeat a small, limited number of times, linear search is better; if the number of different keys is limited, with large equal key ranges, then upper_bound may be better.
For guaranteed complexity, linear search is better: The whole algorithm then has O(n) complexity. Which is as efficient as you can get.

Solving the array sum problem using iterators and testing for equality only

While getting ready for interviews, I decided to code the classic "Find if there are two elements in an array that sum up to a given number" question using iterator logic, so that it can be generalized to other containers than vector.
Here's my function so far
// Search given container for two elements with given sum.
// If two such elements exist, return true and the iterators
// pointing to the elements.
bool hasElementSum( int sum, const vector<int>& v, vector<int>::iterator& el1, vector<int>::iterator& el2 )
{
bool ret = false;
el1 = v.begin();
el2 = v.end()-1;
while ( el1 != el2 ) {
if ( *el1 + *el2 == sum ) return true;
++el1;--el2;
}
return false;
}
This, of course, doesn't work, but I couldn't figure out a way to do it without using the condition while ( el1 >= el2 ). Various sources I looked advise against using omnly equality checking for iterators, to be able to generalize to all types of containers that support iterators.
Thanks!
First of all, your algorithm is wrong unless you've somehow determined ahead of time that you only need to look at sums where one item is in the first half of the collection, and the other is in the second half of the collection.
If the input's not sorted, then #sbi's answer is about as good as it gets.
With a sorted, random-access input, you can start with the first element, and do a binary search (or interpolation search, etc.) to see if you can find the value that would have to go with that to produce the desired sum. Then you can try the second element, but when you do the binary search (or whatever) use the result from the previous search as the upper limit. Since your first element is larger than the previous one, the matching value to produce the correct sum must be less than or equal to what you found the last time around.
foreach element1 in array
foreach element2 in array + &element1
if( element1 + element2 == sum )
return true
return false
This is O(N^2), since you have to add each element to each of the other elements.
Isn't this question usually asked with a sorted array ?
If not it has to work in O(n^2) complexity, and you will have to check all possible pairs.
I propose the following method though did not analyze the order
Construct a binary search tree with all the elements of the vector, Then for each element
foreach(element = vec.begin to vec.end)
{
if element == node.data, skip
if the element + node.data == sum, return true
if the element + node.data > sum, goto left child
if the element + node.data < sum, goto right child
}
Not a perfect solution/algorithm, but something of this kind.
Sorry, I screwed this one up. What I meant to write was a sort followed by a linear passed, which is the typical answer given to this question, as ltsik pointed out in his comment to Jerry, i.e. something like
bool hasElementSum( int sum, const vector<int>& v, int* ind1, int* ind2 )
{
*ind1 = 0; *ind2 = v.size()-1;
std::sort( v.begin(), v.end() );
while ( *ind1 <= *ind2 ) {
int s = v[*ind1] + v[*ind2];
if ( s > sum ) (*ind1)++;
else if ( s < sum ) (*ind2)++;
else return true
}
return false;
}
My question was how to write this using iterators without saying while (iter1 <= iter2 ) in order to be general, but I now see that doesn't make sense because this algorithm needs random access iterators anyway. Also, returning the indexes is meaningless since they refer to the sorted array and not the original one.