std::pair as key in map - c++

I have a large dataset of images, taken at specific times where each image capture start_time and stop_time are known and encoded as doubles.
I want to load each consecutive image into my simulation based on the simulation time, ie - check when the current simulation time falls within the start/stop interval.
I want to use a map for this, where the key is a std::pair<double, double> of start & stop time and the value is the full path to the image.
std::map<std::pair<double, double>, std::string> _sequence; // t1, t2 and full path
My question:
How can I search such a map to find if _currentTime is within an interval pair?

Firstly, don't use a map keyed on std::pair<double, double> if searching for inclusion is the main thing you want to do. That's just not an operation that makes sense with that data structure.
But if you insist, the code would look something like this (in C++11):
bool isWithinInterval() const {
for (const auto& pr : _sequence) {
if (_currentTime >= pr.first.first && _currentTime <= pr.first.second) {
return true;
}
}
return false;
}
Pre-C++11, same idea, just slightly different loop syntax. Ideally, we'd use std::find_if, but it's a hassle to express the map's value_type. In C++14 though, no such hassle:
auto it = std::find_if(_sequence.begin(),
_sequence.end(),
[_currentTime](const auto& pr) {
return _currentTime >= pr.first.first && _currentTime <= pr.first.second;
});
return it != _sequence.end();
Or just:
return std::any_of(_sequence.begin(), _sequence.end(),
[_currentTime](const auto& pr) {
return _currentTime >= pr.first.first && _currentTime <= pr.first.second;
});

One approach could be not to use a std::map<std::pair<double, double>, std::string> but rather a std::map<double, std::pair<double, std::string>>: you'd use m.lower_bound(current_time) to find the start of a range of elements where current_time could fit. You'd then walk the iterator until it reaches the end, falls into the relevant range, or is beyond the end time:
auto it = _sequence.lower_bound(current_time);
for (; it != _sequence.end() && current_time <= it->second; ++it) {
if (it.first <= current_time) {
// found a matching element at it
}
}
Using the layout with a std::pair<double, double> a key has the awkward need to come up with a second time. You could use std::make_pair(current_time, current_time), though.

double search = 0.; /* or some other value */
bool found = false;
for ( auto & key_value_pair : _sequence ) {
// key_value_pair.first == map key
// key_value_pair.second == key's associated value
if ( key_value_pair.first.first <= search || search <= key_value_pair.first.second ) {
found = true;
break;
}
}
if ( found ) {
/* it's within an interval pair! */
} else {
/* it's not within an interval pair! */
}
I would recommend also looking into boost::icl

If possible, don't use std::pair as the key. It doesn't really make sense as a key because you end up with a situation where two overlapping ranges map to the same element.
Anyhow, here is how I would implement the solution to such a problem. lower_bound/upper_bound are your friend here. Also, you can avoid the reverse iterator tricks by keying the values on stop time.
#include <map>
#include <stdio.h>
struct ImageStuff
{
double startTime;
double stopTime;
char data[1000];
};
typedef std::map<double, ImageStuff> starttime_map_type;
starttime_map_type starttime_map;
ImageStuff & MakeImage (double start, double stop) {
ImageStuff newImage;
newImage.startTime = start;
newImage.stopTime = stop;
return starttime_map[start] = newImage;
}
starttime_map_type::iterator FindByTime (double time) {
starttime_map_type::reverse_iterator i = starttime_map_type::reverse_iterator(starttime_map.upper_bound(time));
if (i == starttime_map.rend() || time > i->second.stopTime) {
printf ("Didn't find an image for time %f\n", time);
return starttime_map.end();
}
else {
printf ("Found an image for time %f\n", time);
return i.base();
}
return starttime_map.end();
}
int main (void)
{
MakeImage (4.5, 6.5);
MakeImage (8.0, 12);
MakeImage (1, 1.2);
auto i = FindByTime(3);
i = FindByTime(4.5);
i = FindByTime(9);
i = FindByTime(15);
return 0;
}

Related

If find_if() takes too long, are there alternatives that can be used for better program performance?

I'm working on a D* Lite path planner in C++. The program maintains a priority queue of cells (U), each cell have two cost values, and a key can be calculated for a cell which determine it's order on the priority queue.
using Cost = float;
using HeapKey = pair<Cost, Cost>;
using KeyCompare = std::greater<std::pair<HeapKey, unsigned int>>;
vector<pair<HeapKey, unsigned int>> U;
When a cell is added it is done so by using:
U.push_back({ k, id });
push_heap(U.begin(), U.end(), KeyCompare());
As part of the path planning algorithm cells sometimes need to be removed, and here lies the current problem as far as I can see. I recently had help on this site to speed my program up quite a bit by using push_heap instead of make_heap, but now it seems that the part of the program that removes cells is the slowest part. Cells are removed from the priority queue by:
void DstarPlanner::updateVertex(unsigned int id) {
...
...
auto it = find_if(U.begin(), U.end(), [=](auto p) { return p.second == id; });
U.erase(it);
...
...
}
From my tests this seems to take roughly 80% of the time my program use for path planning. It was my hope coming here that a more time-saving method existed.
Thank you.
EDIT - Extra information.
void DstarPlanner::insertHeap(unsigned int id, HeapKey k) {
U.push_back({ k, id });
push_heap(U.begin(), U.end(), KeyCompare());
in_U[id]++;
}
void DstarPlanner::updateVertex(unsigned int id) {
Cell* u = graph.getCell(id);
if (u->id != id_goal) {
Cost mincost = infinity;
for (auto s : u->neighbors) {
mincost = min(mincost, graph.getEdgeCost(u->id, s->id) + s->g);
}
u->rhs = mincost;
}
if (in_U[id]) {
auto it = find_if(U.begin(), U.end(), [=](auto p) { return p.second == id; });
U.erase(it);
in_U[id]--;
}
if (u->g != u->rhs) {
insertHeap(id, u->calculateKey());
}
}
vector<int> DstarPlanner::ComputeShortestPath() {
vector<int> bestPath;
vector<int> emptyPath;
Cell* n = graph.getCell(id_start);
while (U.front().first < n->calculateKey() || n->rhs != n->g) {
auto uid = U.front().second;
Cell* u = graph.getCell(uid);
auto kold = U.front().first;
pop_heap(U.begin(), U.end(), KeyCompare());
U.pop_back();
in_U[u->id]--;
if (kold < u->calculateKey()) {
insertHeap(u->id, u->calculateKey());
} else if (u->g > u->rhs) {
u->g = u->rhs;
for (auto s : u->neighbors) {
if (!occupied(s->id)) {
updateVertex(s->id);
}
}
} else {
u->g = infinity;
for (auto s : u->neighbors) {
if (!occupied(s->id)) {
updateVertex(s->id);
}
}
updateVertex(u->id);
}
}
bestPath=constructPath();
return bestPath;
}
find_if does a linear search. It maybe faster to use:
std::map/std::set -> Standard binary search tree implementations
std::unordered_map/std::unordered_set -> Standard hash table implementations
These may use a lot of memory if your elements (key-value pairs) are small integers. To avoid that you can use 3rd party alternatives like boost::unordered_flat_map.
How do you re-heapify after U.erase(it)? Do you ever delete multiple nodes at once?
If deletions need to be atomic between searches, then you can
swap it with end() - 1,
erase end() - 1, and
re-heapify.
Erasing end() - 1 is O(1) while erasing it is linear in std::distance(it, end).
void DstarPlanner::updateVertex(unsigned int id) {
...
// take the id by reference since this is synchronous
auto it = find_if(U.begin(), U.end(), [&](const auto& p) { return p.second == id; });
*it = std::move(*(U.end() - 1));
U.erase((U.end() - 1));
std::make_heap(U.begin(), U.end()); // expensive!!! 3*distance(begin, end)
...
}
If you can delete multiple nodes between searches, then you can use a combination of erase + remove_if to only perform one mass re-heapify. This is important be heapify is expensive.
it = remove_if(begin, end, [](){ lambda }
erase(it, end)
re-heapify
void DstarPlanner::updateVertex(const std::vector<unsigned int>& sorted_ids) {
...
auto it = remove_if(U.begin(), U.end(), [&](const auto& p) { return std::binary_search(ids.begin(), ids.end(), p.second); });
U.erase(it, U.end());
std::make_heap(U.begin(), U.end()); // expensive!!! 3*distance(begin, end)
...
}
Doing better
You can possibly improve on this by replacing std::make_heap (which makes no assumptions about the heapiness of [begin(), end()) with a custom method that re-heapifies a former heap around "poison points" -- it only needs to initially inspect the elements around the elements that were swapped. This sounds like a pain to write and I'd only do it if the resulting program was still too slow.
Have you thought of...
Just not even removing elements from the heap? The fact you're using a heap tells me that the algorithm designers suggested a heap. If they suggested a heap, then they likely didn't envision random removals. This is speculation on my part. I'm otherwise not familiar with D* lite.

How to prove/disprove this algorithm time complexity is O(M+N) amortized?

The following problem on leetcode has 2 described solution. Let N be the number of input equations and M be the number of queries:
One uses union find and is O((M+N)log*(N))
One uses DFS and is O(M*N)
However it seems to me that answering all queries at the end with DFS will have an O(M+N) runtime. The below code passed all tests and was accepted by the OJ.
General outline
Build a graph. Each equation (a/b) = x creates two weighted edges from a to b with weight x, and from b to a with weight 1/x.
I run DFS over all variables and record the connected components. For each letter, I maintain in which connected components it is via component_map.
Each var in the component has a value V = captain/var, where captain was the first variable inserted
Then for each query I can answer when both belongs to the same component without need of backtraking since (captain/var1* var2/captain = var2/var1)
The key differences between my DFS solution and theirs are:
I do not need to backtrack due to last bullet above
I answer all queries at once at the end
My reasoning is that every single operation I do is amortized O(1), basically with hash maps and vectors. I run DFS inside a loop with N iteration, but the sum complexity of all my DFS calls will be O(M+N) as every node is only visited once.
I hence believe this solution to be O(M+N).
Question: Am I correct? Can you prove the time complexity of this algorithm whatever it is?
class Solution {
public:
typedef unordered_map<string,double> component; // var --> captain/var
typedef unordered_map<string,component> components; // captain --> component , each component identified by its captain.
typedef unordered_map<string,string> component_map; // var --> captain, to what component this var belongs?
typedef unordered_map<string, vector<pair<string,double>>> adjacency_list;
void DFS(const string& node, component& compo, adjacency_list& adj, double value, unordered_set<string>& visited,component_map& m, const string& captain)
{
for(auto p: adj[node])
{
if(compo.find(p.first) == compo.end())
{
visited.insert(p.first);
m.insert({p.first,captain}); // this letter belongs to this "captain component"
compo.insert({p.first,value*p.second}); // insert L,V
DFS(p.first,compo,adj,value*p.second,visited,m,captain);
}
}
}
vector<double> calcEquation(vector<vector<string>>& equations, vector<double>& values, vector<vector<string>>& queries)
{
adjacency_list adj;
for(int i=0;i<equations.size();++i)
{
string a = equations[i][0];
string b = equations[i][1];
double v = values[i];
auto it = adj.find(a);
if( it == adj.end())
{
adj.insert({a,{}});
it = adj.find(a);
}
it->second.push_back({b,v});
it = adj.find(b);
if( it == adj.end())
{
adj.insert({b,{}});
it = adj.find(b);
}
it->second.push_back({a,1/v});
}
components cps;
unordered_set<string> visited;
component_map m;
for(int i=0;i<equations.size();++i)
{
string a = equations[i][0];
if(visited.find(a)==visited.end())
{
auto it = cps.insert({a,{}}).first;
DFS(a,it->second,adj,1,visited,m,a);
}
string b = equations[i][1];
if(visited.find(b)==visited.end())
{
auto it = cps.insert({b,{}}).first;
DFS(b,it->second,adj,1,visited,m,a);
}
}
vector<double> res;
for(auto& q:queries)
{
auto it0 = m.find(q[0]);
auto it1 = m.find(q[1]);
if(it0 != m.end() && it1 != m.end() && it0->second == it1->second)
{
auto& captain = it0->second;
auto& cp = cps[captain];
res.push_back(cp[q[1]]/cp[q[0]]);
}
else
{
res.push_back(-1.0);
}
}
return res;
}
};

Finding and erasing a value from a std::vector holding std::map elements

First, I have the following two objects, both filled with data:
std::vector<std::map<std::uint8_t, std::uint8_t>> x1;
std::vector<std::map<std::uint8_t, std::uint8_t>> x2;
My objective is to search inside x2 (by the key), checking if any value from x1 doesn't exist inside x2, and then erase it from x1.
I tried with the following code snippet, but to no avail (it doesn't compile!):
for (auto i = x1.begin(); i != x1.end(); ++i)
{
auto it = std::find(x2.begin(), x2.end(), i);
if (it == x2.end())
{
x1.erase(i);
}
}
What am I doing wrong? Could you please share some insights on how to solve this problem?
There are several problems with your code:
std::find() searches for a single matching element, which in this case means you have to give it a std::map to search for. But you are passing in the i iterator itself, not the std::map that it refers to. You need to dereference i, eg:
auto it = std::find(x2.cbegin(), x2.cend(), *i);
When calling x1.erase(i), i becomes invalidated, which means the loop cannot use i anymore - not for ++i, not for i != x1.end(). You need to save the new iterator that erase() returns, which refers to the next element after the one being erased. Which means you also need to update your loop logic to NOT increment i when erase() is called, eg:
for (auto i = x1.cbegin(); i != x1.cend(); )
{
auto it = std::find(x2.cbegin(), x2.cend(), *i);
if (it == x2.cend())
i = x1.erase(i);
else
++i;
}
lastly, when using std::find(), you are comparing entire std::map objects to each other. If you are interested in comparing only the keys, try something more like this:
for (auto i = x1.cbegin(); i != x1.cend(); )
{
const auto &m1 = *i:
auto it = std::find_if(m1.cbegin(), m1.cend(),
[&](const decltype(m1)::value_type &m1_pair) { // or (const auto &m1_pair) in C++14...
return std::find_if(x2.cbegin(), x2.cend(),
[&](const decltype(x2)::value_type &m2){ // or (const auto &m2) in C++14...
return m2.find(m1_pair.first) != m2.cend();
}
);
}
);
if (it == m1.cend())
i = x1.erase(i);
else
++i;
}
You can also go a little bit functional: Playground
#include <algorithm>
#include <functional>
// removes maps from x1, that are equal to none of x2 maps
auto remove_start = std::remove_if(x1.begin(), x1.end(), [&](const auto& x1_map){
return std::none_of(x2.begin(), x2.end(),
std::bind(std::equal_to(), x1_map, std::placeholders::_1));
});
x1.erase(remove_start, x1.end());
EDIT: To check keys only, change std::equal_to to a custom lambda
auto keys_equal = [](auto& m1, auto& m2){
return m1.size() == m2.size()
&& std::equal(m1.begin(), m1.end(), m2.begin(),
[](auto& kv1, auto& kv2){ return kv1.first == kv2.first; });
};
// removes maps from x1, that are equal to none of x2 maps
auto remove_start =
std::remove_if(x1.begin(), x1.end(), [&](const auto& x1_map){
return std::none_of(x2.begin(), x2.end(),
std::bind(keys_equal, x1_map, std::placeholders::_1));
});
x1.erase(remove_start, x1.end());

Get the closest element to a given element in an std::set

I have a (sorted) set of unsigned int's. I need to find the closest element to a given number.
I am looking for a solution using the standard library,
my first solution was to use binary search, but STL's implementation only returns if the element exists.
This post, Find Closest Element in a Set, was helpful and I implemented a solution based on std::lower_bound method,
(*Assuming the set has more than 2 elements, no empty/boundary checks are made):
#include <iostream>
#include<set>
#include<algorithm>
#include<cmath>
int main()
{
std::set<unsigned int> mySet = {34, 256, 268, 500, 502, 444};
unsigned int searchedElement = 260;
unsigned int closestElement;
auto lower_bound = mySet.lower_bound(searchedElement);
if (lower_bound == mySet.end()){
closestElement = *(--lower_bound);
}
std::set<unsigned int>::iterator prevElement = --lower_bound;
bool isPrevClosest = std::abs(*prevElement - searchedElement) > std::abs(*lower_bound - searchedElement);
closestElement = isPrevClosest ? *prevElement : *lower_bound;
std::cout << closestElement << std::endl;
return 0;
}
Is there a simpler more standard solution?
I don't think there is a better solution than using .lower_bound. You can wrap your algorithm into a function template:
template<typename Set>
auto closest_element(Set& set, const typename Set::value_type& value)
-> decltype(set.begin())
{
const auto it = set.lower_bound(value);
if (it == set.begin())
return it;
const auto prev_it = std::prev(it);
return (it == set.end() || value - *prev_it <= *it - value) ? prev_it : it;
}
This function handles all corner cases (empty set, one element, first element, last element) correctly.
Example:
std::set<unsigned int> my_set{34, 256, 268, 500, 502, 444};
std::cout << *closest_element(my_set, 26); // Output: 34
std::cout << *closest_element(my_set, 260); // Output: 256
std::cout << *closest_element(my_set, 620); // Output: 502
Note that std::abs in your code does (almost) nothing: its argument has unsigned type and is always non-negative. But we know that std::set elements are ordered, hence we know that *prev_it <= value <= *it, and no std::abs() is needed.
You could use std::min_element() : as a comperator, give it a lambda that returns the absulute diff e.g.
std::min_element(mySet.begin(), mySet.end(), [searchedElement](const unsigned int a, const unsigned int b) {
return std::abs(searchedElement - a) < std::abs(searchedElement - b);
});
However, I do think this will no longer apply a binary search...
EDIT : Also, as stated in comments below, std::abs(x - y) for unsigned int values may return an unexpectedly large integer when x < y.
The std::set container is suitable for finding adjacent elements, i.e., finding the element that succeeds or precedes a given element. Considering the problem that you are facing:
I am looking for a solution using the standard library, my first solution was to use binary search, but STL's implementation only returns if the element exists.
There is still an approach you can follow without changing your logic: If the element – whose closest element you want to find – does not exist in the set, then you simply insert it in the set (it takes logarithmic time in the size of the set). Next, you find the closest element to this just added element. Finally, remove it from the set when you are done so that the set remains the same as before.
Of course, if the element was already in the set, nothing has to be inserted into or removed from the set. Therefore, you need to keep track of whether or not you added that element.
The following function is an example of the idea elaborated above:
#include <set>
unsigned int find_closest_element(std::set<unsigned int> s, unsigned int val) {
bool remove_elem = false;
auto it = s.find(val);
// does val exist in the set?
if (s.end() == it) {
// element does not exist in the set, insert it
s.insert(val);
it = s.find(val);
remove_elem = true;
}
// find previous and next element
auto prev_it = (it == s.begin()? s.end(): std::prev(it));
auto next_it = std::next(it);
// remove inserted element if applicable
if (remove_elem)
s.erase(it);
unsigned int d1, d2;
d1 = d2 = std::numeric_limits<unsigned int>::max();
if (prev_it != s.end())
d1 = val - *prev_it;
if (next_it != s.end())
d2 = *next_it - val;
return d1 <= d2? *prev_it: *next_it;
}

Checking if reducing iterator points to a valid element

I need to know if I can reduce the iterator and have a valid object. The below errors out because I reduce the iterator by 1 which doesn't exist. How can I know that so I don't get the error?
ticks.push_front(Tick(Vec3(0, 0, 5), 0));
ticks.push_front(Tick(Vec3(0, 0, 8), 100));
ticks.push_front(Tick(Vec3(0, 0, 10), 200));
bool found = false;
list<Tick, allocator<Tick>>::iterator iter;
for (iter = ticks.begin(); iter != ticks.end(); ++iter)
{
Tick t = (*iter);
if (214>= t.timestamp)
{
prior = t;
if (--iter != ticks.end())
{
next = (*--iter);
found = true;
break;
}
}
}
I'm trying to find the entries directly "above" and directly "below" the value 214 in the list. If only 1 exists then I don't care. I need above and below to exist.
After your edits to the question, I think I can write a better answer than what I had before.
First, write a comparison function for Ticks that uses their timestamps:
bool CompareTicks(const Tick& l, const Tick& r)
{
return l.timestamp < r.timestamp;
}
Now use the function with std::upper_bound:
// Get an iterator pointing to the first element in ticks that is > 214
// I'm assuming the second parameter to Tick's ctor is the timestamp
auto itAbove = std::upper_bound(ticks.begin(), ticks.end(), Tick(Vec3(0, 0, 0), 214), CompareTicks);
if(itAbove == ticks.end())
; // there is nothing in ticks > 214. I don't know what you want to do in this case.
This will give you the first element in ticks that is > 214. Next, you can use lower_bound to find the first element that is >= 214:
// get an iterator pointing to the first element in ticks that is >= 214
// I'm assuming the second parameter to Tick's ctor is the timestamp
auto itBelow = std::lower_bound(ticks.begin(), ticks.end(), Tick(Vec3(0, 0, 0), 214), CompareTicks);
You have to do one extra step with itBelow now to get the first element before 214, taking care not to go past the beginning of the list:
if(itBelow == ticks.begin())
; // there is nothing in ticks < 214. I don't know what you want to do in this case.
else
--itBelow;
Now, assuming you didn't hit any of the error cases, itAbove is pointing to the first element > 214, and itBelow is pointing to the last element < 214.
This assumes your Ticks are in order by timestamp, which seems to be the case. Note also that this technique will work even if there are multiple 214s in the list. Finally, you said the list is short so it's not really worth worrying about time complexity, but this technique could get you logarithmic performance if you also replaced the list with a vector, as opposed to linear for iterative approaches.
The answer to your core question is simple. Don't increment if you are at the end. Don't decrement if you are at the start.
Before incrementing, check.
if ( iter == ticks.end() )
Before decrementig, check.
if ( iter == ticks.begin() )
Your particular example
Looking at what you are trying to accomplish, I suspect you meant to use:
if (iter != ticks.begin())
instead of
if (--iter != ticks.end())
Update
It seems you are relying on the contents of your list being sorted by timestamp.
After your comment, I think what you need is:
if (214>= t.timestamp)
{
prior = t;
if (++iter != ticks.end())
{
next = *iter;
if ( 214 <= next.timestep )
{
found = true;
break;
}
}
}
Update 2
I agree with the comment made by #crashmstr. Your logic can be:
if (214 <= t.timestamp)
{
next = t;
if ( iter != ticks.begin())
{
prior = *--(iter);
found = true;
break;
}
}
I think you can do what you want with std::adjacent_find from the standard library <algorithm>. By default std::adjacent_find looks for two consecutive identical elements but you can provide your own function to define the relationship you are interested in.
Here's a simplified example:
#include <algorithm>
#include <iostream>
#include <list>
struct matcher
{
matcher(int value) : target(value) {}
bool operator()(int lo, int hi) const {
return (lo < target) && (target < hi);
}
int target;
};
int main()
{
std::list<int> ticks = { 0, 100, 200, 300 };
auto it = std::adjacent_find(ticks.begin(), ticks.end(), matcher(214));
if (it != ticks.end()) {
std::cout << *it << ' ' << *std::next(it) << '\n';
} else {
std::cout << "not found\n";
}
}
This outputs 200 300, the two "surrounding" values it found.