I have a (sorted) set of unsigned int's. I need to find the closest element to a given number.
I am looking for a solution using the standard library,
my first solution was to use binary search, but STL's implementation only returns if the element exists.
This post, Find Closest Element in a Set, was helpful and I implemented a solution based on std::lower_bound method,
(*Assuming the set has more than 2 elements, no empty/boundary checks are made):
#include <iostream>
#include<set>
#include<algorithm>
#include<cmath>
int main()
{
std::set<unsigned int> mySet = {34, 256, 268, 500, 502, 444};
unsigned int searchedElement = 260;
unsigned int closestElement;
auto lower_bound = mySet.lower_bound(searchedElement);
if (lower_bound == mySet.end()){
closestElement = *(--lower_bound);
}
std::set<unsigned int>::iterator prevElement = --lower_bound;
bool isPrevClosest = std::abs(*prevElement - searchedElement) > std::abs(*lower_bound - searchedElement);
closestElement = isPrevClosest ? *prevElement : *lower_bound;
std::cout << closestElement << std::endl;
return 0;
}
Is there a simpler more standard solution?
I don't think there is a better solution than using .lower_bound. You can wrap your algorithm into a function template:
template<typename Set>
auto closest_element(Set& set, const typename Set::value_type& value)
-> decltype(set.begin())
{
const auto it = set.lower_bound(value);
if (it == set.begin())
return it;
const auto prev_it = std::prev(it);
return (it == set.end() || value - *prev_it <= *it - value) ? prev_it : it;
}
This function handles all corner cases (empty set, one element, first element, last element) correctly.
Example:
std::set<unsigned int> my_set{34, 256, 268, 500, 502, 444};
std::cout << *closest_element(my_set, 26); // Output: 34
std::cout << *closest_element(my_set, 260); // Output: 256
std::cout << *closest_element(my_set, 620); // Output: 502
Note that std::abs in your code does (almost) nothing: its argument has unsigned type and is always non-negative. But we know that std::set elements are ordered, hence we know that *prev_it <= value <= *it, and no std::abs() is needed.
You could use std::min_element() : as a comperator, give it a lambda that returns the absulute diff e.g.
std::min_element(mySet.begin(), mySet.end(), [searchedElement](const unsigned int a, const unsigned int b) {
return std::abs(searchedElement - a) < std::abs(searchedElement - b);
});
However, I do think this will no longer apply a binary search...
EDIT : Also, as stated in comments below, std::abs(x - y) for unsigned int values may return an unexpectedly large integer when x < y.
The std::set container is suitable for finding adjacent elements, i.e., finding the element that succeeds or precedes a given element. Considering the problem that you are facing:
I am looking for a solution using the standard library, my first solution was to use binary search, but STL's implementation only returns if the element exists.
There is still an approach you can follow without changing your logic: If the element – whose closest element you want to find – does not exist in the set, then you simply insert it in the set (it takes logarithmic time in the size of the set). Next, you find the closest element to this just added element. Finally, remove it from the set when you are done so that the set remains the same as before.
Of course, if the element was already in the set, nothing has to be inserted into or removed from the set. Therefore, you need to keep track of whether or not you added that element.
The following function is an example of the idea elaborated above:
#include <set>
unsigned int find_closest_element(std::set<unsigned int> s, unsigned int val) {
bool remove_elem = false;
auto it = s.find(val);
// does val exist in the set?
if (s.end() == it) {
// element does not exist in the set, insert it
s.insert(val);
it = s.find(val);
remove_elem = true;
}
// find previous and next element
auto prev_it = (it == s.begin()? s.end(): std::prev(it));
auto next_it = std::next(it);
// remove inserted element if applicable
if (remove_elem)
s.erase(it);
unsigned int d1, d2;
d1 = d2 = std::numeric_limits<unsigned int>::max();
if (prev_it != s.end())
d1 = val - *prev_it;
if (next_it != s.end())
d2 = *next_it - val;
return d1 <= d2? *prev_it: *next_it;
}
Related
For C++ language, what's the fastest way in processing run-time (in multi core processors), from an algorithm design viewpoint, to search numbers (e.g. between 100 and 1000) that are within an array (or splice or whatever faster data structures for the purpose of this) and return the range of numbers limited to only 10 items returned? e.g. pseudocode in golang:
var listofnums := []uint64
var numcounter := 1
// splice of [1,2,3,4,5,31,32 .. 932536543] this list has 1 billion numeric items.
// the listofnums are already sorted each time an item is added but we do not know the lower_bound or upper_bound of the item list.
// I know I can use binary search to find listofnums[i] where it is smallest at [i] too... I'm asking for suggestions.
for i:=uint(0); i < len(listofnums); i++ {
if listofnums[i] > 100 && listofnums[i] < 1000 {
if listofnums[i]> 1000 || numcounter == 10 {
return
}
fmt.Println("%d",listofnums[i])
numcounter++
}
}
is this the fastest way? I saw bitmap structures in C++ but not sure if can be applied here.
I've come across this question, which is perfectly fine for veteran programmers to ask but I have no idea why it's down voted.
What is the fastest search method for array?
Can someone please not remove this question but let me rephrase it? Thanks in advance. I hope to find the most optimum way to return a range of numbers from a large array of numeric items.
If I understand your problem correctly you need to find two positions in your array, the first of which all numbers are greater than or equal to 100 and the second of which all numbers are less than or equal to 1000.
The functions std::lower_bound and std::upper_bound do binary searches designed to find such a range.
For arrays, in C++ we usually use a std::vector and denote the beginning and end of ranges using a pair of iterators.
So something like this may be what you need:
std::pair<std::vector<int>::iterator, std::vector<int>::iterator>
find_range(std::vector<int>& v, int min, int max)
{
auto begin = std::lower_bound(std::begin(v), std::end(v), min);
// start searching after the previously found value
auto end = std::upper_bound(begin, std::end(v), max);
return {begin, end};
}
You can iterate over that range like this:
auto range = find_range(v, 100, 1000);
for(auto i = range.first; i != range.second; ++i)
std::cout << *i << '\n';
You can create a new vector from the range (slow) like this:
std::vector<int> selection{range.first, range.second};
My first attempt.
Features:
logN time complexity
creates an array slice, no copying of data
second binary search minimises the search space on the basis of the first
possible improvements:
if n is small, the second binary search would be a pessimisation. Better to simply count forward up to n times.
#include <vector>
#include <cstdint>
#include <algorithm>
#include <iterator>
#include <iostream>
template <class Iter> struct range
{
range(Iter first, std::size_t size) : begin_(first), end_(first + size) {}
auto begin() const { return begin_; }
auto end() const { return end_; }
Iter begin_, end_;
};
template<class Iter> range(Iter, std::size_t) -> range<Iter>;
auto find_first_n_between(std::vector<std::int64_t>& vec,
std::size_t n,
std::int64_t from, std::int64_t to)
{
auto lower = std::lower_bound(begin(vec), end(vec), from);
auto upper = std::upper_bound(lower, end(vec), to);
auto size = std::min(n, std::size_t(std::distance(lower, upper)));
return range(lower, size);
}
int main()
{
std::vector<std::int64_t> vec { 1,2,3,4,5,6,7,8,15,17,18,19,20 };
auto slice = find_first_n_between(vec, 5, 6, 15);
std::copy(std::begin(slice), std::end(slice), std::ostream_iterator<std::int64_t>(std::cout, ", "));
}
I am using a multi-set in c++, which I believe stores an element and the respective count of it when it is inserted.
Here, when I want to delete an element, I just want to decrease the count of that element in the set by 1 till it is greater than 0.
Example C++ code:
multiset<int>mset;
mset.insert(2);
mset.insert(2);
printf("%d ",mset.count(2)); //this returns 2
// here I need an O(1) constant time function (in-built or whatever )
// to decrease the count of 2 in the set without deleting it
// Remember constant time only
-> Function and its specifications
printf("%d ",mset.count(2)); // it should print 1 now .
Is there any way to achieve that or should i go by deleting that and inserting the element 2 by the required (count-1) times?
... I am using a multi-set in c++, which stores an element and the respective count of it ...
No you aren't. You're using a multi-set which stores n copies of a value which was inserted n times.
If you want to store something relating a value to a count, use an associative container like std::map<int, int>, and use map[X]++ to increment the number of Xs.
... i need an O(1) constant time function ... to decrease the count ...
Both map and set have O(log N) complexity just to find the element you want to alter, so this is impossible with them. Use std::unordered_map/set to get O(1) complexity.
... I just want to decrease the count of that element in the set by 1 till it is >0
I'm not sure what that means.
with a set:
to remove all copies of an element from the set, use equal_range to get a range (pair of iterators), and then erase that range
to remove all-but-one copies in a non-empty range, just increment the first iterator in the pair and check it's still not equal to the second iterator before erasing the new range.
these both have an O(log N) lookup (equal_range) step followed by a linear-time erase step (although it's linear with the number of elements having the same key, not N).
with a map:
to remove the count from a map, just erase the key
to set the count to one, just use map[key]=1;
both of these have an O(log N) lookup followed by a constant-time erase
with an unordered map ... for your purposes it's identical to the map above, except with O(1) complexity.
Here's a quick example using unordered_map:
template <typename Key>
class Counter {
std::unordered_map<Key, unsigned> count_;
public:
unsigned inc(Key k, unsigned delta = 1) {
auto result = count_.emplace(k, delta);
if (result.second) {
return delta;
} else {
unsigned& current = result.first->second;
current += delta;
return current;
}
}
unsigned dec(Key k, unsigned delta = 1) {
auto iter = count_.find(k);
if (iter == count_.end()) return 0;
unsigned& current = iter->second;
if (current > delta) {
current -= delta;
return current;
}
// else current <= delta means zero
count_.erase(iter);
return 0;
}
unsigned get(Key k) const {
auto iter = count_.find(k);
if (iter == count_.end()) return 0;
return iter->second;
}
};
and use it like so:
int main() {
Counter<int> c;
// test increment
assert(c.inc(1) == 1);
assert(c.inc(2) == 1);
assert(c.inc(2) == 2);
// test lookup
assert(c.get(0) == 0);
assert(c.get(1) == 1);
// test simple decrement
assert(c.get(2) == 2);
assert(c.dec(2) == 1);
assert(c.get(2) == 1);
// test erase and underflow
assert(c.dec(2) == 0);
assert(c.dec(2) == 0);
assert(c.dec(1, 42) == 0);
}
I'm trying to find a sensible algorithm to combine multiple lists/vectors/arrays as defined below.
Each element contains a float declaring the start of its range of validity and a constant that is used over this range. Where ranges from different lists overlap their constants need to be added to produce one global list.
I've done an attempt at an illustration below to try and give a good idea of what I mean:
First List:
0.5---------------2------------3.2--------4
a1 a2 a3
Second List:
1----------2----------3---------------4.5
b1 b2 b3
Desired Output:
0.5----1----------2----------3-3.2--------4--4.5
a1 a1+b1 a2+b2 ^ a3+b3 b3
b3+a2
I can't think of a sensible way of going about this in the case of n lists; Just 2 is quite easy to brute force.
Any hints or ideas would be welcome. Each list is represented as a C++ std::vector (so feel free to use standard algorithms) and are sorted by start of range value.
Cheers!
Edit: Thanks for the advice, I've come up with a naive implementation, not sure why I couldn't get here on my own first. To my mind the obvious improvement would be to store an iterator for each vector since they're already sorted and not have to re-traverse each vector for each point. Given that most vectors will contain less than 100 elements, but there may be many vectors this may or may not be worthwhile. I'd have to profile to see.
Any thoughts on this?
#include <vector>
#include <iostream>
struct DataType
{
double intervalStart;
int data;
// More data here, the data is not just a single int, but that
// works for our demonstration
};
int main(void)
{
// The final "data" of each vector is meaningless as it refers to
// the coming range which won't be used as this is only for
// bounded ranges
std::vector<std::vector<DataType> > input = {{{0.5, 1}, {2.0, 3}, {3.2, 3}, {4.0, 4}},
{{1.0, 5}, {2.0, 6}, {3.0, 7}, {4.5, 8}},
{{-34.7895, 15}, {-6.0, -2}, {1.867, 5}, {340, 7}}};
// Setup output vector
std::vector<DataType> output;
std::size_t inputSize = 0;
for (const auto& internalVec : input)
inputSize += internalVec.size();
output.reserve(inputSize);
// Fill output vector
for (const auto& internalVec : input)
std::copy(internalVec.begin(), internalVec.end(), std::back_inserter(output));
// Sort output vector by intervalStartPoints
std::sort(output.begin(), output.end(),
[](const DataType& data1, const DataType& data2)
{
return data1.intervalStart < data2.intervalStart;
});
// Remove DataTypes with same intervalStart - each interval can only start once
output.erase(std::unique(output.begin(), output.end(),
[](const DataType& dt1, const DataType& dt2)
{
return dt1.intervalStart == dt2.intervalStart;
}), output.end());
// Output now contains all the right intersections, just not with the right data
// Lambda to find the associated data value associated with an
// intervsalStart value in a vector
auto FindDataValue = [&](const std::vector<DataType> v, double startValue)
{
auto iter = std::find_if(v.begin(), v.end(), [startValue](const DataType& data)
{
return data.intervalStart > startValue;
});
if (iter == v.begin() || iter == v.end())
{
return 0;
}
return (iter-1)->data;
};
// For each interval in the output traverse the input and sum the
// data constants
for (auto& val : output)
{
int sectionData = 0;
for (const auto& iv : input)
sectionData += FindDataValue(iv, val.intervalStart);
val.data = sectionData;
}
for (const auto& i : output)
std::cout << "loc: " << i.intervalStart << " data: " << i.data << std::endl;
return 0;
}
Edit2: #Stas's code is a very good way to approach this problem. I've just tested it on all the edge cases I could think of.
Here's my merge_intervals implementation in case anyone is interested. The only slight change I've had to make to the snippets Stas provided is:
for (auto& v : input)
v.back().data = 0;
Before combining the vectors as suggested. Thanks!
template<class It1, class It2, class OutputIt>
OutputIt merge_intervals(It1 first1, It1 last1,
It2 first2, It2 last2,
OutputIt destBegin)
{
const auto begin1 = first1;
const auto begin2 = first2;
auto CombineData = [](const DataType& d1, const DataType& d2)
{
return DataType{d1.intervalStart, (d1.data+d2.data)};
};
for (; first1 != last1; ++destBegin)
{
if (first2 == last2)
{
return std::copy(first1, last1, destBegin);
}
if (first1->intervalStart == first2->intervalStart)
{
*destBegin = CombineData(*first1, *first2);
++first1; ++first2;
}
else if (first1->intervalStart < first2->intervalStart)
{
if (first2 > begin2)
*destBegin = CombineData(*first1, *(first2-1));
else
*destBegin = *first1;
++first1;
}
else
{
if (first1 > begin1)
*destBegin = CombineData(*first2, *(first1-1));
else
*destBegin = *first2;
++first2;
}
}
return std::copy(first2, last2, destBegin);
}
Unfortunately, your algorithm is inherently slow. It doesn't make sense to profile or apply some C++ specific tweaks, it won't help. It will never stop calculation on pretty small sets like merging 1000 lists of 10000 elements each.
Let's try to evaluate time complexity of your algo. For the sake of simplicity, let's merge only lists of the same length.
L - length of a list
N - number of lists to be merged
T = L * N - length of a whole concatenated list
Complexity of your algorithm steps:
create output vector - O(T)
sort output vector - O(T*log(T))
filter output vector - O(T)
fix data in output vector - O(T*T)
See, the last step defines the whole algorithm complexity: O(T*T) = O(L^2*N^2). It is not acceptable for practical application. See, to merge 1000 lists of 10000 elements each, the algorithm should run 10^14 cycles.
Actually, the task is pretty complex, so do not try to solve it in one step. Divide and conquer!
Write an algorithm that merges two lists into one
Use it to merge a list of lists
Merging two lists into one
This is relatively easy to implement (but be careful with corner cases). The algorithm should have linear time complexity: O(2*L). Take a look at how std::merge is implemented. You just need to write your custom variant of std::merge, let's call it merge_intervals.
Applying a merge algorithm to a list of lists
This is a little bit tricky, but again, divide and conquer! The idea is to do recursive merge: split a list of lists on two halves and merge them.
template<class It, class Combine>
auto merge_n(It first, It last, Combine comb)
-> typename std::remove_reference<decltype(*first)>::type
{
if (first == last)
throw std::invalid_argument("Empty range");
auto count = std::distance(first, last);
if (count == 1)
return *first;
auto it = first;
std::advance(it, count / 2);
auto left = merge_n(first, it, comb);
auto right = merge_n(it, last, comb);
return comb(left, right);
}
Usage:
auto combine = [](const std::vector<DataType>& a, const std::vector<DataType>& b)
{
std::vector<DataType> result;
merge_intervals(a.begin(), a.end(), b.begin(), b.end(),
std::back_inserter(result));
return result;
};
auto output = merge_n(input.begin(), input.end(), combine);
The nice property of such recursive approach is a time complexity: it is O(L*N*log(N)) for the whole algorithm. So, to merge 1000 lists of 10000 elements each, the algorithm should run 10000 * 1000 * 9.966 = 99,660,000 cycles. It is 1,000,000 times faster than original algorithm.
Moreover, such algorithm is inherently parallelizable. It is not a big deal to write parallel version of merge_n and run it on thread pool.
I know I'm a bit late to the party, but when I started writing this you hadn't a suitable answer yet, and my solution should have a relatively good time complexity, so here you go:
I think the most straightforward way to approach this is to see each of your sorted lists as a stream of events: At a given time, the value (of that stream) changes to a new value:
template<typename T>
struct Point {
using value_type = T;
float time;
T value;
};
You want to superimpose those streams into a single stream (i.e. having their values summed up at any given point). For that you take the earliest event from all streams, and apply its effect on the result stream. Therefore, you need to first "undo" the effect that the previous value from that stream made on the result stream, and then add the new value to the current value of the result stream.
To be able to do that, you need to remember for each stream the last value, the next value (and when the stream is empty):
std::vector<std::tuple<Value, StreamIterator, StreamIterator>> streams;
The first element of the tuple is the last effect of that stream onto the result stream, the second is an iterator pointing to the streams next event, and the last is the end iterator of that stream:
transform(from, to, inserter(streams, begin(streams)),
[] (auto & stream) {
return make_tuple(static_cast<Value>(0), begin(stream), end(stream));
});
To be able to always get the earliest event of all the streams, it helps to keep the (information about the) streams in a (min) heap, where the top element is the stream with the next (earliest) event. That's the purpose of the following comparator:
auto heap_compare = [] (auto const & lhs, auto const & rhs) {
bool less = (*get<1>(lhs)).time < (*get<1>(rhs)).time;
return (not less);
};
Then, as long as there are still some events (i.e. some stream that is not empty), first (re)build the heap, take the top element and apply its next event to the result stream, and then remove that element from the stream. Finally, if the stream is now empty, remove it.
// The current value of the result stream.
Value current = 0;
while (streams.size() > 0) {
// Reorder the stream information to get the one with the earliest next
// value into top ...
make_heap(begin(streams), end(streams), heap_compare);
// .. and select it.
auto & earliest = streams[0];
// New value is the current one, minus the previous effect of the selected
// stream plus the new value from the selected stream
current = current - get<0>(earliest) + (*get<1>(earliest)).value;
// Store the new time point with the new value and the time of the used
// time point from the selected stream
*out++ = Point<Value>{(*get<1>(earliest)).time, current};
// Update the effect of the selected stream
get<0>(earliest) = (*get<1>(earliest)).value;
// Advance selected stream to its next time point
++(get<1>(earliest));
// Remove stream if empty
if (get<1>(earliest) == get<2>(earliest)) {
swap(streams[0], streams[streams.size() - 1u]);
streams.pop_back();
}
}
This will return a stream where there might be multiple points with the same time, but a different value. This occurs when there are multiple "events" at the same time. If you only want the last value, i.e. the value after all these events happened, then one needs to combine them:
merge_point_lists(begin(input), end(input), inserter(merged, begin(merged)));
// returns points with the same time, but with different values. remove these
// duplicates, by first making them REALLY equal, i.e. setting their values
// to the last value ...
for (auto write = begin(merged), read = begin(merged), stop = end(merged);
write != stop;) {
for (++read; (read != stop) and (read->time == write->time); ++read) {
write->value = read->value;
}
for (auto const cached = (write++)->value; write != read; ++write) {
write->value = cached;
}
}
// ... and then removing them.
merged.erase(
unique(begin(merged), end(merged),
[](auto const & lhs, auto const & rhs) {
return (lhs.time == rhs.time);}),
end(merged));
(Live example here)
Concerning the time complexity: This is iterating over all "events", so it depends on the number of events e. The very first make_heap call has to built a complete new heap, this has worst case complexity of 3 * s where s is the number of streams the function has to merge. On subsequent calls, make_heap only has to correct the very first element, this has worst case complexity of log(s'). I write s' because the number of streams (that need to be considered) will decrease to zero. This
gives
3s + (e-1) * log(s')
as complexity. Assuming the worst case, where s' decreases slowly (this happens when the events are evenly distributed across the streams, i.e. all streams have the same number of events:
3s + (e - 1 - s) * log(s) + (sum (log(i)) i = i to s)
Do you really need a data structure as result? I don't think so. Actually you're defining several functions that can be added. The examples you give are encoded using a 'start, value(, implicit end)' tuple. The basic building block is a function that looks up it's value at a certain point:
double valueAt(const vector<edge> &starts, float point) {
auto it = std::adjacent_find(begin(starts), end(starts),
[&](edge e1, edge e2) {
return e1.x <= point && point < e2.x;
});
return it->second;
};
The function value for a point is the sum of the function values for all code-series.
If you really need a list in the end, you can join and sort all edge.x values for all series, and create the list from that.
Unless performance is an issue :)
If you can combine two of these structures, you can combine many.
First, encapsulate your std::vector into a class. Implement what you know as operator+= (and define operator+ in terms of this if you want). With that in place, you can combine as many as you like, just by repeated addition. You could even use std::accumulate to combine a collection of them.
I want the function to return true when there is any element matching between two vectors,
Note : My vectors are not sorted
Following is my source code,
bool CheckCommon( std::vector< long > &inVectorA, std::vector< long > &inVectorB )
{
std::vector< long > *lower, *higher;
size_t sizeL = 0, sizeH = 0;
if( inVectorA.size() > inVectorB.size() )
{
lower = &inVectorA;
sizeL = inVectorA.size();
higher = &inVectorB;
sizeH = inVectorB.size();
}
else
{
lower = &inVectorB;
sizeL = inVectorB.size();
higher = &inVectorA;
sizeH = inVectorA.size();
}
size_t indexL = 0, indexH = 0;
for( ; indexH < sizeH; indexH++ )
{
bool exists = std::binary_search( lower->begin(), lower->end(), higher->at(indexH) );
if( exists == true )
return true;
else
continue;
}
return false;
}
This is working fine when the size of vector B is less than the size of vector A , but returning false even there is match when size of vector B is greater than size of vector A .
The problem with posted code is that you should not use std::binary_search when the vector is not sorted. The behaviour is defined only for sorted range.
If the input vectors are not sorted then you can use find_first_of to check for existence of first common element found.
bool CheckCommon(std::vector<long> const& inVectorA, std::vector<long> const& nVectorB)
{
return std::find_first_of (inVectorA.begin(), inVectorA.end(),
nVectorB.begin(), nVectorB.end()) != inVectorA.end();
}
Complexity of find_first_of is up to linear in inVectorA.size()*inVectorB.size(); it compares elements until a match is found.
If you want to fix your original algorithm then you can make a copy of one of vectors and std::sort it, then std::binary_search works with it.
In actual programs that do lot of such matching between containers the containers are usually kept sorted. On such case std::set_intersection can be used. Then the complexity of search is up to linear in inVectorA.size()+inVectorB.size().
std::find_first_of is more efficient than to sort both ranges and then to search for matches with std::set_intersection when both ranges are rather short or second range is shorter than binary logarithm of length of first range.
You can use a well-defined algorithm called as std::set_intersection to check if there is any common element between these vectors.
Pre-condition :- Both vectors be sorted.
You could do something like the following. Iterate over the first vector. For each element, use std::find to see if it exists in the other vector. If you find it, they have at least one common element so return true. Otherwise, move to the next element of the first vector and repeat this process. If you make it all the way through the first vector without finding a common element, there is no intersection so return false.
bool CheckCommon(std::vector<long> const& inVectorA, std::vector<long> const& nVectorB)
{
for (auto const& num : inVectorA)
{
auto it = std::find(begin(nVectorB), end(nVectorB), num);
if (it != end(nVectorB))
{
return true;
}
}
return false;
}
Usage of std::set_intersection is one option. Since the vector's elements are sorted, the code can be simplified to this:
#include <algorithm>
#include <iterator>
bool CheckCommon( const std::vector< long > &inVectorA, const std::vector< long > &inVectorB )
{
std::vector< long > temp;
std::set_intersection(inVectorA.begin(), inVectorA.end(),
inVectorB.begin(), inVectorB.end(),
std::back_inserter(temp));
return !temp.empty()
}
The drawback is that a temporary vector is being created while the set_intersection is being executed (but maybe in the future, this can be considered a "feature" if you want to know what elements are common).
Here is an implementation which uses sorted vectors, doesn't construct a new container, and which has only linear complexity (more detailed: O(container1.size()+ container2.size()):
template< class ForwardIt1, class ForwardIt2 >
bool has_common_elements( ForwardIt1 first, ForwardIt1 last, ForwardIt2 s_first, ForwardIt2 s_last )
{
auto it=first;
auto s_it=s_first;
while(it<last && s_it<s_last)
{
if(*it==*s_it)
{
return true;
}
*it<*s_it ? ++it : ++s_it; //increase the smaller of both
}
return false;
}
DEMO
Your code uses std::binary_search, whose pre-condition is that (From http://en.cppreference.com/w/cpp/algorithm/binary_search):
For std::binary_search to succeed, the range [first, last) must be at least partially ordered, i.e. it must satisfy all of the following requirements:
partitioned with respect to element < value or comp(element, value)
partitioned with respect to !(value < element) or !comp(value, element)
for all elements, if element < value or comp(element, value) is true then !(value < element) or !comp(value, element) is also true
A fully-sorted range meets these criteria, as does a range resulting from a call to std::partition.
The sample data you used for testing (as posted at http://ideone.com/XCYdM8) do not meet that requirement. Instead of using:
vectorB.push_back(11116);
vectorB.push_back(11118);
vectorB.push_back(11112);
vectorB.push_back(11120);
vectorB.push_back(11190);
vectorB.push_back(11640);
vectorB.push_back(11740);
if you use a sorted vector like below
vectorB.push_back(11112);
vectorB.push_back(11116);
vectorB.push_back(11118);
vectorB.push_back(11120);
vectorB.push_back(11190);
vectorB.push_back(11640);
vectorB.push_back(11740);
your function will work just fine.
PS The you have designed your code, if the longer std::vector is sorted, the function will work fine.
PS2 Another option is to sort the longer std::vector before calling the function.
std::sort(B.begin(), B.end());
I need to know if I can reduce the iterator and have a valid object. The below errors out because I reduce the iterator by 1 which doesn't exist. How can I know that so I don't get the error?
ticks.push_front(Tick(Vec3(0, 0, 5), 0));
ticks.push_front(Tick(Vec3(0, 0, 8), 100));
ticks.push_front(Tick(Vec3(0, 0, 10), 200));
bool found = false;
list<Tick, allocator<Tick>>::iterator iter;
for (iter = ticks.begin(); iter != ticks.end(); ++iter)
{
Tick t = (*iter);
if (214>= t.timestamp)
{
prior = t;
if (--iter != ticks.end())
{
next = (*--iter);
found = true;
break;
}
}
}
I'm trying to find the entries directly "above" and directly "below" the value 214 in the list. If only 1 exists then I don't care. I need above and below to exist.
After your edits to the question, I think I can write a better answer than what I had before.
First, write a comparison function for Ticks that uses their timestamps:
bool CompareTicks(const Tick& l, const Tick& r)
{
return l.timestamp < r.timestamp;
}
Now use the function with std::upper_bound:
// Get an iterator pointing to the first element in ticks that is > 214
// I'm assuming the second parameter to Tick's ctor is the timestamp
auto itAbove = std::upper_bound(ticks.begin(), ticks.end(), Tick(Vec3(0, 0, 0), 214), CompareTicks);
if(itAbove == ticks.end())
; // there is nothing in ticks > 214. I don't know what you want to do in this case.
This will give you the first element in ticks that is > 214. Next, you can use lower_bound to find the first element that is >= 214:
// get an iterator pointing to the first element in ticks that is >= 214
// I'm assuming the second parameter to Tick's ctor is the timestamp
auto itBelow = std::lower_bound(ticks.begin(), ticks.end(), Tick(Vec3(0, 0, 0), 214), CompareTicks);
You have to do one extra step with itBelow now to get the first element before 214, taking care not to go past the beginning of the list:
if(itBelow == ticks.begin())
; // there is nothing in ticks < 214. I don't know what you want to do in this case.
else
--itBelow;
Now, assuming you didn't hit any of the error cases, itAbove is pointing to the first element > 214, and itBelow is pointing to the last element < 214.
This assumes your Ticks are in order by timestamp, which seems to be the case. Note also that this technique will work even if there are multiple 214s in the list. Finally, you said the list is short so it's not really worth worrying about time complexity, but this technique could get you logarithmic performance if you also replaced the list with a vector, as opposed to linear for iterative approaches.
The answer to your core question is simple. Don't increment if you are at the end. Don't decrement if you are at the start.
Before incrementing, check.
if ( iter == ticks.end() )
Before decrementig, check.
if ( iter == ticks.begin() )
Your particular example
Looking at what you are trying to accomplish, I suspect you meant to use:
if (iter != ticks.begin())
instead of
if (--iter != ticks.end())
Update
It seems you are relying on the contents of your list being sorted by timestamp.
After your comment, I think what you need is:
if (214>= t.timestamp)
{
prior = t;
if (++iter != ticks.end())
{
next = *iter;
if ( 214 <= next.timestep )
{
found = true;
break;
}
}
}
Update 2
I agree with the comment made by #crashmstr. Your logic can be:
if (214 <= t.timestamp)
{
next = t;
if ( iter != ticks.begin())
{
prior = *--(iter);
found = true;
break;
}
}
I think you can do what you want with std::adjacent_find from the standard library <algorithm>. By default std::adjacent_find looks for two consecutive identical elements but you can provide your own function to define the relationship you are interested in.
Here's a simplified example:
#include <algorithm>
#include <iostream>
#include <list>
struct matcher
{
matcher(int value) : target(value) {}
bool operator()(int lo, int hi) const {
return (lo < target) && (target < hi);
}
int target;
};
int main()
{
std::list<int> ticks = { 0, 100, 200, 300 };
auto it = std::adjacent_find(ticks.begin(), ticks.end(), matcher(214));
if (it != ticks.end()) {
std::cout << *it << ' ' << *std::next(it) << '\n';
} else {
std::cout << "not found\n";
}
}
This outputs 200 300, the two "surrounding" values it found.