Find nearest points in a vector - c++

Given a sorted vector with a number of values, as in the following example:
std::vector<double> f;
f.pushback(10);
f.pushback(100);
f.pushback(1000);
f.pushback(10000);
I'm looking for the most elegant way to retrieve for any double d the two values that are immediately adjacent to it. For example, given the value "45", I'd like this to return "10" and "100".
I was looking at lower_bound and upper_bound, but they don't do what I want. Can you help?
EDIT: I've decided to post my own anser, as it is somewhat a composite of all the helpful answers that I got in this thread. I've voted up those answers which I thought were most helpful.
Thanks everyone,
Dave

You can grab both values (if they exist) in one call with equal_range(). It returns a std::pair of iterators, with first being the first location and second being the last location in which you could insert the value passed without violating ordering. To strictly meet your criteria, you'd have to decrement the iterator in first, after verifying that it wasn't equal to the vector's begin().

You can use STL's lower_bound to get want you want in a few lines of code. lower_bound uses binary search under the hood, so your runtime is O(log n).
double val = 45;
double lower, upper;
std::vector<double>::iterator it;
it = lower_bound(f.begin(), f.end(), val);
if (it == f.begin()) upper = *it; // no smaller value than val in vector
else if (it == f.end()) lower = *(it-1); // no bigger value than val in vector
else {
lower = *(it-1);
upper = *it;
}

You could simply use a binary search, which will run in O(log(n)).
Here is a Lua snippet (I don't have time to do it in C++, sorry) which does what you want, except for limit conditions (that you did not define anyway) :
function search(value, list, first, last)
if not first then first = 1; last = #list end
if last - first < 2 then
return list[first], list[last]
end
local median = math.ceil(first + (last - first)/2)
if list[median] > value then
return search(value, list, first, median)
else
return search(value, list, median, last)
end
end
local list = {1,10,100,1000}
print(search(arg[1] + 0, list))
It takes the value to search from the command line :
$ lua search.lua 10 # didn't know what to do in this case
10 100
$ lua search.lua 101
100 1000
$ lua search.lua 99
10 100

I'm going to post my own anser, and vote anyone up that helped me to reach it, since this is what I'll use in the end, and you've all helped me reach this conclusion. Comments are welcome.
std::pair<value_type, value_type> GetDivisions(const value_type& from) const
{
if (m_divisions.empty())
throw 0; // Can't help you if we're empty.
std::vector<value_type>::const_iterator it =
std::lower_bound(m_divisions.begin(), m_divisions.end(), from);
if (it == m_divisions.end())
return std::make_pair(m_divisions.back(), m_divisions.back());
else if (it == m_divisions.begin())
return std::make_pair(m_divisions.front(), m_divisions.front());
else
return std::make_pair(*(it - 1), *(it));
}

What if (in your case) d is less than the first element or more than the last? And how to deal with negative values? By the way: guaranteeing that your "d" lives between the first and the last value of your vector you can do like that:
// Your initializations
std::vector<double>::const_iterator sit = f.begin();
double upper, lower;
Here is the rest:
while ( *sit < d ) // if the element is still less than your d
++sit; // increase your iterator
upper = *sit; // here you get the upper value
lower = *(--sit); // and here your lower
Elegant enough? :/

You could do a search in your vector for your value (which would tell you where your value would be if it were in the vector) and then return the value before and after that location. So searching for 45 would tell you it should be at index=1 and then you would return 0 and 1 (depending on your implementation of the search, you'll either get the index of the smaller value or the index of the larger value, but this is easy to check with a couple boundary conditions). This should be able to run in O(log n) where n is the number of elements in your vector.

I would write something like this, didn't test if this compiles, but you get the idea:
template <typename Iterator>
std::pair<Iterator, Iterator> find_best_pair(Iterator first, Iterator last, const typename Iterator::value_type & val)
{
std::pair<Iterator, Iterator> result(last, last);
typename Iterator::difference_type size = std::distance(first, last);
if (size == 2)
{
// if the container is of size 2, the answer is the two elements
result.first = first;
result.first = first;
++result.first;
}
else
{
// must be of at lease size 3
if (size > 2)
{
Iterator second = first;
++second;
Iterator prev_last = last;
--prev_last;
Iterator it(std::lower_bound(second, last, val));
if (it != last)
{
result.first = it;
result.second = it;
if (it != prev_last)
{
// if this is not the previous last
// then the answer is (it, it + 1)
++result.second;
}
else
{
// if this the previous last
// then the answer is (it - 1, it)
--result.first;
}
}
}
}
return result;
}

I wrote up this little function, which seems to fit the more general case you wanted. I haven't tested it totally, but I did write a little test code (included).
#include <algorithm>
#include <iostream>
#include <vector>
template <class RandomAccessIt, class Container, class T>
std::pair<RandomAccessIt, RandomAccessIt> bracket_range(RandomAccessIt begin, RandomAccessIt end, Container& c, T val)
{
typename Container::iterator first;
typename Container::iterator second;
first = std::find(begin, end, val);
//Find the first value after this by iteration
second = first;
if (first == begin){ // Found the first element, so set this to end to indicate no lower values
first = end;
}
else if (first != end && first != begin) --first; //Set this to the first value before the found one, if the value was found
while (second != end && *second == val) ++second;
return std::make_pair(first,second);
}
int main(int argc, _TCHAR* argv[])
{
std::vector<int> values;
std::pair<std::vector<int>::iterator, std::vector<int>::iterator> vals;
for (int i = 1; i < 9; ++i) values.push_back(i);
for (int i = 0; i < 10; ++i){
vals = bracket_range(values.begin(), values.end(),values, i);
if (vals.first == values.end() && vals.second == values.end()){ // Not found at all
std::cout << i << " is not in the container." << std::endl;
}
else if (vals.first == values.end()){ // No value lower
std::cout << i << ": " << "None Lower," << *(vals.second) << std::endl;
}
else if (vals.second == values.end()) { // No value higher
std::cout << i << ": " << *(vals.first) << ", None Higher" << std::endl;
}
else{
std::cout << i << ": " << *(vals.first) << "," << *(vals.second) << std::endl;
}
}
return 0;
}

Based on the code that tunnuz posted, here you have some improvements regarding bound checking:
template<typename T>
void find_enclosing_values(const std::vector<T> &vec, const T &value, T &lower, T &upper, const T &invalid_value)
{
std::vector<T>::const_iterator it = vec.begin();
while (it != vec.end() && *it < value)
++it;
if(it != vec.end())
upper = *it;
else
upper = invalid_value;
if(it == vec.begin())
lower = invalid_value;
else
lower = *(--it);
}
Example of usage:
std::vector<int> v;
v.push_back(3);
v.push_back(7);
v.push_back(10);
int lower, upper;
find_enclosing_values(v, 4, lower, upper, -1);
std::cout<<"lower "<<lower<<" upper "<<upper<<std::endl;

If you have the ability to use some other data structure (not a vector), I'd suggest a B-tree. If you data is unchanging, I believe you can retrieve the result in constant time (logarithmic time at the worst).

Related

Construct chains of pairs of numbers with one common member

I need to construct a chain of pair of numbers where:
In each pair, the first one is smaller than the second
In order to form a chain between two consecutive nodes, they must have one number in common. In other words, the link (a,b) -- (c,d) can be made if and only if either a==c, b==c, a==d or b==d
A pair cannot be made of the same number. In other words, if (a,b) exists, then a!=b
This may look like a Longest increasing subsequence but I actually want to chain consecutive pairs that have one equal member.
Example:
Initial list (unordered):
(0,1)
(2,3)
(1,6)
(4,6)
(8,9)
(2,8)
Result:
----- chain #1
(0,1)
(1,6)
(4,6)
----- chain #2
(2,3)
(2,8)
(8,9)
I could do an algorithm that will iterate over the entire list for each cell (O(n^2)), but I want to make it faster and I have the flexibility of ordering my initial array in any way I want (std::set, std::map, std::unordered_map, etc.). My list is made of tens of thousands of pairs so I need an efficient solution in terms of processing time.
You can solve it in O(N * log(N)) when you manage two lists, one sorted with respect to first the other sorted with respect to second.
The code has some duplication that I didnt bother to clean up yet.
#include <iostream>
#include <list>
#include <algorithm>
#include <tuple>
#include <any>
struct pair_and_iter {
int first;
int second;
std::any other_iter;
};
struct compare_first {
bool operator()(int x,pair_and_iter p){ return x < p.first; }
bool operator()(pair_and_iter p, int x){ return p.first < x; }
};
struct compare_second {
bool operator()(int x,pair_and_iter p){ return x < p.second; }
bool operator()(pair_and_iter p, int x){ return p.second < x; }
};
template <typename Iter,typename Comp>
Iter my_find(Iter first,Iter last,int x, Comp comp) {
auto it = std::lower_bound(first,last,x,comp);
if (it != last && (!comp(x,*it) && !comp(*it,x))){
return it;
} else {
return last;
}
}
int main() {
std::list<pair_and_iter> a {{0,1},{2,3},{1,6},{4,6},{8,9},{2,8}};
std::list<pair_and_iter> b;
for (auto it = a.begin(); it != a.end(); ++it){
b.push_back({it->first,it->second,it});
it->other_iter = std::prev(b.end());
}
a.sort([](const auto& x,const auto& y){
return std::tie(x.first,x.second) < std::tie(y.first,y.second); });
b.sort([](const auto& x,const auto& y){
return std::tie(x.second,x.first) < std::tie(y.second,y.first); });
std::vector<std::vector<pair_and_iter>> result;
std::vector<pair_and_iter> current_result;
current_result.push_back(a.front());
auto current = current_result.begin();
b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(current->other_iter));
a.erase(a.begin());
while (a.size() && b.size()) {
// look for an element with same first
auto it = my_find(a.begin(),a.end(),current->first,compare_first{});
if (it == a.end()) {
// look for element where current->second == elem.first
it = my_find(a.begin(),a.end(),current->second,compare_first{});
}
if (it != a.end()){
current_result.push_back(*it);
current = std::prev(current_result.end());
b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(it->other_iter));
a.erase(it);
continue;
}
// look for element with current->first == elem.second
it = my_find(b.begin(),b.end(),current->first,compare_second{});
if (it == b.end()) {
// look for element with same second
it = my_find(b.begin(),b.end(),current->second,compare_second{});
}
if (it != b.end()) {
current_result.push_back(*it);
current = std::prev(current_result.end());
a.erase(std::any_cast<std::list<pair_and_iter>::iterator>(it->other_iter));
b.erase(it);
continue;
}
// no matching element found
result.push_back(current_result);
current_result.clear();
current_result.push_back(a.front());
current = current_result.begin();
b.erase(std::any_cast<std::list<pair_and_iter>::iterator>(current->other_iter));
a.erase(a.begin());
}
result.push_back(current_result);
for (const auto& chain : result){
for (const auto& elem : chain){
std::cout << elem.first << " " << elem.second << "\n";
}
std::cout << "\n";
}
}
Output:
0 1
1 6
4 6
2 3
2 8
8 9
I used std::list because it has stable iterators and constant time erase. std::any for type erasure because each list contains iterators to the other list. a is sorted with respect to first and b is sorted with respect to second. Hence std::lower_bound can be used to to find a match in O(logN). A single linear search is traded against 2 binary searchs to find either current->first or current->second in a first of a and 2 binary searchs to find either current->first or current->second in a second of b. In total it is O(N log(N)) for sorting plus O( log(N) + log(N-1) + log(N-2) + .... log(1)) which equals O(log( n! )) if I am not mistaken.
PS: You didn't mention that you are looking for a longest chain, and this algorithm is not finding the longest chain. It just picks the first element of the remaining ones and uses the next element it finds to continue the chain.

How do iterators map/know their current position or element

Consider the following code example :
#include <vector>
#include <numeric>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <functional>
int main()
{
std::vector<int> v(10, 2);
std::partial_sum(v.cbegin(), v.cend(), v.begin());
std::cout << "Among the numbers: ";
std::copy(v.cbegin(), v.cend(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
if (std::none_of(v.cbegin(), v.cend(), std::bind(std::modulus<int>(),
std::placeholders::_1, 2))) {
std::cout << "None of them are odd\n";
}
struct DivisibleBy
{
const int d;
DivisibleBy(int n) : d(n) {}
bool operator()(int n) const { return n % d == 0; }
};
if (std::any_of(v.cbegin(), v.cend(), DivisibleBy(7))) {
std::cout << "At least one number is divisible by 7\n";
}
}
If we look at this part of the code :
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
which is fairly easy to understand. It iterates over those vector elements , and finds out i%2==0 , whether they are completely divisible by 2 or not , hence finds out they're even or not.
Its for loop counterpart could be something like this :
for(int i = 0; i<v.size();++i){
if(v[i] % 2 == 0) areEven = true; //just for readablity
else areEven = false;
}
In this for loop example , it is quiet clear that the current element we're processing is i since we're actually accessing v[i]. But how come in iterator version of same code , it maps i or knows what its current element is that we're accessing?
How does [](int i){ return i % 2 == 0; }) ensures/knows that i is the current element which iterator is pointing to.
I'm not able to makeout that without use of any v.currently_i_am_at_this_posiition() , how is iterating done. I know what iterators are but I'm having a hard time grasping them. Thanks :)
Iterators are modeled after pointers, and that's it really. How they work internally is of no interest, but a possible implementation is to actually have a pointer inside which points to the current element.
Iterating is done by using an iterator object
An iterator is any object that, pointing to some element in a range of
elements (such as an array or a container), has the ability to iterate
through the elements of that range using a set of operators (with at
least the increment (++) and dereference (*) operators).
The most obvious form of iterator is a pointer: A pointer can point to
elements in an array, and can iterate through them using the increment
operator (++).
and advancing it through the set of elements. The std::all_of function in your code is roughly equivalent to the following code
template< class InputIt, class UnaryPredicate >
bool c_all_of(InputIt first, InputIt last, UnaryPredicate p)
{
for (; first != last; ++first) {
if (!p(*first)) {
return false; // Found an odd element!
}
}
return true; // All elements are even
}
An iterator, when incremented, keeps track of the currently pointed element, and when dereferenced it returns the value of the currently pointed element.
For teaching's and clarity's sake, you might also think of the operation as follows (don't try this at home)
bool c_all_of(int* firstElement, size_t numberOfElements, std::function<bool(int)> evenTest)
{
for (size_t i = 0; i < numberOfElements; ++i)
if (!evenTest(*(firstElement + i)))
return false;
return true;
}
Notice that iterators are a powerful abstraction since they allow consistent elements access in different containers (e.g. std::map).

Finding the closest or exact key in a std::map

I need to create a lookup table which links a length to a time interval (both are of data type double). The keys increment linearly as they are inserted, so it will already be sorted (perhaps an unordered_map would be better?).
What I am looking for is a way to find a key that best matches the current length provided to get the time value, or even better find the two keys that surround the length (the given key is between them) so I can find the interpolated value between the two time values.
I also need the best performance possible as it will be called in real time.
EDIT: I would have rather the following was a comment to the first answer below, but the format is hard to read.
I tried to do the following, but it seems to return the same iterator (5.6):
std::map<double, double> map;
map.insert(std::pair<double, double>(0.123, 0.1));
map.insert(std::pair<double, double>(2.5, 0.4));
map.insert(std::pair<double, double>(5.6, 0.8));
std::map<double, double>::iterator low, high;
double pos = 3.0;
low = map.lower_bound(pos);
high = map.upper_bound(pos);
How would I get 'low' to point to the last element that is < than the key used to search?
EDIT 2:
Silly me, 'low--' will do it, providing it's not the first element.
Getting there :)
For this, you can use either std::map::lower_bound
Returns an iterator pointing to the first element that is not less than key.
or std::map::equal_range
Returns a range containing all elements with the given key in the container.
In your case, if you want the closest entry, you need to check both the returned entry and the one before and compare the differences. Something like this might work
std::map<double, double>::iterator low, prev;
double pos = 3.0;
low = map.lower_bound(pos);
if (low == map.end()) {
// nothing found, maybe use rbegin()
} else if (low == map.begin()) {
std::cout << "low=" << low->first << '\n';
} else {
prev = std::prev(low);
if ((pos - prev->first) < (low->first - pos))
std::cout << "prev=" << prev->first << '\n';
else
std::cout << "low=" << low->first << '\n';
}
"best performance possible" - given you insert elements in increasing order, you can push_back/emplace_back them into a std::vector then use std::lower_bound - you'll get better cache utilisation because the data will be packed into contiguous address space.
You could of course use lower_bound and upper_bound, which are logarithmic in runtime. And they should do what you want.
std::map<double,double>::iterator close_low;
//... your_map ...
close_low=your_map.lower_bound (current_length);
This should give you an iterator to the the first map element whose key is < current length. Do likewise with upper_bound and you have your time surrounded.
The functions std::lower_bound() and std::upper_bound() would be useful here.
lower_bound() gives the first element that is >= to the value you're looking for; upper_bound() gives the first element that is > than the value.
For instance, searching for the value 5 in the following list: {1,3,5,5,6}1 using lower_bound() returns the third element, while upper_bound() would return the fifth element.
If the two functions return the same thing x, then the value you're looking for is not present in the list.
The value just before it is x-1 and the value just after it is x.
1As pointed out by Tony D in a comment, the question asked for maps, which generally do not contain duplicate elements.
I'm keeping this example though to illustrate the two functions.
Complete generic solution (original idea taken from Olaf Dietsche's answer):
#include <map>
#include <iostream>
#include <cstdint>
template <typename T1, typename T2>
T1 findClosestKey(const std::map<T1, T2> & data, T1 key)
{
if (data.size() == 0) {
throw std::out_of_range("Received empty map.");
}
auto lower = data.lower_bound(key);
if (lower == data.end()) // If none found, return the last one.
return std::prev(lower)->first;
if (lower == data.begin())
return lower->first;
// Check which one is closest.
auto previous = std::prev(lower);
if ((key - previous->first) < (lower->first - key))
return previous->first;
return lower->first;
}
int main () {
double key = 3.3;
std::map<double, int> data = {{-10, 1000}, {0, 2000}, {10, 3000}};
std::cout << "Provided key: " << key << ", closest key: " << findClosestKey(data, key) << std::endl;
return 0;
}
#include <map>
template <typename T1, typename T2>
std::map<T1, T2>::iterator nearest_key(const std::map<T1, T2>& map, T1 key) {
auto lower_bound = map.lower_bound(key);
auto upper_bound = lower_bound; upper_bound++;
if (lower_bound == map.end()) return upper_bound;
if (upper_bound == map.end()) return lower_bound;
unsigned int dist_to_lower = std::abs((int)lower_bound->first - (int)key);
unsigned int dist_to_upper = std::abs((int)upper_bound->first - (int)key);
return (dist_to_upper < dist_to_lower) ? upper_bound : lower_bound;
}
above is wrong. should be like this
template
typename std::map<T1, T2>::const_iterator nearest_key(const std::map<T1, T2>& map, T1 key)
{
auto lower_bound = map.lower_bound(key);
if (lower_bound == map.end()) return --lower_bound;
auto upper_bound = lower_bound; upper_bound++;
if (upper_bound == map.end()) return lower_bound;
auto dist_to_lower = lower_bound->first - key;
auto dist_to_upper = upper_bound->first - key;
return (dist_to_upper < dist_to_lower) ? upper_bound : lower_bound;
}
I had to solve the same problem, however provided answers do not give me the correct answer. Here is a full example if someone wants
template <typename T>
class Key
{
public:
T x;
T y;
explicit Key(T x_, T y_): x(x_), y(y_){}
bool operator<( const Key<T> right) const{
if((x == right.x) && (y == right.y)){
return false;
}
return true;
}
T operator-( const Key<T> right) const{
return std::sqrt(std::pow(x-right.x, 2) + std::pow(y-right.y, 2));
}
};
int main(int argc, char **argv)
{
std::map<Key<double>, double> pixel_mapper;
Key<double> k1(400,5);
Key<double> k2(4,5);
Key<double> k3(4,5);
Key<double> k4(4667,5);
Key<double> k5(1000,5);
pixel_mapper.insert(std::pair<Key<double>, double>(k2, 5));
pixel_mapper.insert(std::pair<Key<double>, double>(k3, 5));
pixel_mapper.insert(std::pair<Key<double>, double>(k4, 5));
pixel_mapper.insert(std::pair<Key<double>, double>(k1, 5));
auto it = std::min_element( pixel_mapper.begin(), pixel_mapper.end(),
[&](const auto &p1, const auto &p2)
{
return std::abs(p1.first - k5) < std::abs(p2.first - k5);
});
std::cout<< it->first.x << "," << it->first.y << std::endl;
return 0;
}
Here, we can use std:min_element to get the closest in case exact key is not present

Output over unique elements of `std::multiset` and their frequency using std:: algorithm in C++ (no loops)

I have the following multiset in C++:
template<class T>
class CompareWords {
public:
bool operator()(T s1, T s2)
{
if (s1.length() == s2.length())
{
return ( s1 < s2 );
}
else return ( s1.length() < s2.length() );
}
};
typedef multiset<string, CompareWords<string>> mySet;
typedef std::multiset<string,CompareWords<string>>::iterator mySetItr;
mySet mWords;
I want to print each unique element of type std::string in the set once and next to the element I want to print how many time it appears in the list (frequency), as you can see the functor "CompareWord" keeps the set sorted.
A solution is proposed here, but its not what I need, because I am looking for a solution without using (while,for,do while).
I know that I can use this:
//gives a pointer to the first and last range or repeated element "word"
auto p = mWords.equal_range(word);
// compute the distance between the iterators that bound the range AKA frequency
int count = static_cast<int>(std::distance(p.first, p.second));
but I can't quite come up with a solution without loops?
Unlike the other solutions, this iterates over the list exactly once. This is important, as iterating over a structure like std::multimap is reasonably high overhead (the nodes are distinct allocations).
There are no explicit loops, but the tail-end recursion will be optimized down to a loop, and I call an algorithm that will run a loop.
template<class Iterator, class Clumps, class Compare>
void produce_clumps( Iterator begin, Iterator end, Clumps&& clumps, Compare&& compare) {
if (begin==end) return; // do nothing for nothing
typedef decltype(*begin) value_type_ref;
// We know runs are at least 1 long, so don't bother comparing the first time.
// Generally, advancing will have a cost similar to comparing. If comparing is much
// more expensive than advancing, then this is sub optimal:
std::size_t count = 1;
Iterator run_end = std::find_if(
std::next(begin), end,
[&]( value_type_ref v ){
if (!compare(*begin, v)) {
++count;
return false;
}
return true;
}
);
// call our clumps callback:
clumps( begin, run_end, count );
// tail end recurse:
return produce_clumps( std::move(run_end), std::move(end), std::forward<Clumps>(clumps), std::forward<Compare>(compare) );
}
The above is a relatively generic algorithm. Here is its use:
int main() {
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords { "A", "A", "B" };
produce_clumps( mWords.begin(), mWords.end(),
[]( mySetItr run_start, mySetItr /* run_end -- unused */, std::size_t count )
{
std::cout << "Word [" << *run_start << "] occurs " << count << " times\n";
},
CompareWords<std::string>{}
);
}
live example
The iterators must refer to a sorted sequence (with regards to the Comparator), then the clumps will be passed to the 3rd argument together with their length.
Every element in the multiset will be visited exactly once with the above algorithm (as a right-hand side argument to your comparison function). Every start of a clump will be visited (length of clump) additional times as a left-hand side argument (including clumps of length 1). There will be exactly N iterator increments performed, and no more than N+C+1 iterator comparisons (N=number of elements, C=number of clumps).
#include <iostream>
#include <algorithm>
#include <set>
#include <iterator>
#include <string>
int main()
{
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords;
mWords.insert("A");
mWords.insert("A");
mWords.insert("B");
mySetItr it = std::begin(mWords), itend = std::end(mWords);
std::for_each<mySetItr&>(it, itend, [&mWords, &it] (const std::string& word)
{
auto p = mWords.equal_range(word);
int count = static_cast<int>(std::distance(p.first, p.second));
std::cout << word << " " << count << std::endl;
std::advance(it, count - 1);
});
}
Outputs:
A 2
B 1
Live demo link.
Following does the job without explicit loop using recursion:
void print_rec(const mySet& set, mySetItr it)
{
if (it == set.end()) {
return;
}
const auto& word = *it;
auto next = std::find_if(it, set.end(),
[&word](const std::string& s) {
return s != word;
});
std::cout << word << " appears " << std::distance(it, next) << std::endl;
print_rec(set, next);
}
void print(const mySet& set)
{
print_rec(set, set.begin());
}
Demo

How do I efficiently remove_if only a single element from a forward_list?

Well I think the question pretty much sums it up. I have a forward_list of unique items, and want to remove a single item from it:
std::forward_list<T> mylist;
// fill with stuff
mylist.remove_if([](T const& value)
{
return value == condition;
});
I mean, this method works fine but it's inefficient because it continues to search once the item is found and deleted. Is there a better way or do I need to do it manually?
If you only want to remove the first match, you can use std::adjacent_find followed by the member erase_after
#include <algorithm>
#include <cassert>
#include <forward_list>
#include <iostream>
#include <ios>
#include <iterator>
// returns an iterator before first element equal to value, or last if no such element is present
// pre-condition: before_first is incrementable and not equal to last
template<class FwdIt, class T>
FwdIt find_before(FwdIt before_first, FwdIt last, T const& value)
{
assert(before_first != last);
auto first = std::next(before_first);
if (first == last) return last;
if (*first == value) return before_first;
return std::adjacent_find(first, last, [&](auto const&, auto const& R) {
return R == value;
});
}
int main()
{
auto e = std::forward_list<int>{};
std::cout << std::boolalpha << (++e.before_begin() == end(e)) << "\n";
std::cout << (find_before(e.before_begin(), end(e), 0) == end(e)) << "\n";
auto s = std::forward_list<int>{ 0 };
std::cout << (find_before(s.before_begin(), end(s), 0) == s.before_begin()) << "\n";
auto d = std::forward_list<int>{ 0, 1 };
std::cout << (find_before(d.before_begin(), end(d), 0) == d.before_begin()) << "\n";
std::cout << (find_before(d.before_begin(), end(d), 1) == begin(d)) << "\n";
std::cout << (find_before(d.before_begin(), end(d), 2) == end(d)) << "\n";
// erase after
auto m = std::forward_list<int>{ 1, 2, 3, 4, 1, 3, 5 };
auto it = find_before(m.before_begin(), end(m), 3);
if (it != end(m))
m.erase_after(it);
std::copy(begin(m), end(m), std::ostream_iterator<int>(std::cout, ","));
}
Live Example
This will stop as soon as a match is found. Note that the adjacent_find takes a binary predicate, and by comparing only the second argument, we get an iterator before the element we want to remove, so that erase_after can actually remove it. Complexity is O(N) so you won't get it more efficient than this.
FWIW, here's another short version
template< typename T, class Allocator, class Predicate >
bool remove_first_if( std::forward_list< T, Allocator >& list, Predicate pred )
{
auto oit = list.before_begin(), it = std::next( oit );
while( it != list.end() ) {
if( pred( *it ) ) { list.erase_after( oit ); return true; }
oit = it++;
}
return false;
}
Going to have to roll your own...
template <typename Container, typename Predicate>
void remove_first_of(Container& container, Predicate p)
{
auto it = container.before_begin();
for (auto nit = std::next(it); ; it = nit, nit = std::next(it))
{
if (nit == container.end())
return;
if (p(*nit))
{
container.erase_after(it);
return;
}
}
}
A more complete example...
There is nothing in the standard library which would be directly applicable. Actually, there is. See #TemplateRex's answer for that.
You can also write this yourself (especially if you want to combine the search with the erasure), something like this:
template <class T, class Allocator, class Predicate>
bool remove_first_if(std::forward_list<T, Allocator> &list, Predicate pred)
{
auto itErase = list.before_begin();
auto itFind = list.begin();
const auto itEnd = list.end();
while (itFind != itEnd) {
if (pred(*itFind)) {
list.erase_after(itErase);
return true;
} else {
++itErase;
++itFind;
}
}
return false;
}
This kind of stuff used to be a standard exercise when I learned programming way back in the early '80s. It might be interesting to to recall the solution, and compare that with what one can do in C++. Actually that was in Algol 68, but I won't impose that on you and give the translation into C. Given
typedef ... T;
typedef struct node *link;
struct node { link next; T data; };
one could write, realising that one needs to pass the address of the list head pointer if is to be possible to unlink the first node:
void search_and_destroy(link *p_addr, T y)
{
while (*p_addr!=NULL && (*p_addr)->data!=y)
p_addr = &(*p_addr)->next;
if (*p_addr!=NULL)
{
link old = *p_addr;
*p_addr = old->next; /* unlink node */
free(old); /* and free memory */
}
}
There are a lot of occurrences of *p_addr there; it is the last one, where it is the LHS of an assignment, that is the reason one needs the address of a pointer here in the first place. Note that in spite of the apparent complication, the statement p_addr = &(*p_addr)->next; is just replacing a pointer by the value it points to, and then adding an offset (which is 0 here).
One could introduce an auxiliary pointer value to lighten the code a bit up, as follows
void search_and_destroy(link *p_addr, T y)
{
link p=*p_addr;
while (p!=NULL && p->data!=y)
p=*(p_addr = &p->next);
if (p!=NULL)
{
*p_addr = p->next;
free(p);
}
}
but that is fundamentally the same code: any decent compiler should realise that the pointer value *p_addr is used multiple times in succession in the first example, and keep it in a register.
Now with std::forward_list<T>, we are not allowed access to the pointers that link the nodes, and get those awkward "iterators pointing one node before the real action" instead. Our solution becomes
void search_and_destroy(std::forward_list<T> list, T y)
{
std::forward_list<T>::iterator it = list.before_begin();
const std::forward_list<T>::iterator NIL = list.end();
while (std::next(it)!=NIL && *std::next(it)!=y)
++it;
if (std::next(it)!=NIL)
list.erase_after(it);
}
Again we could keep a second iterator variable to hold std::next(it) without having to spell it out each time (not forgetting to refresh its value when we increment it) and arrive at essentially the answer by Daniel Frey. (We could instead try to make that variable a pointer of type *T equal to &*std::next(it) instead, which suffices for the use we make of it, but it would actually be a bit of a hassle to ensure it becomes the null pointer when std::next(it)==NIL, as the standard will not let us take &*NIL).
I cannot help feel that since the old days the solution to this problem has not become more elegant.