This question already has an answer here:
Why C++ ranges "transform -> filter" calls transform twice for values that match the filter's predicate?
(1 answer)
Closed 1 year ago.
Here in a simple pipeline of views adaptors, there is the gen function called to generate a sequence of values (using an internal state) and then a filter on it.
What is surprising and counterintuitive (at least for me) is the fact that the generator function is called twice at each iteration, so that the next check on the same filter fails (the filtered value is not reused in the pipeline).
Do you have an idea if this is the correct expected behavior (and why)?
Tested with libstdc++ in GCC 10.3, 11.1 & trunk (code) and range-v3 with GCC & clang (code).
int main() {
int n = 0;
auto gen = [&n]() {
auto result = ++n;
std::cout << "Generate [" << result << "]\n";
return result;
};
auto tmp =
ranges::views::iota(0)
| ranges::views::transform([gen](auto &&) { return gen(); })
| ranges::views::filter([](auto &&i) {
std::cout << "#1 " << i << " " << (i % 2) << "\n";
return (i % 2) == 1;
});
for (auto &&i : tmp | ranges::views::take(1)) {
std::cout << "#2 " << i << " " << ((i % 2) == 1) << "\n";
assert(((i % 2) == 1));
}
}
NB: if gen function is written as mutable with an internal state, it does not compile:
auto gen = [n=0]() mutable {
auto result = ++n;
std::cout << "Generate [" << result << "]\n";
return result;
};
(and I know that pure functions are better)
Do you have an idea if this is the correct expected behavior (and why)?
Yes: this is the expected behavior. It is an inherent property of the iteration model where we have operator* and operator++ as separate operations.
filter's operator++ has to look for the next underlying iterator that satisfies the predicate. That involves doing *it on transform's iterator which involves invoking the function. But once we find that next iterator, when we read it again, that will again invoke the transform. In a code snippet:
decltype(auto) transform_view<V, F>::iterator::operator*() const {
return invoke(f_, *it_);
}
decltype(auto) filter_view<V, P>::iterator::operator*() const {
// reading through the filter iterator just reads
// through the underlying iterator, which in this
// case means invoking the function
return *it_;
}
auto filter_view<V, P>::iterator::operator++() -> iterator& {
for (++it_; it_ != ranges::end(parent_->base_); ++it_) {
// when we eventually find an iterator that satisfies this
// predicate, we will have needed to read it (which calls
// the functions) and then the next operator* will do
// that same thing again
if (invoke(parent_->pred_, *it_))) {
break;
}
}
return *this;
}
The result is that we invoke the function twice on every element that satisfies the predicate.
The workaround is to either just not care (have the transform be either cheap enough that invoking it twice doesn't matter or the filter be sufficiently rare that the amount of duplicate transforms don't matter or both) or do add a caching layer into your pipeline.
There's no caching view in C++20 Ranges, but there is one in range-v3 named views::cache1:
ranges::views::iota(0)
| ranges::views::transform(f)
| ranges::views::cache1
| ranges::views::filter(g)
This ensures that f only gets invoked at most once per element, at the cost of having to deal with an element cache and downgrading your range to only be an input range (where before it was bidirectional).
Related
I have a running-time issue about my c++ program. The program doing millions of times comparing two integer list contains common elements or not. I don't need to learn which elements is common. I wrote the method below but it doesn't look efficient. I need to speed up program. So, what is the best way of doing this process or c++ have any built-in method which is doing this compare efficiently?
bool compareHSAndNewSet(list<int> hs , list<int> newset){
bool isCommon = false;
for(int x : hs){
for(int y : newset){
if(x == y){isCommon = true; break;}
}
if(isCommon == true) {break;}
}
return isCommon;
}
Hint: I don't now maybe this means something. The first input of the function (in the code hs) is ordered.
I was curious about the various strategies, so I made the simple benchmark below.
However, I wouldn't try to sort the second container; comparing all the data inside a container and moving them around seems to be overkill just to find one element in the intersection.
The program gives these results on my computer (Intel(R) Core(TM) i7-10875H CPU # 2.30GHz):
vectors: 1.41164
vectors (dichotomic): 0.0187354
lists: 12.0402
lists (dichotomic): 13.4844
If we ignore that the first container is sorted and iterate its elements in order, we can see that a simpler container (a vector here) with adjacent storage of the elements if much better than multiple elements spread in memory (a list here): 1.41164 s over 12.0402 (8.5 speedup).
But if we consider that the first container is sorted (as told in the question), a dichotomic approach can improve even more the situation.
The best case (dichotomic approach on vectors) is far better than the original case (in order approach on lists): 0.0187354 s over 12.0402 s (642 speedup).
Of course, all of this depends on many other factors (sizes of datasets, distributions of the values...); this is just a micro benchmark, and a specific application could behave differently.
Note that in the question, the parameters were passed by value; this will probably cause some unneeded copies (except if a move operation is used at the call site, but I would find that uncommon for such a function). I switched to pass-by-reference-on-const instead.
Note also that a dichotomic approach on a list is a pessimisation (no random access for the iterators, so it's still linear but more complicated than the simplest linear approach).
edit: my original code was wrong, thanks to #bitmask I changed it; it does not change the general idea.
/**
g++ -std=c++17 -o prog_cpp prog_cpp.cpp \
-pedantic -Wall -Wextra -Wconversion -Wno-sign-conversion \
-O3 -DNDEBUG -march=native
**/
#include <list>
#include <vector>
#include <algorithm>
#include <chrono>
#include <random>
#include <tuple>
#include <iostream>
template<typename Container>
bool
compareHSAndNewSet(const Container &hs,
const Container &newset)
{
for(const auto &elem: newset)
{
const auto it=std::find(cbegin(hs), cend(hs), elem);
if(it!=cend(hs))
{
return true; // found common element
}
}
return false; // no common element
}
template<typename Container>
bool
compareHSAndNewSet_dichotomic(const Container &hs,
const Container &newset)
{
for(const auto &elem: newset)
{
if(std::binary_search(cbegin(hs), cend(hs), elem))
{
return true; // found common element
}
}
return false; // no common element
}
std::tuple<std::vector<int>, // hs
std::vector<int>> // newset
prepare_vectors()
{
static auto rnd_gen=std::default_random_engine {std::random_device{}()};
constexpr auto sz=10'000;
auto distr=std::uniform_int_distribution<int>{0, 10*sz};
auto hs=std::vector<int>{};
auto newset=std::vector<int>{};
for(auto i=0; i<sz; ++i)
{
hs.emplace_back(distr(rnd_gen));
newset.emplace_back(distr(rnd_gen));
}
std::sort(begin(hs), end(hs));
return {hs, newset};
}
std::tuple<std::list<int>, // hs
std::list<int>> // newset
prepare_lists(const std::vector<int> &hs,
const std::vector<int> &newset)
{
return {std::list(cbegin(hs), cend(hs)),
std::list(cbegin(newset), cend(newset))};
}
double // seconds (1e-6 precision) since 1970/01/01 00:00:00 UTC
get_time()
{
const auto now=std::chrono::system_clock::now().time_since_epoch();
const auto us=std::chrono::duration_cast<std::chrono::microseconds>(now);
return 1e-6*double(us.count());
}
int
main()
{
constexpr auto generations=100;
constexpr auto iterations=1'000;
auto duration_v=0.0;
auto duration_vd=0.0;
auto duration_l=0.0;
auto duration_ld=0.0;
for(auto g=0; g<generations; ++g)
{
const auto [hs_v, newset_v]=prepare_vectors();
const auto [hs_l, newset_l]=prepare_lists(hs_v, newset_v);
for(auto i=-1; i<iterations; ++i)
{
const auto t0=get_time();
const auto comp_v=compareHSAndNewSet(hs_v, newset_v);
const auto t1=get_time();
const auto comp_vd=compareHSAndNewSet_dichotomic(hs_v, newset_v);
const auto t2=get_time();
const auto comp_l=compareHSAndNewSet(hs_l, newset_l);
const auto t3=get_time();
const auto comp_ld=compareHSAndNewSet_dichotomic(hs_l, newset_l);
const auto t4=get_time();
if((comp_v!=comp_vd)||(comp_v!=comp_l)||(comp_v!=comp_ld))
{
std::cerr << "comparison mismatch\n";
}
if(i>=0) // first iteration is dry-run (warmup)
{
duration_v+=t1-t0;
duration_vd+=t2-t1;
duration_l+=t3-t2;
duration_ld+=t4-t3;
}
}
}
std::cout << "vectors: " << duration_v << '\n';
std::cout << "vectors (dichotomic): " << duration_vd << '\n';
std::cout << "lists: " << duration_l << '\n';
std::cout << "lists (dichotomic): " << duration_ld << '\n';
return 0;
}
You can try sorting the list and use set_intersection.
bool compareHSAndNewSet(list<int> hs , list<int> newset){
hs.sort();
newset.sort();
list<int>::iterator i;
list<int> commonElts (hs.size()+newset.size());
i = std::set_intersection(hs.begin(), hs.end(), newset.begin(), newset.end(), commonElts.begin());
commonElts.resize(i - commonElts.begin());
return (v.size() == 0);
I'd use std::unordered_map<> to add the first list to, then check each element of the second list if it exists in the map. This would end up iterating each list once, doing length(first) insertions and length(second) lookups on the map.
std::unordered_map<> should have a lookup and insertion complexity of O(1), though worst case could end up with O(n). (I believe).
I have the following broken code, from what I can tell problem here is that iota(0, n)
returns me a iota_view<int,int64> and then obviously int can never be int64 that is greater than INT_MAX.
Easy fix is to just use iota(0LL, n), but that seems error prone.
int main() {
const int64_t n = 16LL*1024*1024*1024;
auto ints = std::ranges::views::iota(0, n) |
std::views::transform([](int64_t x) { return x * 10; });
for (int64_t lookup : {49LL, 50LL, 51LL}) {
const auto it = std::ranges::lower_bound(ints, lookup);
if (it != ints.end()) {
std::cout << *it << std::endl;
std::cout << *it.base() << std::endl;
}
}
}
My best guess is that iota_view wants to work with "weird" second type, like some kind of +INF type, so that is why it needs 2 types, and nobody thought of forcing the first argument to match the second one if they are both ints.
Is there any way to write these two class methods, without using loops, lambda functions, and other additional functions? I only want to use the functions and functors from libraries algorithm and functional. I have tried to use while loop and recursion to fugure something out, but still couldn't solve this. I didn't want to post the whole code since it is too long.
(I don't have to use these libraries for << operator, if there is a way to solve this without them, and min and max are lists)
My main goal is not using loops, lambda functions and any additional functions outside the two libraries. EraseNegativeTemperatures is the method which should only erase the registered pairs of temperatures where both of them are negative (I wrote the method in the part which I didn't post, and that method makes pairs where one temperature is from min and the other one from max)
Operator << outputs all the registered temperatures, the min ones in one row and the max ones in the second row; the temperatures in one row are separates by space.
void Temperatures::EraseNegativeTemperatures() {
for (auto it = max.begin(); it != max.end(); it++) {
if (*it < 0) {
auto it1 = it;
auto it2 = min.begin();
while (it1 != max.begin()) {
it2++;
it1--;
}
min.erase(it2);
it = max.erase(it);
}
}
// std::remove_if(min.begin(), min.end(), std::bind(std::less<int>(),
max.begin() + std::placeholders::_1 - min.begin(), 0));
}
// second method
std::ostream &operator<<(std::ostream &flow, const Temperatures &t) {
std::for_each(t.min.begin(), t.min.end(),
[&flow](int x) { flow << x << " "; });
flow << std::endl;
std::for_each(t.max.begin(), t.max.end(),
[&flow](int x) { flow << x << " "; });
flow << std::endl;
return flow;
}
As far as I can see you erase a temperature from each of min and max vector which are reverse of each other.
Added a test if the min temp is also negative.
Totally untested code
void Temperatures::EraseNegativeTemperatures() {
for (auto it = max.begin(); it != max.end(); it++) {
if (*it < 0) {
auto diff = std::distance(max.begin(), it);
if (*(min.rend()+diff) < 0) {
min.erase(min.rend()+diff); // reverse of max
it = max.erase(it);
}
}
}
At first it might be simpler to erase from both lists separately; the following (disallowed) lambda solution will be the starting point:
auto current = max.begin();
min.erase(
remove_if(
min.begin(), min.end(),
[¤t] (int /* ignored! */) { return *current++ < 0; }
),
min.end()
);
This will remove all those elements from the min list that have a corresponding negative value in the max list. Afterwards you can remove the negative values from the max list with an expression you have already found:
max.erase(
remove_if(
max.begin(), max.end(),
std::bind(std::less<int>(), std::placeholders::_1, 0)
),
max.end()
);
The tricky part is now to re-build the lambda with standard library means only.
So we will need the operators above available as ordinary functions so that we can bind to them:
auto dereference = &std::list<int>::iterator::operator*;
auto postIncrement = static_cast<
std::list<int>::iterator (std::list<int>::iterator::*)(int)
>(
&std::list<int>::iterator::operator++
);
The static_cast for getting the second operator is necessary to distinguish the post-increment operator from the pre-increment operator.
Now we need to bind everything to the less-than operator:
auto lt0 = std::bind(
std::less<int>(),
std::bind(dereference, std::bind(postIncrement, std::ref(current), 0)),
0
);
Note the creation of a reference to the iterator! The resulting functor we now can use to erase from min list:
min.erase(remove_if(min.begin(), min.end(), lt0), min.end());
Pretty similarly we can create the functor for outputting the arrays; at first we need the operators available; note that one of is a member function, the other one not:
auto outInt = static_cast<std::ostream& (std::ostream::*)(int)>(
&std::ostream::operator<<
);
auto outChar = static_cast<std::ostream& (*)(std::ostream&, char)>(
&std::operator<<
);
Now we can again bind everything together:
auto out = std::bind(
outChar,
std::bind(outInt, std::ref(flow), std::placeholders::_1),
' '
);
std::for_each(t.min.begin(), t.min.end(), out);
flow << std::endl;
std::for_each(t.max.begin(), t.max.end(), out);
flow << std::endl;
I am trying to iterate through a list and then, if the object's plate number matches the one given through the parameters, and if the toll (calculated in toll()) is less than or equal to the given cents, remove/erase the object from the list. I keep getting the error that the list iterator cannot be incremented and I'm clueless as to how to fix it.
void one_time_payment(string& plate_number, int cents) {
// TODO: REWRITE THIS FUNCTION
std::list<LicenseTrip>:: iterator it;
for (it = listLicense.begin(); it != listLicense.end(); std::advance(it, 1)) {
if (it->plate_number().compare(plate_number) == 0) {
cout << "Matching Plate Found" << endl;
if (it->toll() <= cents) {
cout << "Can be paid" << endl;
it = listLicense.erase(it); //Error: list iterator cannot be incremented
}
}
}
cout << "End of Iterator" << endl;
}
This is, I'm guessing, not a compile error but rather an assertion that triggered. You have a bug!
Let's say you're on the last element, and all your conditions apply. So we do:
it = listLicense.erase(it);
Now, it is end(). But right after that, at the end of the body of the for loop, we advance it! This is undefined behavior! Hence: list iterator cannot be incremented.
To help us write this correctly, there is a list::remove_if:
listLicense.remove_if([&](const LicenseTrip& trip){
return trip.plate_number() == plate_number &&
trip.toll() <= cents;
});
So, as Barry explained, the problem that was causing the failed assertion was that the iterator would attempt to advance it beyond end() which would give undefined behavior. In my case, the it would only be needed once (only used to locate a LicenseTrip with a matching plate_number), so it sufficed to put a break; after the listLicense.erase(it). The final working code is as follows:
void one_time_payment(string& plate_number, int cents) {
std::list<LicenseTrip>:: iterator it;
for (it = listLicense.begin(); (it != listLicense.end()) ; std::advance(it, 1)) {
if (it->plate_number().compare(plate_number) == 0 && it->toll() <= cents)
if (it->toll() <= cents) {
listLicense.erase(it);
break;
}
}
}
Here is a simple C++ question.
Description of the problem:
I have a function that takes as input an integer and returns a vector of zeros with length the input. Assume that I call the function many times with the same argument. What I want to avoid is that my function creates the vector of zeroes each time it is called. I want this to happen only the first time the function is called with the given input.
How I approached it: This brought to mind static variables. I thought of creating a static vector that holds the required zero vectors of each size, but wasn't able to figure out how to implement this. As an example I want something that "looks" like [ [0], [0,0], ...].
If there is a different way to approach such a problem please feel free to share! Also, my example with vectors is a bit specialised but replies that are more generic (concerning static variables that depend on the argument) would be greatly appreciated.
Side question:
To generalise further, is it possible to define a function that is only called once for each choice of arguments?
Thanks a lot.
You can have a map of sizes and vectors, one vector for each size:
#include <vector>
#include <map>
#include <cstddef>
std::vector<int>& get_vector(std::size_t size)
{
static std::map<size_t, std::vector<int> > vectors;
std::map<size_t, std::vector<int> >::iterator iter = vectors.find(size);
if (iter == vectors.end())
{
iter = vectors.insert(std::make_pair(size, std::vector<int>(size, 0))).first;
}
return iter->second;
}
If I understand correctly what you are trying to do, I don't think you will get the benefit you are expecting.
I wrote a quick benchmark to compare the performance of repeatedly creating a vector of zeros. The first benchmark uses the standard vector constructor. The second uses a function that only creates the vector the first time and stores it in a map:
const std::vector<int>& zeros(std::size_t size) {
static std::unordered_map<size_t, std::vector<int>> vectors;
auto find = vectors.find(size);
if (find != vectors.end())
return find->second;
auto insert = vectors.emplace(size, std::vector<int>(size));
return insert.first->second;
}
std::chrono::duration<float> benchmarkUsingMap() {
int sum = 0;
auto start = std::chrono::high_resolution_clock::now();
for (int i = 0; i != 10'000; ++i) {
auto zeros10k = zeros(10'000);
zeros10k[5342] = 1;
sum += zeros10k[5342];
}
auto end = std::chrono::high_resolution_clock::now();
std::cout << "Sum: " << sum << "\n";
return end - start;
}
std::chrono::duration<float> benchmarkWithoutUsingMap() {
int sum = 0;
auto start = std::chrono::high_resolution_clock::now();
for (int i = 0; i != 10'000; ++i) {
auto zeros10k = std::vector<int>(10'000);
zeros10k[5342] = 1;
sum += zeros10k[5342];
}
auto end = std::chrono::high_resolution_clock::now();
std::cout << "Sum: " << sum << "\n";
return end - start;
}
int main() {
std::cout << "Benchmark without map: " << benchmarkWithoutUsingMap().count() << '\n';
std::cout << "Benchmark using map: " << benchmarkUsingMap().count() << '\n';
}
Output:
Benchmark without map: 0.0188374
Benchmark using map: 0.134966
So, in this case, just creating the vector each time was almost 10x faster. This is assuming you want to create a mutable copy of the vector of zeros.
If each vector needs to be a separate instance then you will have to have a construction for each instance. Since you will have to construct each instance you can make a simple make_int_vector function like:
std::vector<int> make_int_vector(std::size_t size, int fill = 0)
{
return std::vector(size, fill);
}
The returned vector will either be moved or be elided with copy elision
What you are asking for is a cache. The hard part is how long an entry should exist in the cache. Your current requirement seems to be an eternal cache, meaning that each entry will persist for ever. For such a simple use case, à static map is enough:
template<typename T, typename U>
T cached(T (*funct)(U arg)) {
static unordered_map<U, T> c;
if (c.count(arg) == 0) {
c[arg] = funct(arg);
}
return c[arg];
}
The above is returning a value,which will require à copy. If you want to avoid the copy, just return a reference, but then, if you change one of the vectors, the next call will return the modified value.
template<typename T, typename U>
&T cached(T (*funct)(U arg)) {
static unordered_map<U, T> c;
if (c.count(arg) == 0) {
c[arg] = funct(arg);
}
return c[arg];
}