I have a vector<A>, A has a field A::foo.
I would like to create a vector which elements are the "foo's" of the previous vector. Sure I can iterate through the vector's elements, but is there a direct implementation in the STL or other major library ?
Probably this should work
vector<A> vA; // this is your vector<A>, assumed to contain some `A`'s
...
vector<Foo> vFoo; // here is where we extract the Foo's
std::transform(std::begin(vA), std::end(vA), std::back_inserter(vFoo),
[](const A& param){return param.foo});
You can use std::transform from <algorithm>.
#include<algorithm>
std::vector<A> as;
// fill it
std::vector<A::foo_type> foos;
foos.resize(as.size());
std::transform(as.begin(), as.end(), foos.begin(), [](const &A) { return A.foo; });
You can use std::transform, but in this case,i think a range based for loop is more readable and should be at least as efficient as using STL's algorithm.
EDIT:
To illustrate my point:
struct Foo {};
struct Bar {
Foo foo;
};
int main() {
vector<Bar> bars(10);
vector<Foo> foos1, foos2, foos3;
foos1.reserve(bars.size());
for (const auto& e : bars) { // this will become 'for (e:bars)' in the future
foos1.push_back(e.foo);
}
foos2.resize(bars.size());
transform(bars.begin(), bars.end(), foos2.begin(), [](const Bar& bar) { return bar.foo; });
foos3.reserve(bars.size());
transform(bars.begin(), bars.end(), back_inserter(foos3), [](const Bar& bar){return bar.foo; });
}
In my opinion, the loop version is much easier to read (but I admit that this is probably a matter of taste).
Don't get me wrong, I really do like algorithms, but in very simple cases like this, their syntactic overhead just doesn't pull its weight.
EDIT2:
Some processors have a gather instruction instruction. It would be interesting to see, whether a typical compiler would generate those and under which conditions.
Related
Let's have
class InputClass;
class OutputClass;
OutputClass const In2Out(InputClass const &in)
{
//conversion implemented
}
and finally
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
std::vector<OutputClass> res;
res.reserve(input.size());
//either
for (auto const &in : input)
res.emplace_back(In2Out(in));
return res;
//or something like
std::transform(input.begin(), input.end(), std::back_inserter(res), [](InputClass const &in){return In2Out(in);});
return res;
}
And now my question:
Can I rewrite the Convert function somehow avoiding the need to name the new container? I. e. is there a way to construct a vector directly using something roughly like std::transform or std::for_each?
As in (pseudocode, this unsurprisingly does not work or even build)
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
return std::transform(input.begin(), input.end(), std::back_inserter(std::vector<OutputClass>()), [](InputClass const &in){return In2Out(in);});
}
Searched, but did not find any elegant solution. Thanks!
Starting in C++ 20 you can use the new std::ranges::transform_view to accomplish what you want. It will call your transformation function for each element in the container that it is adapting and you can use that view to invoke std::vector's iterator range constructor which will allocate the memory for the entire vector once and then populate the elements. It still requires you to create a variable in the function but it becomes much more streamlined. That would give you something like
std::vector<OutputClass> Convert(std::vector<InputClass> const &input)
{
auto range = std::ranges::transform_view(input, In2Out);
return {range.begin(), range.end()};
}
Do note that this should optimize to the exact same code your function generates.
Yes it is possible, and quite simple when using boost:
struct A
{
};
struct B
{
};
std::vector<B> Convert(const std::vector<A> &input)
{
auto trans = [](const A&) { return B{}; };
return { boost::make_transform_iterator(input.begin(), trans), boost::make_transform_iterator(input.end(), trans) };
}
https://wandbox.org/permlink/ZSqt2SbsHeY8V0mt
But as other mentioned this is weird and doesn't provide any gain (no performance gain or readability gain)
Can I rewrite the Convert function somehow avoiding the need to name the new container?
Not using just std::transform. std::transform itself never creates a container. It only inserts elements to an output iterator. And in order to both get output iterator to a container, and return the container later, you pretty much need a name (unless you allocate the container dynamically, which would be silly and inefficient).
You can of course write a function that uses std::transform, creates the (named) vector, and returns it. Then caller of that function doesn't need to care about that name. In fact, that's pretty much what your function Convert is.
If I have the following code that makes use of execution policies, do I need to synchronize all accesses to Foo::value even when I'm just reading the variable?
#include <algorithm>
#include <execution>
#include <vector>
struct Foo { int value; int getValue() const { return value; } };
int main() {
std::vector<Foo> foos;
//fill foos here...
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() > right.getValue();
});
return 0;
}
My concern is that std::sort() will move (or copy) elements asynchronously which is effectively equivalent to asynchronously writing to Foo::value and, therefore, all read and write operations on that variable need to be synchronized. Is this correct or does the sort function itself take care of this for me?
What if I were to use std::execution::par_unseq?
If you follow the rules, i.e. you don't modify anything or rely on the identity of the objects being sorted inside your callback, then you're safe.
The parallel algorithm is responsible for synchronizing access to the objects it modifies.
See [algorithms.parallel.exec]/2:
If an object is modified by an element access function, the algorithm will perform no other unsynchronized accesses to that object. The modifying element access functions are those which are specified as modifying the object. [ Note: For example, swap(), ++, --, #=, and assignments modify the object. For the assignment and #= operators, only the left argument is modified. — end note ]
In case of std::execution::par_unseq, there's the additional requirement on the user-provided callback that it isn't allowed to call vectorization-unsafe functions, so you can't even lock anything in there.
This is OK. After all, you have told std::sort what you want of it and you would expect it to behave sensibly as a result, given that it is presented with all the relevant information up front. There's not a lot of point to the execution policy parameter at all, otherwise.
Where there might be an issue (although not in your code, as written) is if the comparison function has side effects. Suppose we innocently wrote this:
int numCompares;
std::sort(std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
++numCompares;
return left.getValue() > right.getValue();
});
Now we have introduced a race condition, since two threads of execution might be passing through that code at the same time and access to numCompares is not synchronised (or, as I would put it, serialised).
But, in my slightly contrived example, we don't need to be so naive, because we can simply say:
std::atomic_int numCompares;
and then the problem goes away (and this particular example would also work with what appears to me to be the spectacularly useless std::execution::par_unseq, because std_atomic_int is lockless on any sensible platform, thank you Rusty).
So, in summary, don't be too concerned about what std::sort does (although I would certainly knock up a quick test program and hammer it a bit to see if it does actually work as I am claiming). Instead, be concerned about what you do.
More here.
Edit And while Rusty was digging that up, I did in fact write that quick test program (had to fix your lambda) and, sure enough, it works fine. I can't find an online compiler that supports execution (MSVC seems to think it is experimental) so I can't offer you a live demo, but when run on the latest version of MSVC, this code:
#define _SILENCE_PARALLEL_ALGORITHMS_EXPERIMENTAL_WARNING
#include <algorithm>
#include <execution>
#include <vector>
#include <cstdlib>
#include <iostream>
constexpr int num_foos = 100000;
struct Foo
{
Foo (int value) : value (value) { }
int value;
int getValue() const { return value; }
};
int main()
{
std::vector<Foo> foos;
foos.reserve (num_foos);
// fill foos
for (int i = 0; i < num_foos; ++i)
foos.emplace_back (rand ());
std::sort (std::execution::par, foos.begin(), foos.end(), [](const Foo & left, const Foo & right)
{
return left.getValue() < right.getValue();
});
int last_foo = 0;
for (auto foo : foos)
{
if (foo.getValue () < last_foo)
{
std::cout << "NOT sorted\n";
break;
}
last_foo = foo.getValue ();
}
return 0;
}
Generates the following output every time I run it:
<nothing>
QED.
Is there an std function to easily copy a vector of pointers to classes into a vector of classes or do I have to manually iterate over them?
Looking for the solution with the fastest/fewer lines of code :).
A solution that avoids copying without leaking memory is also welcomed!
I doubt there is such, below is one liner with for:
std::vector<X*> vec1 = { new X, new X };
std::vector<X> vec2;
vec2.reserve(vec1.size());
for (auto p : vec1) vec2.push_back(*p);
if you want to make sure no copies are made then you can use std::reference_wrapper<>:
std::vector<std::reference_wrapper<X>> vec2;
for (auto p : vec1) vec2.push_back(*p);
but then you have to make sure no element of vec2 is accessed after vec1 elements were deallocated.
Another aproach is to use unique_ptr like that:
std::vector<std::unique_ptr<X>> vec2;
for (auto p : vec1) vec2.emplace_back(p);
now you can ditch vec1, but then why not make vec1 of type std::vector<std::unique_ptr<X>>?
A one-liner with no manual iteration at all:
std::for_each(vec1.begin(), vec1.end(), [&](auto x) { vec2.push_back(*x); });
(Disclaimer: I'm not 100% sure about the reference-capturing lambda syntax.)
You have to do it yourself. std:transform or std::for_each will help you:
#include <algorithm>
#include <vector>
#include <functional>
using namespace std::placeholders;
class Myclass{};
Myclass deref(const Myclass *mc) { return *mc;}
void append(std::vector<Myclass> &v, Myclass *mc) {v.push_back(*mc);}
int main(){
std::vector<Myclass*> foo;
std::vector<Myclass> bar;
/* foo is initialised somehow */
/* bar is initialised somehow to hold the same amount of dummy elements */
//solution with transform
std::transform (foo.begin(), foo.end(), bar.begin(), deref);
bar={};
// solution with for_each
auto bound_append = std::bind(append, bar, _1);
std::for_each(foo.begin(), foo.end(), bound_append);
}
Compile wit -std=c++11 (gcc).
Disclaimer
I don't actually propose to apply this design anywhere, but I've been curious nonetheless how one would implement this in C++, in particular given C++'s lack of reflection. (I'm simultaneously learning and experimenting with C++11 features, so please do use C++11 features where helpful.)
The Effect
What I want to achieve is almost purely cosmetic.
I'd like a class that binds itself to an arbitrary number of, say, vectors, using referenced members (which, as I understand, must be initialized during construction), but provides "aliases" for accessing these vectors as members.
To give a minimal example, I want this to work—
std::vector<int> one;
std::vector<int> two;
std::vector<int> three;
Foo foo(std::make_pair('B', one),
std::make_pair('D', two),
std::make_pair('F', three));
foo.DoSomething();
where
class Foo
{
public:
// I'm using variadic templates here for sake of learning,
// but initializer lists would work just as well.
template <typename ...Tail>
Foo(std::pair<char, std::vector<int>&> head, Tail&&... tail) // : ???
{
// ???
}
virtual void DoSomething()
{
D.push_back(42);
std::cout << D[0] << std::endl;
}
private:
std::vector<int> A&, B&, C&, D&, E&, F&, G&; // and so on...
}
and also so that
std::cout << one[0] << std::endl; // outputs 42 from outside the class...
But you refuse to answer unless you know why...
Why would anyone want to do this? Well, I don't really want to do it, but the application I had in mind was something like this. Suppose I'm building some kind of a data analysis tool, and I have clients or operations people who know basic logic and C++ syntax, but don't understand OOP or anything beyond CS 101. Things would go a lot smoother if they could write their own DoSomething()s on the fly, rather than communicate every need to developers. However, it's not realistic to get them to set up UNIX accounts, teach them how to compile, and so on. So suppose instead I'd like to build an intranet web interface that lets them write the body of DoSomething() and configure what datasets they'd like to "alias" by an uppercase char, and upon submission, generates C++ for a child class of Foo that overrides DoSomething(), then builds, runs, and returns the output. (Suspiciously specific for a "hypothetical," eh? :-) Okay, something like this situation does exist in my world—however it only inspired this idea and a desire to explore it—I don't think it'd be worth actually implementing.) Obviously, this whole uppercase char ordeal isn't absolutely necessary, but it'd be a nice touch because datasets are already associated with standard letters, e.g. P for Price, Q for Quantity, etc.
The best I can do...
To be honest, I can't figure out how I'd make this work using references. I prefer using references if possible, for these reasons.
With pointers, I guess I'd do this—
class Foo
{
public:
template <typename ...Tail>
Foo(std::pair<char, std::vector<int>*> head, Tail&&... tail)
{
std::vector<int>[26] vectors = {A, B, C, D, E, F, G}; // and so on...
// I haven't learned how to use std::forward yet, but you get the picture...
// And dear lord, forgive me for what I'm about to do...
vectors[tail.first - 65] = tail.second;
}
virtual void DoSomething()
{
D->push_back(42);
std::cout << (*D)[0] << std::endl;
}
private:
std::vector<int> A*, B*, C*, D*, E*, F*, G*; // and so on...
}
But even that is not that elegant.
Is there a way to use references and achieve this?
Is there a way to make this more generic, e.g. use pseudo-reflection methods to avoid having to list all the uppercase letters again?
Any suggestions on alternative designs that would preserve the primary goal (the cosmetic aliasing I've described) in a more elegant or compact way?
You may use something like:
class Foo
{
public:
template <typename ...Ts>
Foo(Ts&&... ts) : m({std::forward<Ts>(ts)...}),
A(m.at('A')),
B(m.at('B'))
// and so on...
{
}
virtual void DoSomething()
{
A.push_back(42);
std::cout << A[0] << std::endl;
}
private:
std::map<char, std::vector<int>&> m;
std::vector<int> &A, &B; //, &C, &D, &E, &F, &G; // and so on...
};
but that requires that each vector is given, so
Foo(std::vector<int> (&v)[26]) : A(v[0]), B(v[1]) // and so on...
{
}
or
Foo(std::vector<int> &a, std::vector<int> &b /* And so on */) : A(a), B(b) // and so on...
{
}
seems more appropriate.
Live example.
And it seems even simpler if it is the class Foo which owns the vector, so you will just have
class Foo
{
public:
virtual ~Foo() {}
virtual void DoSomething() { /* Your code */ }
std::vector<int>& getA() { return A; }
private:
std::vector<int> A, B, C, D; // And so on
};
and provide getters to initialize internal vectors.
and then
std::vector<int>& one = foo.GetA(); // or foo.A if you let the members public.
Suppose I have the following two data structures:
std::vector<int> all_items;
std::set<int> bad_items;
The all_items vector contains all known items and the bad_items vector contains a list of bad items. These two data structures are populated entirely independent of one another.
What's the proper way to write a method that will return a std::vector<int> contain all elements of all_items not in bad_items?
Currently, I have a clunky solution that I think can be done more concisely. My understanding of STL function adapters is lacking. Hence the question. My current solution is:
struct is_item_bad {
std::set<int> const* bad_items;
bool operator() (int const i) const {
return bad_items.count(i) > 0;
}
};
std::vector<int> items() const {
is_item_bad iib = { &bad_items; };
std::vector<int> good_items(all_items.size());
std::remove_copy_if(all_items.begin(), all_items.end(),
good_items.begin(), is_item_bad);
return good_items;
}
Assume all_items, bad_items, is_item_bad and items() are all a part of some containing class. Is there a way to write them items() getter such that:
It doesn't need temporary variables in the method?
It doesn't need the custom functor, struct is_item_bad?
I had hoped to just use the count method on std::set as a functor, but I haven't been able to divine the right way to express that w/ the remove_copy_if algorithm.
EDIT: Fixed the logic error in items(). The actual code didn't have the problem, it was a transcription error.
EDIT: I have accepted a solution that doesn't use std::set_difference since it is more general and will work even if the std::vector isn't sorted. I chose to use the C++0x lambda expression syntax in my code. My final items() method looks like this:
std::vector<int> items() const {
std::vector<int> good_items;
good_items.reserve(all_items.size());
std::remove_copy_if(all_items.begin(), all_items.end(),
std::back_inserter(good_items),
[&bad_items] (int const i) {
return bad_items.count(i) == 1;
});
}
On a vector of about 8 million items the above method runs in 3.1s. I bench marked the std::set_difference approach and it ran in approximately 2.1s. Thanks to everyone who supplied great answers.
As jeffamaphone suggested, if you can sort any input vectors, you can use std::set_difference which is efficient and less code:
#include <algorithm>
#include <set>
#include <vector>
std::vector<int>
get_good_items( std::vector<int> const & all_items,
std::set<int> const & bad_items )
{
std::vector<int> good_items;
// Assumes all_items is sorted.
std::set_difference( all_items.begin(),
all_items.end(),
bad_items.begin(),
bad_items.end(),
std::back_inserter( good_items ) );
return good_items;
}
Since your function is going to return a vector, you will have to make a new vector (i.e. copy elements) in any case. In which case, std::remove_copy_if is fine, but you should use it correctly:
#include <iostream>
#include <vector>
#include <set>
#include <iterator>
#include <algorithm>
#include <functional>
std::vector<int> filter(const std::vector<int>& all, const std::set<int>& bad)
{
std::vector<int> result;
remove_copy_if(all.begin(), all.end(), back_inserter(result),
[&bad](int i){return bad.count(i)==1;});
return result;
}
int main()
{
std::vector<int> all_items = {4,5,2,3,4,8,7,56,4,2,2,2,3};
std::set<int> bad_items = {2,8,4};
std::vector<int> filtered_items = filter(all_items, bad_items);
copy(filtered_items.begin(), filtered_items.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
}
To do this in C++98, I guess you could use mem_fun_ref and bind1st to turn set::count into a functor in-line, but there are issues with that (which resulted in deprecation of bind1st in C++0x) which means depending on your compiler, you might end up using std::tr1::bind anyway:
remove_copy_if(all.begin(), all.end(), back_inserter(result),
bind(&std::set<int>::count, bad, std::tr1::placeholders::_1)); // or std::placeholders in C++0x
and in any case, an explicit function object would be more readable, I think:
struct IsMemberOf {
const std::set<int>& bad;
IsMemberOf(const std::set<int>& b) : bad(b) {}
bool operator()(int i) const { return bad.count(i)==1;}
};
std::vector<int> filter(const std::vector<int>& all, const std::set<int>& bad)
{
std::vector<int> result;
remove_copy_if(all.begin(), all.end(), back_inserter(result), IsMemberOf(bad));
return result;
}
At the risk of appearing archaic:
std::set<int> badItems;
std::vector<int> items;
std::vector<int> goodItems;
for ( std::vector<int>::iterator iter = items.begin();
iter != items.end();
++iter)
{
int& item = *iter;
if ( badItems.find(item) == badItems.end() )
{
goodItems.push_back(item);
}
}
std::remove_copy_if returns an iterator to the target collection. In this case, it would return good_items.end() (or something similar). good_items goes out of scope at the end of the method, so this would cause some memory errors. You should return good_items or pass in a new vector<int> by reference and then clear, resize, and populate it. This would get rid of the temporary variable.
I believe you have to define the custom functor because the method depends on the object bad_items which you couldn't specify without it getting hackey AFAIK.