The STL provides std::copy but it is tricky to use it with output containers with fixed sizes as there is no bounds checking on the output iterator
So I invented my own, something like below
template<class InputIterator , class OutputIterator>
void safecopy( InputIterator srcStart , InputIterator srcEnd ,
OutputIterator destStart , OutputIterator destEnd )
{
while ( srcStart != srcEnd && destStart != destEnd )
{
*destStart = *srcStart;
++srcStart;
++destStart;
}
}
int main()
{
std::istream_iterator<char> begin(std::cin), end;
char buffer[3];
safecopy( begin, end, buffer, buffer + 3 );
return 0;
}
Questions:
Am I reinventing the wheel here ? Is there an stl algorithm to do what I want.
Are there any deficiencies in my safecopy , does it work for everything std::copy works for ?
Let me promote my comment to an answer, so I have a bit more space.
First off, your implementation looks good.
Now, why isn't this in the standard? (The new standard adds std::copy_n, but that does something different, too.*)
Think about it like this: strncopy isn't really a "good" idea; it's just not a terrible idea. Since C doesn't have any dynamic data structures, a length-checked version is the best you can do.
But in C++ this doesn't fit nicely into the general idea of dynamic containers: You would rarely want to overwrite some elements, but rather create all elements, which you do by std::copy plus std::inserter. strncpy is a crutch which requires you to preallocate the destination data structure, but in C++ we can do a lot better than this. With dynamic containers, iterators and inserters, we can copy anything without needing to worry about allocation.
In other words, any abstract algorithm that you might conceive should have a better, more specific method of obtaining iterators and iterator ranges (think remove/erase); it is rarely the case that the ultimate goal of an algorithm is to only produce an output range that is constrained by some other destination range.
In summary: Yes, you can do that, but you can probably do better.
*) Though copy_n plus min of source and destination size could be used to create a bounded copy.
I would make one minor adjustment to your implementation. Give it a return value. Either the final output iterator, or an integer indicating the number of elements copied.
The main use case I can see for your function would be reading fixed size chunks from an input stream and you don't know when it will end. If it does end, you need some way of knowing that, and you need to know how many elements were copied before it actually ended. If you know how many elements were copied, and it didn't meet or exceed the size of the output range, that's how you can know it ended.
Yes. You're reinventing the wheel again!
For example, you could use std::copy as:
std::copy(s.begin(), s.begin() + 3 , buffer);
instead of this,
safecopy(s.begin(), s.end() , buffer, buffer + 3);
The usage of std::copy in this way is NOT less safer than your safecopy.
Or even better is std::copy_n which comes with C++11:
std::copy_n(s.begin(), 3, buffer);
This would work even if the argument is not random access iterator.
As for when you use std::vector<char>, you could use its constructor directly as:
std::vector<char> v(s.begin(), s.end());
No need of even std::copy.
Related
I am asking this as the other relevant questions on SO seem to be either for older versions of the C++ standard, do not mention any form of parallelization, or are focused on keeping the ordering/indexing the same as elements are removed.
I have a vector of potentially hundreds of thousands or millions of elements (which are fairly light structures, around ~20 bytes assuming they're compacted down).
Due to other restrictions, it must be a std::vector and other containers would not work (like std::forward_list), or be even less optimal in other uses.
I recently swapped from simple it = std::erase(it) approach to using pop-and-swap using something like this:
for(int i = 0; i < myVec.size();) {
// Do calculations to determine if element must be removed
// ...
// Remove if needed
if(elementMustBeRemoved) {
myVec[i] = myVec.back();
myVec.pop_back();
} else {
i++;
}
}
This works, and was a significant improvement. It cut the runtime of the method down to ~61% of what it was previously. But I would like to improve this further.
Does C++ have a method to remove many non-consecutive elements from a std::vector efficiently? Like passing a vector of indices to erase() and have C++ do some magic under the hood to minimize movement of data?
If so, I could have threads individually gather indices that must be removed in parallel, and then combine them and pass them to erase().
Take a look at std::remove_if algorithm. You could use it like this:
auto firstToErase = std::remove_if(myVec.begin(), myVec.end(),
[](const & T x){
// Do calculations to determine if element must be removed
// ...
return elementMustBeRemoved;});
myVec.erase(firstToErase, myVec.end());
cppreference says that following code is a possible implementation for remove_if:
template<class ForwardIt, class UnaryPredicate>
ForwardIt remove_if(ForwardIt first, ForwardIt last, UnaryPredicate p)
{
first = std::find_if(first, last, p);
if (first != last)
for(ForwardIt i = first; ++i != last; )
if (!p(*i))
*first++ = std::move(*i);
return first;
}
Instead of swapping with the last element it continuously moves through a container building up a range of elements which should be erased, until this range is at the very end of vector. This looks like a more cache-friendly solution and you might notice some performance improvement on a very big vector.
If you want to experiment with a parallel version, there is a version (4) which allows to specify execution policy.
Or, since C++20 you can type sligthly less and use erase_if.
However, in such case you lose the option to choose execution policy.
Is there an even faster approach than swap-and-pop for erasing from std::vector?
Ever since C++11, the optimal removal of single element from vector without preserving order has been move-and-pop rather than swap-and-pop.
Does C++ have a method to remove many non-consecutive elements from a std::vector efficiently?
The remove-erase (std::erase in C++20) idiom is the most efficient that the standard provides. std::remove_if does preserve order, and if you don't care about that, then a more efficient algorithm may be possible. But standard library does not come with unstable remove out of the box. The algorithm goes as follows:
Find first element to be removed (a)
Find last element to not be removed (b)
Move b to a.
Repeat between a and b until iterators meet.
There is a proposal P0048 to add such algorithm to the standard library, and there is a demo implementation in https://github.com/WG21-SG14/SG14/blob/6c5edd5c34e1adf42e69b25ddc57c17d99224bb4/SG14/algorithm_ext.h#L84
I have a trivial function that copies a byte block to std::vector:
std::vector<uint8_t> v;
void Write(const uint8_t * buffer, size_t count)
{
//std::copy(buffer, buffer + count, std::back_inserter(v));
v.insert(v.end(), buffer, buffer + count);
}
v.reserve(<buffer size>);
v.resize(0);
Write(<some buffer>, <buffer size>);
if I use std::vector<uint8_t>::insert it works 5 times faster than if I use std::copy.
I tried to compile this code with MSVC 2015 with enabled and disabled optimization and got the same result.
Looks like something is strange with std::copy or std::back_inserter implementation.
Standard library implementation is written with performance in mind, but performance is achieved only when optimization is ON.
//This reduces the performance dramatically if the optimization is switched off.
Trying to measure a function performance with optimization OFF is as pointless as asking ourselves if the law of gravitation would still be true if there were no mass left in the Universe.
The call to v.insert is calling a member function of the container. The member function knows how the container is implemented, so it can do things that a more generic algorithm can't do. In particular, when inserting a range of values designated by random-access iterators into a vector, the implementation knows how many elements are being added, so it can resize the internal storage once and then just copy the elements.
The call to std::copy with an insert-iterator, on the other hand, has to call insert for each element. It can't preallocate, because std::copy works with sequences, not containers; it doesn't know how to adjust the size of the container. So for large insertions into a vector the internal storage gets resized each time the vector is full and a new insertion is needed. The overhead of that reallocation is amortized constant time, but the constant is much larger than the constant when only one resizing is done.
With the call to reserve (which I overlooked, thanks, #ChrisDrew), the overhead of reallocating is not as significant. But the implementation of insert knows how many values are being copied, and it knows that those values are contiguous in memory (because the iterator is a pointer), and it knows that the values are trivially copyable, so it will use std::memcpy to blast the bits in all at once. With std::copy, none of that applies; the back inserter has to check whether a reallocation is necessary, and that code can't be optimized out, so you end up with a loop that copies an element at a time, checking for the end of the allocated space for each element. That's much more expensive than a plain std::memcpy.
In general, the more the algorithm knows about the internals of the data structure that it's accessing, the faster it can be. STL algorithms are generic, and the cost of that genericity can be more overhead than a that of a container-specific algorithm.
With a good implementation of std::vector, v.insert(v.end(), buffer, buffer + count); might be implemented as:
size_t count = last-first;
resize(size() + count);
memcpy(data+offset, first, count);
std::copy(buffer, buffer + count, std::back_inserter(v)) on the other hand will be implemented as:
while ( first != last )
{
*output++ = *first++;
}
which is equivalent to:
while ( first != last )
{
v.push_back( *first++ );
}
or (roughly):
while ( first != last )
{
// push_back should be slightly more efficient than this
v.resize(v.size() + 1);
v.back() = *first++;
}
Whilst in theory the compiler could optimise the above into a memcpy its unlikely to, at best you'll probably get the methods inlined so that you don't have a function call overhead, it'll still be writing one byte at a time whereas a memcpy will normally use vector instructions to copy multiple bytes at once.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why use iterators instead of array indices?
I'm reviewing my knowledge on C++ and I've stumbled upon iterators. One thing I want to know is what makes them so special and I want to know why this:
using namespace std;
vector<int> myIntVector;
vector<int>::iterator myIntVectorIterator;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
for(myIntVectorIterator = myIntVector.begin();
myIntVectorIterator != myIntVector.end();
myIntVectorIterator++)
{
cout<<*myIntVectorIterator<<" ";
//Should output 1 4 8
}
is better than this:
using namespace std;
vector<int> myIntVector;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
for(int y=0; y<myIntVector.size(); y++)
{
cout<<myIntVector[y]<<" ";
//Should output 1 4 8
}
And yes I know that I shouldn't be using the std namespace. I just took this example off of the cprogramming website. So can you please tell me why the latter is worse? What's the big difference?
The special thing about iterators is that they provide the glue between algorithms and containers. For generic code, the recommendation would be to use a combination of STL algorithms (e.g. find, sort, remove, copy) etc. that carries out the computation that you have in mind on your data structure (vector, list, map etc.), and to supply that algorithm with iterators into your container.
Your particular example could be written as a combination of the for_each algorithm and the vector container (see option 3) below), but it's only one out of four distinct ways to iterate over a std::vector:
1) index-based iteration
for (std::size_t i = 0; i != v.size(); ++i) {
// access element as v[i]
// any code including continue, break, return
}
Advantages: familiar to anyone familiar with C-style code, can loop using different strides (e.g. i += 2).
Disadvantages: only for sequential random access containers (vector, array, deque), doesn't work for list, forward_list or the associative containers. Also the loop control is a little verbose (init, check, increment). People need to be aware of the 0-based indexing in C++.
2) iterator-based iteration
for (auto it = v.begin(); it != v.end(); ++it) {
// if the current index is needed:
auto i = std::distance(v.begin(), it);
// access element as *it
// any code including continue, break, return
}
Advantages: more generic, works for all containers (even the new unordered associative containers, can also use different strides (e.g. std::advance(it, 2));
Disadvantages: need extra work to get the index of the current element (could be O(N) for list or forward_list). Again, the loop control is a little verbose (init, check, increment).
3) STL for_each algorithm + lambda
std::for_each(v.begin(), v.end(), [](T const& elem) {
// if the current index is needed:
auto i = &elem - &v[0];
// cannot continue, break or return out of the loop
});
Advantages: same as 2) plus small reduction in loop control (no check and increment), this can greatly reduce your bug rate (wrong init, check or increment, off-by-one errors).
Disadvantages: same as explicit iterator-loop plus restricted possibilities for flow control in the loop (cannot use continue, break or return) and no option for different strides (unless you use an iterator adapter that overloads operator++).
4) range-for loop
for (auto& elem: v) {
// if the current index is needed:
auto i = &elem - &v[0];
// any code including continue, break, return
}
Advantages: very compact loop control, direct access to the current element.
Disadvantages: extra statement to get the index. Cannot use different strides.
What to use?
For your particular example of iterating over std::vector: if you really need the index (e.g. access the previous or next element, printing/logging the index inside the loop etc.) or you need a stride different than 1, then I would go for the explicitly indexed-loop, otherwise I'd go for the range-for loop.
For generic algorithms on generic containers I'd go for the explicit iterator loop unless the code contained no flow control inside the loop and needed stride 1, in which case I'd go for the STL for_each + a lambda.
With a vector iterators do no offer any real advantage. The syntax is uglier, longer to type and harder to read.
Iterating over a vector using iterators is not faster and is not safer (actually if the vector is possibly resized during the iteration using iterators will put you in big troubles).
The idea of having a generic loop that works when you will change later the container type is also mostly nonsense in real cases. Unfortunately the dark side of a strictly typed language without serious typing inference (a bit better now with C++11, however) is that you need to say what is the type of everything at each step. If you change your mind later you will still need to go around and change everything. Moreover different containers have very different trade-offs and changing container type is not something that happens that often.
The only case in which iteration should be kept if possible generic is when writing template code, but that (I hope for you) is not the most frequent case.
The only problem present in your explicit index loop is that size returns an unsigned value (a design bug of C++) and comparison between signed and unsigned is dangerous and surprising, so better avoided. If you use a decent compiler with warnings enabled there should be a diagnostic on that.
Note that the solution is not to use an unsiged as the index, because arithmetic between unsigned values is also apparently illogical (it's modulo arithmetic, and x-1 may be bigger than x). You instead should cast the size to an integer before using it.
It may make some sense to use unsigned sizes and indexes (paying a LOT of attention to every expression you write) only if you're working on a 16 bit C++ implementation (16 bit was the reason for having unsigned values in sizes).
As a typical mistake that unsigned size may introduce consider:
void drawPolyline(const std::vector<P2d>& points)
{
for (int i=0; i<points.size()-1; i++)
drawLine(points[i], points[i+1]);
}
Here the bug is present because if you pass an empty points vector the value points.size()-1 will be a huge positive number, making you looping into a segfault.
A working solution could be
for (int i=1; i<points.size(); i++)
drawLine(points[i - 1], points[i]);
but I personally prefer to always remove unsinged-ness with int(v.size()).
PS: If you really don't want to think by to yourself to the implications and simply want an expert to tell you then consider that a quite a few world recognized C++ experts agree and expressed opinions on that unsigned values are a bad idea except for bit manipulations.
Discovering the ugliness of using iterators in the case of iterating up to second-last is left as an exercise for the reader.
Iterators make your code more generic.
Every standard library container provides an iterator hence if you change your container class in future the loop wont be affected.
Iterators are first choice over operator[]. C++11 provides std::begin(), std::end() functions.
As your code uses just std::vector, I can't say there is much difference in both codes, however, operator [] may not operate as you intend to. For example if you use map, operator[] will insert an element if not found.
Also, by using iterator your code becomes more portable between containers. You can switch containers from std::vector to std::list or other container freely without changing much if you use iterator such rule doesn't apply to operator[].
It always depends on what you need.
You should use operator[] when you need direct access to elements in the vector (when you need to index a specific element in the vector). There is nothing wrong in using it over iterators. However, you must decide for yourself which (operator[] or iterators) suits best your needs.
Using iterators would enable you to switch to other container types without much change in your code. In other words, using iterators would make your code more generic, and does not depend on a particular type of container.
By writing your client code in terms of iterators you abstract away the container completely.
Consider this code:
class ExpressionParser // some generic arbitrary expression parser
{
public:
template<typename It>
void parse(It begin, const It end)
{
using namespace std;
using namespace std::placeholders;
for_each(begin, end,
bind(&ExpressionParser::process_next, this, _1);
}
// process next char in a stream (defined elsewhere)
void process_next(char c);
};
client code:
ExpressionParser p;
std::string expression("SUM(A) FOR A in [1, 2, 3, 4]");
p.parse(expression.begin(), expression.end());
std::istringstream file("expression.txt");
p.parse(std::istringstream<char>(file), std::istringstream<char>());
char expr[] = "[12a^2 + 13a - 5] with a=108";
p.parse(std::begin(expr), std::end(expr));
Edit: Consider your original code example, implemented with :
using namespace std;
vector<int> myIntVector;
// Add some elements to myIntVector
myIntVector.push_back(1);
myIntVector.push_back(4);
myIntVector.push_back(8);
copy(myIntVector.begin(), myIntVector.end(),
std::ostream_iterator<int>(cout, " "));
The nice thing about iterator is that later on if you wanted to switch your vector to a another STD container. Then the forloop will still work.
its a matter of speed. using the iterator accesses the elements faster. a similar question was answered here:
What's faster, iterating an STL vector with vector::iterator or with at()?
Edit:
speed of access varies with each cpu and compiler
I am struggling with this piece of code :
std::queue<char> output_queue;
std::string output_string
// put stuff into output_queue
while (!output_queue.empty())
{
output_string.insert(0,(output_queue.front()));
output_queue.pop();
}
I somehow can't do this since std::queue<char>::front() will return a char& and I can't put this into std::string.
You're missing an argument to make insert insert a character. You need to specify how many of that character:
output_string.insert(0, 1, output_queue.front());
If you want to make it easier on yourself, you can also use std::deque instead of std::queue and replace it with this:
std::deque<char> output_queue;
//fill output_queue in same way, but use push/pop_front/back instead of push/pop
std::string output_string(output_queue.begin(), output_queue.end());
output_queue.clear();
It would nearly be the same thing as now because your queue is actually using a std::deque by default under the hood. The deque, however, supports iterators, which makes this possible without ugly code that relies on the underlying storage.
You may use
output_string += (output_queue.front());
and then (after while) reverse it
What is the fastest way (if there is any other) to convert a std::vector from one datatype to another (with the idea to save space)? For example:
std::vector<unsigned short> ----> std::vector<bool>
we obviously assume that the first vector only contains 0s and 1s. Copying element by element is highly inefficient in case of a really large vector.
Conditional question:
If you think there is no way to do it faster, is there a complex datatype which actually allows fast conversion from one datatype to another?
std::vector<bool>
Stop.
A std::vector<bool> is... not. std::vector has a specialization for the use of the type bool, which causes certain changes in the vector. Namely, it stops acting like a std::vector.
There are certain things that the standard guarantees you can do with a std::vector. And vector<bool> violates those guarantees. So you should be very careful about using them.
Anyway, I'm going to pretend you said vector<int> instead of vector<bool>, as the latter really complicates things.
Copying element by element is highly inefficient in case of a really large vector.
Only if you do it wrong.
Vector casting of the type you want needs to be done carefully to be efficient.
If the the source T type is convertible to the destination T, then this is works just fine:
vector<Tnew> vec_new(vec_old.begin(), vec_old.end());
Decent implementations should recognize when they've been given random-access iterators and optimize the memory allocation and loop appropriately.
The biggest problem for non-convertible types you'll have for simple types is not doing this:
std::vector<int> newVec(oldVec.size());
That's bad. That will allocate a buffer of the proper size, but it will also fill it with data. Namely, default-constructed ints (int()).
Instead, you should do this:
std::vector<int> newVec;
newVec.reserve(oldVec.size());
This reserves capacity equal to the original vector, but it also ensures that no default construction takes place. You can now push_back to your hearts content, knowing that you will never cause reallocation in your new vector.
From there, you can just loop over each entry in the old vector, doing the conversion as needed.
There's no way to avoid the copy, since a std::vector<T> is a distinct
type from std::vector<U>, and there's no way for them to share the
memory. Other than that, it depends on how the data is mapped. If the
mapping corresponds to an implicit conversion (e.g. unsigned short to
bool), then simply creating a new vector using the begin and end
iterators from the old will do the trick:
std::vector<bool> newV( oldV.begin(), oldV.end() );
If the mapping isn't just an implicit conversion (and this includes
cases where you want to verify things; e.g. that the unsigned short
does contain only 0 or 1), then it gets more complicated. The
obvious solution would be to use std::transform:
std::vector<TargetType> newV;
newV.reserve( oldV.size() ); // avoids unnecessary reallocations
std::transform( oldV.begin(), oldV.end(),
std::back_inserter( newV ),
TranformationObject() );
, where TranformationObject is a functional object which does the
transformation, e.g.:
struct ToBool : public std::unary_function<unsigned short, bool>
{
bool operator()( unsigned short original ) const
{
if ( original != 0 && original != 1 )
throw Something();
return original != 0;
}
};
(Note that I'm just using this transformation function as an example.
If the only thing which distinguishes the transformation function from
an implicit conversion is the verification, it might be faster to verify
all of the values in oldV first, using std::for_each, and then use
the two iterator constructor above.)
Depending on the cost of default constructing the target type, it may be
faster to create the new vector with the correct size, then overwrite
it:
std::vector<TargetType> newV( oldV.size() );
std::transform( oldV.begin(), oldV.end(),
newV.begin(),
TranformationObject() );
Finally, another possibility would be to use a
boost::transform_iterator. Something like:
std::vector<TargetType> newV(
boost::make_transform_iterator( oldV.begin(), TranformationObject() ),
boost::make_transform_iterator( oldV.end(), TranformationObject() ) );
In many ways, this is the solution I prefer; depending on how
boost::transform_iterator has been implemented, it could also be the
fastest.
You should be able to use assign like this:
vector<unsigned short> v;
//...
vector<bool> u;
//...
u.assign(v.begin(), v.end());
class A{... }
class B{....}
B convert_A_to_B(const A& a){.......}
void convertVector_A_to_B(const vector<A>& va, vector<B>& vb)
{
vb.clear();
vb.reserve(va.size());
std::transform(va.begin(), va.end(), std::back_inserter(vb), convert_A_to_B);
}
The fastest way to do it is to not do it. For example, if you know in advance that your items only need a byte for storage, use a byte-size vector to begin with. You'll find it difficult to find a faster way than that :-)
If that's not possible, then just absorb the cost of the conversion. Even if it's a little slow (and that's by no means certain, see Nicol's excellent answer for details), it's still necessary. If it wasn't, you would just leave it in the larger-type vector.
First, a warning: Don't do what I'm about to suggest. It's dangerous and must never be done. That said, if you just have to squeeze out a tiny bit more performance No Matter What...
First, there are some caveats. If you don't meet these, you can't do this:
The vector must contain plain-old-data. If your type has pointers, or uses a destructor, or needs an operator = to copy correctly ... do not do this.
The sizeof() both vector's contained types must be the same. That is, vector< A > can copy from vector< B > only if sizeof(A) == sizeof(B).
Here is a fairly stable method:
vector< A > a;
vector< B > b;
a.resize( b.size() );
assert( sizeof(vector< A >::value_type) == sizeof(vector< B >::value_type) );
if( b.size() == 0 )
a.clear();
else
memcpy( &(*a.begin()), &(*b.begin()), b.size() * sizeof(B) );
This does a very fast, block copy of the memory contained in vector b, directly smashing whatever data you have in vector a. It doesn't call constructors, it doesn't do any safety checking, and it's much faster than any of the other methods given here. An optimizing compiler should be able to match the speed of this in theory, but unless you're using an unusually good one, it won't (I checked with Visual C++ a few years ago, and it wasn't even close).
Also, given these constraints, you could forcibly (via void *) cast one vector type to the other and swap them -- I had a code sample for that, but it started oozing ectoplasm on my screen, so I deleted it.
Copying element by element is not highly inefficient. std::vector provides constant access time to any of its elements, hence the operation will be O(n) overall. You will not notice it.
#ifdef VECTOR_H_TYPE1
#ifdef VECTOR_H_TYPE2
#ifdef VECTOR_H_CLASS
/* Other methods can be added as needed, provided they likewise carry out the same operations on both */
#include <vector>
using namespace std;
class VECTOR_H_CLASS {
public:
vector<VECTOR_H_TYPE1> *firstVec;
vector<VECTOR_H_TYPE2> *secondVec;
VECTOR_H_CLASS(vector<VECTOR_H_TYPE1> &v1, vector<VECTOR_H_TYPE2> &v2) { firstVec = &v1; secondVec = &v2; }
~VECTOR_H_CLASS() {}
void init() { // Use this to copy a full vector into an empty (or garbage) vector to equalize them
secondVec->clear();
for(vector<VECTOR_H_TYPE1>::iterator it = firstVec->begin(); it != firstVec->end(); it++) secondVec->push_back((VECTOR_H_TYPE2)*it);
}
void push_back(void *value) {
firstVec->push_back((VECTOR_H_TYPE1)value);
secondVec->push_back((VECTOR_H_TYPE2)value);
}
void pop_back() {
firstVec->pop_back();
secondVec->pop_back();
}
void clear() {
firstVec->clear();
secondVec->clear();
}
};
#undef VECTOR_H_CLASS
#endif
#undef VECTOR_H_TYPE2
#endif
#undef VECTOR_H_TYPE1
#endif