Histogram of integer without looping - c++

I was wandering if there is any STL algorithm which produces the same result of the following code:
std::vector<int> data;
std::vector<int> counter(N); //I know in advance that all values in data
//are between 0 and N-1
for(int i=0; i<data.size(); ++i)
counter[data[i]]++;
This code simply outputs the histogram of my integer data, with pre-defined bin size equal to one.
I know that I should avoid loops as much as I could, as the equivalents with STL algorithms are much better optimized than what the majority of C++ programmer may come up with.
Any suggestions?
Thank you in advance, Giuseppe

Well, you can certainly at least clean up the loop a bit:
for (auto i : data)
++count[i];
You could (for example) use std::for_each instead:
std::for_each(data.begin(), data.end(), [&count](int i) { ++count[i]; });
...but that doesn't really look like much (if any) of an improvement to me.

I don't think there's a more efficient way of doing this. You're right about avoiding loops and preferring STL in most cases, but this only applies to bigger, and overly-complicated loops which are harder to write and maintain, therefore likely to be not optimal.
Looking at the problem at an assembly level, the only way to compute this problem is exactly the way you have it in your example. Since C/C++ loops translate to assembly very efficiently with zero unnecessary overhead, this leaves me believing that no STL function could preform this faster than your algorithm.
There is one STL function called count, but the complexity of it is linear ( O(n) ), and so as your solution's.
If you really want to squeeze out the maximum of every CPU-cycle, then consider using C-style arrays, and a separate counter variable. The overhead introduced by vectors is barely even measurable, but if any, that's the only opportunity I see for optimization here. Not that I would suggest it, but I'm afraid that's the only way you can get a hair more speed out of this.

If you think about it, in order to count the occurrences of elements in a vector, each element would have to be "visited" at least once, there's no avoiding it.
A simple loop like this is already the most efficient. You can try to unroll it, but that's probably the best you can do. STL or not, I doubt if there's a better algorithm.

You can use for_each and one lambda function. Check this example:
#include <algorithm>
#include <vector>
#include <ctime>
#include <iostream>
const int N = 10;
using namespace std;
int main()
{
srand(time(0));
std::vector<int> counter(N);
std::vector<int> data(N);
generate(data.begin(),data.end(),[]{return rand()%N;});
for (int i = 0;i<N;i++)
cout<<data[i]<<endl;
cout<<endl;
for_each(data.begin(),data.end(),[&counter](int i){++counter[i];});
for (int i = 0;i<N;i++)
cout<<counter[i]<<endl;
}

Related

How to parallelize a plain for loop using the C++ standard library

I feel kind of dumb having to ask this, but I just can't find a non-convoluted way to do this.
I have the following loop:
for (int i = 0; i < count; ++i) {
if (myFunc(i))
continue;
myOtherFunc(i);
}
Parallelizing this with OpenMP is trivial: Just add #pragma omp parallel for before the loop.
I wanted to compare the performance of OMP (and its different schedules) to MSVC's parallel <algorithms> implementation (i.e. using C++17 execution policies). The straightforward idea would be to use std::for_each, but I can't figure out a good way to transform this super plain for loop into any appropriate <algorithm> thing that I can throw an execution policy at.
Notably, you can't just do
std::for_each(std::execution::par, 0, count, [](int i){ /*...*/ });
because you must supply iterators (i.e. something that yields the i argument when dereferenced).
I could std::iota into a std::vector of int just so I have a range of indices to iterate over. That would be absurd though.
I could use std::generate_n with some dummy output iterator that discards whatever it is assigned. Since I don't think that's available in std I would have to write the full dummy iterator myself. And this would of course be a stupid hack regardless. And getting a hold of the correct index would probably require manual tracking with a std::atomic<int> because you don't get to know your current index.
I really don't have a container to loop over. I mean, somewhere deep inside those functions there are containers, but restructuring everything just so I can use iterators over some container in this loop is out of the question.
15 minutes of googling different descriptions of this didn't get me anywhere.
Is there some way to match the most plain and basic for loop with <algorithm> tools that doesn't involve silly nonsense?
If using boost, you may use the boost::irange (in boost-range) to produce the counting loop like this:
auto ints = boost::irange(0, count);
std::for_each_n(POLICY, ints.begin(), boost::size(ints), [](int i)
{
if (!myFunc(i)) {
myOtherFunc(i);
}
}
);

Swap columns in c++

I have an std matrix defined as:
std::vector<std::vector<double> > Qe(6,std::vector<double>(6));
and a vector v that is:
v{0, 1, 3, 2, 4, 5};
I would like to swap the columns 3 and 2 of matrix Qe like indicated in vector v.
In Matlab this is as easy as writing Qe=Qe(:,v);
I wonder if there is an easy way other than a for loop to do this in c++.
Thanks in advance.
Given that you've implemented this as a vector of vectors, you can use a simple swap:
std::swap(Qe[2], Qe[3]);
This should have constant complexity. Of course, this will depend on whether you're treating your data as column-major or row-major. If you're going to be swapping columns often, however, you'll want to arrange the data to suit that (i.e., to allow the code above to work).
As far as doing the job without a for loop when you're using row-major ordering (the usual for C++), you can technically eliminate the for loop (at least from your source code) by using a standard algorithm instead:
std::for_each(Qe.begin(), Qe.end(), [](std::vector<double> &v) {std::swap(v[2], v[3]); });
This doesn't really change what's actually happening though--it just hides the for loop itself inside a standard algorithm. In this case, I'd probably prefer a range-based for loop:
for (auto &v : Qe)
std::swap(v[2], v[3]);
...but I've never been particularly fond of std::for_each, and when C++11 added range-based for loops, I think that was a superior alternative to the vast majority of cases where std::for_each might previously have been a reasonable possibility (IOW, I've never seen much use for std::for_each, and see almost none now).
Depends on how you implement your matrix.
If you have a vector of columns, you can swap the column references. O(1)
If you have a vector of rows, you need to swap the elements inside each row using a for loop. O(n)
std::vector<std::vector<double>> can be used as a matrix but you also need to define for yourself whether it is a vector of columns or vector of rows.
You can create a function for this so you don't write a for loop each time. For example, you can write a function which receives a matrix which is a vector of columns and a reordering vector (like v) and based on the reordering vector you create a new matrix.
//untested code and inefficient, just an example:
vector<vector<double>> ReorderColumns(vector<vector<double>> A, vector<int> order)
{
vector<vector<double>> B;
for (int i=0; i<order.size(); i++)
{
B[i] = A[order[i]];
}
return B;
}
Edit: If you want to do linear algebra there are libraries that can help you, you don't need to write everything yourself. There are math libraries for other purposes too.
If you are in a row scenario. The following would probably work:
// To be tested
std::vector<std::vector<double> >::iterator it;
for (it = Qe.begin(); it != Qe.end(); ++it)
{
std::swap((it->second)[2], (it->second)[3]);
}
In this scenario I don't see any other solution that would avoid doing a loop O(n).

Efficient way to rotate array clockwise and counterclockwise

I am rotating an array or vector clockwise and counterclockwise in C++. Which is the most efficient way in terms of time complexity to do that ?
I used rotate() function but I want to know is there any faster methods than this ?
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<int> v;
for(int i=0;i<5;i++)
v.push_back(i);
int d=2;
rotate(v.begin(),v.begin()+d,v.end());
return 0;
}
rotate() is a linear time function and that is the best you can do.
However, if you need to do multiple rotates, you can accumulate.
For eg:
rotation of 4 and rotation of 5 is same as a single rotate of 9.
Or in fact, in some applications, you may not even want to actually rotate.
Like, if you want to rotate by 'd'. You can just make a function that returns v[(i+d)%v.size()] when asked for v[i]. This is constant time solution. But like I said, this is application specific.
General answer for the "can I make XY faster?" kind of question:
Maybe you can. But probably you shouldn't.
std::rotate is designed to be efficient for the average case. That means, if you have a very specific case, it might be possible to have a more performant implementation for that case.
BUT:
Don't bother to search for a more performant implementation for your specific case, because finding that specific implementation will require you to know about the detailed performance of the steps you take and about the optimizations your compiler will perform.
Don't bother to implement it, because you will have to test it and still your coverage of corner cases won't be as good as the tests already performed with the standard library implementation.
Don't use it, because someone will be irritaded, asking himself by why you rolled your own implementation and not just used the standard library. And someone else will use it for a case where it is not as performant as the standard implementation.
Don't invest time to improve the performance of a clear piece of code, unless you are 100% sure that it is a performance bottleneck. 100% sure means, you have used a profiler and pinpointed the exact location of the bottleneck.
#include <bits/stdc++.h>
#include <iostream>
using namespace std;
void rotatebyone(int arr[], int n) {
int temp= arr[0], i;
for(i=0;i<n;i++)
{
arr[i]=arr[i+1];
arr[n]=temp;
}
}
int main()
{
int arr[]={2,3,4,5,6,7,8};
int m= sizeof(arr)/sizeof(arr[0]);
int i;
int d=1;
for(i=0;i<d;i++) { //function to implement it d no of times
rotatebyone(arr,m);
}
for(i=0;i<m;i++) {
cout<<arr[i]<<"";
}
return 0;
}

How do I create a vector of object values from a vector of object pointers?

The "naive" solution is;
std::vector<T> vector_of_objects;
vector_of_objects.reserve(vector_of_pointers.size());
for (T const * p : vector_of_pointers)
vector_of_objects.push_back(*p);
The above seems cumbersome and perhaps not immediately obvious.
Is there a solution that is at least not significantly less efficient and perhaps a little quicker and more intuitive? I'm thinking C++11 might have a solution that I am not aware of...
Does writing everything in one line mean shorter code? No.
In my opinion these two lines are shorter and more readable:
for (auto p : vector_of_pointers)
vector_of_objects.emplace_back(*p);
Function std::for_each is not shorter than ranged-based loop, sometime it's bigger due to passing lambda expressions.
Function std::transform is even longer than std::for_each, however the word transform is an advantage to find out what's happening in the following.
You are doing the correct way. Another way is to use built-in algorithms library, like this:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main() {
// Create a vector of pointers
std::vector<int*> vptr;
vptr.push_back(new int(1));
vptr.push_back(new int(2));
// Copy to vector of objects
std::vector<int> vobj;
std::for_each(vptr.begin(), vptr.end(), [&](int *n) { vobj.emplace_back(*n); });
// Free the pointers
std::for_each(vptr.begin(), vptr.end(), [&](int *n) { delete n; });
// Print out the vector of objects
std::copy(vobj.begin(), vobj.end(), std::ostream_iterator<int>(std::cout, " "));
return 0;
}
The idiomatic way would be to use std::transform:
std::transform( vector_of_pointers.begin(),
vector_of_pointers.end(),
std::back_inserter( vector_of_objects ),
[]( T* p ) { return *p; } );
Whether this is "better" than what you've written is another
question: it has the advantage of being idiomatic, and of
actually naming what is going on (which makes the code slightly
clearer). On the other hand, the "transformation" is very, very
simple, so it would be easily recognized in the loop, and the
new form for writing such loops makes things fairly clear as
well.
No, you have to call the copy-ctor as in your solution, there's no way around that.
std::vector<T*> vecPointers;
std::vector<T> vecValues;
for(size_t x=0;x<vecPointers.size();x++)
{
vecValues.push_back(*vecPointers[x]);
}
I believe that if type T is a custom object then you will need to create a copy constructor for class T.
class T
{
private:
int someValue;
public:
T()
{
}
T(const T &o)// copy constructor
{
someValue = o.someValue;
}
virtual ~T()
{
}
};
It seems to me that the real question is whether you're doing this often enough for it to be worth writing extra code in one place to clean up the code in the other places.
I can imagine writing a deref_iterator that would allow you to do something like this:
std::vector<T> vector_of_objects{
deref_iterator(std::begin(vector_of_pointers)),
deref_iterator(std::end(vector_of_pointers))};
Now, we're left with the question of whether this is really shorter than the original loop or not. In terms of simple number of key strokes, it's probably going to depend on the names you give things. If you didn't care about readable names, it could be:
vector<T> v_o{d(begin(v_p)), d(end(v_p))};
The short names obviously make it short, but I certainly wouldn't advise them -- if I hadn't just typed this in, I'd have no clue in the world what it meant. A longer name (that needs to be repeated a couple of times) obviously adds more key-strokes, but I can't imagine anybody thinking the readability wasn't worth it.
In any case, the deref_iterator itself would clearly take up some code. An iterator has enough boiler-plate that it typically takes around 100 lines of code or so. Let's (somewhat arbitrarily) decide that this saves one line of code every time you use it. On that basis, you'd have to use it 100 times to break even.
I'm not sure that's accurate in characterizing the code overall -- the code for an iterator is mostly boiler-plate, and other than a typo, there's not much that could really go wrong with it. For the most part, it would be a matter of including the right header, and using it, not of virtually ever having to look at the code for the iterator itself.
That being the case, I might accept it as an improvement even if the total number of lines of code increased. Writing it to use only once would clearly be a loss, but I don't think it'd need to be a full 100 times to qualify as breaking even either.

Hamming weight for vector<int> C++

I'm working on a the Hamming weight for a vector and what I do is count in linear way all the 1 in the vector, is there any more efficient way?
int HammingWeight(vector<int> a){
int HG=0;
for(int i=0; i<a.size(); i++){
if(a[i] == 1)
HG++;
}
return HG;
}
To calculate the hamming weight, you need to visit each element, giving you O(n) best case, making your loop as efficient as it gets when discounting micro optimizations.
However your function call itself is extremely inefficient: You pass the vector by value, resulting in a copy of all it's content. This copy can easily be more expensive then the rest of the function combined. Furthermore there is nothing at all in your function which actually needs that copy. So changing the signature to int HammingWeight(const std::vector<int>& a) should make your function much more efficient.
Another (possible) optimization comes to mind, assuming your vector only contains ones and zeros (otherwise I don't see how your code makes sense). In that case you could just add the corresponding vectorelement to HG, getting rid of the if (addition is typically much faster then branching):
for(size_t i=0; i<a.size(); ++i)
HG+=a[i];
I would assume this to likely be faster, however whether or not it actually is isdepends on how the compiler optimizes.
If you'd actually need to you could of course apply common microoptimizations (loop unrolling, vectorization, ...), but that would be premature unless you have good reason to. Besides in that case the first thing to do (again assuming the vector only contains zeros and ones) would be to use a more compact (read efficient) representation of the data.
Also note that both approaches (the direct summation and the if version) could also be expressed using the standard library:
int HG=std::count(a.begin(), a.end(), 1); does basically the same thing as your code
int HG=std::accumulate(a.begin(), a.end(), 0); would be equivalent to the loopI mentioned above
Now this is unlikely to help performance, but using less code to archieve the same effect is typically considered a good thing.