OpenMP plus reduction on unordered_map<string,double> - c++

I would like to parallelize a for loop in which values of an unordered_map are updated:
unordered_map<string,double> umap {{"foo", 0}, {"bar", 0}};
#pragma omp parallel for reduction(my_reduction:umap)
for (int i = 0; i < 100; ++i)
{
// some_string(i) would return either "foo" or "bar"
umap[some_string(i)] += some_double(i);
}
So, no new entries in the unordered_map would be created, only the existing entries would be updated with a sum.
In this answer a user declared reduction is defined for the case of a vector. Could a user declared reduction be similarly defined in the case of the unordered_map?

It can be done using a similar approach to the one taken in the answer you linked. The one issue we face is that std::transform uses an unfortunate line when it comes to maps.
//GCC version, but the documentation suggests the same thing.
*__result = __binary_op(*__first1, *__first2);
Since maps store types std::pair<const T1, T2> (i.e. the first must always be const, you can't modify the key), this would cause an error, because the operator= is deleted for this case.
For that reason we end up having to write this whole thing ourselves (the answer that follows could be made cleaner, I just hard-coded your type...).
We could just start with the examples of std::transform (look at example implementation 2) and modify the problematic part, but #Zulan raises a good point in the comments that simultaneously traversing unordered maps might not be a good idea (as they are, by definition, not ordered). While it might make some sense that the copy constructor retain the order, it seems that this is not guaranteed by the standard (at least I couldn't find it anywhere), as such the approach that std::transform takes becomes pretty useless.
We can resolve this issue with a slightly different reduction.
#include <unordered_map>
#include <string>
#include <iostream>
#include <utility>
void reduce_umaps(\
std::unordered_map<std::string, double>& output, \
std::unordered_map<std::string, double>& input)
{
for (auto& X : input) {
output.at(X.first) += X.second; //Will throw if X.first doesn't exist in output.
}
}
#pragma omp declare reduction(umap_reduction : \
std::unordered_map<std::string, double> : \
reduce_umaps(omp_out, omp_in)) \
initializer(omp_priv(omp_orig))
using namespace std;
unordered_map<string, double> umap {{"foo", 0}, {"bar", 0}};
string some_string(int in) {
if (in % 2 == 0) return "foo";
else return "bar";
}
inline double some_double(int in) {
return static_cast<double>(in);
}
int main(void) {
#pragma omp parallel for reduction(umap_reduction:umap)
for (int i = 0; i < 100; ++i) {
umap.at(some_string(i)) += some_double(i);
}
std::cerr << umap["foo"] << " " << umap["bar"] << "\n";
return 0;
}
You can also generalise this to allow the addition of keys within the parallel loop, but that won't parallelise well unless the number of added keys remains much smaller than the number of times you increase the values.
As a final side note, I replaced umap[some_string(i)] with umap.at(some_string(i)), to avoid accidentally adding elements much like was suggested in the comments, but find isn't the most practical function for that purpose.

Related

C++ vector with std::map?

my function looks like this:
bool getPair(std::vector<std::vector<unsigned short>>Cards) {
std::sort(Cards.begin(), Cards.end(), Cardsort);
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]];
for (const auto& val : Counter) {
if (val.second == 2)
return true;
}
return false;
}
I'm pretty sure I'm using std::map incorrectly, I basically have the vector setup like so:
{{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}}
where the first number represents value, the second represents card suit. I realize now I should have used an object which may have made this problem less complicated but now I'm trying to use std::map to count how many times the value shows up and if it shows up two times, it show return true (in the vector it would return true at the 3) but I don't think I'm using std::map properly
I want to see if Cards has more than one of the same variable in Cards[i][0], I do not care about duplicates in Cards[i][1]
Tested this and works. Highlighted the fix
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
using namespace std;
bool getPair(std::vector<std::vector<unsigned short>>Cards) {
std::sort(Cards.begin(), Cards.end());
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]]++; // ++++++++++++++++++ need to alter the value!
for (const auto& val : Counter) {
if (val.second == 2)
return true;
}
return false;
}
int main() {
// your code goes here
// {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}}
std::vector<std::vector<unsigned short>> c = {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}};
std::cout << getPair(c);
return 0;
}
Here´s my suggestion.
Some remarks:
why use two loops? You already have the map entry to check, since you want to increase it, so you can check for doubles aka pairs in the counting loop. No need for a second run. This way it´s much less expensive.
I changed the vector parameter to const&. It´s a very bad idea to pass such a thing by value, at least I can´t see why that could be appropriate in that case
I left out the sorting thingy, can´t see for what end it´s needed, just reinsert it, if necessary. Sorting is very expensive.
you are right in the fact that std:: containers do not need initialization, they are proper initialized, the allocator calls the constructor of new elements, event for e.g. int thats one reason why e.g. int got a default constructor syntax and you can write funny thingies like auto a = int();.
accessing nonexistent keys of a map simply creates them
using a set and counting will definitely not yield better performance
I think the code is pretty easy to read, here you are:
#include <iostream>
#include <vector>
#include <map>
bool getPair(const std::vector<std::vector<unsigned short>>& cards) {
std::map<unsigned short, int> counts;
for(const auto& n : cards) {
if(++counts[n[0]] == 2)
return true;
}
return false;
}
int main()
{
std::vector<std::vector<unsigned short>> cards1 = {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}};
std::vector<std::vector<unsigned short>> cards2 = {{1,0},{2,0},{4,1},{3,0},{5,0},{7,0},{6,0}};
std::cout << getPair(cards1) << "\n";
std::cout << getPair(cards2) << "\n";
return 0;
}
Edit:
Quote of the C++14 Standard regarding access to not existing members of std::map, just for the sake of completeness:
23.4.4.3 map element access [map.access]
T& operator[](const key_type& x);
Effects: If there is no key equivalent to x in the map, inserts value_type(x, T()) into the map.
Requires: key_type shall be CopyInsertable and mapped_type shall be DefaultInsertable into
*this.
Returns: A reference to the mapped_type corresponding to x in *this.
Complexity: Logarithmic.23.4.4.3 map element access
First, you address uninitialized variables in Counter, and then you don't really do anything with it (and why do you run till 6 instead of Cards.size()? Your array has size 7 BTW. Also why there is some kind of sort there? You don't need it.):
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]];
They might set the uninitialized variable automatically at 0 or they might not - it depends on the implementation as it is not specified as far as I am aware (in Debug they do set it to 0 but I doubt about the Release version). You'll need to rewrite the code as follows to make it work 100% in all circumstances:
std::map<unsigned short, int> Counter;
for (int i = 0; i < (int)Cards.size(); i++)
{
unsigned short card = Cards[i][0];
auto itr = Counter.find(card);
if(itr == Counter.end())
Counter[card] = 1;
else
itr->second++;
}
I would recommend to use std::set for this task:
std::set<unsigned short> Counter;
for (int i = 0; i < (int)Cards.size(); i++)
{
unsigned short card = Cards[i][0];
if(Counter.count(card)>0)
{
return true;
}
Counter.insert(card);
}
return false;

Sort an array of std::pair vs. struct: which one is faster?

I was wondering whether sorting an array of std::pair is faster, or an array of struct?
Here are my code segments:
Code #1: sorting std::pair array (by first element):
#include <algorithm>
pair <int,int> client[100000];
sort(client,client+100000);
Code #2: sort struct (by A):
#include <algorithm>
struct cl{
int A,B;
}
bool cmp(cl x,cl y){
return x.A < y.A;
}
cl clients[100000];
sort(clients,clients+100000,cmp);
code #3: sort struct (by A and internal operator <):
#include <algorithm>
struct cl{
int A,B;
bool operator<(cl x){
return A < x.A;
}
}
cl clients[100000];
sort(clients,clients+100000);
Update: I used these codes to solve a problem in an online Judge. I got time limit of 2 seconds for code #1, and accept for code #2 and #3 (ran in 62 milliseconds). Why code #1 takes so much time in comparison to other codes? Where is the difference?
You know what std::pair is? It's a struct (or class, which is the same thing in C++ for our purposes). So if you want to know what's faster, the usual advice applies: you have to test it and find out for yourself on your platform. But the best bet is that if you implement the equivalent sorting logic to std::pair, you will have equivalent performance, because the compiler does not care whether your data type's name is std::pair or something else.
But note that the code you posted is not equivalent in functionality to the operator < provided for std::pair. Specifically, you only compare the first member, not both. Obviously this may result in some speed gain (but probably not enough to notice in any real program).
I would estimate that there isn't much difference at all between these two solutions.
But like ALL performance related queries, rather than rely on someone on the internet telling they are the same, or one is better than the other, make your own measurements. Sometimes, subtle differences in implementation will make a lot of difference to the actual results.
Having said that, the implementation of std::pair is a struct (or class) with two members, first and second, so I have a hard time imagining that there is any real difference here - you are just implementing your own pair with your own compare function that does exactly the same things that the already existing pair does... Whether it's in an internal function in the class or as an standalone function is unlikely to make much of a difference.
Edit: I made the following "mash the code together":
#include <algorithm>
#include <iostream>
#include <iomanip>
#include <cstdlib>
using namespace std;
const int size=100000000;
pair <int,int> clients1[size];
struct cl1{
int first,second;
};
cl1 clients2[size];
struct cl2{
int first,second;
bool operator<(const cl2 x) const {
return first < x.first;
}
};
cl2 clients3[size];
template<typename T>
void fill(T& t)
{
srand(471117); // Use same random number each time/
for(size_t i = 0; i < sizeof(t) / sizeof(t[0]); i++)
{
t[i].first = rand();
t[i].second = -t[i].first;
}
}
void func1()
{
sort(clients1,clients1+size);
}
bool cmp(cl1 x, cl1 y){
return x.first < y.first;
}
void func2()
{
sort(clients2,clients2+size,cmp);
}
void func3()
{
sort(clients3,clients3+size);
}
void benchmark(void (*f)(), const char *name)
{
cout << "running " << name << endl;
clock_t time = clock();
f();
time = clock() - time;
cout << "Time taken = " << (double)time / CLOCKS_PER_SEC << endl;
}
#define bm(x) benchmark(x, #x)
int main()
{
fill(clients1);
fill(clients2);
fill(clients3);
bm(func1);
bm(func2);
bm(func3);
}
The results are as follows:
running func1
Time taken = 10.39
running func2
Time taken = 14.09
running func3
Time taken = 10.06
I ran the benchmark three times, and they are all within ~0.1s of the above results.
Edit2:
And looking at the code generated, it's quite clear that the "middle" function takes quite a bit longer, since the comparison is made inline for pair and struct cl2, but can't be made inline for struct cl1 - so every compare literally makes a function call, rather than a few instructions inside the functions. This is a large overhead.

Memoizing a function with two inputs in C++

I have a function, f(a,b), that accepts two inputs. I do not know ahead of time which values of a and b will be used. I'm okay with being a little wasteful on memory (I care about speed). I want to be able to check if the output of f(a,b) has already been delivered, and if so, deliver that output again without re-running through the f(a,b) process.
Trivially easy to do in Python with decorators, but C++ is way over my head here.
I would use a std::map (or maybe an std::unordered_map) whose key is a std::pair, or perhaps use a map of maps.
C++11 improvements are probably helpful in that case. Or maybe some Boost thing.
The poster asks:
I want to be able to check if the output of f(a,b) has already been delivered, and if so, deliver that output again without re-running through the f(a,b) process.
It's pretty easy in C++ using a std::map. The fact that the function has exactly two parameters means that we can use std::pair to describe them.
#include <map>
#include <iostream>
uint64_t real_f(int a, int b) {
std::cout << "*";
// Do something tough:
return (uint64_t)a*b;
}
uint64_t memo_f(int a, int b) {
typedef std::pair<int, int> key;
typedef std::map<key, uint64_t> map;
static map m;
key k(a,b);
map::iterator it = m.find(k);
if(it == m.end()) {
return m[k] = real_f(a, b);
}
return it->second;
}
int main () {
std::cout << memo_f(1, 2) << "\n";
std::cout << memo_f(3, 4) << "\n";
std::cout << memo_f(1, 2) << "\n";
std::cout << memo_f(3, 4) << "\n";
std::cout << memo_f(5, 6) << "\n";
}
The output of the above program is:
*2
*12
2
12
*30
The lines without asterisks represent cached results.
With C++11, you could use tasks and futures. Let f be your function:
int f(int a, int b)
{
// Do hard work.
}
Then you would schedule the function execution, which returns you a handle to the return value. This handle is called a future:
template <typename F>
std::future<typename std::result_of<F()>::type>
schedule(F f)
{
typedef typename std::result_of<F()>::type result_type;
std::packaged_task<result_type> task(f);
auto future = task.get_future();
tasks_.push_back(std::move(task)); // Queue the task, execute later.
return std::move(future);
}
Then, you could use this mechanism as follows:
auto future = schedule(std::bind(&f, 42, 43)); // Via std::bind.
auto future = schedule([&] { f(42, 43); }); // Lambda alternative.
if (future.has_value())
{
auto x = future.get(); // Blocks if the result of f(a,b) is not yet availble.
g(x);
}
Disclaimer: my compiler does not support tasks/futures, so the code may have some rough edges.
The main point about this question are the relative expenses in CPU and RAM between calculating f(a,b) and keeping some sort of lookup table to cache results.
Since an exhaustive table of 128 bits index length is not (yet) feasable, we need to reduce the lookup space into a manageable size - this can't be done without some considerations inside your app:
How big is the really used space of function inputs? Is there a pattern in it?
What about the temporal component? Do you expect repeated calculations to be close to one another or ditributed along the timeline?
What about the distribution? Do you assume a tiny part of the index space to consume the majority of function calls?
I would simply start with a fixed-size array of (a,b, f(a,b)) tuples and a linear search. Depending on your pattern as asked above, you might want to
window-slide it (drop oldest on a cache miss): This is good for localized reocurrences
have (a,b,f(a,b),count) tuples with the tuple with the smallest count being expelled - this is good for non-localized occurrences
have some key-function determine a position in the cache (this is good for tiny index space usage)
whatever else Knuth or Google might have thought of
You might also want to benchmark repeated calculation against the lookup mechanism, if the latter becomes more and more complex: std::map and freinds don't come for free, even if they are high-quality implementations.
The only easy way is to use std::map. std::unordered_map does not work. We cannot use std::pair as the key in unordered map. You can do the following,
std::map<pair<int, int>, int> mp;
int func(int a, int b)
{
if (mp.find({a, b}) != mp.end()) return mp[{a, b}];
// compute f(a, b)...
mp[{a, b}] = // computed value;
return mp[{a, b}];
}

C++ range/xrange equivalent in STL or boost?

Is there C++ equivalent for python Xrange generator in either STL or boost?
xrange basically generates incremented number with each call to ++ operator.
the constructor is like this:
xrange(first, last, increment)
was hoping to do something like this using boost for each:
foreach(int i, xrange(N))
I. am aware of the for loop. in my opinion they are too much boilerplate.
Thanks
my reasons:
my main reason for wanting to do so is because i use speech to text software, and programming loop usual way is difficult, even if using code completion. It is much more efficient to have pronounceable constructs.
many loops start with zero and increment by one, which is default for range. I find python construct more intuitive
for(int i = 0; i < N; ++i)
foreach(int i, range(N))
functions which need to take range as argument:
Function(int start, int and, int inc);
function(xrange r);
I understand differences between languages, however if a particular construct in python is very useful for me and can be implemented efficiently in C++, I do not see a reason not to use it. For each construct is foreign to C++ as well however people use it.
I put my implementation at the bottom of the page as well the example usage.
in my domain i work with multidimensional arrays, often rank 4 tensor. so I would often end up with 4 nested loops with different ranges/increments to compute normalization, indexes, etc. those are not necessarily performance loops, and I am more concerned with correctness readability and ability to modify.
for example
int function(int ifirst, int ilast, int jfirst, int jlast, ...);
versus
int function(range irange, range jrange, ...);
In the above, if different strids are needed, you have to pass more variables, modify loops, etc. eventually you end up with a mass of integers/nearly identical loops.
foreach and range solve my problem exactly. familiarity to average C++ programmer is not high on my list of concerns - problem domain is a rather obscure, there is a lot of meta-programming, SSE intrinsic, generated code.
Boost irange should really be the answer (ThxPaul Brannan)
I'm adding my answer to provide a compelling example of very valid use-cases that are not served well by manual looping:
#include <boost/range/adaptors.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/irange.hpp>
using namespace boost::adaptors;
static int mod7(int v)
{ return v % 7; }
int main()
{
std::vector<int> v;
boost::copy(
boost::irange(1,100) | transformed(mod7),
std::back_inserter(v));
boost::sort(v);
boost::copy(
v | reversed | uniqued,
std::ostream_iterator<int>(std::cout, ", "));
}
Output: 6, 5, 4, 3, 2, 1, 0,
Note how this resembles generators/comprehensions (functional languages) and enumerables (C#)
Update I just thought I'd mention the following (highly inflexible) idiom that C++11 allows:
for (int x : {1,2,3,4,5,6,7})
std::cout << x << std::endl;
of course you could marry it with irange:
for (int x : boost::irange(1,8))
std::cout << x << std::endl;
Boost has counting_iterator as far as I know, which seems to allow only incrementing in steps of 1. For full xrange functionality you might need to implement a similar iterator yourself.
All in all it could look like this (edit: added an iterator for the third overload of xrange, to play around with boost's iterator facade):
#include <iostream>
#include <boost/iterator/counting_iterator.hpp>
#include <boost/range/iterator_range.hpp>
#include <boost/foreach.hpp>
#include <boost/iterator/iterator_facade.hpp>
#include <cassert>
template <class T>
boost::iterator_range<boost::counting_iterator<T> > xrange(T to)
{
//these assertions are somewhat problematic:
//might produce warnings, if T is unsigned
assert(T() <= to);
return boost::make_iterator_range(boost::counting_iterator<T>(0), boost::counting_iterator<T>(to));
}
template <class T>
boost::iterator_range<boost::counting_iterator<T> > xrange(T from, T to)
{
assert(from <= to);
return boost::make_iterator_range(boost::counting_iterator<T>(from), boost::counting_iterator<T>(to));
}
//iterator that can do increments in steps (positive and negative)
template <class T>
class xrange_iterator:
public boost::iterator_facade<xrange_iterator<T>, const T, std::forward_iterator_tag>
{
T value, incr;
public:
xrange_iterator(T value, T incr = T()): value(value), incr(incr) {}
private:
friend class boost::iterator_core_access;
void increment() { value += incr; }
bool equal(const xrange_iterator& other) const
{
//this is probably somewhat problematic, assuming that the "end iterator"
//is always the right-hand value?
return (incr >= 0 && value >= other.value) || (incr < 0 && value <= other.value);
}
const T& dereference() const { return value; }
};
template <class T>
boost::iterator_range<xrange_iterator<T> > xrange(T from, T to, T increment)
{
assert((increment >= T() && from <= to) || (increment < T() && from >= to));
return boost::make_iterator_range(xrange_iterator<T>(from, increment), xrange_iterator<T>(to));
}
int main()
{
BOOST_FOREACH(int i, xrange(10)) {
std::cout << i << ' ';
}
BOOST_FOREACH(int i, xrange(10, 20)) {
std::cout << i << ' ';
}
std::cout << '\n';
BOOST_FOREACH(int i, xrange(0, 46, 5)) {
std::cout << i << ' ';
}
BOOST_FOREACH(int i, xrange(10, 0, -1)) {
std::cout << i << ' ';
}
}
As others are saying, I don't see this buying you much over a normal for loop.
std::iota (not yet standardized) is kinda like range. Doesn't make things any shorter or clearer than an explicit for loop, though.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <numeric>
#include <vector>
int main() {
std::vector<int> nums(5);
std::iota(nums.begin(), nums.end(), 1);
std::copy(nums.begin(), nums.end(),
std::ostream_iterator<int>(std::cout, " "));
std::cout << std::endl;
return 0;
}
Compile with g++ -std=c++0x; this prints "1 2 3 4 5 \n".
well, here is what i wrote, since there does not seem to be one.
the generator does not use any internal storage besides single integer.
range object can be passed around and used in nested loops.
there is a small test case.
#include "iostream"
#include "foreach.hpp"
#include "boost/iterator/iterator_categories.hpp"
struct range {
struct iterator_type {
typedef int value_type;
typedef int difference_type;
typedef boost::single_pass_traversal_tag iterator_category;
typedef const value_type* pointer;
typedef const value_type & reference;
mutable value_type value;
const difference_type increment;
iterator_type(value_type value, difference_type increment = 0)
: value(value), increment(increment) {}
bool operator==(const iterator_type &rhs) const {
return value >= rhs.value;
}
value_type operator++() const { return value += increment; }
operator pointer() const { return &value; }
};
typedef iterator_type iterator;
typedef const iterator_type const_iterator;
int first_, last_, increment_;
range(int last) : first_(0), last_(last), increment_(1) {}
range(int first, int last, int increment = 1)
: first_(first), last_(last), increment_(increment) {}
iterator begin() const {return iterator(first_, increment_);}
iterator end() const {return iterator(last_);}
};
int test(const range & range0, const range & range1){
foreach(int i, range0) {
foreach(int j, range1) {
std::cout << i << " " << j << "\n";
}
}
}
int main() {
test(range(6), range(3, 10, 3));
}
my main reason for wanting to do so is because i use speech to text software, and programming loop usual way is difficult, even if using code completion. It is much more efficient to have pronounceable constructs.
That makes sense. But couldn't a simple macro solve this problem? #define for_i_to(N, body) for (int i = 0; i < N; ++i) { body }
or something similar. Or avoid the loop entirely and use the standard library algorithms. (std::for_each(range.begin(), rang.end(), myfunctor()) seems easier to pronounce)
many loops start with zero and increment by one, which is default for range. I find python construct more intuitive
You're wrong. The Python version is more intuitive to a Python programmer. And it may be more intuitive to a non-programmer. But you're writing C++ code. Your goal should be to make it intuitive to a C++ programmer. And C++ programmer know for-loops and they know the standard library algorithms. Stick to using those. (Or stick to writing Python)
functions which need to take range as argument:
Function(int start, int and, int inc);
function(xrange r);
Or the idiomatic C++ version:
template <typename iter_type>
void function(iter_type first, iter_type last);
In C++, ranges are represented by iterator pairs. Not integers.
If you're going to write code in a new language, respect the conventions of that language. Even if it means you have to adapt and change some habits.
If you're not willing to do that, stick with the language you know.
Trying to turn language X into language Y is always the wrong thing to do. It own't work, and it'll confuse the language X programmers who are going to maintain (or just read) your code.
Since I've started to use BOOST_FOREACH for all my iteration (probably a misguided idea, but that's another story), here's another use for aaa's range class:
std::vector<int> vec;
// ... fill the vector ...
BOOST_FOREACH(size_t idx, make_range(0, vec.size()))
{
// ... do some stuff ...
}
(yes, range should be templatized so I can use user-defined integral types with it)
And here's make_range():
template<typename T>
range<T> make_range(T const & start, T const & end)
{
return range<T>(start, end);
}
See also:
http://groups.google.com/group/boost-list/browse_thread/thread/3e11117be9639bd
and:
https://svn.boost.org/trac/boost/ticket/3469
which propose similar solutions.
And I've just found boost::integer_range; with the above example, the code would look like:
using namespace boost;
std::vector<int> vec;
// ... fill the vector ...
BOOST_FOREACH(size_t idx, make_integer_range(0, vec.size()))
{
// ... do some stuff ...
}
C++ 20's ranges header has iota_view which does this:
#include <ranges>
#include <vector>
#include <iostream>
int main()
{
for (int i : std::views::iota{1, 10})
std::cout << i << ' ';
std::cout << '\n';
for (int i : std::views::iota(1) | std::views::take(9))
std::cout << i << ' ';
}
Output:
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9
Since we don't really know what you actually want to use this for, I'm assuming your test case is representative. And then plain simple for loops are a whole lot simpler and more readable:
int main() {
for (int i = 0; i <= 6; ++i){
for (int j = 3; j <= 10; j += 3){
std::cout << i << " " << j << "\n";
}
}
}
A C++ programmer can walk in from the street and understand this function without having to look up complex classes elsewhere. And it's 5 lines instead of your 60. Of course if you have 400 loops exactly like these, then yes, you'd save some effort by using your range object. Or you could just wrap these two loops inside a helper function, and call that whenever you needed.
We don't really have enough information to say what's wrong with simple for loops, or what would be a suitable replacement. The loops here solve your problem with far less complexity and far fewer lines of code than your sample implementation. If this is a bad solution, tell us your requirements (as in what problem you need to solve, rather than "I want python-style loops in C++")
Keep it simple, make a stupid macro;
#define for_range(VARNAME, START, STOP, INCREMENT) \
for(int VARNAME = START, int STOP_ = STOP, INCREMENT_ = INCREMENT; VARNAME != STOP_; VARNAME += INCREMENT_)
and use as;
for_range(i, 10, 5, -1)
cout << i << endl;
You're trying to bring a python idiom into C++. That's unncessary. Use
for(int i=initVal;i<range;i+=increment)
{
/*loop body*/
}
to achieve this. In Python, the for(i in xrange(init, rng, increment)) form is necessary because Python doesn't provide a simple for loop, only a for-each type construct. So you can iterate only over a sequence or a generator. This is simply unnecessary and almost certainly bad practice in a language that provides a for(;;) syntax.
EDIT: As a completely non-recommended aside, the closest I can get to the for i xrange(first, last, inc) syntax in C++ is:
#include <cstdio>
using namespace std;
int xrange(unsigned int last, unsigned int first=0, unsigned int inc=1)
{
static int i = first;
return (i<last)?i+=inc:i=0;
}
int main()
{
while(int i=xrange(10, 0, 1))
printf("in loop at i=%d\n",i);
}
Not that while this loops the correct number of times, i varies from first+inc to last and NOT first to last-inc as in Python. Also, the function can only work reliably with unsigned values, as when i==0, the while loop will exit. Do not use this function. I only added this code here to demonstrate that something of the sort is indeed possible. There are also several other caveats and gotchas (the code won't really work for first!=0 on subsequent function calls, for example)

C++ STL - iterate through everything in a sequence

I have a sequence, e.g
std::vector< Foo > someVariable;
and I want a loop which iterates through everything in it.
I could do this:
for (int i=0;i<someVariable.size();i++) {
blah(someVariable[i].x,someVariable[i].y);
woop(someVariable[i].z);
}
or I could do this:
for (std::vector< Foo >::iterator i=someVariable.begin(); i!=someVariable.end(); i++) {
blah(i->x,i->y);
woop(i->z);
}
Both these seem to involve quite a bit of repetition / excessive typing. In an ideal language I'd like to be able to do something like this:
for (i in someVariable) {
blah(i->x,i->y);
woop(i->z);
}
It seems like iterating through everything in a sequence would be an incredibly common operation. Is there a way to do it in which the code isn't twice as long as it should have to be?
You could use for_each from the standard library. You could pass a functor or a function to it. The solution I like is BOOST_FOREACH, which is just like foreach in other languages. C+0x is gonna have one btw.
For example:
#include <iostream>
#include <vector>
#include <algorithm>
#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
void print(int v)
{
std::cout << v << std::endl;
}
int main()
{
std::vector<int> array;
for(int i = 0; i < 100; ++i)
{
array.push_back(i);
}
std::for_each(array.begin(), array.end(), print); // using STL
foreach(int v, array) // using Boost
{
std::cout << v << std::endl;
}
}
Not counting BOOST_FOREACH which AraK already suggested, you have the following two options in C++ today:
void function(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
std::for_each(someVariable.begin(), someVariable.end(), function);
struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());
Both require you to specify the "body" of the loop elsewhere, either as a function or as a functor (a class which overloads operator()). That might be a good thing (if you need to do the same thing in multiple loops, you only have to define the function once), but it can be a bit tedious too. The function version may be a bit less efficient, because the compiler is generally unable to inline the function call. (A function pointer is passed as the third argument, and the compiler has to do some more detailed analysis to determine which function it points to)
The functor version is basically zero overhead. Because an object of type functor is passed to for_each, the compiler knows exactly which function to call: functor::operator(), and so it can be trivially inlined and will be just as efficient as your original loop.
C++0x will introduce lambda expressions which make a third form possible.
std::for_each(someVariable.begin(), someVariable.end(), [](Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
});
Finally, it will also introduce a range-based for loop:
for(Foo& arg : my_someVariable)
{
blah(arg.x, arg.y);
woop(arg.z);
}
So if you've got access to a compiler which supports subsets of C++0x, you might be able to use one or both of the last forms. Otherwise, the idiomatic solution (without using Boost) is to use for_eachlike in one of the two first examples.
By the way, MSVS 2008 has a "for each" C++ keyword. Look at How to: Iterate Over STL Collection with for each.
int main() {
int retval = 0;
vector<int> col(3);
col[0] = 10;
col[1] = 20;
col[2] = 30;
for each( const int& c in col )
retval += c;
cout << "retval: " << retval << endl;
}
Prefer algorithm calls to hand-written loops
There are three reasons:
1) Efficiency: Algorithms are often more efficient than the loops programmers produce
2) Correctness: Writing loops is more subject to errors than is calling algorithms.
3) Maintainability: Algorithm calls often yield code that is clearer and more
straightforward than the corresponding explicit loops.
Prefer almost every other algorithm to for_each()
There are two reasons:
for_each is extremely general, telling you nothing about what's really being done, just that you're doing something to all the items in a sequence.
A more specialized algorithm will often be simpler and more direct
Consider, an example from an earlier reply:
void print(int v)
{
std::cout << v << std::endl;
}
// ...
std::for_each(array.begin(), array.end(), print); // using STL
Using std::copy instead, that whole thing turns into:
std::copy(array.begin(), array.end(), std::ostream_iterator(std::cout, "\n"));
"struct functor {
void operator()(Foo& arg){
blah(arg.x, arg.y);
woop(arg.z);
}
};
std::for_each(someVariable.begin(), someVariable.end(), functor());"
I think approaches like these are often needlessly baroque for a simple problem.
do i=1,N
call blah( X(i),Y(i) )
call woop( Z(i) )
end do
is perfectly clear, even if it's 40 years old (and not C++, obviously).
If the container is always a vector (STL name), I see nothing wrong with an index and nothing wrong with calling that index an integer.
In practice, often one needs to iterate over multiple containers of the same size simultaneously and peel off a datum from each, and do something with the lot of them. In that situation, especially, why not use the index?
As far as SSS's points #2 and #3 above, I'd say it could be so for complex cases, but often iterating 1...N is often as simple and clear as anything else.
If you had to explain the algorithm on the whiteboard, could you do it faster with, or without, using 'i'? I think if your meatspace explanation is clearer with the index, use it in codespace.
Save the heavy C++ firepower for the hard targets.