C++ Iterators for multi-dimensional C arrays - c++

I have a large number of 3 to 6-dimensional C arrays I need to iterate through. More C++'y representation like boost::multi_array isn't an option as these arrays come via the C framework PETSc (using fortran ordering, hence the backward indexing). Straightforward loops end up looking something like this:
for (int i=range.ibeg; i<=range.iend; ++i){
for (int j=range.jbeg; j<=range.jend; ++j){
for (int k=range.kbeg; k<=range.kend; ++k){
(...)
or even worse:
for (int i=range.ibeg-1; i<=range.iend+1; ++i){
for (int j=range.jbeg-1; j<=range.jend+1; ++j){
for (int k=range.kbeg-1; k<=range.kend+1; ++k){
for (int ii=0; ii<Np1d; ++ii){
for (int jj=0; jj<Np1d; ++jj){
for (int kk=0; kk<Np1d; ++kk){
data[k][j][i].member[kk][jj][ii] =
func(otherdata[k][j][i].member[kk][jj][ii],
otherdata[k][j][i].member[kk][jj][ii+1]);
There are many instances like this, with varying ranges on the loop indexes, and it all gets very ugly and potentially error prone. How should one construct iterators for multi-dimensional arrays like this?

A fully templated version was not so hard after all, so here it is in a separate answer, again with live example. If I'm not mistaken, this should have zero overhead on top of custom nested loops. You could measure and let me know. I intend to implement this for my own purposes anyway, that's why I put this effort here.
template<size_t N>
using size = std::integral_constant<size_t, N>;
template<typename T, size_t N>
class counter : std::array<T, N>
{
using A = std::array<T, N>;
A b, e;
template<size_t I = 0>
void inc(size<I> = size<I>())
{
if (++_<I>() != std::get<I>(e))
return;
_<I>() = std::get<I>(b);
inc(size<I+1>());
}
void inc(size<N-1>) { ++_<N-1>(); }
public:
counter(const A& b, const A& e) : A(b), b(b), e(e) { }
counter& operator++() { return inc(), *this; }
operator bool() const { return _<N-1>() != std::get<N-1>(e); }
template<size_t I>
T& _() { return std::get <I>(*this); }
template<size_t I>
constexpr const T& _() const { return std::get <I>(*this); }
};
Instead of operator[] I now have method _ (feel free to rename), which is just a shortcut for std::get, so usage is not so much more verbose than with operator[]:
for (counter<int, N> c(begin, end); c; ++c)
cout << c._<0>() << " " << c._<1>() << " " << c._<2>() << endl;
In fact, you may try the previous version
for (counter<int, N> c(begin, end); c; ++c)
cout << c[0] << " " << c[1] << " " << c[2] << endl;
and measure, because it may be equivalent. For this to work, switch std::array inheritance to public or declare using A::operator[]; in counter's public section.
What is definitely different is operator++, which is now based on recursive template function inc() and the problematic condition if (n < N - 1) is replaced by a specialization (actually, overload) that has no overhead.
If it turns out that there is overhead eventually, an ultimate attempt would be to replace std::array by std::tuple. In this case, std::get is the only way; there is no operator[] alternative. It will also be weird that type T is repeated N times. But I hope this won't be needed.
Further generalizations are possible, e.g. specifying a (compile-time) increment step per dimension or even specifying arbitrary indirect arrays per dimension, e.g. to simulate
a([3 5 0 -2 7], -4:2:20)
in Matlab-like syntax.
But this needs even more work, and I think you can take it on from here if you like the approach.

A full-blown n-dimensional iterator is not necessary in your simple case of nested for loops. Since a single traversal is only needed, a simple counter is enough, which is easily custom-made like this:
template<typename T, size_t N>
class counter
{
using A = std::array<T, N>;
A b, i, e;
public:
counter(const A& b, const A& e) : b(b), i(b), e(e) { }
counter& operator++()
{
for (size_t n = 0; n < N; ++n)
{
if (++i[n] == e[n])
{
if (n < N - 1)
i[n] = b[n];
}
else
break;
}
return *this;
}
operator bool() { return i[N - 1] != e[N - 1]; }
T& operator[](size_t n) { return i[n]; }
const T& operator[](size_t n) const { return i[n]; }
};
It is then very easy to use this counter like this:
int main()
{
constexpr size_t N = 3;
using A = std::array<int, N>;
A begin = {{0, -1, 0}};
A end = {{3, 1, 4}};
for (counter<int, N> c(begin, end); c; ++c)
cout << c << endl;
// or, cout << c[0] << " " << c[1] << " " << c[3] << endl;
}
assuming there's an operator << for counter. See live example for full code.
The innermost condition if (n < N - 1) accounts for being able to check for termination and is not so efficient to always check. It was not so apparent to me how to factor it out, but anyhow it only takes place when we advance to the next "digit" of the counter, not at every increment operation.
Instead of using c[0], c[1], c[2] etc., it is more efficient to use std::get if counter derives std::array instead of having member i (while b,e remain members). This idea can be extended towards a compile-time recursive implementation of operator++ (operator bool as well) that would eliminate the for loop inside it, along with the problematic check discussed above. operator[] would be discarded in this case. But all this would make counter code more obscure and I just wanted to highlight the idea. It would also make usage of counter a bit more verbose, but that's a price you'd need to pay for efficiency.
Of course, a full-blown n-dimensional iterator can be built by extending counter with more methods and traits. But to make it generic enough may be a huge undertaking.

Related

Can I write template <typename T> only once?

If it is possible, how could I avoid to write twice template < typename T> above my 2 functions making them still work for every argument type thew would have to work with?
And, even if it is possible, how could I write only the reverse_vector function making it contain "inside" the swap function, so I could be able to write template only once?
(I'm still a beginner learning basics and I absolutely don't know if there are better ways to do what I think you have understood I'd like. If yes, please say those. Thank you so much, good evening.)
#include <iostream>
#include <vector>
template <typename T>//HERE...
void swap(T& a, T& b) {
T tmp = a;
a = b;
b = tmp;
}
template <typename T>//...AND HERE
void reverse_vector(std::vector<T>& v) {
if (v.size() % 2 == 0) {
for (T i = 0; i < v.size() / 2; ++i) {
swap(v[i], v[v.size() - 1 - i]);
}
}
}
void print_vector(std::vector<short int>& v) {
for (unsigned short int i = 0; i < v.size(); ++i) {
std::cout << v[i] << '\n';
}
}
int main() {
unsigned short int g;
std::cout << "How large do you want the vector? _";
std::cin >> g;
std::vector <short int> v(g);
for (unsigned short int i = 0; i < g; ++i) {
v[i] = i + 1;
}
print_vector(v);
std::cout << "\n\n\n";
reverse_vector(v);
print_vector(v);
return 0;
}
You can not write template <typename T> only once. But you can write it not at all with c++20:
#include <concepts>
void swap(auto a, auto b) requires std::same_as<decltype(a), decltype(b)> {
auto tmp = a;
a = b;
b = tmp;
};
The requires clause is there to make sure you don't accidentally swap something of different types. I probably just broke the whole semantic and make swap copy everything. Think whether you need auto, auto&, auto&& to get the right semantic for swap before you use it.
PS: use std::ranges::swap

Fast element access for multi-dimensional representation of contiguous array

I have a multi-dimensional array represented contiguously in memory. I want to keep this representation hidden and just let the user access the array elements as if it were a multi-dimensional one: e.g. my_array[0][3][5] or my_array(0,3,5) or something similar. The dimensions of the object are not determined until runtime, but the object is created with a type that specifies how many dimensions it has. This element look-up will need to be called billions of times, and so should hopefully involve minimal overhead for each call.
I have looked at similar questions but not really found a good solution. Using the [] operator requires the creation of N-1 dimensional objects, which is fine for multi-dimensional structures like vectors-of-vectors because the object already exists, but for a contiguous array it seems like it would get convoluted very quickly and require some kind of slicing through the original array.
I have also looked at overloading (), which seems more promising, but requires specifying the number of arguments, which will vary depending upon the number of dimensions of the array. I have thought about using list initialization or vectors, but wanted to avoid instantiating objects.
I am only a little familiar with templates and figure that there should be some way with C++'s majestic template powers to specify a unique overloading of () for arrays with different types (e.g. different numbers of dimensions). But I have only used templates in really basic generic cases like making a function use both float and double.
I am imagining something like this:
template<typename TDim>
class MultiArray {
public:
MultiArray() {} //build some things
~MultiArray() {} //destroy some things
// The number of arguments would be == to TDim for the instantiated class
float& operator() (int dim1, int dim2, ...) {
//convert to contiguous index and return ref to element
// I believe the conversion equation is something like:
// dim1 + Maxdim1 * ( dim2 + MaxDim2 * ( dim3 + MaxDim3 * (...)))
}
private:
vector<float> internal_array;
vector<int> MaxDimX; // Each element says how large each corresponding dim is.
};
So if I initialize this class and attempted to access an element, it would look something like this:
my_array = MultiArray<4>();
element = my_array(2,5,4,1);
How might I go about doing this using templates? Is this even possible?
template<class T>
struct slice {
T* data = 0;
std::size_t const* stride = 0;
slice operator[](std::size_t I)const {
return{ data + I* *stride, stride + 1 };
}
operator T&()const {
return *data;
}
T& operator=(typename std::remove_const<T>::type in)const {
*data = std::move(in); return *data;
}
};
store a vector<T> of data, and an std::vector<std::size_t> stride of strides, where stride[0] is the element-stride that the first index wants.
template<class T>
struct buffer {
std::vector<T> data;
std::vector<std::size_t> strides;
buffer( std::vector<std::size_t> sizes, std::vector<T> d ):
data(std::move(d)),
strides(sizes)
{
std::size_t scale = 1;
for (std::size_t i = 0; i<sizes.size(); ++i){
auto next = scale*strides[sizes.size()-1-i];
strides[sizes.size()-1-i] = scale;
scale=next;
}
}
slice<T> get(){ return {data.data(), strides.data()}; }
slice<T const> get()const{ return {data.data(), strides.data()}; }
};
c++14. Live example.
If you use not enough []s it refers to the first element of the subarray in question. If you use too many it does UB. It does zero dimension checking, both in count of dimensions and in size.
Both can be added, but would cost performance.
The number of dimensions is dynamic. You can split buffer into two types, one that owns the buffer and the other that provides the dimensioned view of it.
It seems to me that you could use Boost.MultiArray, boost::multi_array_ref to be more specific. boost::multi_array_ref does exactly what you want: it wraps continuous data array into an object that may be treated as a multidimensional array. You may also use boost::multi_array_ref::array_view for slicing purposes.
I cannot provide you with any benchmark results, but from my experience, I can say that boost::multi_array_ref works pretty fast.
If you can use C++17, so variadic template folding, and row major order, I suppose you can write something like (caution: not tested)
template <template ... Args>
float & operator() (Args ... dims)
{
static_assert( sizeof...(Args) == TDim , "wrong number of indexes" );
// or SFINAE enable instead of static_assert()
std::size_t pos { 0U };
std::size_t i { 0U };
( pos *= MaxDimX[i++], pos += dims, ... );
return internal_array[pos];
}
OTPS (Off Topic Post Scriptum): your MaxDimX, if I understand correctly, is a vector of dimensions; so should be an unsigned integer, non a signed int; usually, for indexes, is used std::size_t [see Note 1].
OTPS 2: if you know compile time the number of dimensions (TDim, right?) instead of a std::vector, I suggest the use of a std::array; I mean
std::array<std::size_t, TDim> MaxDimX;
-- EDIT --
If you can't use C++17, you can use the trick of the unused array initialization to obtain something similar.
I mean
template <template ... Args>
float & operator() (Args ... dims)
{
using unused = int[];
static_assert( sizeof...(Args) == TDim , "wrong number of indexes" );
// or SFINAE enable instead of static_assert()
std::size_t pos { 0U };
std::size_t i { 0U };
(void)unused { (pos *= MaxDimX[i++], pos += dims, 0) ... };
return internal_array[pos];
}
Note 1: as pointed by Julius, the use of a signed or an unsigned integer for indexes is controversial.
So I try to explain better why I suggest to use an unsigned value (std::size_t, by example) for they.
The point is that (as far I know) all Standard Template Library is designed to use unsigned integer for index values. You can see it by the value returned by the size() method and by the fact that access methods that receive an index, as at() or operator[], receive an unsigned value.
Right or wrong, the language itself is designed to return a std::size_t from the old sizeof() and from more recent variadic sizeof...(). The same class std::index_sequence is an alias for std::integer_sequence with a fixed unsigned, again std::size_t, type.
In a world designed to use unsigned integers for indexes, the use of a signed integer for they it's possible but, IMHO, dangerous (because error prone).
I've used this pattern several times when creating a class templates of a matrix class with variable dimensions.
Matrix.h
#ifndef MATRIX_H
template<typename Type, size_t... Dims>
class Matrix {
public:
static const size_t numDims_ = sizeof...(Dims);
private:
size_t numElements_;
std::vector<Type> elements_;
std::vector<size_t> strides_; // Technically this vector contains the size of each dimension... (its shape)
// actual strides would be the width in memory of each element to that dimension of the container.
// A better name for this container would be dimensionSizes_ or shape_
public:
Matrix() noexcept;
template<typename... Arg>
Matrix( Arg&&... as ) noexcept;
const Type& operator[]( size_t idx ) const;
size_t numElements() const {
return elements_.size();
}
const std::vector<size_t>& strides() const {
return strides_;
}
const std::vector<Type>& elements() const {
return elements_;
}
}; // matrix
#include "Matrix.inl"
#endif // MATRIX_H
Matrix.inl
template<typename Type, size_t... Dims>
Matrix<Type, Dims...>::Matrix() noexcept :
strides_( { Dims... } ) {
using std::begin;
using std::end;
auto mult = std::accumulate( begin( strides_ ), end( strides_ ), 1, std::multiplies<>() );
numElements_ = mult;
elements_.resize( numElements_ );
} // Matrix
template<typename Type, size_t... Dims>
template<typename... Arg>
Matrix<Type, Dims...>::Matrix( Arg&&... as ) noexcept :
elements_( { as... } ),
strides_( { Dims... } ){
numElements_ = elements_.size();
} // Matrix
template<typename T, size_t... d>
const T& Matrix<T,d...>::operator[]( size_t idx ) const {
return elements_[idx];
} // Operator[]
Matrix.cpp
#include "Matrix.h"
#include <vector>
#include <numeric>
#include <functional>
#include <algorithm>
main.cpp
#include <vector>
#include <iostream>
#include "matrix.h"
int main() {
Matrix<int, 3, 3> mat3x3( 1, 2, 3, 4, 5, 6, 7, 8, 9 );
for ( size_t idx = 0; idx < mat3x3.numElements(); idx++ ) {
std::cout << mat3x3.elements()[idx] << " ";
}
std::cout << "\n\nnow using array index operator\n\n";
for ( size_t idx = 0; idx < mat3x3.numElements(); idx++ ) {
std::cout << mat3x3[idx] << " ";
}
std::cout << "\n\ncheck the strides\n\n";
for ( size_t idx = 0; idx < mat3x3.numDims_; idx++ ) {
std::cout << mat3x3.strides()[idx] << " ";
}
std::cout << "\n\n";
std::cout << "=================================\n\n";
Matrix<float, 5, 2, 9, 7> mf5x2x9x7;
// Check Strides
// Num Elements
// Total Size
std::cout << "The total number of dimensions are: " << mf5x2x9x7.numDims_ << "\n";
std::cout << "The total number of elements are: " << mf5x2x9x7.numElements() << "\n";
std::cout << "These are the strides: \n";
for ( size_t n = 0; n < mf5x2x9x7.numDims_; n++ ) {
std::cout << mf5x2x9x7.strides()[n] << " ";
}
std::cout << "\n";
std::cout << "The elements are: ";
for ( size_t n = 0; n < mf5x2x9x7.numElements(); n++ ) {
std::cout << mf5x2x9x7[n] << " ";
}
std::cout << "\n";
std::cout << "\nPress any key and enter to quit." << std::endl;
char c;
std::cin >> c;
return 0;
} // main
This is a simple variable multidimensional matrix class of the Same Type <T>
You can create a matrix of floats, ints, chars etc of varying sizes such as a 2x2, 2x3, 5x3x7, 4x9x8x12x2x19. This is a very simple but versatile class.
It is using std::vector<> so the search time is linear. The larger the multi - dimensional matrix grows in dimensions the larger the internal container will grow depending on the size of each dimension; this can "explode" fairly quick if each individual dimensions are of a large dimensional size for example: a 9x9x9 is only a 3 dimensional volumetric matrix that has many more elements than a 2x2x2x2x2 which is a 5 dimensional volumetric matrix. The first matrix has 729 elements where the second matrix has only 32 elements.
I did not include a default constructor, copy constructor, move constructor, nor any overloaded constructors that would accept either a std::container<T> or another Matrix<T,...>. This can be done as an exercise for the OP.
I also did not include any simple functions that would give the size of total elements from the main container, nor the number of total dimensions which would be the size of the strides container size. The OP should be able to implement these very simply.
As for the strides and for indexing with multiple dimensional coordinates the OP would need to use the stride values to compute the appropriate indexes again I leave this as the main exercise.
EDIT - I went ahead and added a default constructor, moved some members to private section of the class, and added a few access functions. I did this because I just wanted to demonstrate in the main function the power of this class even when creating an empty container of its type.
Even more you can take user Yakk's answer with his "stride & slice" algorithm and should easily be able to plug it right into this class giving you the full functionality of what you are looking for.

How can I avoid "for" loops with an "if" condition inside them with C++?

With almost all code I write, I am often dealing with set reduction problems on collections that ultimately end up with naive "if" conditions inside of them. Here's a simple example:
for(int i=0; i<myCollection.size(); i++)
{
if (myCollection[i] == SOMETHING)
{
DoStuff();
}
}
With functional languages, I can solve the problem by reducing the collection to another collection (easily) and then perform all operations on my reduced set. In pseudocode:
newCollection <- myCollection where <x=true
map DoStuff newCollection
And in other C variants, like C#, I could reduce with a where clause like
foreach (var x in myCollection.Where(c=> c == SOMETHING))
{
DoStuff();
}
Or better (at least to my eyes)
myCollection.Where(c=>c == Something).ToList().ForEach(d=> DoStuff(d));
Admittedly, I am doing a lot of paradigm mixing and subjective/opinion based style, but I can't help but feel that I am missing something really fundamental that could allow me to use this preferred technique with C++. Could someone enlighten me?
IMHO it's more straight forward and more readable to use a for loop with an if inside it. However, if this is annoying for you, you could use a for_each_if like the one below:
template<typename Iter, typename Pred, typename Op>
void for_each_if(Iter first, Iter last, Pred p, Op op) {
while(first != last) {
if (p(*first)) op(*first);
++first;
}
}
Usecase:
std::vector<int> v {10, 2, 10, 3};
for_each_if(v.begin(), v.end(), [](int i){ return i > 5; }, [](int &i){ ++i; });
Live Demo
Boost provides ranges that can be used w/ range-based for. Ranges have the advantage that they don't copy the underlying data structure, they merely provide a 'view' (that is, begin(), end() for the range and operator++(), operator==() for the iterator). This might be of your interest: http://www.boost.org/libs/range/doc/html/range/reference/adaptors/reference/filtered.html
#include <boost/range/adaptor/filtered.hpp>
#include <iostream>
#include <vector>
struct is_even
{
bool operator()( int x ) const { return x % 2 == 0; }
};
int main(int argc, const char* argv[])
{
using namespace boost::adaptors;
std::vector<int> myCollection{1,2,3,4,5,6,7,8,9};
for( int i: myCollection | filtered( is_even() ) )
{
std::cout << i;
}
}
Instead of creating a new algorithm, as the accepted answer does, you can use an existing one with a function that applies the condition:
std::for_each(first, last, [](auto&& x){ if (cond(x)) { ... } });
Or if you really want a new algorithm, at least reuse for_each there instead of duplicating the iteration logic:
template<typename Iter, typename Pred, typename Op>
void
for_each_if(Iter first, Iter last, Pred p, Op op) {
std::for_each(first, last, [&](auto& x) { if (p(x)) op(x); });
}
The idea of avoiding
for(...)
if(...)
constructs as an antipattern is too broad.
It is completely fine to process multiple items that match a certain expression from inside a loop, and the code cannot get much clearer than that. If the processing grows too large to fit on screen, that is a good reason to use a subroutine, but still the conditional is best placed inside the loop, i.e.
for(...)
if(...)
do_process(...);
is vastly preferable to
for(...)
maybe_process(...);
It becomes an antipattern when only one element will match, because then it would be clearer to first search for the element, and perform the processing outside of the loop.
for(int i = 0; i < size; ++i)
if(i == 5)
is an extreme and obvious example of this. More subtle, and thus more common, is a factory pattern like
for(creator &c : creators)
if(c.name == requested_name)
{
unique_ptr<object> obj = c.create_object();
obj.owner = this;
return std::move(obj);
}
This is hard to read, because it isn't obvious that the body code will be executed once only. In this case, it would be better to separate the lookup:
creator &lookup(string const &requested_name)
{
for(creator &c : creators)
if(c.name == requested_name)
return c;
}
creator &c = lookup(requested_name);
unique_ptr obj = c.create_object();
There is still an if within a for, but from the context it becomes clear what it does, there is no need to change this code unless the lookup changes (e.g. to a map), and it is immediately clear that create_object() is called only once, because it is not inside a loop.
Here is a quick relatively minimal filter function.
It takes a predicate. It returns a function object that takes an iterable.
It returns an iterable that can be used in a for(:) loop.
template<class It>
struct range_t {
It b, e;
It begin() const { return b; }
It end() const { return e; }
bool empty() const { return begin()==end(); }
};
template<class It>
range_t<It> range( It b, It e ) { return {std::move(b), std::move(e)}; }
template<class It, class F>
struct filter_helper:range_t<It> {
F f;
void advance() {
while(true) {
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
if (this->empty())
return;
if (f(*this->begin()))
return;
}
}
filter_helper(range_t<It> r, F fin):
range_t<It>(r), f(std::move(fin))
{
while(true)
{
if (this->empty()) return;
if (f(*this->begin())) return;
(range_t<It>&)*this = range( std::next(this->begin()), this->end() );
}
}
};
template<class It, class F>
struct filter_psuedo_iterator {
using iterator_category=std::input_iterator_tag;
filter_helper<It, F>* helper = nullptr;
bool m_is_end = true;
bool is_end() const {
return m_is_end || !helper || helper->empty();
}
void operator++() {
helper->advance();
}
typename std::iterator_traits<It>::reference
operator*() const {
return *(helper->begin());
}
It base() const {
if (!helper) return {};
if (is_end()) return helper->end();
return helper->begin();
}
friend bool operator==(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
if (lhs.is_end() && rhs.is_end()) return true;
if (lhs.is_end() || rhs.is_end()) return false;
return lhs.helper->begin() == rhs.helper->begin();
}
friend bool operator!=(filter_psuedo_iterator const& lhs, filter_psuedo_iterator const& rhs) {
return !(lhs==rhs);
}
};
template<class It, class F>
struct filter_range:
private filter_helper<It, F>,
range_t<filter_psuedo_iterator<It, F>>
{
using helper=filter_helper<It, F>;
using range=range_t<filter_psuedo_iterator<It, F>>;
using range::begin; using range::end; using range::empty;
filter_range( range_t<It> r, F f ):
helper{{r}, std::forward<F>(f)},
range{ {this, false}, {this, true} }
{}
};
template<class F>
auto filter( F&& f ) {
return [f=std::forward<F>(f)](auto&& r)
{
using std::begin; using std::end;
using iterator = decltype(begin(r));
return filter_range<iterator, std::decay_t<decltype(f)>>{
range(begin(r), end(r)), f
};
};
};
I took short cuts. A real library should make real iterators, not the for(:)-qualifying pseudo-fascades I did.
At point of use, it looks like this:
int main()
{
std::vector<int> test = {1,2,3,4,5};
for( auto i: filter([](auto x){return x%2;})( test ) )
std::cout << i << '\n';
}
which is pretty nice, and prints
1
3
5
Live example.
There is a proposed addition to C++ called Rangesv3 which does this kind of thing and more. boost also has filter ranges/iterators available. boost also has helpers that make writing the above much shorter.
One style that gets used enough to mention, but hasn't been mentioned yet, is:
for(int i=0; i<myCollection.size(); i++) {
if (myCollection[i] != SOMETHING)
continue;
DoStuff();
}
Advantages:
Doesn't change the indentation level of DoStuff(); when condition complexity increases. Logically, DoStuff(); should be at the top-level of the for loop, and it is.
Immediately makes it clear that the loop iterates over the SOMETHINGs of the collection, without requiring the reader to verify that there is nothing after the closing } of the if block.
Doesn't require any libraries or helper macros or functions.
Disadvantages:
continue, like other flow control statements, gets misused in ways that lead to hard-to-follow code so much that some people are opposed to any use of them: there is a valid style of coding that some follow that avoids continue, that avoids break other than in a switch, that avoids return other than at the end of a function.
for(auto const &x: myCollection) if(x == something) doStuff();
Looks pretty much like a C++-specific for comprehension to me. To you?
If DoStuff() would be dependent on i somehow in the future then I'd propose this guaranteed branch-free bit-masking variant.
unsigned int times = 0;
const int kSize = sizeof(unsigned int)*8;
for(int i = 0; i < myCollection.size()/kSize; i++){
unsigned int mask = 0;
for (int j = 0; j<kSize; j++){
mask |= (myCollection[i*kSize+j]==SOMETHING) << j;
}
times+=popcount(mask);
}
for(int i=0;i<times;i++)
DoStuff();
Where popcount is any function doing a population count ( count number of bits = 1 ). There will be some freedom to put more advanced constraints with i and their neighbors. If that is not needed we can strip the inner loop and remake the outer loop
for(int i = 0; i < myCollection.size(); i++)
times += (myCollection[i]==SOMETHING);
followed by a
for(int i=0;i<times;i++)
DoStuff();
Also, if you don't care reordering the collection, std::partition is cheap.
#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>
void DoStuff(int i)
{
std::cout << i << '\n';
}
int main()
{
using namespace std::placeholders;
std::vector<int> v {1, 2, 5, 0, 9, 5, 5};
const int SOMETHING = 5;
std::for_each(v.begin(),
std::partition(v.begin(), v.end(),
std::bind(std::equal_to<int> {}, _1, SOMETHING)), // some condition
DoStuff); // action
}
I am in awe of the complexity of the above solutions. I was going to suggest a simple #define foreach(a,b,c,d) for(a; b; c)if(d) but it has a few obvious deficits, for example, you have to remember to use commas instead of semicolons in your loop, and you can't use the comma operator in a or c.
#include <list>
#include <iostream>
using namespace std;
#define foreach(a,b,c,d) for(a; b; c)if(d)
int main(){
list<int> a;
for(int i=0; i<10; i++)
a.push_back(i);
for(auto i=a.begin(); i!=a.end(); i++)
if((*i)&1)
cout << *i << ' ';
cout << endl;
foreach(auto i=a.begin(), i!=a.end(), i++, (*i)&1)
cout << *i << ' ';
cout << endl;
return 0;
}
Another solution in case the i:s are important. This one builds a list that fills in the indexes of which to call doStuff() for. Once again the main point is to avoid the branching and trade it for pipelineable arithmetic costs.
int buffer[someSafeSize];
int cnt = 0; // counter to keep track where we are in list.
for( int i = 0; i < container.size(); i++ ){
int lDecision = (container[i] == SOMETHING);
buffer[cnt] = lDecision*i + (1-lDecision)*buffer[cnt];
cnt += lDecision;
}
for( int i=0; i<cnt; i++ )
doStuff(buffer[i]); // now we could pass the index or a pointer as an argument.
The "magical" line is the buffer loading line that arithmetically calculates wether to keep the value and stay in position or to count up position and add value. So we trade away a potential branch for some logics and arithmetics and maybe some cache hits. A typical scenario when this would be useful is if doStuff() does a small amount of pipelineable calculations and any branch in between calls could interrupt those pipelines.
Then just loop over the buffer and run doStuff() until we reach cnt. This time we will have the current i stored in the buffer so we can use it in the call to doStuff() if we would need to.
One can describe your code pattern as applying some function to a subset of a range, or in other words: applying it to the result of applying a filter to the whole range.
This is achievable in the most straightforward manner with Eric Neibler's ranges-v3 library; although it's a bit of an eyesore, because you want to work with indices:
using namespace ranges;
auto mycollection_has_something =
[&](std::size_t i) { return myCollection[i] == SOMETHING };
auto filtered_view =
views::iota(std::size_t{0}, myCollection.size()) |
views::filter(mycollection_has_something);
for (auto i : filtered_view) { DoStuff(); }
But if you're willing to forego indices, you'd get:
auto is_something = [&SOMETHING](const decltype(SOMETHING)& x) { return x == SOMETHING };
auto filtered_collection = myCollection | views::filter(is_something);
for (const auto& x : filtered_collection) { DoStuff(); }
which is nicer IMHO.
PS - The ranges library is mostly going into the C++ standard in C++20.
I'll just mention Mike Acton, he would definitely say:
If you have to do that, you have a problem with your data. Sort your data!

int[n][m], where n and m are known at runtime

I often need to create a 2D array with width and height (let them be n and m) unknown at compile time, usually I write :
vector<int> arr(n * m);
And I access elements manually with :
arr[j * m + i]
I recently got told that I could instead write :
int arr[n][m] // n and m still only known at runtime.
So here are 2 questions :
Is this behaviour allowed by the C++ Standard ?
How should I pass such an array to a function ? g++ reports that arr has type int (*)[n], but again, n is dynamic and not known outside the function where it is declared (main).
The feature you are asking about (where the dimensions are only made known at runtime) is a non-standard extension of C++, but a standard one of C.99 (made into an optional feature in C.11). The feature is called variable-length array (VLA), and the link is the documentation for GCC.
If you are using GCC, then you are to pass the length of the array as a parameter to the function.
void foo (int m, int arr[][m]) {
//...
}
However, there seems to be a bug in either the compiler or the documentation, as the above function prototype syntax only works when compiling C code, not C++ (as of gcc version 4.8.2). The only work-around I found was to use a void * parameter, and cast it int the function body:
int foo_workaround (int m, void *x)
{
int (*arr)[m] = static_cast<int (*)[m]>(x);
//...
}
There are other solutions if you do not want to rely on a compiler extension. If you don't mind a separate allocation for each row, you can use a vector of vectors, for example:
std::vector<std::vector<int> > arr(n, std::vector<int>(m));
However, if you want a single allocation block like you demonstrated in your own example, then it is better to create a wrapper class around vector to give you 2-d like syntax.
template <typename T>
class vector2d {
int n_;
int m_;
std::vector<T> vec_;
template <typename I>
class vector2d_ref {
typedef std::iterator_traits<I> TRAITS;
typedef typename TRAITS::value_type R_TYPE;
template <typename> friend class vector2d;
I p_;
vector2d_ref (I p) : p_(p) {}
public:
R_TYPE & operator [] (int j) { return *(p_+j); }
};
typedef std::vector<T> VEC;
typedef vector2d_ref<typename VEC::iterator> REF;
typedef vector2d_ref<typename VEC::const_iterator> CREF;
template <typename I>
vector2d_ref<I> ref (I p, int i) { return p + (i * m_); }
public:
vector2d (int n, int m) : n_(n), m_(m), vec_(n*m) {}
REF operator [] (int i) { return ref(vec_.begin(), i); }
CREF operator [] (int i) const { return ref(vec_.begin(), i); }
};
The wrapper's operator[] returns an intermediate object that also overloads operator[] to allow 2-dimensional array syntax when using the wrapper.
vector2d<int> v(n, m);
v[i][j] = 7;
std::cout << v[i][j] << '\n';
Why not have an std::vector of std::vector's?
std::vector<std::vector<int> > arr(n, std::vector<int>(m));
Accessing an item then becomes:
std::cout << "(2,1) = " << arr[2][1] << std::endl;
A std::vector of std::vector's (from #include <vector>) would do the same thing as a 2-Dimensional array:
int n = 10, m = 10; //vector dimensions
std::vector<std::vector<int>> arr(n, std::vector<int>(m)); //Create 2D vector (vector will be stored as "std::vector<int> arr(n * m);
//you can get values from 2D vector the same way that you can for arrays
int a = 5, b = 5, value = 12345;
arr[a][b] = 12345;
std::cout << "The element at position (" << a << ", " << b << ") is " << arr[a][b] << "." << std::endl;
outputs:
The element at position (5, 5) is 12345.

Loop from one integer to another regardless of direction, with minimal overhead

Assume I'm given two unsigned integers:
size_t A, B;
They're loaded out with some random numbers, and A may be larger, equal, or smaller than B. I want to loop from A to B. However, the comparison and increment both depend on which is larger.
for (size_t i = A; i <= B; ++i) //A <= B
for (size_t i = A; i >= B; --i) //A >= B
The obvious brute force solution is to embed these in if statements:
if (A <= B)
{
for (size_t i = A; i <= B; ++i) ...
}
else
{
for (size_t i = A; i >= B; --i) ...
}
Note that I must loop from A to B, so I can't have two intermediate integers and toss A and B into the right slots then have the same comparison and increment. In the "A is larger" case I must decrement, and the opposite must increment.
I'm going to have potentially many nested loops that require this same setup, which means every if/else will have a function call, which I have to pass lots of variables through, or another if/else with another if/else etc.
Is there any tricky shortcut to avoid this without sacrificing much speed? Function pointers and stuff in a tight, often repeated loop sound extremely painful to me. Is there some crazy templates solution?
My mistake, originally misinterpreting the question.
To make an inclusive loop from A to B, you have a tricky situation. You need to loop one past B. So you work out that value prior to your loop. I've used the comma operator inside the for loop, but you can always put it outside for clarity.
int direction = (A < B) ? 1 : -1;
for( size_t i = A, iEnd = B+direction; i != iEnd; i += direction ) {
...
}
If you don't mind modifying A and B, you can do this instead (using A as the loop variable):
for( B+=direction, A != B; A += direction ) {
}
And I had a play around... Don't know what the inlining rules are when it comes to function pointers, or whether this is any faster, but it's an exercise in any case. =)
inline const size_t up( size_t& val ) { return val++; }
inline const size_t down( size_t& val ) { return val--; }
typedef const size_t (*FnIncDec)( size_t& );
inline FnIncDec up_or_down( size_t A, size_t B )
{
return (A <= B) ? up : down;
}
int main( void )
{
size_t A = 4, B = 1;
FnIncDec next = up_or_down( A, B );
for( next(B); A != B; next(A) ) {
std::cout << A << endl;
}
return 0;
}
In response to this:
This won't work for case A = 0, B = UINT_MAX (and vice versa)
That is correct. The problem is that the initial value for i and iEnd become the same due to overflow. To handle that, you would instead use a do->while loop. That removes the initial test, which is redundant because you will always execute the loop body at least once... By removing that first test, you iterate past the terminating condition the first time around.
size_t i = A;
size_t iEnd = B+direction;
do {
// ...
i += direction;
} while( i != iEnd );
size_t const delta = size_t(A < B? 1 : -1);
size_t i = A;
for( ;; )
{
// blah
if( i == B ) { break; }
i += delta;
}
What are you going to do with the iterated value?
If this is going to be some index in an array, you should use the relevant iterator or reverse_iterator class, and implement your algorithms around these. Your code will be more robust, and easier to maintain or evolve. Besides, a lot of tools in the standard library are built using these interfaces.
Actually, even if you don't, you may implement an iterator class which returns its own index.
You can also use a little bit of metaprogramming magic to define how your iterator will behave according to the order of A and B.
Before going further, please consider that this would only work on constant values of A and B.
template <int A,int B>
struct ordered {
static const bool value = A > B ? false: true;
};
template <bool B>
int pre_incr(int &v){
return ++v;
}
template <>
int pre_incr<false>(int &v){
return --v;
}
template <int A, int B>
class const_int_iterator : public iterator<input_iterator_tag, const int>
{
int p;
public:
typedef const_int_iterator<A,B> self_type;
const_int_iterator() : p(A) {}
const_int_iterator(int s) : p(s) {}
const_int_iterator(const self_type& mit) : p(mit.p) {}
self_type& operator++() {pre_incr< ordered<A,B>::value >(p);return *this;}
self_type operator++(int) {self_type tmp(*this); operator++(); return tmp;}
bool operator==(const self_type& rhs) {return p==rhs.p;}
bool operator!=(const self_type& rhs) {return p!=rhs.p;}
const int& operator*() {return p;}
};
template <int A, int B>
class iterator_factory {
public:
typedef const_int_iterator<A,B> iterator_type;
static iterator_type begin(){
return iterator_type();
}
static iterator_type end(){
return iterator_type(B);
}
};
In the code above, I defined a barebone iterator class going accross the values from A to B. There's simple metaprogramming test to determine whether A and B are in ascending order, and pick the correct operator (++ or --) to go through the values.
Finally, I also defined a simple factory class to hold begin and end iterators methods, Using this class let you have only one single point of declaration for your dependent type values A and B (I mean here that you only need to use A and B once for this container, and the iterators generated from there will be depending on these same A and B, thus simplifying code somewhat).
Here I provide a simple test program, outputing values from 20 to 11.
#define A 20
#define B 10
typedef iterator_factory<A,B> factory;
int main(){
auto it = factory::begin();
for (;it != factory::end();it++)
cout << "iterator is : " << *it << endl;
}
There might better ways of doing this with the standard library though.
The issue of using O and UINT_MAX for A and B was brought up. I think it should be possible to handle these cases by overloading the templates using these particular values (left as an exercise for the reader).
size_t A, B;
if (A > B) swap(A,B); // Assuming A <= B, if not, make B to be A
for (size_t i = A; A <= B; ++A) ...