Howto elegantly extract a 2D rectangular region from a C++ vector - c++

The problem is pretty basic.
(I am puzzled why the search didn't find anything)
I have a rectangular "picture" that stores it's pixel color line
after line in a std::vector
I want to copy a rectangular region out of that picture.
How would I elegantly code this in c++?
My first try:
template <class T> std::vector<T> copyRectFromVector(const std::vector<T>& vec, std::size_t startx, std::size_t starty, std::size_t endx, std::size_t endy, std::size_t fieldWidth, std::size_t fieldHeight)
{
using namespace std;
vector<T> ret((endx-startx)*(endy-starty)+10); // 10: chickenfactor
// checks if the given parameters make sense:
if (vec.size() < fieldWidth*endy)
{
cerr << "Error: CopyRectFromVector: vector to small to contain rectangular region!" << std::endl;
return ret;
}
// do the copying line by line:
vector<T>::const_iterator vecIt = vec.begin();
vector<T>::forward_iterator retIt = ret.end();
vecIt += startx + (starty*fieldWidth);
for(int i=starty; i < endy; ++i)
{
std::copy(vecIt, vecIt + endx - startx, retIt);
}
return ret;
}
does not even compile.....
Addit: Clarification:
I know how to do this "by hand". It is not a problem as such. But I would love some c++ stl iterator magic that does the same, but faster and... more c++ stylish.
Addition: I give the algorithm the pictureDataVector, the width and height of the picture and a rectangle denoting the region that I want to copy out of the picture.
The return value should be a new vector with the contents of the rectangle.
Think of it as opening your favorite image editor, and copy a rectangular region out of that.
The Picture is stored as a long 1D array(vector) of pixelcolors.

for (int r = startRow; r < endRow; r++)
for (int c = startCol; c < endCol; c++)
rect[r-startRow][c-startCol] = source[r*rowWidth+c];

Basically the same idea, except that it compiles and is a bit more iteratory:
#include <vector>
#include <algorithm>
#include <iostream>
#include <iterator>
template <typename I, typename O>
void copyRectFromBiggerRect(
I input,
O output,
std::size_t startx,
std::size_t cols,
std::size_t starty,
std::size_t rows,
std::size_t stride
) {
std::advance(input, starty*stride + startx);
while(rows--) {
std::copy(input, input+cols, output);
std::advance(input, stride);
}
}
template<typename T>
std::vector<T> copyRectFromVector (
const std::vector<T> &vec,
std::size_t startx,
std::size_t starty,
std::size_t endx,
std::size_t endy,
std::size_t stride
) {
// parameter-checking omitted: you could also check endx > startx etc.
const std::size_t cols = endx - startx;
const std::size_t rows = endy - starty;
std::vector<T> ret;
ret.reserve(rows*cols);
std::back_insert_iterator<std::vector<T> > output(ret);
typename std::vector<T>::const_iterator input = vec.begin();
copyRectFromBiggerRect(input,output,startx,cols,starty,rows,stride);
return ret;
}
int main() {
std::vector<int> v(20);
for (int i = 0; i < 20; ++i) v[i] = i;
std::vector<int> v2 = copyRectFromVector(v, 0, 0, 1, 2, 4);
std::copy(v2.begin(), v2.end(), std::ostream_iterator<int>(std::cout, "\n"));
}
I wouldn't expect this to be any faster than two loops copying by index. Probably slower, even, although it's basically a race between the overhead of vector::push_back and the gain from std::copy over a loop.
However, it might be more flexible if your other template code is designed to work with iterators in general, rather than vector as a specific container. copyRectFromBiggerRect can use an array, a deque, or even a list as input just as easily as a vector, although it's currently not optimal for iterators which aren't random-access, since it currently advances through each copied row twice.
For other ways of making this more like other C++ code, consider boost::multi_array for multi-dimensional arrays (in which case the implementation would be completely different from this anyway), and avoid returning collections such as vector by value (firstly it can be inefficient if you don't get the return-value optimisation, and second so that control over what resources are allocated is left at the highest level possible).

Your question asks for a C++ way of copying a rectangular field of elements in some container. You have a fairly close example of doing so and will get more in the answers. Let's generalize, though:
You want an iterator that travels a rectangular range of elements over some range of elements. So, how about write a sort of adapter that sits on any container and provides this special iterator.
Gonna go broad strokes with the code here:
vector<pixels> my_picture;
point selTopLeft(10,10), selBotRight(40, 50);
int picWidth(640), picHeight(480);
rectangular_selection<vector<pixels> > selection1(my_picture.begin(),
my_picture.end(), picWidth, picHeight, selTopLeft, selBotRight);
// Now you can use stl algorithms on your rectangular range
vector<pixels> rect_copy = std::copy(selection1.begin(), selection1.end());
// or maybe you don't want to copy, you want
// to modify the selection in place
std::for_each (selection1.begin(), selection1.end(), invert_color);
I'm sure this is totally do-able, but I'm not comfortable coding stl-style template stuff off-the-cuff. If I have some time and you're interested, I may re-edit a rough-draft later, since this is an interesting concept.
See this SO question's answer for inspiration.

Good C++ code must first be easy to read and understand (just like any code), object oriented (like any code in an object oriented language) and then should use the language facilities to simplify the implementation.
I would not worry about using STL algorithms to make it look more C++-ish, it would be much better to start simplifying the usability (interface) in an object oriented way. Do not use plain vectors externally to represent your images. Provide a level of abstraction: make a class that represents the image and provide the funcionality you need in there. That will improve usability by encapsulating details from the regular use (the 2D area object can know its dimensions, the user does not need to pass them as arguments). And this will make the code more robust as the user can make less mistakes.
Even if you use STL containers, always consider readability first. If it is simpler to implement in terms of a regular for loop and it will be harder to read with STL algorithms, forget them: make your code simple and maintainable.
That should be your focus: making better, simpler, more readable code. Use language features to improve your code, not your code to exercise or show off the features in the language. It will pay off if you need to maintain that code two months from now.
Note: Using more STL will not make your code more idiomatic in C++, and I believe this is one of those cases. Abusing STL can make the code actually worse.

Related

C++ vectorizing a vector of vectors

I have some code that uses a vector<vector<>> to store calculation results.
Through benchmarking, I have found that this is preventing my code from vectorizing, even though I am accessing the elements with the appropriate C-stride.
I am trying to come up with a data structure that will vectorize and improve my code's performance.
I read a few posts on here, and several of them mentioned creating a class that has 2 separate vectors inside: 1 for storing the data contiguously, and another for storing indices marking the beginning of a new column/row from the original 2D vector<vector>. Essentially, it would decompose the 2D array into a 1D, and use the "helper" vector to allow for proper indexing.
My concern is that I have also read that vectorization doesn't usually happen with indirect indexing like this, such as in common compressed row storage scheme for sparse matrices.
Before I go through all the work of implementing this, has anybody run into this problem before and solved it? Any other suggestions or resources that could help?
I wrote a small matrix class based on std::vector:
#include <vector>
template <typename T>
class MyMatrix {
public:
typedef T value_type;
struct RowPointer {
int row_index;
MyMatrix* parent;
RowPointer(int r,MyMatrix* p) : row_index(r),parent(p) {}
T& operator[](int col_index) {
return parent->values[row_index*parent->cols+col_index];
}
};
MyMatrix() : rows(0),cols(0),size(0),values(){}
MyMatrix(int r,int c) : rows(r),cols(c),size(r*c),values(std::vector<T>(size)){}
RowPointer operator[](int row_index){return RowPointer(row_index,this);}
private:
size_t rows;
size_t cols;
size_t size;
std::vector<T> values;
};
It can be used like this:
MyMatrix<double> mat = MyMatrix<double>(4,6);
mat[1][2] = 3;
std::cout << mat[0][0] << " " << mat[1][2] << std::endl;
It still misses lots of stuff, but I think it is enough to illustrate the idea of flattening the matrix. From your question it was not 100% clear, if your rows have different sizes, then the access pattern is a little bit more complicated.
PS: I dont want to change the answer anymore, but I would never use a std::vector to construct a matrix again. Vectors offer flexibility that is not required for a matrix, which usually has same and fixed number of entries in each row.

Slow performance of sparse matrix using std::vector

I'm trying to implement the functionality of MATLAB function sparse.
Insert a value in sparse matrix at a specific index such that:
If a value with same index is already present in the matrix, then the new and old values are added.
Else the new value is appended to the matrix.
The function addNode performs correctly but the problem is that it is extremely slow. I call this function in a loop about 100000 times and the program takes more than 3 minutes to run. While MATLAB accomplishes this task in a matter of seconds. Is there any way to optimize the code or use stl algorithms instead of my own function to achieve what I want?
Code:
struct SparseMatNode
{
int x;
int y;
float value;
};
std::vector<SparseMatNode> SparseMatrix;
void addNode(int x, int y, float val)
{
SparseMatNode n;
n.x = x;
n.y = y;
n.value = val;
bool alreadyPresent = false;
int i = 0;
for(i=0; i<SparseMatrix.size(); i++)
{
if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
{
alreadyPresent = true;
break;
}
}
if(alreadyPresent)
{
SparseMatrix[i].value += val;
if(SparseMatrix[i].value == 0.0f)
SparseMatrix.erase(SparseMatrix.begin + i);
}
else
SparseMatrix.push_back(n);
}
Sparse matrices aren't typically stored as a vector of triplets as you are attempting.
MATLAB (as well as many other libraries) uses a Compressed Sparse Column (CSC) data structure, which is very efficient for static matrices. The MATLAB function sparse also does not build the matrix one entry at a time (as you are attempting) - it takes an array of triplet entries and packs the whole sequence into a CSC matrix. If you are attempting to build a static sparse matrix this is the way to go.
If you want a dynamic sparse matrix object, that supports efficient insertion and deletion of entries, you could look at different structures - possibly a std::map of triplets, or an array of column lists - see here for more information on data formats.
Also, there are many good libraries. If you're wanting to do sparse matrix operations/factorisations etc - SuiteSparse is a good option, otherwise Eigen also has good sparse support.
Sparse matrices are usually stored in compressed sparse row (CSR) or compressed sparse column (CSC, also called Harwell-Boeing) format. MATLAB by default uses CSC, IIRC, while most sparse matrix packages tend to use CSR.
Anyway, if this is for production usage rather than a learning exercise, I'd recommend using a matrix package with support for sparse matrices. In the C++ world, my favourite is Eigen.
The first thinks that stands out is that you are implementing your own functionality for finding an element: that's what std::find is for. So, instead of:
bool alreadyPresent = false;
int i = 0;
for(i=0; i<SparseMatrix.size(); i++)
{
if((SparseMatrix[i].x == x) && (SparseMatrix[i].y == y))
{
alreadyPresent = true;
break;
}
}
You should write:
auto it = std::find(SparseMatrix.begin(), SparseMatrix().end(), Comparer);
where Comparer is a function that compares two SparseMatNode objects.
But the main improvement will come from using the appropriate container. Instead of std::vector, you will be much better off using an associative container. This way, finding an element will have just a O(logN) complexity instead of O(N). You may slighly modify your SparseMatNode class as follows:
typedef std::pair<int, int> Coords;
typedef std::pair<const Coords, float> SparseMatNode;
You may cover this typedefs inside a class to provide a better interface, of course.
And then:
std::unordered_map<Coords, float> SparseMatrix;
This way you can use:
auto it = SparseMatrix.find(std::make_pair(x, y));
to find elements much more efficiently.
Have you tried sorting your vector of sparse nodes? Performing a linear search becomes costly every time you add a node. You could Insert In Place and always perform Binary Search.
Because sparse matrix may be huge and need to be compressed, you may use std::unordered_map. I assume matrix indexes (x and y) are always positive.
#include <unordered_map>
const size_t MAX_X = 1000*1000*1000;
std::unordered_map <size_t, float> matrix;
void addNode (size_t x, size_t y, float val)
{
size_t index = x + y*MAX_X;
matrix[index] += val; //this function can be still faster
if (matrix[index] == 0) //using find() / insert() methods
matrix.erase(index);
}
If std::unordered_map is not available on your system, you may try std::tr1::unordered_map or stdext::hash_map...
If you can use more memory, then use double instead of float, this will improve a bit your processing speed.

Sort objects of dynamic size

Problem
Suppose I have a large array of bytes (think up to 4GB) containing some data. These bytes correspond to distinct objects in such a way that every s bytes (think s up to 32) will constitute a single object. One important fact is that this size s is the same for all objects, not stored within the objects themselves, and not known at compile time.
At the moment, these objects are logical entities only, not objects in the programming language. I have a comparison on these objects which consists of a lexicographical comparison of most of the object data, with a bit of different functionality to break ties using the remaining data. Now I want to sort these objects efficiently (this is really going to be a bottleneck of the application).
Ideas so far
I've thought of several possible ways to achieve this, but each of them appears to have some rather unfortunate consequences. You don't necessarily have to read all of these. I tried to print the central question of each approach in bold. If you are going to suggest one of these approaches, then your answer should respond to the related questions as well.
1. C quicksort
Of course the C quicksort algorithm is available in C++ applications as well. Its signature matches my requirements almost perfectly. But the fact that using that function will prohibit inlining of the comparison function will mean that every comparison carries a function invocation overhead. I had hoped for a way to avoid that. Any experience about how C qsort_r compares to STL in terms of performance would be very welcome.
2. Indirection using Objects pointing at data
It would be easy to write a bunch of objects holding pointers to their respective data. Then one could sort those. There are two aspects to consider here. On the one hand, just moving around pointers instead of all the data would mean less memory operations. On the other hand, not moving the objects would probably break memory locality and thus cache performance. Chances that the deeper levels of quicksort recursion could actually access all their data from a few cache pages would vanish almost completely. Instead, each cached memory page would yield only very few usable data items before being replaced. If anyone could provide some experience about the tradeoff between copying and memory locality I'd be very glad.
3. Custom iterator, reference and value objects
I wrote a class which serves as an iterator over the memory range. Dereferencing this iterator yields not a reference but a newly constructed object to hold the pointer to the data and the size s which is given at construction of the iterator. So these objects can be compared, and I even have an implementation of std::swap for these. Unfortunately, it appears that std::swap isn't enough for std::sort. In some parts of the process, my gcc implementation uses insertion sort (as implemented in __insertion_sort in file stl_alog.h) which moves a value out of the sequence, moves a number items by one step, and then moves the first value back into the sequence at the appropriate position:
typename iterator_traits<_RandomAccessIterator>::value_type
__val = _GLIBCXX_MOVE(*__i);
_GLIBCXX_MOVE_BACKWARD3(__first, __i, __i + 1);
*__first = _GLIBCXX_MOVE(__val);
Do you know of a standard sorting implementation which doesn't require a value type but can operate with swaps alone?
So I'd not only need my class which serves as a reference, but I would also need a class to hold a temporary value. And as the size of my objects is dynamic, I'd have to allocate that on the heap, which means memory allocations at the very leafs of the recusrion tree. Perhaps one alternative would be a vaue type with a static size that should be large enough to hold objects of the sizes I currently intend to support. But that would mean that there would be even more hackery in the relation between the reference_type and the value_type of the iterator class. And it would mean I would have to update that size for my application to one day support larger objects. Ugly.
If you can think of a clean way to get the above code to manipulate my data without having to allocate memory dynamically, that would be a great solution. I'm using C++11 features already, so using move semantics or similar won't be a problem.
4. Custom sorting
I even considered reimplementing all of quicksort. Perhaps I could make use of the fact that my comparison is mostly a lexicographical compare, i.e. I could sort sequences by first byte and only switch to the next byte when the firt byte is the same for all elements. I haven't worked out the details on this yet, but if anyone can suggest a reference, an implementation or even a canonical name to be used as a keyword for such a byte-wise lexicographical sorting, I'd be very happy. I'm still not convinced that with reasonable effort on my part I could beat the performance of the STL template implementation.
5. Completely different algorithm
I know there are many many kinds of sorting algorithms out there. Some of them might be better suited to my problem. Radix sort comes to my mind first, but I haven't really thought this through yet. If you can suggest a sorting algorithm more suited to my problem, please do so. Preferrably with implementation, but even without.
Question
So basically my question is this:
“How would you efficiently sort objects of dynamic size in heap memory?”
Any answer to this question which is applicable to my situation is good, no matter whether it is related to my own ideas or not. Answers to the individual questions marked in bold, or any other insight which might help me decide between my alternatives, would be useful as well, particularly if no definite answer to a single approach turns up.
The most practical solution is to use the C style qsort that you mentioned.
template <unsigned S>
struct my_obj {
enum { SIZE = S; };
const void *p_;
my_obj (const void *p) : p_(p) {}
//...accessors to get data from pointer
static int c_style_compare (const void *a, const void *b) {
my_obj aa(a);
my_obj bb(b);
return (aa < bb) ? -1 : (bb < aa);
}
};
template <unsigned N, typename OBJ>
void my_sort (const char (&large_array)[N], const OBJ &) {
qsort(large_array, N/OBJ::SIZE, OBJ::SIZE, OBJ::c_style_compare);
}
(Or, you can call qsort_r if you prefer.) Since STL sort inlines the comparision calls, you may not get the fastest possible sorting. If all your system does is sorting, it may be worth it to add the code to get custom iterators to work. But, if most of the time your system is doing something other than sorting, the extra gain you get may just be noise to your overall system.
Since there are only 31 different object variations (1 to 32 bytes), you could easily create an object type for each and select a call to std::sort based on a switch statement. Each call will get inlined and highly optimized.
Some object sizes might require a custom iterator, as the compiler will insist on padding native objects to align to address boundaries. Pointers can be used as iterators in the other cases since a pointer has all the properties of an iterator.
I'd agree with std::sort using a custom iterator, reference and value type; it's best to use the standard machinery where possible.
You worry about memory allocations, but modern memory allocators are very efficient at handing out small chunks of memory, particularly when being repeatedly reused. You could also consider using your own (stateful) allocator, handing out length s chunks from a small pool.
If you can overlay an object onto your buffer, then you can use std::sort, as long as your overlay type is copyable. (In this example, 4 64bit integers). With 4GB of data, you're going to need a lot of memory though.
As discussed in the comments, you can have a selection of possible sizes based on some number of fixed size templates. You would have to have pick from these types at runtime (using a switch statement, for example). Here's an example of the template type with various sizes and example of sorting the 64bit size.
Here's a simple example:
#include <vector>
#include <algorithm>
#include <iostream>
#include <ctime>
template <int WIDTH>
struct variable_width
{
unsigned char w_[WIDTH];
};
typedef variable_width<8> vw8;
typedef variable_width<16> vw16;
typedef variable_width<32> vw32;
typedef variable_width<64> vw64;
typedef variable_width<128> vw128;
typedef variable_width<256> vw256;
typedef variable_width<512> vw512;
typedef variable_width<1024> vw1024;
bool operator<(const vw64& l, const vw64& r)
{
const __int64* l64 = reinterpret_cast<const __int64*>(l.w_);
const __int64* r64 = reinterpret_cast<const __int64*>(r.w_);
return *l64 < *r64;
}
std::ostream& operator<<(std::ostream& out, const vw64& w)
{
const __int64* w64 = reinterpret_cast<const __int64*>(w.w_);
std::cout << *w64;
return out;
}
int main()
{
srand(time(NULL));
std::vector<unsigned char> buffer(10 * sizeof(vw64));
vw64* w64_arr = reinterpret_cast<vw64*>(&buffer[0]);
for(int x = 0; x < 10; ++x)
{
(*(__int64*)w64_arr[x].w_) = rand();
}
std::sort(
w64_arr,
w64_arr + 10);
for(int x = 0; x < 10; ++x)
{
std::cout << w64_arr[x] << '\n';
}
std::cout << std::endl;
return 0;
}
Given the enormous size (4GB), I would seriously consider dynamic code generation. Compile a custom sort into a shared library, and dynamically load it. The only non-inlined call should be the call into the library.
With precompiled headers, the compilation times may actually be not that bad. The whole <algorithm> header doesn't change, nor does your wrapper logic. You just need to recompile a single predicate each time. And since it's a single function you get, linking is trivial.
#define OBJECT_SIZE 32
struct structObject
{
unsigned char* pObject;
bool operator < (const structObject &n) const
{
for(int i=0; i<OBJECT_SIZE; i++)
{
if(*(pObject + i) != *(n.pObject + i))
return (*(pObject + i) < *(n.pObject + i));
}
return false;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
std::vector<structObject> vObjects;
unsigned char* pObjects = (unsigned char*)malloc(10 * OBJECT_SIZE); // 10 Objects
for(int i=0; i<10; i++)
{
structObject stObject;
stObject.pObject = pObjects + (i*OBJECT_SIZE);
*stObject.pObject = 'A' + 9 - i; // Add a value to the start to check the sort
vObjects.push_back(stObject);
}
std::sort(vObjects.begin(), vObjects.end());
free(pObjects);
To skip the #define
struct structObject
{
unsigned char* pObject;
};
struct structObjectComparerAscending
{
int iSize;
structObjectComparerAscending(int _iSize)
{
iSize = _iSize;
}
bool operator ()(structObject &stLeft, structObject &stRight)
{
for(int i=0; i<iSize; i++)
{
if(*(stLeft.pObject + i) != *(stRight.pObject + i))
return (*(stLeft.pObject + i) < *(stRight.pObject + i));
}
return false;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
int iObjectSize = 32; // Read it from somewhere
std::vector<structObject> vObjects;
unsigned char* pObjects = (unsigned char*)malloc(10 * iObjectSize);
for(int i=0; i<10; i++)
{
structObject stObject;
stObject.pObject = pObjects + (i*iObjectSize);
*stObject.pObject = 'A' + 9 - i; // Add a value to the start to work with something...
vObjects.push_back(stObject);
}
std::sort(vObjects.begin(), vObjects.end(), structObjectComparerAscending(iObjectSize));
free(pObjects);

Returning Iterator vs Loose Coupling

I've got a question about the architecture of a data structure I'm writing. I'm writing an image class, and I'm going to use it in a specific algorithm. In this algorithm, I need to touch every pixel in the image that's within a certain border. The classic way I know to do this is with two nested for loops:
for(int i = ROW_BORDER; i < img->height - ROW_BORDER; i++)
for(int j = COL_BORDER; j < img->width - COL_BORDER; j++)
WHATEVER
However, I've been told that in style of the STL, it is in general better to return an iterator rather than use loops as above. It would be very easy to get an iterator to look at every pixel in the image, and it would even be easy to incorporate the border constraints, but I feel like included the border is blowing loose coupling out of the water.
So, the question is, should I return a special "border-excluding iterator", use the for loops, or is there a better way I haven't thought of?
Just to avoid things like "well, just use OpenCV, or VXL!" , I'm not actually writing an image class, I'm writing a difference-of-gaussian pyramid for use in a feature detector. That said, the same issues apply, and it was simpler to write two for loops than three or four.
To have something reusable, I'd go with a map function.
namespace your_imaging_lib {
template <typename Fun>
void transform (Image &img, Fun fun) {
const size_t width = img.width(),
size = img.height() * img.width();
Pixel *p = img.data();
for (size_t s=0; s!=size; s+=width)
for (size_t x=0; x!=width; ++x)
p[x + s] = fun (p[x + s]);
}
template <typename Fun>
void generate (Image &img, Fun fun) {
const size_t width = img.width(), size = img.height();
Pixel *p = img.data();
for (size_t s=0, y=0; s!=size; s+=width, ++y)
for (size_t x=0; x!=width; ++x)
p[x + s] = fun (x, y);
}
}
Some refinement needed. E.g., some systems like x, y to be in [0..1).
You can then use this like:
using namespace your_imaging_lib;
Image i = Image::FromFile ("foobar.png");
map (i, [](Pixel const &p) { return Pixel::Monochrome(p.r()); });
or
generate (i, [](int x, int y) { return (x^y) & 0xFF; });
Iff you need knowledge of both coordinates (x and y), I guarantee this will give better performance compared to iterators which need an additional check for each iteration.
Iterators, on the other hand, will make your stuff usable with standard algorithms, like std::transform, and you could make them almost as fast if pixel positions are not needed and you do not have a big pitch in your data (pitch is for alignment, usually found on graphics hardware surfaces).
I suspect you should be using the visitor pattern instead-- instead of returning an iterator or some sort of collection of your items, you should pass in the operation to be done on each pixel/item to your data structure that holds the items, and the data structure should be able to apply that operation to each item. Whether your data structure uses for loops or iterators to traverse the pixel/whatever collection is hidden, and the operation is decoupled from the data structure.
IMHO it sounds like a good idea to have an iterator that touches every pixel. However, it doesn't sound as appealing to me to include the border constraints inside of it. Maybe try to achieve something like:
IConstraint *bc=new BorderConstraint("blue-border");
for(pixel_iterator itr=img.begin(); itr!=img.end(); itr++) {
if(!bc->check(itr))
continue;
// do whatever
}
Where IConstraint is a base class that can be derived to make many different BorderConstraints.
My rationale is that iterators iterate in different ways but I don't think they need to know about your business logic. That could be abstracted away into another design construct as depcited via Constraints above.
In the case of bitmap data it is noteworthy that there are no iterator based algorithms or datasets commonly used in the popular image manipulation APIs. This should be a clue as to that it is hard to implement as efficiently as a regular 2D array. (thanks phresnel)
If you really require/prefer an iterator for your image sans border, you should invent a new concept to iterate. My suggestion would be something like an ImageArea.
class ImageArea: Image
{ int clipXLeft, clipXRight;
int clipYTop, clipYBottom;
public:
ImageArea(Image i, clipXTop ... )
And construct your iterator from there. The iterators can then be transparent to work with images or regions within an image.
On the other hand, a regular x/y index based approach is not a bad idea. Iterators are very useful for abstracting data sets, but comes with a cost when you implement them on your own.

C++/STL: std::transform with given stride?

I have a 1d array containing Nd data, I would like to effectively traverse on it with std::transform or std::for_each.
unigned int nelems;
unsigned int stride=3;// we are going to have 3D points
float *pP;// this will keep xyzxyzxyz...
Load(pP);
std::transform(pP, pP+nelems, strMover<float>(pP, stride));//How to define the strMover??
The answer is not to change strMover, but to change your iterator. Define a new iterator class which wraps a float * but moves forward 3 places when operator++ is called.
You can use boost's Permutation Iterator and use a nonstrict permutation which only includes the range you are interested in.
If you try to roll your own iterator, there are some gotchas: to remain strict to the standard, you need to think carefully about what the correct "end" iterator for such a stride iterator is, since the naive implementation will merrily stride over and beyond the allowed "one-past-the-end" to the murky area far past the end of the array which pointers should never enter, for fear of nasal demons.
But I have to ask: why are you storing an array of 3d points as an array of floats in the first place? Just define a Point3D datatype and create an array of that instead. Much simpler.
This is terrible, people told you to use stride iterators instead. Apart from not being able to use functional objects from standard library with this approach, you make it very, very complicated for compiler to produce multicore or sse optimization by using crutches like this.
Look for "stride iterator" for proper solution, for example in c++ cookbook.
And back to original question... use valarray and stride to simulate multidimensional arrays.
Well, I have decided to use for_each instead of transform any other decisions are welcome:
generator<unsigned int> gen(0, 1);
vector<unsigned int> idx(m_nelem);//make an index
std::generate(idx.begin(), idx.end(),gen);
std::for_each(idx.begin(), idx.end(), strMover<float>(&pPOS[0],&m_COM[0],stride));
where
template<class T> T op_sum (T i, T j) { return i+j; }
template<class T>
class strMover
{
T *pP_;
T *pMove_;
unsigned int stride_;
public:
strMover(T *pP,T *pMove, unsigned int stride):pP_(pP), pMove_(pMove),stride_(stride)
{}
void operator() ( const unsigned int ip )
{
std::transform(&pP_[ip*stride_], &pP_[ip*stride_]+stride_,
pMove_, &pP_[ip*stride_], op_sum<T>);
}
};
From first look this is a thread safe solution.
use boost adapters. you can get iterators out of them. the only disadvantage is compilation time.
vector<float> pp = vector_load(pP);
boost::for_each(pp|stride(3)|transformed(dosmtn()));