Using pointer arithmetic to iterate over class data members in C++ - c++

I have a C++ class containing a bunch of data members of the same type and I want to iterate over them:
// C.h
class C {
// other members
double foo;
double bar;
...
double barf; // 57th double declared since foo, nothing else in between
// other members
};
Pointer arithmetic seems to work, e.g. here using the constructor to initialize those 58 member doubles:
// C.cpp
C::C() {
for (int i = 0; i < 58; i++) {
*(&this->foo + i) = 0;
}
}
I found related questions here How to iterate through variable members of a class C++, here C++: Iterating through all of an object's members?, here Are class members garaunteed to be contiguous in memory? and here Class contiguous data, with some people suggesting this kind of thing is ok and others a no-no. The latter say there's no guarantee it won't fail, but don't cite any instances of it actually failing. So my question is, does anyone else use this, or has tried and got into trouble?
Or maybe there's a better way? Originally in my application I did actually use an array instead to represent my object, with indices like so:
int i_foo = 0, i_bar = 1, ..., i_barf = 57;
However once I introduced different objects (and arrays thereof) the index naming started to get out of hand. Plus I wanted to learn about classes and I'm hoping some of the other functionality will prove useful down the line ;-)
I use the iteration pretty heavily, e.g. to calculate statistics for collections of objects. Of course I could create a function to map the class members to an array one-by-one, but performance is a priority. I'm developing this application for myself to use on Windows with VS. I would like to keep other platform options open, but it's not something I intend to distribute widely. Thanks

George:
I think you can have a better solution (like a method that will return the i-th attribute:
double get(size_t idx)
{
switch (idx)
{
case 0: return foo;
case 1: return bar;
case 2: return foo_bar;
....
}
}

Using pointer arithmetic to iterate over class data members can cause problems during code optimization. Example:
struct Vec3
{
double x, y, z;
inline Vec3& operator =(const Vec3& that)
{
x = that.x;
y = that.y;
z = that.z;
return *this;
}
inline double& operator [](int index)
{
return (&x)[index];
}
};
...
Vec3 foo = bar; // operator =
double result = foo[2]; // operator []
...
Both operators are inlined, the value of the result depends on the final instructions reordering. Possible cases:
foo.x = bar.x;
foo.y = bar.y;
foo.z = bar.z;
result = (&foo.x)[2]; // correct -- result contains new value
foo.x = bar.x;
foo.y = bar.y;
result = (&foo.x)[2]; // incorrect -- result contains old value
foo.z = bar.z;
foo.x = bar.x;
result = (&foo.x)[2]; // incorrect -- result contains old value
foo.y = bar.y;
foo.z = bar.z;
Some compilers just do not realise that (&foo.x)[2] is the same data as foo.z and they reorder instructions incorrectly. It is very hard to find bugs like this.

Related

C++ understanding RVO (as compared to returning local variable reference)

It's my first year of using C++ and learning on the way. I'm currently reading up on Return Value Optimizations (I use C++11 btw). E.g. here https://en.wikipedia.org/wiki/Return_value_optimization, and immediately these beginner examples with primitive types spring to mind:
int& func1()
{
int i = 1;
return i;
}
//error, 'i' was declared with automatic storage (in practice on the stack(?))
//and is undefined by the time function returns
...and this one:
int func1()
{
int i = 1;
return i;
}
//perfectly fine, 'i' is copied... (to previous stack frame... right?)
Now, I get to this and try to understand it in the light of the other two:
Simpleclass func1()
{
return Simpleclass();
}
What actually happens here? I know most compilers will optimise this, what I am asking is not 'if' but:
how the optimisation works (the accepted response)
does it interfere with storage duration: stack/heap (Old: Is it basically random whether I've copied from stack or created on heap and moved (passed the reference)? Does it depend on created object size?)
is it not better to use, say, explicit std::move?
You won't see any effect of RVO when returning ints.
However, when returning large objects like this:
struct Huge { ... };
Huge makeHuge() {
Huge h { x, y, x };
h.doSomething();
return h;
}
The following code...
auto h = makeHuge();
... after RVO would be implemented something like this (pseudo code) ...
h_storage = allocate_from_stack(sizeof(Huge));
makeHuge(addressof(h_storage));
auto& h = *properly_aligned(h_storage);
... and makeHuge would compile to something like this...
void makeHuge(Huge* h_storage) // in fact this address can be
// inferred from the stack pointer
// (or just 'known' when inlining).
{
phuge = operator (h_storage) new Huge(x, y, z);
phuge->doSomething();
}

c++ class members functions: how to write these functions?

In my Object Oriented c++ course, we have to write this class that I have put below.
Point
class Point{
public:
Point( double x = 0, double y = 0 );
double getX() const;
double getY() const;
Point & setX( double x ); // mutator, returning reference to self
Point & setY( double y );
const Point & output() const;
double distance( const Point & other ) const;
private:
double xPoint, yPoint;
}; // class Point
my question is...I can't find any information on how the functions setX, setY, and output should work. They are the same type as the class itself and I have written what I would expect them to look like below. Can anyone tell me what I am doing wrong and maybe some more specifics of how these functions are working?
The setX function should change xPoint in the object, the setY should do the same for the yPoint and output should simply output them.
Point & Point::setX( double x )
{
xPoint = x;
}
Point & Point::setY( double y )
{
Ypoint = y;
}
const Point & Point::output() const
{
cout << xPoint << yPoint;
}
Just add a return *this; at the end of your setX and setY: you are returning a reference to your object, so that for example you can do: p0.setX(1.23).setY(3.45), with of course p0 an instance of Point. In the output function, put a separator between xPoint and yPoint, like a space. You say They are the same type as the class itself: don't confuse a variable type with the type returned by a function/method: the method setX, setY and output return a reference to an instance of the class to which they belong. Note that the reference returned by output is const, so you can do:
p0.setX(1.23).setY(3.45).output();
But not:
p0.output().setX(1.23);
As setX is not a const method (it doesn't declare that it won't modify the data inside the class instance to which it belongs).
You can call instead:
double x = p0.output().getX();
because getX is a const method.
Note: I am not saying you should use the methods in this way, but the point is to show what potentially you can do.
Setters are public metods thats allow you change private members of the class, they don't have return type so setX, setY should be void not Point:
void set(double x); // Declaration
void Point::setX( double x ) // Definition outside Point.h
{
xPoint = x;
}
Same with output should be void, rest is fine you can define it whatever you wish to display it, you can change it like this:
void Point::output() const
{
cout << "(" << xPoint << ", " << yPoint << ")";
}
setX() will probably change the value of the pointX member, and return a reference to the object being acted on.
So an implementation might be something like
Point &Point::setX(double xval)
{
if (IsValid(xval)) pointX = xval; // ignore invalid changes
return *this;
}
This can (assuming other member functions and operators are being used correctly) be used in things like this
#include <iostream>
// declaration of Point here
int main()
{
Point p;
std::cout << p.setX(25).setY(30).getX() << '\n';
}
While this example isn't particularly useful (it shows what is possible) the chaining of member function calls is useful in various circumstances. For example, this technique is actually the basis on which iostream insertion and extraction operators work, and allow multiple things to be inserted/extracted to/from a stream in a single statement.
The documentation of the setX and setY functions says
// mutator, returning reference to self
Your implementation does the mutation, but you've failed to complete the contract that this function is supposed to satisfy: it's supposed to return a reference to itself.
this is a pointer to the object you're invoking the method on, and so adding the line
return *this;
would complete the contract.
This is an aside, but it may help you understand why anyone would want to use such a 'strange' design.
You may be familiar with ordinary assignment being used in ways such as
a = b = 0;
if((result = some_function()) == 0) {
// Do something in the `result == 0` case
} else {
// Do something in the `result != 0` case
}
and other similar things. The first example sets both a and b to be 0. The second example stores the return value of the function call into the variable result, and then branches based on whether that value is 0 or not.
The way this works is that x = y is a binary operator that which has the side effect of copying the value of y into x, and then returns that value (technically a reference to x) so that it may be used in the surrounding expression.
So when you write a = b = 0, this is parsed as a = (b = 0), and has the effect of making b zero, and then evaluates to a = 0 which is then evaluated and makes a zero. Similarly for the branching example.
This is something people like to do when writing code (it's a completely separate topic whether or not this is good style), so when people design new types with operator= methods, they design them to support this usage, by making them return a reference to the object assigned to. e.g.
MyClass& MyClass::operator=(arg a)
{
// Copy the value of `a` into this object
return *this;
}
The other assignment operators, like operator+= also work this way.
Now, when you're used to this usage, it is a small step to extend it to other functions that sort of act like assignment, like setX and setY. This has the additional convenience of making it easy to chain modifications, as in point.setX(3).setY(7).

Save and load function pointers to file

Consider the following code:
typedef float (*MathsOperation)(float _1, float _2);
struct Data
{
float x, y;
MathsOperation op;
};
Data data[100];
float Add(float _1, float _2){//add}
float Sub(float _1, float _2){//subtract}
float Mul(float _1, float _2){//multiply}
// other maths operations
for (int i = 0; i < 100; ++i)
{
// assign one of the maths operators above to data struct member op
// according to some condition (maybe some user input):
if(condition1) data[i].op = &Add;
if(condition2) data[i].op = &Sub;
if(condition3) data[i].op = &Mul;
// etc.
}
Now I'd like to somehow save the dataarray to a file and load it later (maybe in another program which doesn't know about the conditions that were used to assign the operators to each array element). Obviously, the pointers would be different every time I ran the application. So, my question is what is the best way to do this?
You can't store "functions" as data anyway, and as you say, storing pointers in external media doesn't work. So, what you have to do in this case is store an operator value, e.g.
enum Operator
{
Op_Add,
Op_Sub,
Op_Mul,
Op_Largest // For array size below.
};
And instead of:
if(condition1) data[i].op = &Add;
if(condition2) data[i].op = &Sub;
if(condition3) data[i].op = &Mul;
have:
if(condition1) data[i].op = Op_Add;
if(condition2) data[i].op = Op_Sub;
if(condition3) data[i].op = Op_Mul;
Since that is an integer type value, it can be stored in a file, and you can then do:
// Or `fin.read(reinterpret_cast<char*>(data), sizeof(data))
fin >> op >> x >> y;
if (op == Op_Add) ...
else if (op == Op_Sub) ...
Or have a function pointer array that you index with op... In other words:
typedef float (*MathsOperation)(float _1, float _2);
...
MathsOperation mathsOps[Op_Largest] = { &Add, &Sub, &Mul };
...
mathsOps[op](x, y);
...
If I where you I would build an index, where you could register your operators
static std::array<MathsOperation> MathsOperations;
MathsOperations.push_back(Add);
MathsOperations.push_back(Sub);
MathsOperations.push_back(Mul);
int getIdx(MathsOperation op) {
return std::find(MathsOperations.begin(), MathsOperations.end(), op) - MathsOperations.begin();
}
and put it in a .h file just after the MathsOperation definitions
Then rather then saving the function pointer, you could just save the relevant index and access the operator afterwards
int opidx = getIdx(Add);
MathsOperator op = MathsOperator[idx];
Non-portable, but almost certain to work if all your functions are in the same module:
template<typename FuncT>
intptr_t FunctionPointerToId( FuncT* fptr )
{
return reinterpret_cast<intptr_t>(fptr) - reinterpret_cast<intptr_t>(&Add);
}
template<typename FuncT>
FuncT* FunctionPointerFromId( intptr_t id )
{
return reinterpret_cast<FuncT*>(i + reinterpret_cast<intptr_t>(&Add));
}
This assumes that your implementation preserves relative addresses of functions within the same module (most platforms do guarantee this as implementation-specific behavior, since dynamic loaders rely on this). Using relative addresses (aka "based pointers") allows it to still work even if the module is a shared library that gets loaded at a different base address each time (e.g. ASLR).
Don't try this if your functions come from multiple modules, though.
If you have the ability to build and maintain a list of the functions, storing an index into that list is definitely a better approach (those indexes can remain good even after relinking, while relative code addresses get changed).
You need some permanent identifier for each function. You save this identifier instead of function address and restore address after reading.
The simpliest is the integer identifier which is an index of array
const MathsOperation Operations[] = { &Add, &Sub };
In this case you must never change the order of Operations items.
If it is impossible, use strings:
const std::map<std::string, MathsOperation> OpNames
{
{ "Add", &Add },
{ "Sub", &Sub },
};

A variable that is read-only after assignment at run-time?

Fairly new programmer here, and an advance apology for silly questions.
I have an int variable in a program that I use to determine what the lengths of my arrays should be in some of my structures. I used to put it in my header as a const int. Now, I want to fork my program to give the variable different values depending on the arguments given in, but keep it read-only after I assign it at run-time.
A few ideas I've had to do this. Is there a preferred way?
Declare a const int * in my header and assigning it to a const int in my main function, but that seems clunky.
Make it a plain int in my main function.
Pass the variable as an argument when the function is called.
Something else I haven't thought of yet.
I'd use a function-static variable and a simple function. Observe:
int GetConstValue(int initialValue = 0)
{
static int theValue = initialValue;
return theValue;
}
Since this is a function-level static variable, it is initialized only the first time through. So the initialValue parameter is useless after the first run of the function. Therefore, all you need to do is ensure that the first call of the function is the one that initializes it.
C++ doesn't have a built-in solution for this, but if you really want to make sure that your int is only assigned once, you can build your own special int class:
class MyConstInt
{
public:
MyConstInt(): assigned(false) {}
MyConstInt& operator=(int v)
{
assert(!assigned);
value = v;
assigned = true;
return *this;
}
operator int() const
{
assert(assigned);
return value;
}
private:
int value;
bool assigned;
};
MyConstInt mi;
// int i = mi; // assertion failure; mi has no value yet
mi = 42;
// mi = 43; // assertion failure; mi already has a value
int* array = new int[mi];
When exactly do you know the correct value? If you read it from a file or whatever, you can just say:
const int n = determine_correct_value();
I'm tempted to say that what you want doesn't make sense. A constant is something that doesn't change its value, not something that maybe changes its value once or twice. If you want a global variable, just make it non-constant.
On the other hand, if you have scope-constant values, you would just declare and initialize them at the same time, following the general C++ guideline to declare as close to the usage site as possible. For example, mark the use of constants in the following local scope:
for (auto it = v.begin(), end = v.end(); it != end; ++it)
{
const Foo & x = *it;
const std::size_t n = x.get_number_of_bars();
// use x and n ...
const bool res = gobble(x, zip(n));
if (res && shmargle(x)) { return 8; }
}
Here the compiler may even choose not to generate any special code for the variables at all if their value is already known through other means.

Is there any well-known paradigm for iterating enum values?

I have some C++ code, in which the following enum is declared:
enum Some
{
Some_Alpha = 0,
Some_Beta,
Some_Gamma,
Some_Total
};
int array[Some_Total];
The values of Alpha, Beta and Gamma are sequential, and I gladly use the following cycle to iterate through them:
for ( int someNo = (int)Some_Alpha; someNo < (int)Some_Total; ++someNo ) {}
This cycle is ok, until I decide to change the order of the declarations in the enum, say, making Beta the first value and Alpha - the second one. That invalidates the cycle header, because now I have to iterate from Beta to Total.
So, what are the best practices of iterating through enum? I want to iterate through all the values without changing the cycle headers every time. I can think of one solution:
enum Some
{
Some_Start = -1,
Some_Alpha,
...
Some_Total
};
int array[Some_Total];
and iterate from (Start + 1) to Total, but it seems ugly and I have never seen someone doing it in the code. Is there any well-known paradigm for iterating through the enum, or I just have to fix the order of the enum values? (let's pretend, I really have some awesome reasons for changing the order of the enum values)...
You can define an operator++() for your enum. This has the advantage that it uses the well-known paradigm of the standard incrementation operators. :)
Depending on whether your enums are contiguous, you can treat them as int or use a switch:
Some& operator++(Some& obj)
{
# if YOUR_ENUMS_ARE_CONTIGUOUS
int i = obj;
if( ++i > Some_Total ) i = Some_Alpha;
return obj = static_cast<Some>(i);
# else
switch(obj)
{
case Some_Alpha : obj = Some_Beta; break;
case Some_Beta : obj = Some_Gamma; break;
case Some_Gamma : obj = Some_Total; break;
case Some_Total : obj = Some_Alpha; break;
default: assert(false); // changed enum but forgot to change operator
}
return obj;
# endif
}
Note that, if operator++() is defined, users will probably expect an operator--(), too.
No, there is no way of doing this because there is no guarantee that someone hasn't written code like:
enum Some
{
Some_Alpha = 0,
Some_Beta,
Some_Gamma = 42,
Some_Delta,
Some_Total
};
You can check out this article with its source code on how you can implement this with static class members.
In C++11 (and probably earlier), you could use the following hack, to make Some iterable:
Some operator++(Some& s) {
return s = (Some )(std::underlying_type<Some>::type(x) + 1);
}
Some operator*(Some s) {
return s;
}
Some begin(Some s) {
return Some_Alpha;
Some end(Some s) {
return Some_Gamma;
}
int main() {
// the parenthesis here instantiate the enum
for(const auto& s : Some()) {
// etc. etc.
}
return 0;
}
(This answer was shamelessly adapted from here.)
enum Some
{
Some_first_ = 0,
Some_Alpha = Some_first_,
....
Some_last_
};
Doing such you can grant first & last never changes order
If you do not use any assignments, the enums are guaranteed to be sequential starting with 0 as the first.
thers.
The best thing you can do is keep them in the order you want in your enum definition, and cycle through them with the for loop.
I place all Enums in their own namespace. Example:
namespace Foo {
enum Enum {
First=0, // must be sequential
Second,
Third,
End // must be last
};
}
In code:
for (int i=Foo::First; i!=Foo::End; i++) {
// do stuff
}
This is because C++ allows stuff like this (not tested in a compiler):
enum Foo {
Alpha = 1
};
enum Bar {
Beta = 2
};
Foo foo = Beta;
Which is clearly wrong.