C++ non-virtual class member variables memory layout? - c++

I have a non-virtual class template A as below and I do the following
#include <iostream>
// my class template
template<typename T>
class A
{
public:
T x;
T y;
T z;
// bunch of other non-virtual member functions including constructors, etc
// and obviously no user-defined destructor
// ...
};
int main()
{
//now I do the following
A<double> a;
a.x = 1.0; // not important this
a.y = 2.0;
a.z = 3.0;
// now the concerned thing
double* ap = (double*)&a;
double* xp = &(a.x);
// can I correctly and meaningfully do the following?
double new_az = ap[2]; // guaranteed to be same as a.z (for any z) ? ** look here **
double new_z = xp[2]; // guaranteed to be same as a.z (for any z) ? ** look here **
std::cout<<new_az<<std::endl;
std::cout<<new_z<<std::endl;
return 0;
}
So, is it guaranteed that if I use a raw point to object A or to the member variable a.x, I will correctly get the other variables?

As many users pointed out, there is no guarantee that the memory layout of your structure will be identical to the appropriate array. And "ideologically correct" way to access members by index would be creating some ugly operator [] with a switch inside it.
However, speaking practically, there is usually no problem with your approach, and the suggested solutions are inferior in terms of code generated and run-time performance.
I can suggest 2 other solutions.
Keep your solution, but verify in compile-time that your structure layout corresponds to an array. In your specific case putting STATIC_ASSERT(sizeof(a) == sizeof(double)*3);
Change your template class to be actually an array, and convert the x,y,zvariables into the access functions into the elements of the array.
I mean:
#include <iostream>
// my class template
template<typename T>
class A
{
public:
T m_Array[3];
T& x() { return m_Array[0]; }
const T& x() const { return m_Array[0]; }
// repeat for y,z
// ...
};
If you make the length of the array (i.e. dimension of the represented vector) a template parameter as well, you may put a 'STATIC_ASSERT' in each access function to ensure the actual existence of the member.

No, there is no guarantee, not the way you do it. If T is a int8_t, for example, it would work only if you specified 1-byte packing.
The easiest, and correct way to do this, would be to add an operator [] to your template class, something like:
T& operator[](size_t i)
{
switch(i)
{
case 0: return x;
case 1: return y;
case 2: return z:
}
throw std::out_of_range(__FUNCTION__);
}
const T& operator[](size_t i) const
{
return (*const_cast<A*>(this))[i]; // not everyone likes to do this.
}
But this is not really efficient. A more efficient way is to have your vector (or point) coordinates in a array, and x(), y(), z() member functions to access them. Then you example would work in all cases, provided you implement a T* operator in your class.
operator T*() { return &values[0]; }
operator const T*()const { return &values[0]; }

If you really want to do such things:
template <typename T>
class FieldIteratable
{
using Data = std::array<T, 5/*magic number*/>;
Data data_;
public:
const Data & data() { return data_; }
T& a1 = data_[0]; // or some macro
char padding1[3]; // you can choose what field is iteratable
T& a2 = data_[1];
char padding2[3]; // class can contain other fields can be
T& a3 = data_[2];
char padding3[3];
T& a4 = data_[3];
char padding4[3];
T& a5 = data_[4];
};
int main() {
FieldIteratable<int> fi;
int* a = &fi.a1;
*a++ = 0;
*a++ = 1;
*a++ = 2;
*a++ = 3;
*a++ = 4;
std::cout << fi.a1 << std::endl;
std::cout << fi.a2 << std::endl;
std::cout << fi.a3 << std::endl;
std::cout << fi.a4 << std::endl;
std::cout << fi.a5 << std::endl;
for(auto i :fi.data())
std::cout << i << std::endl;
return 0;
}

Related

How to make a data member const after but not during construction?

Without relying on const_cast, how can one make a C++ data member const after but not during construction when there is an expensive-to-compute intermediate value that is needed to calculate multiple data members?
The following minimal, complete, verifiable example further explains the question and its reason. To avoid wasting your time, I recommend that you begin by reading the example's two comments.
#include <iostream>
namespace {
constexpr int initializer {3};
constexpr int ka {10};
constexpr int kb {25};
class T {
private:
int value;
const int a_;
const int b_;
public:
T(int n);
inline int operator()() const { return value; }
inline int a() const { return a_; }
inline int b() const { return b_; }
int &operator--();
};
T::T(const int n): value {n - 1}, a_ {0}, b_ {0}
{
// The integer expensive
// + is to be computed only once and,
// + after the T object has been constructed,
// is not to be stored.
// These requirements must be met without reliance
// on the compiler's optimizer.
const int expensive {n*n*n - 1};
const_cast<int &>(a_) = ka*expensive;
const_cast<int &>(b_) = kb*expensive;
}
int &T::operator--()
{
--value;
// To alter a_ or b_ is forbidden. Therefore, the compiler
// must abort compilation if the next line is uncommented.
//--a_; --b_;
return value;
}
}
int main()
{
T t(initializer);
std::cout << "before decrement, t() == " << t() << "\n";
--t;
std::cout << "after decrement, t() == " << t() << "\n";
std::cout << "t.a() == " << t.a() << "\n";
std::cout << "t.b() == " << t.b() << "\n";
return 0;
}
Output:
before decrement, t() == 2
after decrement, t() == 1
t.a() == 260
t.b() == 650
(I am aware of this previous, beginner's question, but it treats an elementary case. Please see my comments in the code above. My trouble is that I have an expensive initialization I do not wish to perform twice, whose intermediate result I do not wish to store; whereas I still wish the compiler to protect my constant data members once construction is complete. I realize that some C++ programmers avoid constant data members on principle but this is a matter of style. I am not asking how to avoid constant data members; I am asking how to implement them in such a case as mine without resort to const_cast and without wasting memory, execution time, or runtime battery charge.)
FOLLOW-UP
After reading the several answers and experimenting on my PC, I believe that I have taken the wrong approach and, therefore, asked the wrong question. Though C++ does afford const data members, their use tends to run contrary to normal data paradigms. What is a const data member of a variable object, after all? It isn't really constant in the usual sense, is it, for one can overwrite it by using the = operator on its parent object. It is awkward. It does not suit its intended purpose.
#Homer512's comment illustrates the trouble with my approach:
Don't overstress yourself into making members const when it is inconvenient. If anything, it can lead to inefficient code generation, e.g. by making move-construction fall back to copy constructions.
The right way to prevent inadvertent modification to data members that should not change is apparently, simply to provide no interface to change them—and if it is necessary to protect the data members from the class's own member functions, why, #Some programmer dude's answer shows how to do this.
I now doubt that it is possible to handle const data members smoothly in C++. The const is protecting the wrong thing in this case.
Something along these lines perhaps:
class T {
private:
T(int n, int expensive)
: value{n-1}, a_{ka*expensive}, b_{kb*expensive} {}
public:
T(int n) : T(n, n*n*n - 1) {}
};
One possible way could be to put a and b in a second structure, which does the expensive calculation, and then have a constant member of this structure.
Perhaps something like this:
class T {
struct constants {
int a;
int b;
constants(int n) {
const int expensive = ... something involving n...;
a = ka * expensive;
b = kb * expensive;
}
};
constants const c_;
public:
T(int n)
: c_{ n }
{
}
};
With that said, why make a_ and b_ constant in the first place, if you control the class T and its implementation?
If you want to inhibit possible modifications from other developers that might work on the T class, then add plenty of documentation and comments about the values not being allowed to be modified. Then if someone modifies the values of a_ or b_ anyway, then it's their fault for making possibly breaking changes. Good code-review practices and proper version control handling should then be used to point out and possibly blame wrongdoers.
Before describing the answer, I'd first suggest you to re-think your interface. If there's an expensive operation, why don't you let the caller be aware of it and allow them to cache the result? Usually the design forms around the calculations and abstractions that are worth keeping as a state; if it's expensive and reusable, it's definitely worth keeping.
Therefore, I'd suggest to put this to the public interface:
struct ExpensiveResult
{
int expensive;
ExpensiveResult(int n)
: expensive(n*n*n - 1)
{}
};
class T
{
private:
const int a;
const int b;
T(const ExpensiveResult& e)
: a(ka * e.expensive)
, b(kb * e.expensive)
{}
};
Note that ExpensiveResult can be directly constructed from int n (ctor is not explicit), therefore call syntax is similar when you don't cache it; but, caller might, at any time, start storing the result of the expensive calculation.
It's pretty easy to modify the const ints in your object as a result of a significant change in c++20. The library function construct_at and destroy_at have been provided to simplify this. For your class, destroy_at is superfluous since the class contains no members that use dynamic memory like vector, etc. I've made a small modification, added a constructor taking just an int. Also defined an operator= which allows the objects to be manipulated in containers. You can also use construct_at to decrement a_ and b_ in your operator-- method. Here's the code:
#include <iostream>
#include <memory>
namespace {
constexpr int initializer{ 3 };
constexpr int ka{ 10 };
constexpr int kb{ 25 };
class T {
private:
int value;
const int a_{};
const int b_{};
public:
T(int n);
T(int n, int a, int b);
T(const T&) = default;
inline int operator()() const { return value; }
inline int a() const { return a_; }
inline int b() const { return b_; }
int& operator--();
T& operator=(const T& arg) { std::construct_at(this, arg); return *this; };
};
T::T(const int n, const int a, const int b) : value{ n - 1 }, a_{ a }, b_{ b } {}
T::T(const int n) : value{ n - 1 }
{
// The integer expensive
// + is to be computed only once and,
// + after the T object has been constructed,
// is not to be stored.
// These requirements must be met without reliance
// on the compiler's optimizer.
const int expensive{ n * n * n - 1 };
std::construct_at(this, n, ka*expensive, kb*expensive);
}
int& T::operator--()
{
// implement decrements
//--a_; --b_;
const int a_1 = a_ - 1;
const int b_1 = b_ - 1;
std::construct_at(this, value, a_1, b_1);
return value;
}
}
int main()
{
T t(initializer);
std::cout << "before decrement, t() == " << t() << "\n";
--t;
std::cout << "after decrement, t() == " << t() << "\n";
std::cout << "t.a() == " << t.a() << "\n";
std::cout << "t.b() == " << t.b() << "\n";
return 0;
}
Output:
before decrement, t() == 2
after decrement, t() == 1
t.a() == 259
t.b() == 649

How to efficiently alias a float as both a named member and an element of an array?

I am implementing a particle-based fluid simulation. To represent vectors such as velocity, acceleration etc I have defined a class that looks like this
class Vec3f {
public:
float x, y, z;
// ... bunch of constructors, operators and utility functions
}
I'm using the library nanoflann for kd-tree searches. To accommodate for arbitrary class designs, nanoflann requires a user-defined adaptor class that the kd-tree class then queries to get info about the particle dataset. One function that the adaptor has to offer, as described in the nanoflann documentation is the following.
// Must return the dim'th component of the idx'th point in the class:
inline T kdtree_get_pt(const size_t idx, int dim) const { ... }
The problem is that this interface does not work seamlessly with the x, y, z representation. Naively, it would need to do something like this
inline float kdtree_get_pt(const size_t idx, int dim) const {
switch(dim) {
case 0:
return particlearray[idx].x;
case 1:
return particlearray[idx].y;
case 2:
return particlearray[idx].z;
}
}
Building and querying the kd-tree consumes a significant portion of my app's runtime and kd_tree_get_pt gets queried multiple times in the process so I need it to be optimized. The following solution should be faster.
class Vec3f {
public:
float values[3];
// ...
}
// Then inside the adaptor class
inline float kdtree_get_pt(const size_t idx, int dim) const {
return particlearrray[idx].values[dim];
}
However, I much prefer the x, y, z interface for my equations. Now that the problem is clear, my question is how can I keep the x, y, z notation for my equations without making kdtree_get_pt suboptimal.
Solutions I have considered:
Vec3f has member float values[3] and getters in the form of float& x(). The function call should be optimized away completely so this almost works but I do not want to have to add the parentheses in my equations if I can avoid it. eg I want to be able to write vec1.x - vec2.x instead of vec1.x() - vec2.x(). As far as I know, C++ does not offer a way to "disguise" a function call as a member variable, excluding preprocessor macros which I do not consider a safe solution.
Vec3f has members float values[3] and float& x, y, z where the latter are initialized to point to the corresponding floats in the array. I thought they would be optimized away as they are known at compile time and obviously cannot change value after initialization, but even with optimizations on, MSVC++ seems to actually store the float&s as can be seen by sizeof(Vec3f) growing by 12 bytes after their addition. This doubles the storage size of my dataset which raises a concern for cache misses when working with arbitrarily large datasets.
kdtree_get_pt uses float& values[3] to point to x, y, z. This might eliminate the branching cost, but I don't believe the extra level of indirection, nor the need to initialize all 3 references can be optimized away so it is presumably slower than the return particlearrray[idx][dim]` version.
kdtree_get_pt uses reinterpret_cast or pointer magic to directly point into Vec3f's members. Given a Vec3f object's address, I believe x, y, z are guaranteed to be stored in that order with the first one stored at the same address as the one given by the & operator on the Vec3f object, but even so I'm confused as to whether there exists a well-defined way to observe them.
From a software engineering standpoint, it's best to expose the data through accessor and modifier functions only.
I would suggest:
class Vec3f
{
public:
float& operator[](size_t index) { return values[index]; }
float operator[](size_t index) const { return values[index]; }
float& x() { return values[0]; }
float x() const { return values[0]; }
float& y() { return values[1]; }
float y() const { return values[1]; }
float& z() { return values[2]; }
float z() const { return values[2]; }
private:
float values[3];
}
Re: kdtree_get_pt uses reinterpret_cast or pointer magic to directly point into Vec3f's members.
That's a bad idea in general. However, I don't see that being a problem with my suggestion.
You should always check if the switch statements will really introduce branchings in the final compiled output. A tool that might help you there is godbolt.
For both of those code snippets (random and cout have been added to prevent complete removeal of the code):
#include<cstddef>
#include<array>
#include<iostream>
#include <ctime>
class Vec3f {
public:
float values[3];
};
struct Test {
std::array<Vec3f,100> particlearray;
float kdtree_get_pt(const size_t idx, int dim) const {
return particlearray[idx].values[dim];
}
};
int main() {
Test t;
std::srand(std::time(0));
int random_variable = std::rand();
std::cout << t.kdtree_get_pt(random_variable,0);
std::cout << t.kdtree_get_pt(random_variable,1);
std::cout << t.kdtree_get_pt(random_variable,2) << std::endl;
return 0;
}
and
#include<iostream>
#include<array>
#include<ctime>
#include<cstdlib>
#include<cstddef>
class Vec3f {
public:
float x, y, z;
};
struct Test {
std::array<Vec3f,100> particlearray;
float kdtree_get_pt(const size_t idx, int dim) const {
switch(dim) {
case 0:
return particlearray[idx].x;
case 1:
return particlearray[idx].y;
case 2:
return particlearray[idx].z;
}
}
};
int main() {
Test t;
std::srand(std::time(0));
int random_variable = std::rand();
std::cout << t.kdtree_get_pt(random_variable,0);
std::cout << t.kdtree_get_pt(random_variable,1);
std::cout << t.kdtree_get_pt(random_variable,2) << std::endl;
return 0;
}
The access to x, y and z or values[dim] will be compiled (by gcc 7) to:
cvtss2sd xmm0, DWORD PTR [rsp+rbx]
cvtss2sd xmm0, DWORD PTR [rsp+4+rbx]
cvtss2sd xmm0, DWORD PTR [rsp+8+rbx]
Without any branching.
There is known technique for mixing up access via x, y, z and array indices using union of identical data types. Resolves problem with UB, sizeof() is 12 bytes, access time is as fast as it can be, one could use SIMD vector in very similar fashion. Code below tested with VS2017
#include <iostream>
#include <type_traits>
template <int Size> struct VectorBase {
float _data[Size];
float operator[](int Index) {
return _data[Index];
}
};
template <typename VectorType, int Index> struct ScalarAccessor {
VectorType _v;
operator float() const {
return _v._data[Index];
}
float operator = (float x) {
_v._data[Index] = x;
return *this;
}
};
union uuu {
VectorBase<3> xyz;
ScalarAccessor<VectorBase<3>, 0> x;
ScalarAccessor<VectorBase<3>, 1> y;
ScalarAccessor<VectorBase<3>, 2> z;
};
template <int Size> struct Vector {
union
{
VectorBase<3> xyz;
ScalarAccessor<VectorBase<3>, 0> x;
ScalarAccessor<VectorBase<3>, 1> y;
ScalarAccessor<VectorBase<3>, 2> z;
};
float operator[](int Index) {
return xyz[Index];
}
};
using Vec3f = Vector<3>;
int main() {
Vec3f a;
a.x = 1.0f;
a.y = a.x + 3.0f;
a.z = a.x * 3.0f;
std::cout << sizeof(a) << "\n";
std::cout << a.x << " " << a.y << " " << a.z << "\n";
std::cout << a[0] << " " << a[1] << " " << a[2] << "\n";
std::cout << std::is_standard_layout<VectorBase<3>>::value << "\n";
std::cout << std::is_standard_layout<ScalarAccessor<VectorBase<3>, 0>>::value << "\n";
std::cout << std::is_standard_layout<ScalarAccessor<VectorBase<3>, 1>>::value << "\n";
std::cout << std::is_standard_layout<ScalarAccessor<VectorBase<3>, 2>>::value << "\n";
std::cout << std::is_standard_layout<Vec3f>::value << "\n";
std::cout << std::is_standard_layout<uuu>::value << "\n";
return 0;
}
UPDATE
Here some C++ standard reading
I'm relying on the definition of standard-layout type 12.7 Classes
A class S is a standard-layout class if it:
(7.1) — has no non-static data members of type non-standard-layout class (or array of such
types) or reference,
(7.2) — has no virtual functions (13.3) and no virtual base classes (13.1),
(7.3) — has the same access control (Clause 14) for all non-static data members,
(7.4) — has no non-standard-layout base classes, (7.5) — has at most one base class subobject of any given type
...
It is easy to check if all proposed classes are standard-layout - I've changed the code to check for that.
They are all layout-compatible, I believe
Also If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member.
Union is a standard-layout class as well, so we have classes aligned in union with only data member being array of the same type and size, and looks like standard requires it to be byte-by-byte compatible
However, I much prefer the x, y, z interface for my equations. Now that the problem is clear, my question is how can I keep the x, y, z
Declare x,y,z as local references before calculation:
auto& [x1, y1, z1] = v1.values;
auto& [x2, y2, z2] = v2.values;
return x1*x2 + y1*y2 + z1*z2;
For pre-C++17, you need more verbose:
auto& x = values[0];
auto& y = values[1];
auto& z = values[2];
The compiler will not need to use any storage for these references.
This of course introduces some repetition; One line (in C++17) per vector per function.
Extra parentheses introduced by your first suggestion is another good way to go. Whether the introduction of parentheses is better or worse than local reference declaration boiler plate depends on the use case and personal preference.
Edit: Another alternative: Define operator[] and use named constants for indices.
namespace axes {
enum axes {
x, y, z
};
}
struct Vec3f {
float values[3];
float& operator[](size_t index) { return values[index]; }
float operator[](size_t index) const { return values[index]; }
};
// usage
using namespace axes;
return v1[x]*v2[x] + v1[y]*v2[y] + v1[z]*v2[z];

C++ type casting pointers to constant variables

I'm hoping there's a way to write a single get function for a class with a large number of accessible (but non-editable) data members, of mixed type. Use of a map holding void*-cast copies of the members' addresses will work, as seen in the following code, but as soon as a 'const' is thrown in to the mix to enforce read-only, unsurprisingly C++ barks saying that 'const void*' type cannot be recast in order to appropriately access the data member. The following code works for writing a single get function for a class of mixed data types, but it effectively makes all data members accessed by the get function public (see specifically the get function in the memlist class).
Bottom line:
Is there a way to make a pointer type-castable while retaining read-only at the actual memory location? Or more fundamentally, can one define a type cast-able pointer to a constant variable? E.g., it seems to me that const type *var defines a read-only/non-castable address to a read-only variable, whereas I am trying to find something (that hasn't worked for me as of yet) more like type * const var, though I haven't been able to find any documentation on this.
#include <iostream>
#include <string>
#include <map>
class A{
public:
A(int a, double b): a(a), b(b) {};
private:
int a;
double b;
friend std::ostream& operator<<(std::ostream& os, A& rhs);
};
class memlist{
public:
memlist(int param1, double param2)
{
myint = new int(param1);
mydouble = new double(param2);
myclass = new A(param1,param2);
getMap["myint"] = myint;
getMap["mydouble"] = mydouble;
getMap["myclass"] = myclass;
}
~memlist()
{
delete myint;
delete mydouble;
delete myclass;
}
void* get(std::string param) {return getMap[param];};
private:
int *myint;
double *mydouble;
A *myclass;
std::map<std::string,void*> getMap;
};
std::ostream& operator<<(std::ostream& os, A& rhs){
os << rhs.a << std::endl << rhs.b;
return os;
};
int main(){
int myint = 5;
double mydbl = 3.14159263;
memlist mymem(myint,mydbl);
std::cout << *(int*)mymem.get("myint") << std::endl;
std::cout << *(double*)mymem.get("mydouble") << std::endl;
std::cout << *(A*)mymem.get("myclass") << std::endl;
*(int*)mymem.get("myint") = 10;
std::cout << *(int*)mymem.get("myint") << std::endl;
return 0;
}
Output:
5
3.14159
5
3.14159
10
The code shown is very, shall we say, ill-designed.
void* is as close to destroying the type system as it gets in C++. As mentioned in the comments, std::any is a better solution to this.
That said, I took it as a challenge to implement what you have illustrated in the question in a type-safe manner. It was overkill, to say the least.
#include <iostream>
#include <type_traits>
using namespace std;
template<typename>
struct is_str_literal : false_type {};
template<size_t N>
struct is_str_literal<const char[N]> : true_type {};
template<typename T>
struct is_str_literal<T&> : is_str_literal<T> {};
template<typename T>
constexpr bool is_str_literal_v = is_str_literal<T>::value;
constexpr bool samestr(const char* arr1, const char* arr2, size_t n)
{
return n == 0 ? arr1[0] == arr2[0] :
(arr1[n] == arr2[n]) && samestr(arr1, arr2, n - 1);
}
template<size_t N1, size_t N2>
constexpr bool samestr(const char (&arr1)[N1], const char (&arr2)[N2])
{
return N1 == N2 ? samestr(arr1, arr2, N1 - 1) : false;
}
constexpr char myint[] = "myint";
constexpr char mydouble[] = "mydouble";
constexpr char myclass[] = "myclass";
struct S
{
template<const auto& name>
const auto& get()
{
static_assert(is_str_literal_v<decltype(name)>, "usage: get<var name>()");
if constexpr(samestr(name, ::myint))
return myint;
if constexpr(samestr(name, ::mydouble))
return mydouble;
if constexpr(samestr(name, ::myclass))
return myclass;
}
int myint;
double mydouble;
char myclass;
};
int main()
{
S s;
s.myint = 42;
s.mydouble = 10.0;
s.myclass = 'c';
cout << s.get<myint>() << endl;
cout << s.get<mydouble>() << endl;
cout << s.get<myclass>() << endl;
}
Live
This uses C++17.
After some further poking around, I have to respectfully disagree with the previous assessments in the comments and answers... I have, since posting this question, come across many functions in the standard C library where void * types are readily used (http://www.cplusplus.com/reference/cstdlib/qsort/), not to mention it being the return type of malloc (probably the most widely-used function in C/C++?) which relies on programmer type-casting. Also, to the best of my knowledge, std::any is a new c++17 class, so how might you have answered this question 6 months ago?

Changing VTBL of existing object "on the fly", dynamic subclassing

Consider the following setup.
Base class:
class Thing {
int f1;
int f2;
Thing(NO_INIT) {}
Thing(int n1 = 0, int n2 = 0): f1(n1),f2(n2) {}
virtual ~Thing() {}
virtual void doAction1() {}
virtual const char* type_name() { return "Thing"; }
}
And derived classes that are different only by implementation of methods above:
class Summator {
Summator(NO_INIT):Thing(NO_INIT) {}
virtual void doAction1() override { f1 += f2; }
virtual const char* type_name() override { return "Summator"; }
}
class Substractor {
Substractor(NO_INIT):Thing(NO_INIT) {}
virtual void doAction1() override { f1 -= f2; }
virtual const char* type_name() override { return "Substractor"; }
}
The task I have requires ability to change class (VTBL in this case) of existing objects on the fly. This is known as dynamic subclassing if I am not mistaken.
So I came up with the following function:
// marker used in inplace CTORs
struct NO_INIT {};
template <typename TO_T>
inline TO_T* turn_thing_to(Thing* p)
{
return ::new(p) TO_T(NO_INIT());
}
that does just that - it uses inplace new to construct one object in place of another. Effectively this just changes vtbl pointer in objects. So this code works as expected:
Thing* thing = new Thing();
cout << thing->type_name() << endl; // "Thing"
turn_thing_to<Summator>(thing);
cout << thing->type_name() << endl; // "Summator"
turn_thing_to<Substractor>(thing);
cout << thing->type_name() << endl; // "Substractor"
The only major problems I have with this approach is that
a) each derived classes shall have special constructors like Thing(NO_INIT) {} that shall do precisely nothing. And b) if I will want to add members like std::string to the Thing they will not work - only types that have NO_INIT constructors by themselves are allowed as members of the Thing.
Question: is there a better solution for such dynamic subclassing that solves 'a' and 'b' problems ? I have a feeling that std::move semantic may help to solve 'b' somehow but not sure.
Here is the ideone of the code.
(Already answered at RSDN http://rsdn.ru/forum/cpp/5437990.1)
There is a tricky way:
struct Base
{
int x, y, z;
Base(int i) : x(i), y(i+i), z(i*i) {}
virtual void whoami() { printf("%p base %d %d %d\n", this, x, y, z); }
};
struct Derived : Base
{
Derived(Base&& b) : Base(b) {}
virtual void whoami() { printf("%p derived %d %d %d\n", this, x, y, z); }
};
int main()
{
Base b(3);
Base* p = &b;
b.whoami();
p->whoami();
assert(sizeof(Base)==sizeof(Derived));
Base t(std::move(b));
Derived* d = new(&b)Derived(std::move(t));
printf("-----\n");
b.whoami(); // the compiler still believes it is Base, and calls Base::whoami
p->whoami(); // here it calls virtual function, that is, Derived::whoami
d->whoami();
};
Of course, it's UB.
For your code, I'm not 100% sure it's valid according to the standard.
I think the usage of the placement new which doesn't initialize any member variables, so to preserve previous class state, is undefined behavior in C++. Imagine there is a debug placement new which will initialize all uninitialized member variable into 0xCC.
union is a better solution in this case. However, it does seem that you are implementing the strategy pattern. If so, please use the strategy pattern, which will make code a lot easier to understand & maintain.
Note: the virtual should be removed when using union.
Adding it is ill-formed as mentioned by Mehrdad, because introducing virtual function doesn't meet standard layout.
example
#include <iostream>
#include <string>
using namespace std;
class Thing {
int a;
public:
Thing(int v = 0): a (v) {}
const char * type_name(){ return "Thing"; }
int value() { return a; }
};
class OtherThing : public Thing {
public:
OtherThing(int v): Thing(v) {}
const char * type_name() { return "Other Thing"; }
};
union Something {
Something(int v) : t(v) {}
Thing t;
OtherThing ot;
};
int main() {
Something sth{42};
std::cout << sth.t.type_name() << "\n";
std::cout << sth.t.value() << "\n";
std::cout << sth.ot.type_name() << "\n";
std::cout << sth.ot.value() << "\n";
return 0;
}
As mentioned in the standard:
In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [ Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (9.2), and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members; see 9.2. — end note ]
Question: is there a better solution for such dynamic subclassing that solves 'a' and 'b' problems ?
If you have fixed set of sub-classes then you may consider using algebraic data type like boost::variant. Store shared data separately and place all varying parts into variant.
Properties of this approach:
naturally works with fixed set of "sub-classes". (though, some kind of type-erased class can be placed into variant and set would become open)
dispatch is done via switch on small integral tag. Sizeof tag can be minimized to one char. If your "sub-classes" are empty - then there will be small additional overhead (depends on alignment), because boost::variant does not perform empty-base-optimization.
"Sub-classes" can have arbitrary internal data. Such data from different "sub-classes" will be placed in one aligned_storage.
You can make bunch of operations with "sub-class" using only one dispatch per batch, while in general case with virtual or indirect calls dispatch will be per-call. Also, calling method from inside "sub-class" will not have indirection, while with virtual calls you should play with final keyword to try to achieve this.
self to base shared data should be passed explicitly.
Ok, here is proof-of-concept:
struct ThingData
{
int f1;
int f2;
};
struct Summator
{
void doAction1(ThingData &self) { self.f1 += self.f2; }
const char* type_name() { return "Summator"; }
};
struct Substractor
{
void doAction1(ThingData &self) { self.f1 -= self.f2; }
const char* type_name() { return "Substractor"; }
};
using Thing = SubVariant<ThingData, Summator, Substractor>;
int main()
{
auto test = [](auto &self, auto &sub)
{
sub.doAction1(self);
cout << sub.type_name() << " " << self.f1 << " " << self.f2 << endl;
};
Thing x = {{5, 7}, Summator{}};
apply(test, x);
x.sub = Substractor{};
apply(test, x);
cout << "size: " << sizeof(x.sub) << endl;
}
Output is:
Summator 12 7
Substractor 5 7
size: 2
LIVE DEMO on Coliru
Full Code (it uses some C++14 features, but can be mechanically converted into C++11):
#define BOOST_VARIANT_MINIMIZE_SIZE
#include <boost/variant.hpp>
#include <type_traits>
#include <functional>
#include <iostream>
#include <utility>
using namespace std;
/****************************************************************/
// Boost.Variant requires result_type:
template<typename T, typename F>
struct ResultType
{
mutable F f;
using result_type = T;
template<typename ...Args> T operator()(Args&& ...args) const
{
return f(forward<Args>(args)...);
}
};
template<typename T, typename F>
auto make_result_type(F &&f)
{
return ResultType<T, typename decay<F>::type>{forward<F>(f)};
}
/****************************************************************/
// Proof-of-Concept
template<typename Base, typename ...Ts>
struct SubVariant
{
Base shared_data;
boost::variant<Ts...> sub;
template<typename Visitor>
friend auto apply(Visitor visitor, SubVariant &operand)
{
using result_type = typename common_type
<
decltype( visitor(shared_data, declval<Ts&>()) )...
>::type;
return boost::apply_visitor(make_result_type<result_type>([&](auto &x)
{
return visitor(operand.shared_data, x);
}), operand.sub);
}
};
/****************************************************************/
// Demo:
struct ThingData
{
int f1;
int f2;
};
struct Summator
{
void doAction1(ThingData &self) { self.f1 += self.f2; }
const char* type_name() { return "Summator"; }
};
struct Substractor
{
void doAction1(ThingData &self) { self.f1 -= self.f2; }
const char* type_name() { return "Substractor"; }
};
using Thing = SubVariant<ThingData, Summator, Substractor>;
int main()
{
auto test = [](auto &self, auto &sub)
{
sub.doAction1(self);
cout << sub.type_name() << " " << self.f1 << " " << self.f2 << endl;
};
Thing x = {{5, 7}, Summator{}};
apply(test, x);
x.sub = Substractor{};
apply(test, x);
cout << "size: " << sizeof(x.sub) << endl;
}
use return new(p) static_cast<TO_T&&>(*p);
Here is a good resource regarding move semantics: What are move semantics?
You simply can't legally "change" the class of an object in C++.
However if you mention why you need this, we might be able to suggest alternatives. I can think of these:
Do v-tables "manually". In other words, each object of a given class should have a pointer to a table of function pointers that describes the behavior of the class. To modify the behavior of this class of objects, you modify the function pointers. Pretty painful, but that's the whole point of v-tables: to abstract this away from you.
Use discriminated unions (variant, etc.) to nest objects of potentially different types inside the same kind of object. I'm not sure if this is the right approach for you though.
Do something implementation-specific. You can probably find the v-table formats online for whatever implementation you're using, but you're stepping into the realm of undefined behavior here so you're playing with fire. And it most likely won't work on another compiler.
You should be able to reuse data by separating it from your Thing class. Something like this:
template <class TData, class TBehaviourBase>
class StateStorageable {
struct StateStorage {
typedef typename std::aligned_storage<sizeof(TData), alignof(TData)>::type DataStorage;
DataStorage data_storage;
typedef typename std::aligned_storage<sizeof(TBehaviourBase), alignof(TBehaviourBase)>::type BehaviourStorage;
BehaviourStorage behaviour_storage;
static constexpr TData *data(TBehaviourBase * behaviour) {
return reinterpret_cast<TData *>(
reinterpret_cast<char *>(behaviour) -
(offsetof(StateStorage, behaviour_storage) -
offsetof(StateStorage, data_storage)));
}
};
public:
template <class ...Args>
static TBehaviourBase * create(Args&&... args) {
auto storage = ::new StateStorage;
::new(&storage->data_storage) TData(std::forward<Args>(args)...);
return ::new(&storage->behaviour_storage) TBehaviourBase;
}
static void destroy(TBehaviourBase * behaviour) {
auto storage = reinterpret_cast<StateStorage *>(
reinterpret_cast<char *>(behaviour) -
offsetof(StateStorage, behaviour_storage));
::delete storage;
}
protected:
StateStorageable() = default;
inline TData *data() {
return StateStorage::data(static_cast<TBehaviourBase *>(this));
}
};
struct Data {
int a;
};
class Thing : public StateStorageable<Data, Thing> {
public:
virtual const char * type_name(){ return "Thing"; }
virtual int value() { return data()->a; }
};
Data is guaranteed to be leaved intact when you change Thing to other type and offsets should be calculated at compile-time so performance shouldn't be affected.
With a propert set of static_assert's you should be able to ensure that all offsets are correct and there is enough storage for holding your types. Now you only need to change the way you create and destroy your Things.
int main() {
Thing * thing = Thing::create(Data{42});
std::cout << thing->type_name() << "\n";
std::cout << thing->value() << "\n";
turn_thing_to<OtherThing>(thing);
std::cout << thing->type_name() << "\n";
std::cout << thing->value() << "\n";
Thing::destroy(thing);
return 0;
}
There is still UB because of not reassigning thing which can be fixed by using result of turn_thing_to
int main() {
...
thing = turn_thing_to<OtherThing>(thing);
...
}
Here is one more solution
While it slightly less optimal (uses intermediate storage and CPU cycles to invoke moving ctors) it does not change semantic of original task.
#include <iostream>
#include <string>
#include <memory>
using namespace std;
struct A
{
int x;
std::string y;
A(int x, std::string y) : x(x), y(y) {}
A(A&& a) : x(std::move(a.x)), y(std::move(a.y)) {}
virtual const char* who() const { return "A"; }
void show() const { std::cout << (void const*)this << " " << who() << " " << x << " [" << y << "]" << std::endl; }
};
struct B : A
{
virtual const char* who() const { return "B"; }
B(A&& a) : A(std::move(a)) {}
};
template<class TO_T>
inline TO_T* turn_A_to(A* a) {
A temp(std::move(*a));
a->~A();
return new(a) B(std::move(temp));
}
int main()
{
A* pa = new A(123, "text");
pa->show(); // 0xbfbefa58 A 123 [text]
turn_A_to<B>(pa);
pa->show(); // 0xbfbefa58 B 123 [text]
}
and its ideone.
The solution is derived from idea expressed by Nickolay Merkin below.
But he suspect UB somewhere in turn_A_to<>().
I have the same problem, and while I'm not using it, one solution I thought of is to have a single class and make the methods switches based on a "item type" number in the class. Changing type is as easy as changing the type number.
class OneClass {
int iType;
const char* Wears() {
switch ( iType ) {
case ClarkKent:
return "glasses";
case Superman:
return "cape";
}
}
}
:
:
OneClass person;
person.iType = ClarkKent;
printf( "now wearing %s\n", person.Wears() );
person.iType = Superman;
printf( "now wearing %s\n", person.Wears() );

non-resizeable vector/array of non-reassignable but mutable members?

Is there a way to make a non-resizeable vector/array of non-reassignable but mutable members? The closest thing I can imagine is using a vector<T *> const copy constructed from a temporary, but since I know at initialization how many of and exactly what I want, I'd much rather have a block of objects than pointers. Is anything like what is shown below possible with std::vector or some more obscure boost, etc., template?
// Struct making vec<A> that cannot be resized or have contents reassigned.
struct B {
vector<A> va_; // <-- unknown modifiers or different template needed here
vector<A> va2_;
// All vector contents initialized on construction.
Foo(size_t n_foo) : va_(n_foo), va2_(5) { }
// Things I'd like allowed: altering contents, const_iterator and read access.
good_actions(size_t idx, int val) {
va_[idx].set(val);
cout << "vector<A> info - " << " size: " << va_.size() << ", max: "
<< va_.max_size() << ", capacity: " << va_.capacity() << ", empty?: "
<< va_.empty() << endl;
if (!va_.empty()) {
cout << "First (old): " << va_[0].get() << ", resetting ..." << endl;
va_[0].set(0);
}
int max = 0;
for (vector<A>::const_iterator i = va_.begin(); i != va_.end(); ++i) {
int n = i->get();
if (n > max) { max = n; }
if (n < 0) { i->set(0); }
}
cout << "Max : " << max << "." << endl;
}
// Everything here should fail at compile.
bad_actions(size_t idx, int val) {
va_[0] = va2_[0];
va_.at(1) = va2_.at(3);
va_.swap(va2_);
va_.erase(va_.begin());
va_.insert(va_.end(), va2_[0]);
va_.resize(1);
va_.clear();
// also: assign, reserve, push, pop, ..
}
};
There is an issue with your requirements. But first let's tackle the fixed size issue, it's called std::tr1::array<class T, size_t N> (if you know the size at compile time).
If you don't know it at compile time, you can still use some proxy class over a vector.
template <class T>
class MyVector
{
public:
explicit MyVector(size_t const n, T const& t = T()): mVector(n,t) {}
// Declare the methods you want here
// and just forward to mVector most of the time ;)
private:
std::vector<T> mVector;
};
However, what is the point of not being assignable if you are mutable ? There is nothing preventing the user to do the heavy work:
class Type
{
public:
int a() const { return a; }
void a(int i) { a = i; }
int b() const { return b; }
void b(int i) { b = i; }
private:
Type& operator=(Type const&);
int a, b;
};
Nothing prevents me from doing:
void assign(Type& lhs, Type const& rhs)
{
lhs.a(rhs.a());
lhs.b(rhs.b());
}
I just want to hit you on the head for complicating my life...
Perhaps could you describe more precisely what you want to do, do you wish to restrict the subset of possible operations on your class (some variables should not be possible to modify, but other could) ?
In this case, you could once again use a Proxy class
class Proxy
{
public:
// WARN: syntax is screwed, but `vector` requires a model
// of the Assignable concept so this operation NEED be defined...
Proxy& operator=(Proxy const& rhs)
{
mType.a = rhs.mType.a;
// mType.b is unchanged
return *this;
}
int a() const { return mType.a(); }
void a(int i) { mType.a(i); }
int b() const { return mType.b(); }
private:
Type mType;
};
There is not much you cannot do with suitable proxies. That's perhaps the most useful pattern I have ever seen.
What you're asking is not really possible.
The only way to prevent something from being assigned is to define the operator = for that type as private. (As an extension of this, since const operator = methods don't make much sense (and are thus uncommon) you can come close to this by only allowing access to const references from your container. But the user can still define a const operator =, and you want mutable objects anyways.)
If you think about it, std::vector::operator [] returns a reference to the value it contains. Using the assignment operator will call operator = for the value. std::vector is completely bypassed here (except for the operator[] call used to get the reference in the first place) so there is no possibility for it (std::vector) to in any way to override the call to the operator = function.
Anything you do to directly access the members of an object in the container is going to have to return a reference to the object, which can then be used to call the object's operator =. So, there is no way a container can prevent objects inside of it from being assigned unless the container implements a proxy for the objects it contains which has a private assignment operator that does nothing and forwards other calls to the "real" object, but does not allow direct access to the real object (though if it made sense to do so, you could return copies of the real object).
Could you create a class which holds a reference to your object, but its constructors are only accessible to its std::vector's friend?
e.g.:
template<typename T>
class MyRef {
firend class std::vector< MyRef<T> >
public:
T& operator->();
[...etc...]
You can achieve what you want by making the std::vector const, and the vector's struct or class data mutable. Your set method would have to be const. Here's an example that works as expected with g++:
#include <vector>
class foo
{
public:
foo () : n_ () {}
void set(int n) const { n_ = n; }
private:
mutable int n_;
};
int main()
{
std::vector<foo> const a(3); // Notice the "const".
std::vector<foo> b(1);
// Executes!
a[0].set(1);
// Failes to compile!
a.swap(b);
}
That way you can't alter the vector in any way but you can modify the mutable data members of the objects held by the vector. Here's how this example compiles:
g++ foo.cpp
foo.cpp: In function 'int main()':
foo.cpp:24: error: passing 'const std::vector<foo, std::allocator<foo> >' as 'this' argument of 'void std::vector<_Tp, _Alloc>::swap(std::vector<_Tp, _Alloc>&) [with _Tp = foo, _Alloc = std::allocator<foo>]' discards qualifiers
The one disadvantage I can think of is that you'll have to be more aware of the const-correctness of your code, but that's not necessarily a disadvantage either.
HTH!
EDIT / Clarification: The goal of this approach is not defeat const completely. Rather, the goal is to demonstrate a means of achieving the requirements set forth in the OP's question using standard C++ and the STL. It is not the ideal solution since it exposes a const method that allows alteration of the internal state visible to the user. Certainly that is a problem with this approach.