Copying addresses of objects containing an atomic member? - c++

Lets say I have a struct AtomicElement:
struct AtomicElement{
std::atomic<int32_t> a;
};
and I have another class, Object, which contains a reference to one of the above AtomicElement objects:
struct Object{
AtomicElement& ae;
};
Now elsewhere I have a vector of these AtomicElement objects and I would like to update Object::ae to point to different vector elements:
std::vector<AtomicElement> aeVector(AtomicElement());
Object obj;
.
.
//Obtain the address to the new element
AtomicElement& raw_ae = aeVector[i];
.
.
//Change to point to the new element
obj.ae = raw_ae; //Cannot get this part to compile.
What am I doing wrong?
My AtomicElement is only 32 bits, should I just be using by value?
I want to do this in the most efficient way possible, with as little copying as possible.
EDIT:
The 32-bit int is actually representing two 16-bit numbers and the atomicity is so the value is updated.... atomically. I'm wondering if I'd be better off defining a copy constructor as copying 32-bit int would be quicker than pointer dereferencing?

You can't reassign references, but you can use pointers. (Would have commented but don't have the rep.)
However, since an std::atomic is (most likely, but depends on your architecture) useful because of the way it funnels all access through atomic member functions, not because of any extra member data that makes it atomic, copying it around is probably equivalent to copying an int, and it may indeed be faster if you don't need reference semantics, because dereferencing memory is relatively slow. As mentioned in the comments, you'll have to define what it means to copy it to satisfy your needs.

Type & value = reference;is more and less similar to Type * const value = &reference while to set a value you need to do so respectively through value = Type() and *value = Type(). As a fact, you CANNOT change the address of a reference in both ways once defined and it is mandatory to define them at their declaration.
In your case, the issue is that by doing obj.ae = raw_ae; you are not attempting to change the reference but to assign a new value to the current referencee of obj.ae. As std::atomic is declared as a non-copyable class, you got an appropriate compiler error.
If you need to be able to switch to another referencee, you must use a pointer instead.

Related

c++ return structures and vectors optimally

I am reading a lot of different things on C++ optimization and I am getting quite mixed up. I would appreciate some help. Basically, I want to clear up what needs to be a pointer or not when I am passing vectors and structures as parameters or returning vectors and structures.
Say I have a Structure that contains 2 elements: an int and then a vector of integers. I will be creating this structure locally in a function, and then returning it. This function will be called multiple times and generate a new structure every time. I would like to keep the last structure created in a class member (lastStruct_ for example). So before returning the struct I could update lastStruct_ in some way.
Now, what would be the best way to do this, knowing that the vector in the structure can be quite large (would need to avoid copies). Does the vector in the struct need to be a pointer ? If I want to share lastStruct_ to other classes by creating a get_lastStruct() method, should I return a reference to lastStruct_, a pointer, or not care about that ? Should lastStruct_ be a shared pointer ?
This is quite confusing to me because apparently C++ knows how to avoid copying, but I also see a lot of people recommending the use of pointers while others say a pointer to a vector makes no sense at all.
struct MyStruct {
std::vector<int> pixels;
int foo;
}
class MyClass {
MyStruct lastStruct_;
public:
MyStruct create_struct();
MyStruct getLastStruct();
}
MyClass::create_struct()
{
MyStruct s = {std::vector<int>(100, 1), 1234};
lastStruct_ = s;
return s;
}
MyClass::getLastStruct()
{
return lastStruct_;
}
If the only copy you're trying to remove is the one that happen when you return it from your factory function, I'd say containing the vector directly will be faster all the time.
Why? Two things. Return Value Optimisation (RVO/NRVO) will remove any need for temporaries when returning. This is enough for almost all cases.
When return value optimisation don't apply, move semantics will. returning a named variable (eg: return my_struct;) will do implicit move in the case NRVO won't apply.
So why is it always faster than a shared pointer? Because when copying the shared pointer, you must dereference the control block to increase the owner count. And since it's an atomic operation, the incrementation is not free.
Also, using a shared pointer brings shared ownership and non-locality. If you were to use a shared pointer, use a pointer to const data to bring back value semantics.
Now that you added the code, it's much clearer what you're trying to do.
There's no way around the copy here. If you measure performance degradation, then containing a std::shared_ptr<const std::vector<int>> might be the solution, since you'll keep value semantic but avoid vector copy.

Why can you convert all types to other pointer types?

Code:
class Dummy {
double i,j;
};
class Addition {
int x,y;
public:
Addition (int a, int b) { x=a; y=b; }
int result() { return x+y;}
};
int main () {
Dummy d;
Addition * padd;
padd = (Addition*) &d;
//cout << padd->result();
return 0;
}
Questions:
Why can you do this padd = (Addition*) &d; but not this padd = (Addition) &d;?
When declaring an object of Addition like Addition padd instead of Addition* padd, why does it ask for a constructor, which it doesn't when declaring a pointer?
Taking the address of a variable returns a pointer, which is a very different thing than the original variable. A pointer is just a memory address (usually the size of an int), while the object pointed can be anything from char to a huge table or a stream. So you can definitively not cast a pointer in non-pointer type (and vice-versa), and (Addition) &d can not work.
When you type Addition padd;, you build an object of type Addition which the default constructor. When you type Addition* padd;, you just declare a pointer, so a memory address, and no object is created.
Finally, be careful with the C cast (the type in parenthesis in front of the variable). This cast will try several C++ cast until it succeeds (from static_cast to reinterpret_cast), and you can easily find yourself on an undefined behavior. See this for instance.
A simple simile that indirectly answers your questions:
Pointers are like real-life signs.
A sign pointing to a lake is similar to a sign pointing to a building (but you can also use a different definition of "similar" which will say these things aren't similar).
A sign pointing to a lake is not similar to an actual lake or building.
Now when you assume a sign is pointing to a lake when it's in fact pointing to a building, and you try to treat that building like a lake (by e.g. swimming in it), you're going to have problems.
That is to say: C++ would run code that casts between incompatible pointers, but trying to access the object the pointer you incorrectly cast is pointing to probably won't work so well for you.
Making a new sign doesn't require that you also make the thing you want it to point to.
You can have a sign to a lake that doesn't actually point to a lake (yet?), or you can point it to an existing lake.
&lake gives you a sign to lake.
Disclaimer: I might've borrowed the core idea from someone else.
Basically you asking what is the difference between an object and a pointer to an object?
Object is a region of storage. A pointer is just a an integer pointing an offset in this storage. A pointer physically (after making a binary out of code) does not differ from an object of one type to an object of another type - always an integer. On the other hand, object is a region that has an offset and a size. That two basic matters that C++ distinguishes.
Why can you do this padd = (Addition*) &d; but not this padd = (Addition) &d;?
Like it is said above all pointers are physically the same (i.e. integers). That is why it is always possible to do the former. Mind that at the end dereferencing this pointer is undefined behavior. However it does not seem feasible for the compiler to convert to different matters (pointer to storage).
When declaring an object of Addition like Addition padd instead of Addition* padd, why does it ask for a constructor, which it doesn't when declaring a pointer?
Again pointer is just an integer, you do not need anything else to create it. However, Addition is a storage shat is type that has no default constructor that is why you have to provide other objects to initialize it.
By the way, pointers are also object, that is why you can have a pointer to a pointer and etc.
In addition:
Pointers to void, objects, functions are very different from pointers to members;
Pointers to void, objects, functions must not be assumed to have a fixed size, size might be varying on some platforms, compilers, but not larger than void*.

How are pointers to data members allocated/stored in memory?

This is one topic that is not making sense to me. Pointers to data members of a class can be declared and used. However,
What is the logic that supports the idea ? [I am not talking about the syntax, but the logic of this feature]
Also,if i understand this correctly, this would imply an indefinite/variable amount of memory being allocated at the pointer initialization as any number of objects may exist at that time. Also, new objects may be created and destroyed during runtime. Hence, in effect, a single statement will cause a large number of allocations/deallocations. This seems rather counter-intuitive as compared to the rest of the language. Or is my understanding of this incorrect ? I dont think there is any other single initialization statement that will implicitly affect program execution as widely as this.
Lastly, how is memory allocated to these pointers ? Where are they placed with respect to objects ? Is it possible to see physical memory addresses of these pointers ?
A single declaration of a pointer to a data member, creates pointers for every object of that class.
No, it does not. A pointer to a member is a special object that is very different from a pointer; it is a lot more similar to an offset. Given a pointer to an object of the class and a member pointer, you'd be able to get the value of a member; without the pointer to an object of a class a pointer to a member is useless.
Questions 2 and 3 stem from the same basic misunderstanding.
A single declaration of a pointer to a data member, creates pointers for every object of that class.
No. It creates a pointer to a member (which can be though of as an offset from the base of object)
You can then use it with a pointer to an object to get that member.
struct S
{
int x;
int y;
};
int S::* ptrToMember = &S::x; // Pointer to a member.
S obj;
int* ptrToData = &obj.x; // Pointer to object
// that happens to be a member
Notice in creating the pointer to a member we don't use an object (we just use the type information). So this pointer is an offset into the class to get a specific member.
You can access the data member via a pointer or object.
(obj.*ptrToMember) = 5; // Assign via pointer to member (requires an object)
*ptrToData = 6; // Assign via pointer already points at object.
Why does this happen as opposed to a single pointer being created to point to only one specific instance of the class ?
That is called a pointer.
A similar but parallel concept (see above).
What is the logic that supports the idea ?
Silly example:
void addOneToMember(S& obj, int S::* member) { (obj.*member) += 1; }
void addOneToX(S& obj) { addOneToMember(obj, &Obj::x);}
void addOneToY(S& obj) { addOneToMember(obj, &Obj::y);}
Also,if i understand this correctly, this would imply an indefinite/variable amount of memory being allocated at the pointer initialization as any number of objects may exist at that time.
No. Because a pointer to a member is just an offset into an object. You still need the actual object to get the value.
Lastly, how is memory allocated to these pointers ?
Same way as other objects. There is nothing special about them in terms of layout.
But the actual layout is implementation defined. So there is no way of answering this question without referring to the compiler. But it is really of no use to you.
Is it possible to see physical memory addresses of these pointers ?
Sure. They are just like other objects.
// Not that this will provide anything meaningful.
std::cout.write(reinterpret_cast<char*>(&ptrToMember), sizeof(ptrToMember));
// 1) take the address of the pointer to member.
// 2) cast to char* as required by write.
// 3) pass the size of the pointer to member
// and you should write the values printed out.
// Note the values may be non printable but I am sure you can work with that
// Also note the meaning is not useful to you as it is compiler dependent.
Internally, for a class that does not have virtual bases, a pointer-to-member-data just has to hold the offset of the data member from the start of an object of that type. With virtual bases it's a bit more complicated, because the location of the virtual base can change, depending on the type of the most-derived object. Regardless, there's a small amount of data involved, and when you dereference the pointer-to-data-member the compiler generates appropriate code to access it.

Trying to store an object in an array but then how to call that object's methods?

I'm not a very experienced c++ coder and this has me stumped. I am passing a object (created elsewhere) to a function, I want to be able to store that object in some array and then run through the array to call a function on that object. Here is some pseudo code:
void AddObject(T& object) {
object.action(); // this works
T* objectList = NULL;
// T gets allocated (not shown here) ...
T[0] = object;
T[0].action(); // this doesn't work
}
I know the object is passing correctly, because the first call to object.action() does what it should. But when I store object in the array, then try to invoke action() it causes a big crash.
Likely my problem is that I simply tinkered with the .'s and *'s until it compiled, T[0].action() compliles but crashes at runtime.
The simplest answer to your question is that you must declare your container correctly and you must define an appropriate assigment operator for your class. Working as closely as possible from your example:
typedef class MyActionableClass T;
T* getGlobalPointer();
void AddInstance(T const& objInstance)
{
T* arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **objects**
//whose first element we don't mind losing
//**copy** the instance we've received
arrayFromElsewhere[0] = objInstance;
//now invoke the action() method on our **copy**
arrayFromElsewhere[0].action();
}
Note the signature change to const reference which emphasizes that we are going to copy the original object and not change it in any way.
Also note carefully that arrayFromElsewhere[0].action() is NOT the same as objInstance.action() because you have made a copy — action() is being invoked in a different context, no matter how similar.
While it is obvious you have condensed, the condensation makes the reason for doing this much less obvious — specifying, for instance, that you want to maintain an array of callback objects would make a better case for “needing” this capability. It is also a poor choice to use “T” like you did because this tends to imply template usage to most experienced C++ programmers.
The thing that is most likely causing your “unexplained” crash is that assignment operator; if you don't define one the compiler will automatically generate one that works as a bitwise copy — almost certainly not what you want if your class is anything other than a collection of simple data types (POD).
For this to work properly on a class of any complexity you will likely need to define a deep copy or use reference counting; in C++ it is almost always a poor choice to let the compiler create any of ctor, dtor, or assignment for you.
And, of course, it would be a good idea to use standard containers rather than the simple array mechanism you implied by your example. In that case you should probably also define a default ctor, a virtual dtor, and a copy ctor because of the assumptions made by containers and algorithms.
If, in fact, you do not want to create a copy of your object but want, instead, to invoke action() on the original object but from within an array, then you will need an array of pointers instead. Again working closely to your original example:
typedef class MyActionableClass T;
T** getGlobalPointer();
void AddInstance(T& objInstance)
{
T** arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **pointers**
//whose first element we don't mind losing
//**reference** the instance we've received by saving its address
arrayFromElsewhere[0] = &objInstance;
//now invoke the action() method on **the original instance**
arrayFromElsewhere[0]->action();
}
Note closely that arrayFromElsewhere is now an array of pointers to objects instead of an array of actual objects.
Note that I dropped the const modifier in this case because I don’t know if action() is a const method — with a name like that I am assuming not…
Note carefully the ampersand (address-of) operator being used in the assignment.
Note also the new syntax for invoking the action() method by using the pointer-to operator.
Finally be advised that using standard containers of pointers is fraught with memory-leak peril, but typically not nearly as dangerous as using naked arrays :-/
I'm surprised it compiles. You declare an array, objectList of 8 pointers to T. Then you assign T[0] = object;. That's not what you want, what you want is one of
T objectList[8];
objectList[0] = object;
objectList[0].action();
or
T *objectList[8];
objectList[0] = &object;
objectList[0]->action();
Now I'm waiting for a C++ expert to explain why your code compiled, I'm really curious.
You can put the object either into a dynamic or a static array:
#include <vector> // dynamic
#include <array> // static
void AddObject(T const & t)
{
std::array<T, 12> arr;
std::vector<T> v;
arr[0] = t;
v.push_back(t);
arr[0].action();
v[0].action();
}
This doesn't really make a lot of sense, though; you would usually have defined your array somewhere else, outside the function.

Managing C++ objects in a buffer, considering the alignment and memory layout assumptions

I am storing objects in a buffer. Now I know that I cannot make assumptions about the memory layout of the object.
If I know the overall size of the object, is it acceptible to create a pointer to this memory and call functions on it?
e.g. say I have the following class:
[int,int,int,int,char,padding*3bytes,unsigned short int*]
1)
if I know this class to be of size 24 and I know the address of where it starts in memory
whilst it is not safe to assume the memory layout is it acceptible to cast this to a pointer and call functions on this object which access these members?
(Does c++ know by some magic the correct position of a member?)
2)
If this is not safe/ok, is there any other way other than using a constructor which takes all of the arguments and pulling each argument out of the buffer one at a time?
Edit: Changed title to make it more appropriate to what I am asking.
You can create a constructor that takes all the members and assigns them, then use placement new.
class Foo
{
int a;int b;int c;int d;char e;unsigned short int*f;
public:
Foo(int A,int B,int C,int D,char E,unsigned short int*F) : a(A), b(B), c(C), d(D), e(E), f(F) {}
};
...
char *buf = new char[sizeof(Foo)]; //pre-allocated buffer
Foo *f = new (buf) Foo(a,b,c,d,e,f);
This has the advantage that even the v-table will be generated correctly. Note, however, if you are using this for serialization, the unsigned short int pointer is not going to point at anything useful when you deserialize it, unless you are very careful to use some sort of method to convert pointers into offsets and then back again.
Individual methods on a this pointer are statically linked and are simply a direct call to the function with this being the first parameter before the explicit parameters.
Member variables are referenced using an offset from the this pointer. If an object is laid out like this:
0: vtable
4: a
8: b
12: c
etc...
a will be accessed by dereferencing this + 4 bytes.
Basically what you are proposing doing is reading in a bunch of (hopefully not random) bytes, casting them to a known object, and then calling a class method on that object. It might actually work, because those bytes are going to end up in the "this" pointer in that class method. But you're taking a real chance on things not being where the compiled code expects it to be. And unlike Java or C#, there is no real "runtime" to catch these sorts of problems, so at best you'll get a core dump, and at worse you'll get corrupted memory.
It sounds like you want a C++ version of Java's serialization/deserialization. There is probably a library out there to do that.
Non-virtual function calls are linked directly just like a C function. The object (this) pointer is passed as the first argument. No knowledge of the object layout is required to call the function.
It sounds like you're not storing the objects themselves in a buffer, but rather the data from which they're comprised.
If this data is in memory in the order the fields are defined within your class (with proper padding for the platform) and your type is a POD, then you can memcpy the data from the buffer to a pointer to your type (or possibly cast it, but beware, there are some platform-specific gotchas with casts to pointers of different types).
If your class is not a POD, then the in-memory layout of fields is not guaranteed, and you shouldn't rely on any observed ordering, as it is allowed to change on each recompile.
You can, however, initialize a non-POD with data from a POD.
As far as the addresses where non-virtual functions are located: they are statically linked at compile time to some location within your code segment that is the same for every instance of your type. Note that there is no "runtime" involved. When you write code like this:
class Foo{
int a;
int b;
public:
void DoSomething(int x);
};
void Foo::DoSomething(int x){a = x * 2; b = x + a;}
int main(){
Foo f;
f.DoSomething(42);
return 0;
}
the compiler generates code that does something like this:
function main:
allocate 8 bytes on stack for object "f"
call default initializer for class "Foo" (does nothing in this case)
push argument value 42 onto stack
push pointer to object "f" onto stack
make call to function Foo_i_DoSomething#4 (actual name is usually more complex)
load return value 0 into accumulator register
return to caller
function Foo_i_DoSomething#4 (located elsewhere in the code segment)
load "x" value from stack (pushed on by caller)
multiply by 2
load "this" pointer from stack (pushed on by caller)
calculate offset of field "a" within a Foo object
add calculated offset to this pointer, loaded in step 3
store product, calculated in step 2, to offset calculated in step 5
load "x" value from stack, again
load "this" pointer from stack, again
calculate offset of field "a" within a Foo object, again
add calculated offset to this pointer, loaded in step 8
load "a" value stored at offset,
add "a" value, loaded int step 12, to "x" value loaded in step 7
load "this" pointer from stack, again
calculate offset of field "b" within a Foo object
add calculated offset to this pointer, loaded in step 14
store sum, calculated in step 13, to offset calculated in step 16
return to caller
In other words, it would be more or less the same code as if you had written this (specifics, such as name of DoSomething function and method of passing this pointer are up to the compiler):
class Foo{
int a;
int b;
friend void Foo_DoSomething(Foo *f, int x);
};
void Foo_DoSomething(Foo *f, int x){
f->a = x * 2;
f->b = x + f->a;
}
int main(){
Foo f;
Foo_DoSomething(&f, 42);
return 0;
}
A object having POD type, in this case, is already created (Whether or not you call new. Allocating the required storage already suffices), and you can access the members of it, including calling a function on that object. But that will only work if you precisely know the required alignment of T, and the size of T (the buffer may not be smaller than it), and the alignment of all the members of T. Even for a pod type, the compiler is allowed to put padding bytes between members, if it wants. For a non-POD types, you can have the same luck if your type has no virtual functions or base classes, no user defined constructor (of course) and that applies to the base and all its non-static members too.
For all other types, all bets are off. You have to read values out first with a POD, and then initialize a non-POD type with that data.
I am storing objects in a buffer. ... If I know the overall size of the object, is it acceptable to create a pointer to this memory and call functions on it?
This is acceptable to the extent that using casts is acceptable:
#include <iostream>
namespace {
class A {
int i;
int j;
public:
int value()
{
return i + j;
}
};
}
int main()
{
char buffer[] = { 1, 2 };
std::cout << reinterpret_cast<A*>(buffer)->value() << '\n';
}
Casting an object to something like raw memory and back again is actually pretty common, especially in the C world. If you're using a class hierarchy, though, it would make more sense to use pointer to member functions.
say I have the following class: ...
if I know this class to be of size 24 and I know the address of where it starts in memory ...
This is where things get difficult. The size of an object includes the size of its data members (and any data members from any base classes) plus any padding plus any function pointers or implementation-dependent information, minus anything saved from certain size optimizations (empty base class optimization). If the resulting number is 0 bytes, then the object is required to take at least one byte in memory. These things are a combination of language issues and common requirements that most CPUs have regarding memory accesses. Trying to get things to work properly can be a real pain.
If you just allocate an object and cast to and from raw memory you can ignore these issues. But if you copy an object's internals to a buffer of some sort, then they rear their head pretty quickly. The code above relies on a few general rules about alignment (i.e., I happen to know that class A will have the same alignment restrictions as ints, and thus the array can be safely cast to an A; but I couldn't necessarily guarantee the same if I were casting parts of the array to A's and parts to other classes with other data members).
Oh, and when copying objects you need to make sure you're properly handling pointers.
You may also be interested in things like Google's Protocol Buffers or Facebook's Thrift.
Yes these issues are difficult. And, yes, some programming languages sweep them under the rug. But there's an awful lot of stuff getting swept under the rug:
In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary. On top of this, every object has a 2-word header in memory. The JVM's word size is usually the platform's native pointer size. (An object consisting of only a 32-bit int and a 64-bit double -- 96 bits of data -- will require) two words for the object header, one word for the int, two words for the double. That's 5 words: 160 bits. Because of the alignment, this object will occupy 192 bits of memory.
This is because Sun is relying on a relatively simple tactic for memory alignment issues (on an imaginary processor, a char may be allowed to exist at any memory location, an int at any location that is divisible by 4, and a double may need to be allocated only on memory locations that are divisible by 32 -- but the most restrictive alignment requirement also satisfies every other alignment requirement, so Sun is aligning everything according to the most restrictive location).
Another tactic for memory alignment can reclaim some of that space.
If the class contains no virtual functions (and therefore class instances have no vptr), and if you make correct assumptions about the way in which the class' member data is laid out in memory, then doing what you're suggesting might work (but might not be portable).
Yes, another way (more idiomatic but not much safer ... you still need to know how the class lays out its data) would be to use the so-called "placement operator new" and a default constructor.
That depends upon what you mean by "safe". Any time you cast a memory address into a point in this way you are bypassing the type safety features provided by the compiler, and taking the responsibility to yourself. If, as Chris implies, you make an incorrect assumption about the memory layout, or compiler implementation details, then you will get unexpected results and loose portability.
Since you are concerned about the "safety" of this programming style it is likely worth your while to investigate portable and type-safe methods such as pre-existing libraries, or writing a constructor or assignment operator for the purpose.