I've never seen this in any language, but I was wondering if this is possible using some trick that I don't know.
Let's say that I have a function like
struct A {
// some members and methods ...
some_t t;
// more members ...
};
void test(some_t& x) { // a reference to avoid copying a new some_t
// obtain the A instance if x is the t member of an A
// or throw an error if x is not the t member of an A
...
// do something
}
Would it be possible to obtain the instance of A whose member t is x ?
No unfortunately it's not possible.
If you know that you have a reference to the t member of some A instance, you can get the instance using container_of, e.g. A* pa = container_of(&x, A, t);.
Verifying that the resulting pointer actually is an A is technically possible if and only if A has virtual members, unfortunately there's no portable method to check.
You can achieve something similar, however, using multiple inheritance and dynamic_cast, which allows cross-casting between subobjects.
You can add pointer to A inside some_t (of course if some_t is struct or class)
like this:
struct some_t
{
A *a;
...
};
void test(some_t& x)
{
if( x.a )
{
// do some
}
else
throw ...
}
If you can modify struct A and its constructor and if you can ensure the structure packing, you can add a value directly after t which holds some magic key.
struct A {
...
some_t t
struct magic_t
{
uint32 code
some_t* pt;
} magic;
}
#define MAGICCODE 0xC0DEC0DE //or something else unique
In A's constructor, do:
this->magic.code = MAGICCODE; this->magic.pt = &(this->t);
Then you can write
bool test(some_t *t) //note `*` not `&`
{
struct magic_t* pm = (struct magic_t*)(t+1);
return (pm->pt == t && pm->code == MAGICCODE);
}
This answer does not meet all the requirements of the original question, I had deleted it, but the OP requested I post it. It shows how under very specific conditions you can calculate the instance pointer from a pointer to a member variable.
You shouldn't, but you can:
#include <iostream>
#include <cstddef>
using namespace std;
struct A
{
int x;
int y;
};
struct A* find_A_ptr_from_y(int* y)
{
int o = offsetof(struct A, y);
return (struct A*)((char *)y - o);
}
int main(int argc, const char* argv[])
{
struct A a1;
struct A* a2 = new struct A;
cout << "Address of a1 is " << &a1 << endl;
cout << "Address of a2 is " << a2 << endl;
struct A *pa1 = find_A_ptr_from_y(&a1.y);
struct A *pa2 = find_A_ptr_from_y(&(a2->y));
cout << "Address of a1 (recovered) is " << pa1 << endl;
cout << "Address of a2 (recovered) is " << pa2 << endl;
}
Output
Address of a1 is 0x7fff5fbff9d0
Address of a2 is 0x100100080
Address of a1 (recovered) is 0x7fff5fbff9d0
Address of a2 (recovered) is 0x100100080
Caveats: if what you pass to find_A_ptr_from_y is not a pointer to (struct A).y you well get total rubbish.
You should (almost) never do this. See comment by DasBoot below.
It's not quite clear to me what you are trying to do, but if you are want to find the pointer to an instance of struct A when you know the pointer to a member of A, you can do that.
See for example the container_of macro in the linux kernel.
The parameter x of function test() need not be a member of any class as far as test() is converned.
If semantically in a particular application x must always be a member of a class then that information could be provided, either by passing an additional paraemter or having some_t itself contain such information. However to do that would be enturely unnecessary since if test() truely needed access to the object containing x, then why not simply pass the parent object itself? Or just make test() a member function of the same class and pass no paraemeters whatsoever? If the reason is because x may belong to differnt classes, then polymorphism can be employed to resolve that issue.
Basically I suggest that there is no situation where you would need such a capability that cannot be solved in a simpler, safer and more object oriented manner.
Related
I have nested structs, where the base has a pure virtual function.
(The following examples are a bit pseudo-ish, but describe the purpose)
struct Base {
int id=0;
virtual std::wstring toString() = 0;
}
struct Top1 : public Base {
id=1;
int val = 5;
std::wstring toString() { return L"need to use string stream. id="+id+" val="+val; }
}
struct Top2 : public Base {
id=2;
std::string val = "Hello!";
std::wstring toString() { return L"need to use string stream. id="+id+" val="+val; }
}
I wish to have a single table for all the different types, so I created this:
struct BaseFootprint{
union{
Top1 top1;
Top2 top2;
}
std::vector<BaseFootprint> data;
Calling the function in the following way does not work:
for(int i=0;i<data.size;i++){
std::cwout <<data[i].toString()<< std::endl;;
}
I have tried:
std::cwout << ((base)data[i]).toString() << std::endl;
And:
std::cwout << (Top1)data[i].toString() << std::endl;
But it always says data[i]-> empty.
So, to my disappointment, and not unexpected, the pure virtual function does not point to the correct top function depending on how the struct data is viewed via the union.
As my end product will hold 100s of different top types, I am hoping for a dynamic solution as opposed to making a hard-written selection. A dynamic solution will allow me to add new types without altering the base code, and this is what I hope for.
It would be awesome if there is a way to achieve this as described.
Union is not the right tool.
Ignoring the other compiler errors, you need to access particular member of union (e.g. data[i].top1) and you cannot access any member except the one that was last written to (which means you would need to somehow remember which one is which in the vector). std::variant is a typesafe union, but you would still need a lot of boilerplate code to access correct member.
The normal way to use polymorphism in C++ is through pointers:
int main()
{
std::vector<std::unique_ptr<Base>> data;
data.push_back(std::make_unique<Top1>());
data.push_back(std::make_unique<Top2>());
for (auto& ptr : data)
{
std::wcout << ptr->toString();
}
}
The problem I was having is that I was not calling the constructor for the union objects.
For example...
If the union object needs to be Top1 then its constructor should be called...
new (&data[i]->top1) Top1();
At the other end the polymorophic methods worked for me with the following changes...
Remove the pure from the base method, like so...
virtual std::wstring toString() { return L"Base"; };
Add Base to the union, like so...
union{
Base base;
Top1 top1;
Top2 top2;
}
The continuous chunk of memory of objects can now be processed, by calling the polymorphic method...
for (std::vector<BaseFootprint>::iterator bfi = data.begin(); bfi != data.end(); bfi++) {
std::wcout << (*bfi).base->toString() << std::endl;
};
If you have never pushed a continuous chunk of memory of objects to the L1 cache before, you're welcome!
I have a class for a sparse matrix. Say it has a pointer a of int data type as a private data member. My question then is, if I create two objects B and C of that class, would both B and C have a pointer a pointing to the same location or they would do something else?
I am confused here.
The actual pointer in my class is defined as a private member thus:
element* ele;
and it's assigned in the constructor with:
ele = new element[this->num_non_zero];
Now that we can see the code you're discussing, the pointer you have is declared and initialised (in the constructor) in the following manner:
Element *ele;
ele = new element[this->num_non_zero];
That use of new gives each instance their own copy of memory to which their own ele variable points to. There is no possibility of different instances interfering with each other given this method.
Below is the original answer, before you added the deail allowing us to succinctly answer your question. Since it provides interesting background information, I've left it in.
Unless the member variable is a class-level static (shared amongst all instances of the class), it belongs to the instance itself, and will point to wherever it's set to (possibly, but not necessarily, in a constructor).
See, for example, the following code, which has both a static and non-static member variable:
#include <iostream>
#include <string>
class demo {
public:
demo(int newnonstat = 7, int newstat= 42): nonstat(newnonstat) {
std::cout << "create\n";
stat = newstat;
}
void dump(std::string desc) {
std::cout << desc << ": " << nonstat << ' ' << stat << '\n';
}
private:
int nonstat;
static int stat;
};
int demo::stat;
int main() {
demo d1; d1.dump("d1");
demo d2(1, 2); d2.dump("d2"); d1.dump("d1");
}
The output of that shows that the possibilities for how the two types are set (with my added comment):
create
d1: 7 42
create
d2: 1 2
d1: 7 2 <-- "corrupted" static
So, unless they're static, the variables will be distinct. However, as pointers, there's nothing that stops distinct pointers pointing to the same thing, it all comes down to what the various bits of code set it to.
Another example, with distinct pointers that point to the same thing:
#include <iostream>
#include <string>
class demo {
public:
demo(char *pStr) : m_pStr(pStr) {}
void dump(std::string desc) {
std::cout << desc << ": " << &m_pStr << ' ' << (void*)m_pStr << " '" << m_pStr << "'\n";
}
private:
char *m_pStr;
};
int main() {
char buff[] = "same string";
demo d1(buff);
demo d2(buff);
d1.dump("d1");
d2.dump("d2");
}
The output shows the pointers, although distinct (second column is address of pointer variable), pointing to the same thing (third and fourth column):
d1: 0x7ffea260c150 0x7ffea260c18c 'same string'
d2: 0x7ffea260c158 0x7ffea260c18c 'same string'
I am new to C++ and get confused about what goes on under the hood when a class method returns a reference to a member variable that is raw data (rather than a pointer or a reference). Here's an example:
#include <iostream>
using namespace std;
struct Dog {
int age;
};
class Wrapper {
public:
Dog myDog;
Dog& operator*() { return myDog; }
Dog* operator->() { return &myDog; }
};
int main() {
auto w = Wrapper();
// Method 1
w.myDog.age = 1;
cout << w.myDog.age << "\n";
// Method 2
(*w).age = 2;
cout << w.myDog.age << "\n";
// Method 3
w->age = 3;
cout << w.myDog.age << "\n";
}
My question is: what happens at runtime when the code reads (*w) or w-> (as in the main function)? Does it compute the address of the myDog field every time it sees (*it) or it->? Is there overhead to either of these two access methods compared to accessing myDog_ directly?
Thanks!
Technically, what you are asking is entirely system/compiler-specific. As a practicable matter, a pointer and a reference are identical in implementation.
No rational compiler is going to treat
(*x).y
and
x->y
differently. Under the covers both usually appears in assembly language as something like
y(Rn)
Where Rn is a register holding the address of x and y is the offset of y into the structure.
The problem is that C++ is built upon C which in turn is the most f*&*) *p programming language ever devised. The reference construct is a work around to C's inept method of passing parameters.
This question already has answers here:
Correct way of initializing a struct in a class constructor
(5 answers)
Closed 8 years ago.
So I read about Plain Old Data classes (POD) , and decided to make my structs POD to hold data. For example, I have
struct MyClass {
int ID;
int age;
double height;
char[8] Name;
};
Obviously, to assign values to the struct, I can do this:
MyClass.ID = 1;
MyClass.age = 20;
...
But is there anyway to assign raw data, WITHOUT knowing the name of each field?
For example, My program retrieves field value for each column,, and I want to assign the value to the struct, given that i don't know the name of the fields..
MyClass c;
while (MoreColumns()) {
doSomething( c , GetNextColumn() ); //GetNextColumn() returns some value of POD types
}
I'm assuming there's way to do this using memcpy, or something std::copy,, but Not sure how to start..
Sorry if the question is a bit unclear.
You can use aggregate initialization:
MyClass c1 = { 1, 20, 6.0, "Bob" };
MyClass c2;
c2 = MyClass{ 2, 22, 5.5, "Alice" };
There is no general way to loop over the members of a struct or class. There are some tricks to add data and functions to emulate that sort of thing, but they all require additional setup work beyond just declaring the type.
Since MyClass is an aggregate, you can use a brace-initializer to initialize all fields in one call, without naming any of them:
MyClass m {
1,
2,
42.0,
{ "Joseph" }
};
However, given your description, maybe a POD is not a good idea, and you might want to design a class with accessors to set internal fields based on (for example) index columns.
Maybe boost::fusion can help you with what you want to archive.
You can use the adapt macro to iterate over a struct.
From the example of boost:
struct MyClass
{
int ID;
int age;
double height;
};
BOOST_FUSION_ADAPT_STRUCT(
MyClass,
(int, ID)
(int, age)
(double, height)
)
void fillData(int& i)
{
i = 0;
}
void fillData(double& d)
{
d = 99;
}
struct MoreColumns
{
template<typename T>
void operator()(T& t) const
{
fillData(t);
}
};
int main()
{
struct MyClass m = { 33, 5, 2.0 };
std::cout << m.ID << std::endl;
std::cout << m.age << std::endl;
std::cout << m.height << std::endl;
MoreColumns c;
boost::fusion::for_each(m, c);
std::cout << m.ID << std::endl;
std::cout << m.age << std::endl;
std::cout << m.height << std::endl;
}
What you are trying to achieve usually leads to hard-to-read or even unreadable code. However, assuming that you have a genuinely good reason to try to assign (as opposed to initialize) raw data to a field without knowing its name, you could use reinterpret_cast as below (Link here). I don't recommend it, but just want to point out that you have the option.
#include <cstdio>
#include <cstring>
struct Target { // This is your "target"
char foo[8];
};
struct Trap {
// The "trap" which lets you manipulate your target
// without addressing its internals directly.
// Assuming here that an unsigned occupies 4 bytes (not always holds)
unsigned i1, i2;
};
int main() {
Target t;
strcpy(t.foo, "AAAAAAA");
// Ask the compiler to "reinterpet" Target* as Trap*
Trap* tr = reinterpret_cast<Trap*>(&t);
fprintf(stdout, "Before: %s\n", t.foo);
printf("%x %x\n", tr->i1, tr->i2);
// Now manipulate as you please
// Note the byte ordering issue in i2.
// on another architecture, you might have to use 0x42424200
tr->i1 = 0x42424242;
tr->i2 = 0x00424242;
printf("After: %s\n", t.foo);
return 0;
}
This is just a quick example I came up with, you can figure out how to make it "neater". Note that in the above, you could also access target iteratively, by using an array in "Trap" instead of i1, i2 as I have done above.
Let me reiterate, I don't recommend this style, but if you absolutely must do it, this is an option you could explore.
I was wondering whether assert( this != nullptr ); was a good idea in member functions and someone pointed out that it wouldn’t work if the value of this had been added an offset. In that case, instead of being 0, it would be something like 40, making the assert useless.
When does this happen though?
Multiple inheritance can cause an offset, skipping the extra v-table pointers in the object. The generic name is "this pointer adjustor thunking".
But you are helping too much. Null references are very common bugs, the operating system already has an assert built-in for you. Your program will stop with a segfault or access violation. The diagnostic you'll get from the debugger is always good enough to tell you that the object pointer is null, you'll see a very low address. Not just null, it works for MI cases as well.
this adjustment can happen only in classes that use multiple-inheritance. Here's a program that illustrates this:
#include <iostream>
using namespace std;
struct A {
int n;
void af() { cout << "this=" << this << endl; }
};
struct B {
int m;
void bf() { cout << "this=" << this << endl; }
};
struct C : A,B {
};
int main(int argc, char** argv) {
C* c = NULL;
c->af();
c->bf();
return 0;
}
When I run this program I get this output:
this=0
this=0x4
That is: your assert this != nullptr will not catch the invocation of c->bf() where c is nullptr because the this of the B sub-object inside the C object is shifted by four bytes (due to the A sub-object).
Let's try to illustrate the layout of a C object:
0: | n |
4: | m |
the numbers on the left-hand-side are offsets from the object's beginning. So, at offset 0 we have the A sub-object (with its data member n). at offset 4 we have the B sub-objects (with its data member m).
The this of the entire object, as well as the this of the A sub-object both point at offset 0. However, when we want to refer to the B sub-object (when invoking a method defined by B) the this value need to be adjusted such that it points at the beginning of the B sub-object. Hence the +4.
Note this is UB anyway.
Multiple inheritance can introduce an offset, depending on the implementation:
#include <iostream>
struct wup
{
int i;
void foo()
{
std::cout << (void*)this << std::endl;
}
};
struct dup
{
int j;
void bar()
{
std::cout << (void*)this << std::endl;
}
};
struct s : wup, dup
{
void foobar()
{
foo();
bar();
}
};
int main()
{
s* p = nullptr;
p->foobar();
}
Output on some version of clang++:
0
0x4
Live example.
Also note, as I pointed out in the comments to the OP, that this assert might not work for virtual function calls, as the vtable isn't initialized (if the compiler does a dynamic dispatch, i.e. doesn't optimize if it know the dynamic type of *p).
Here is a situation where it might happen:
struct A {
void f()
{
// this assert will probably not fail
assert(this!=nullptr);
}
};
struct B {
A a1;
A a2;
};
static void g(B *bp)
{
bp->a2.f(); // undefined behavior at this point, but many compilers will
// treat bp as a pointer to address zero and add sizeof(A) to
// the address and pass it as the this pointer to A::f().
}
int main(int,char**)
{
g(nullptr); // oops passed null!
}
This is undefined behavior for C++ in general, but with some compilers, it might have the
consistent behavior of the this pointer having some small non-zero address inside A::f().
Compilers typically implement multiple inheritance by storing the base objects sequentially in memory. If you had, e.g.:
struct bar {
int x;
int something();
};
struct baz {
int y;
int some_other_thing();
};
struct foo : public bar, public baz {};
The compiler will allocate foo and bar at the same address, and baz will be offset by sizeof(bar). So, under some implementation, it's possible that nullptr -> some_other_thing() results in a non-null this.
This example at Coliru demonstrates (assuming the result you get from the undefined behavior is the same one I did) the situation, and shows an assert(this != nullptr) failing to detect the case. (Credit to #DyP who I basically stole the example code from).
I think its not that bad a idea to put assert, for example atleast it can catch see below example
class Test{
public:
void DoSomething() {
std::cout << "Hello";
}
};
int main(int argc , char argv[]) {
Test* nullptr = 0;
nullptr->DoSomething();
}
The above example will run without error, If more complex becomes difficult to debug if that assert is absent.
I am trying to make a point that null this pointer can go unnoticed, and in complex situation becomes difficult to debug , I have faced this situation.