When is the value of "this" shifted by an offset? - c++

I was wondering whether assert( this != nullptr ); was a good idea in member functions and someone pointed out that it wouldn’t work if the value of this had been added an offset. In that case, instead of being 0, it would be something like 40, making the assert useless.
When does this happen though?

Multiple inheritance can cause an offset, skipping the extra v-table pointers in the object. The generic name is "this pointer adjustor thunking".
But you are helping too much. Null references are very common bugs, the operating system already has an assert built-in for you. Your program will stop with a segfault or access violation. The diagnostic you'll get from the debugger is always good enough to tell you that the object pointer is null, you'll see a very low address. Not just null, it works for MI cases as well.

this adjustment can happen only in classes that use multiple-inheritance. Here's a program that illustrates this:
#include <iostream>
using namespace std;
struct A {
int n;
void af() { cout << "this=" << this << endl; }
};
struct B {
int m;
void bf() { cout << "this=" << this << endl; }
};
struct C : A,B {
};
int main(int argc, char** argv) {
C* c = NULL;
c->af();
c->bf();
return 0;
}
When I run this program I get this output:
this=0
this=0x4
That is: your assert this != nullptr will not catch the invocation of c->bf() where c is nullptr because the this of the B sub-object inside the C object is shifted by four bytes (due to the A sub-object).
Let's try to illustrate the layout of a C object:
0: | n |
4: | m |
the numbers on the left-hand-side are offsets from the object's beginning. So, at offset 0 we have the A sub-object (with its data member n). at offset 4 we have the B sub-objects (with its data member m).
The this of the entire object, as well as the this of the A sub-object both point at offset 0. However, when we want to refer to the B sub-object (when invoking a method defined by B) the this value need to be adjusted such that it points at the beginning of the B sub-object. Hence the +4.

Note this is UB anyway.
Multiple inheritance can introduce an offset, depending on the implementation:
#include <iostream>
struct wup
{
int i;
void foo()
{
std::cout << (void*)this << std::endl;
}
};
struct dup
{
int j;
void bar()
{
std::cout << (void*)this << std::endl;
}
};
struct s : wup, dup
{
void foobar()
{
foo();
bar();
}
};
int main()
{
s* p = nullptr;
p->foobar();
}
Output on some version of clang++:
0
0x4
Live example.
Also note, as I pointed out in the comments to the OP, that this assert might not work for virtual function calls, as the vtable isn't initialized (if the compiler does a dynamic dispatch, i.e. doesn't optimize if it know the dynamic type of *p).

Here is a situation where it might happen:
struct A {
void f()
{
// this assert will probably not fail
assert(this!=nullptr);
}
};
struct B {
A a1;
A a2;
};
static void g(B *bp)
{
bp->a2.f(); // undefined behavior at this point, but many compilers will
// treat bp as a pointer to address zero and add sizeof(A) to
// the address and pass it as the this pointer to A::f().
}
int main(int,char**)
{
g(nullptr); // oops passed null!
}
This is undefined behavior for C++ in general, but with some compilers, it might have the
consistent behavior of the this pointer having some small non-zero address inside A::f().

Compilers typically implement multiple inheritance by storing the base objects sequentially in memory. If you had, e.g.:
struct bar {
int x;
int something();
};
struct baz {
int y;
int some_other_thing();
};
struct foo : public bar, public baz {};
The compiler will allocate foo and bar at the same address, and baz will be offset by sizeof(bar). So, under some implementation, it's possible that nullptr -> some_other_thing() results in a non-null this.
This example at Coliru demonstrates (assuming the result you get from the undefined behavior is the same one I did) the situation, and shows an assert(this != nullptr) failing to detect the case. (Credit to #DyP who I basically stole the example code from).

I think its not that bad a idea to put assert, for example atleast it can catch see below example
class Test{
public:
void DoSomething() {
std::cout << "Hello";
}
};
int main(int argc , char argv[]) {
Test* nullptr = 0;
nullptr->DoSomething();
}
The above example will run without error, If more complex becomes difficult to debug if that assert is absent.
I am trying to make a point that null this pointer can go unnoticed, and in complex situation becomes difficult to debug , I have faced this situation.

Related

Life time of a c++ static_cast result

I wonder - what are cast result in cpp actualy is? And specificly - what are their lifetime?
Consider this example:
#include <iostream>
#include <stdint.h>
using namespace std;
class Foo
{
public:
Foo(const int8_t & ref)
: _ptr(&ref)
{}
const int8_t & getRef() { return *_ptr; }
private:
const int8_t * _ptr;
};
enum Bar
{
SOME_BAR = 100
};
int main()
{
{
int32_t value = 50;
Foo x(static_cast<int16_t>(value));
std::cout << "casted from int32_t " << x.getRef() << std::endl;
}
{
Bar value = SOME_BAR;
Foo x(static_cast<int16_t>(value));
std::cout << "casted from enum " << x.getRef() << std::endl;
}
return 0;
}
Output:
casted from int32_t 50
casted from enum 100
It works - but is is safe? With integers i can imagine that compiller somehow cast a "pointer" to needed part of target variable bytes. But what happens when you cast int to float?
static_cast creates an rvalue that exists for the life of the expression. That is, up until the semi-colon. See Value Categories. If you need to pass a reference to the value, the compiler will put the value on the stack and pass that address. Otherwise, it will probably stay in a register, especially with optimizations turned on.
The way you are using it, at the place you're using it, static_cast is completely safe. In the Foo class however, you are saving a pointer to the rvalue. It is only luck that the program executes correctly. A more complex example will probably reuse those stack locations for other uses.
Edited to elaborate on safety of static_cast.

Changing pointer to class from one class to another

I'm pretty new to C++ and am having trouble making a pointer point from one class to another. This is what I have, it compiles without error, but doesn't work the way I want it to.
JungleMap *Map;
class JungleMap
{
public:
void goNorth()
{
cout << "You are going north towards the river.\n";
delete[] Map;
RiverMap *Map;
}
}
class RiverMap
{
public:
void goNorth()
{
cout << "You are going north away from the river.\n";
delete[] Map;
JungleMap *Map;
}
}
int main()
{
Map->goNorth();
Map->goNorth();
}
This is what the output is:
You are going north towards the river.
You are going north towards the river.
And this is what I would like the output to be:
You are going north towards the river.
You are going north away from the river.
How do I achieve this? It's really bugging me, especially since it compiles without problems.
Just creating a JungleMap* doesn't create a JungleMap. You formed a pointer, but didn't point it anywhere!
This is particularly dangerous since you then dereference it, and later attempt to delete through it. Yes, this compiles, because a compiler cannot diagnose this in the general case (and is never required to try), but you'll get everything at runtime from silent nothingness, to a crash, to a nuclear explosion.
You are also trying to invoke different functions in two different classes, through changing the type of a pointer (without any inheritance, at that), which is simply not possible and will prevent your code from compiling, even though you've tried to get around it by redeclaring variables locally. I could list a ream of misunderstandings but suffice it to say it's time to read a good introductory C++ book.
I would suggest a combination of inheritance and dynamic allocation, if I knew what you were trying to achieve. A common mistake on SO is to provide nonsense code, then expect us to know what your goal is from that nonsense code; unfortunately we have about as much idea what you really meant to do as the C++ compiler does!
You could make this work (to at least a minimal degree) by creating a base class from which both JungleMap and RiverMap derive. You'd then have a pointer to the base class, which you'd point at an instance of one of the derived classes. You'll also need to rearrange the code somewhat to get it to compile.
class Map {
public:
virtual void goNorth() { cout<<"Sorry, you can't go that way"; }
virtual void goSouth() { cout<<"Sorry, you can't go that way"; }
};
Map *map;
class RiverMap;
class JungleMap : public Map {
public:
void goNorth();
};
class RiverMap : public Map {
public:
void goSouth();
};
void JungleMap::goNorth() {
cout<<"You are going north towards the river.\n";
delete map;
map=new RiverMap;
}
void RiverMap::goSouth() {
cout<<"You are going south towards the jungle.\n";
delete map;
map=new JungleMap;
}
Note: here I'm just trying to say as close to your original design as possible and still have some code that might at least sort of work. I'm certainly not holding it up as an exemplary design, or even close to it (because, frankly, it's not).
What you should do is to sit down and think about the problem you are trying to solve, and make a proper design. In your case you have two "locations", and the "player" should be able to move between these locations. Starting from that we have identified two possible classes (Location and Player) and one behavior (the player can move from location to location).
With the above information, you could do something like this:
class Location
{
public:
void setNorth(Location* loc)
{
north_ = loc;
}
Location* getNorth() const
{
return north_;
}
void setSouth(Location* loc)
{
south_ = loc;
}
Location* getSouth() const
{
return south_;
}
void setDescription(const std::string& descr)
{
description_ = descr;
}
const std::string& getDescription() const
{
return description_;
}
protected:
Location() {} // Made protected to prevent direct creation of Location instances
private:
Location* north_;
Location* south_;
std::string description_;
};
class Jungle : public Location
{
public:
Jungle() : Location()
{
setDescription("You are in a jungle.");
}
};
class River : public Location
{
public:
River() : Location()
{
setDescription("You are close to a river.");
}
};
// The actual "map"
std::vector<Location*> map
void createMap()
{
map.push_back(new Jungle);
map.push_back(new River);
map[0]->setNorth(map[1]);
map[1]->setSouth(map[0]);
}
class Player
{
public:
Player(Location* initialLocation)
: currentLocation_(initialLocation)
{
std::cout << currentLocation_->getDescription() << '\n';
}
...
// Other methods and members needed for a "player"
void goNorth()
{
if (currentLocation_ && currentLocation_->getNorth())
{
currentLocation_ = currentLocation_->getNorth();
std::cout << currentLocation_->getDescription() << '\n';
}
}
void goSouth()
{
if (currentLocation_ && currentLocation_->getSouth())
{
currentLocation_ = currentLocation_->getSouth();
std::cout << currentLocation_->getDescription() << '\n';
}
}
private:
Location* currentLocation_; // The players current location
};
int main()
{
createMap(); // Create the "map"
Player player(map[0]); // Create a player and place "him" in the jungle
// Move the player around a little
player.goNorth();
player.goSouth();
}
In the code above, you have a single player object, which have a "current location". When you move the player around, you simply change the current location for that player. The current location of the player acts as the global Map variable you have.
Note: I'm not saying that this is a good design or code, just that it's simple.
However, if you're truly new to C++, you should probably start with some simpler problems, including tutorials on pointers and inheritance.
You appear to be confusing declaration with assignment.
The following line of code is called a declaration, it tells the compiler the properties and attributes of a thing.
JungleMap *Map;
After this line of code, the compiler knows that "Map" is a symbol (a name) referring to a pointer to a JungleMap.
The compiler doesn't have to do anything with a declaration, unless it would have a side effect, at which point it becomes a definition, which means that the declaration invokes a non-trivial constructor or provides an assignment:
struct Foo {};
struct Baz { Baz() { std::cout << "Baz is here\n"; } };
These are declarations - they don't create instances of objects, they describe the layout and functions for instances. At some point you have to create a concrete instance of them with a definition or a call to new.
struct Foo {};
struct Bar { Bar() { std::cout << "Bar is here\n"; } };
struct Baz {};
int main() {
int i; // no side effects, i is trivial.
char* p; // no side effects, p is a pointer (trivial) type
std::string* sp; // trivial, pointer
Foo f; // trivial
Bar b; // non-trivial, baz has a user-defined ctor that has side-effects.
Bar* bar; // trivial, unassigned pointer type.
Bar* bar2 = new Bar(); // side effects.
Bar bar(); // syntax error, "the most vexing parse"
}
In the above code, we never use "Baz" and we never declare an object of type Baz so the compiler essentially throws it away. Because so many of the variables are trivial and have no side effect, the result of compiling the above will be functionally equivalent to if we had written:
struct Foo {};
struct Bar { Bar() { std::cout << "Bar is here\n"; } };
int main() {
Bar* bar2 = new Bar(); // side effects.
Bar bar(); // syntax error, "the most vexing parse"
}
All of the rest does nothing.
C++ also allows you to re-use names as long as they are in different scopes, but this creates a new, hidden ("shadow") thing:
#include <iostream>
int main() {
int i = 1;
if (i == 1) {
float i = 3.141;
std::cout << "inner i = " << i << '\n';
}
std::cout << "outer i = " << i << '\n';
return 0;
}
The code you wrote will therefore compile, because it is declaring a new and private "Map" inside each of the go functions and then simply never using them.
Note that above I was able to declare i differently inside the inner scope than the outer.
C++ does not allow you to change the type of a variable - in the above code there are two variables called i. When we created the second i, it is a second variable called i the original variable didn't change.
In order to do what you are trying to do, you're going to need to learn about "polymorphism" and "inheritance", C++ concepts that will allow you to describe a "Room" or "Location" and then base JungleMap and RiverMap on that base definition such that you can take a pointer to the core concept, the Room, and write generic code that deals with rooms while moving the specifics of Jungle, River or BridgeMap into specialized functions. But I think that's beyond the scope of a reply here.

dereferencing typecasted void *object pointers

With regards to this piece of code:
#include <iostream>
class CClass1
{
public:
void print() {
std::cout << "This should print first" << std::endl;
}
};
class CClass2
{
public:
void print() {
std::cout << "This should print second" << std::endl;
}
};
So someone asked an interesting question about having a "free pointer" (so to speak) which can point to multiple instances of different objects without having to create a new type of that object. The person had the idea that this pointer can be of type void * and since it is void, it can be made to point to any instance of an object and access the object's public properties.
The following solution was submitted:
int main() {
void *pClass(NULL);
((CClass1 *)(pClass))->print();
((CClass2 *)(pClass))->print();
std::cin.ignore();
return 0;
}
My question is why does the above work, but this doesn't:
int main() {
(CClass1 *FG)->print();
(CClass2 *FG)->print();
std::cin.ignore();
return 0;
}
Your first example exhibits undefined behavior, by calling a non-static member function via a pointer that doesn't point to a valid object. It only appears to work by accident, because the function in question just happens not to use this in any way.
Your second example is, quite simply, syntactically incorrect. I'm not even sure what you are trying to do there; the code makes no sense.

Determine class instance of a member in C++

I've never seen this in any language, but I was wondering if this is possible using some trick that I don't know.
Let's say that I have a function like
struct A {
// some members and methods ...
some_t t;
// more members ...
};
void test(some_t& x) { // a reference to avoid copying a new some_t
// obtain the A instance if x is the t member of an A
// or throw an error if x is not the t member of an A
...
// do something
}
Would it be possible to obtain the instance of A whose member t is x ?
No unfortunately it's not possible.
If you know that you have a reference to the t member of some A instance, you can get the instance using container_of, e.g. A* pa = container_of(&x, A, t);.
Verifying that the resulting pointer actually is an A is technically possible if and only if A has virtual members, unfortunately there's no portable method to check.
You can achieve something similar, however, using multiple inheritance and dynamic_cast, which allows cross-casting between subobjects.
You can add pointer to A inside some_t (of course if some_t is struct or class)
like this:
struct some_t
{
A *a;
...
};
void test(some_t& x)
{
if( x.a )
{
// do some
}
else
throw ...
}
If you can modify struct A and its constructor and if you can ensure the structure packing, you can add a value directly after t which holds some magic key.
struct A {
...
some_t t
struct magic_t
{
uint32 code
some_t* pt;
} magic;
}
#define MAGICCODE 0xC0DEC0DE //or something else unique
In A's constructor, do:
this->magic.code = MAGICCODE; this->magic.pt = &(this->t);
Then you can write
bool test(some_t *t) //note `*` not `&`
{
struct magic_t* pm = (struct magic_t*)(t+1);
return (pm->pt == t && pm->code == MAGICCODE);
}
This answer does not meet all the requirements of the original question, I had deleted it, but the OP requested I post it. It shows how under very specific conditions you can calculate the instance pointer from a pointer to a member variable.
You shouldn't, but you can:
#include <iostream>
#include <cstddef>
using namespace std;
struct A
{
int x;
int y;
};
struct A* find_A_ptr_from_y(int* y)
{
int o = offsetof(struct A, y);
return (struct A*)((char *)y - o);
}
int main(int argc, const char* argv[])
{
struct A a1;
struct A* a2 = new struct A;
cout << "Address of a1 is " << &a1 << endl;
cout << "Address of a2 is " << a2 << endl;
struct A *pa1 = find_A_ptr_from_y(&a1.y);
struct A *pa2 = find_A_ptr_from_y(&(a2->y));
cout << "Address of a1 (recovered) is " << pa1 << endl;
cout << "Address of a2 (recovered) is " << pa2 << endl;
}
Output
Address of a1 is 0x7fff5fbff9d0
Address of a2 is 0x100100080
Address of a1 (recovered) is 0x7fff5fbff9d0
Address of a2 (recovered) is 0x100100080
Caveats: if what you pass to find_A_ptr_from_y is not a pointer to (struct A).y you well get total rubbish.
You should (almost) never do this. See comment by DasBoot below.
It's not quite clear to me what you are trying to do, but if you are want to find the pointer to an instance of struct A when you know the pointer to a member of A, you can do that.
See for example the container_of macro in the linux kernel.
The parameter x of function test() need not be a member of any class as far as test() is converned.
If semantically in a particular application x must always be a member of a class then that information could be provided, either by passing an additional paraemter or having some_t itself contain such information. However to do that would be enturely unnecessary since if test() truely needed access to the object containing x, then why not simply pass the parent object itself? Or just make test() a member function of the same class and pass no paraemeters whatsoever? If the reason is because x may belong to differnt classes, then polymorphism can be employed to resolve that issue.
Basically I suggest that there is no situation where you would need such a capability that cannot be solved in a simpler, safer and more object oriented manner.

Virtual member functions and std::tr1::function: How does this work?

Here is a sample piece of code. Note that B is a subclass of A and both provide a unique print routine. Also notice in main that both bind calls are to &A::print, though in the latter case a reference to B is passed.
#include <iostream>
#include <tr1/functional>
struct A
{
virtual void print()
{
std::cerr << "A" << std::endl;
}
};
struct B : public A
{
virtual void print()
{
std::cerr << "B" << std::endl;
}
};
int main (int argc, char * const argv[])
{
typedef std::tr1::function<void ()> proc_t;
A a;
B b;
proc_t a_print = std::tr1::bind(&A::print, std::tr1::ref(a));
proc_t b_print = std::tr1::bind(&A::print, std::tr1::ref(b));
a_print();
b_print();
return 0;
}
Here is the output I see compiling with GCC 4.2:
A
B
I would consider this correct behavior, but I am at a loss to explain how it is working properly given that the std::tr1::functions were bound to &A::print in both cases. Can someone please enlighten me?
EDIT: Thanks for the answers. I am familiar with inheritance and polymorphic types. What I am interested in is what does &A::print mean? Is it an offset into a vtable, and that vtable changes based on the referred object (in this case, a or b?) From a more nuts-and-bolts perspective, how does this code behave correctly?
This works in the same manner as it would have worked with plain member function pointers. The following produces the same output:
int main ()
{
A a;
B b;
typedef void (A::*fp)();
fp p = &A::print;
(a.*p)(); // prints A
(b.*p)(); // prints B
}
It would have been surprising if boost/tr1/std::function did anything different since they presumably store these pointers to member functions under the hood. Oh, and of course no mention of these pointers is complete without a link to the Fast Delegates article.
Because print() is declared virtual, A is a polymorphic class. By binding to the print function pointer, you will be calling through an A pointer, much in the same way as:
A* ab = &b;
ab->print();
In the ->print call above, you would expect polymorphic behavior. Same it true in your code as well. And this is a Good Thing, if you ask me. At least, most of the time. :)