method running on an object BEFORE the object has been initialised? - c++

#include <iostream>
using namespace std;
class Foo
{
public:
Foo(): initialised(0)
{
cout << "Foo() gets called AFTER test() ?!" << endl;
};
Foo test()
{
cout << "initialised= " << initialised << " ?! - ";
cout << "but I expect it to be 0 from the 'initialised(0)' initialiser on Foo()" << endl;
cout << "this method test() is clearly working on an uninitialised object ?!" << endl;
return Foo();
}
~Foo()
{};
private:
int initialised;
};
int main()
{
//SURE this is bad coding but it compiles and runs
//I want my class to DETECT and THROW an error to prevent this type of coding
//in other words how to catch it at run time and throw "not initialised" or something
Foo foo=foo.test();
}

Yes, it is calling the function on a yet not constructed object, which is undefined behavior. You can't detect it reliable. I would argue you also should not try to detect it. It's nothing which would happen likely by accident, compared to for example calling a function on an already deleted object. Trying to catch every and all possible mistakes is just about impossible. The name declared is visible already in its initializer, for other useful purposes. Consider this:
Type *t = (Type*)malloc(sizeof(*t));
Which is a common idiom in C programming, and which still works in C++.
Personally, i like this story by Herb Sutter about null references (which are likewise invalid). The gist is, don't try to protect from cases that the language clearly forbids and in particular are in their general case impossible to diagnose reliably. You will get a false security over time, which becomes quite dangerous. Instead, train your understanding of the language and design interfaces in a way (avoid raw pointers, ...) that reduces the chance of doing mistakes.
In C++ and likewise in C, many cases are not explicitly forbidden, but rather are left undefined. Partially because some things are rather difficult to diagnose efficiently and partially because undefined behavior lets the implementation design alternative behavior for it instead of completely ignoring it - which is used often by existing compilers.
In the above case for example, any implementation is free to throw an exception. There are other situations that are likewise undefined behavior which are much harder to diagnose efficiently for the implementation: Having an object in a different translation unit accessed before it was constructed is such an example - which is known as the static initialization order fiasco.

The constructor is the method you want (not running before initialization but rather on initialization, but that should be OK). The reason it doesn't work in your case is that you have undefined behavior here.
Particularly, you use the not-yet-existent foo object to initialize itself (eg. the foo in foo.Test() doesn't exist yet). You can solve it by creating an object explicitly:
Foo foo=Foo().test()
You cannot check for it in the program, but maybe valgrind could find this type of bug (as any other uninitialized memory access).

You can't prevent people from coding poorly, really. It works just like it "should":
Allocate memory for Foo (which is the value of the "this" pointer)
Going to Foo::test by doing: Foo::test(this), in which,
It gets the value by this->initialised, which is random junk, then it
Calls Foo's default constructor (because of return Foo();), then
Call Foo's copy constructor, to copy the right-handed Foo().
Just like it should. You can't prevent people from not knowing the right way to use C++.
The best you could do is have a magic number:
class A
{
public:
A(void) :
_magicFlag(1337)
{
}
void some_method(void)
{
assert (_magicFlag == 1337); /* make sure the constructor has been called */
}
private:
unsigned _magicFlag;
}
This "works" because the chances _magicFlag gets allocated where the value is already 1337 is low.
But really, don't do this.

You're getting quite a few responses that basically say, "you shouldn't expect the compiler to help you with this". However, I'd agree with you that the compiler should help with this problem by with some sort of diagnostic. Unfortunately (as the other answers point out), the language spec doesn't help here - once you get to the initializer part of the declaration, the newly declared identifier is in scope.
A while back, DDJ had an article about a simple debugging class called "DogTag" that could be used as a debugging aid to help with:
using an object after deletion
overwriting an object's memory with garbage
using an object before initializing it
I haven't used it much - but it did come in handly on an embedded project that was running into some memory overwrite bugs.
It's basically an elaboration of the "MagicFlag" technique that GMan described.

Related

I accidentally called a member function without own class object. But how does this work?

Here is my code.
class IService {
};
class X_Service {
public:
void service1() {
std::cout<< "Service1 Running..."<<std::endl;
}
};
int main() {
IService service;
auto func = reinterpret_cast<void (IService::*)()>(&X_Service::service1);
(service.*(func))();
return 0;
}
I don't understand how this works. I didn't inherit IService and didn't create a X_Service object but it works.
Can someone explain this?
Your confusion probably comes from the misunderstanding that because something compiles and runs without crashing, it "works". Which is not really true.
There are many ways you can break the rules of the language and still write code that compiles and runs. By using reinterpret_cast here and making an invalid cast you have broken the rules of the language, and your program has Undefined Behaviour.
That means it can seem to work, it can crash or it can just do something completely different from what you intended.
In your case it seems to work, but it's still UB and the code is not valid.
Under the hood your compiler will turn all these functions into machine code that is basically just jumps to certain addresses in memory and then executing commands stored there.
A member function is just a function that has in addition to local variables and parameters, a piece of memory that stores the address of the class object. That piece of memory holds the address you are accessing when you use the this keyword.
If you call a member function on a wrong object or nullptr, then you basically just make the this pointer point to something invalid.
Your function doesn't access this, which is the reason your program doesn't blow up.
That said, this is still undefined behavior, and anything could happen.
So, I had some fun and manipulated the code a bit. This is also an empirical answer. There are a lot of pitfalls that risk stack corruption with this way of doing things, so I changed the code a bit to make it to where stack corruption does not occur but kind of show what it happening.
#include <iostream>
class IService {
public:
int x;
};
class X_Service {
public:
int x;
void service1() {
this->x = 65;
std::cout << this->x << std::endl;
}
};
int main() {
IService service;
auto func = reinterpret_cast<void (IService::*)()>(&X_Service::service1);
(service.*(func))();
std::cout << service.x << std::endl;
std::cin.get();
X_Service derp;
(derp.service1)();
std::cout << derp.x << std::endl;
return 0;
}
So from the outset, auto gave you the power to make a none type safe pointer void (IService::*)()also the instance of the object itself is this-> regardless of what member function of whatever class you are stealth inheriting from. The only issue is that the first variable of the instance is interpreted based on the first variable of the class you are stealth inheriting from, which can lead to stack corruption if the type differs.
Ways to get cool output but inevitably cause stack corruption, you can do the following fun things.
class IService {
public:
char x;
};
Your IDE will detect stack corruption of your IService object, but getting that output of
65
A
is kind of worth it, but you will see that issues will arise doing this stealth inheritance.
I'm also on an 86x compiler. So basically my variable are all lined up. Say for instance if I add an int y above int x in Iservice, this program would output nonsense. Basically it only works because my classes are binary compatible.
When you reinterpret_cast a function or member function pointer to a different type, you are never allowed to call the resulting pointer except if you cast it back to its original type first and call through that.
Violating this rule causes undefined behavior. This means that you loose any language guarantee that the program will behave in any specific way, absent additional guarantees from your specific compiler.
reinterpret_cast is generally dangerous because it completely circumvents the type system. If you use it you need to always verify yourself by looking at the language rules, whether the cast and the way the result will be used is well-defined. reinterpret_cast tells the compiler that you know what you are doing and that you don't want any warning or error, even if the result will be non-sense.

Method call after invalid C-style cast works

So I'm trying to learn a bit more about the differences between C-style-casts, static_cast, dynamic_cast and I decided to try this example which should reflect the differences between C-style-casts and static_cast pretty good.
class B
{
public:
void hi() { cout << "hello" << endl; }
};
class D: public B {};
class A {};
int main()
{
A* a = new A();
B* b = (B*)a;
b->hi();
}
Well this code snippet should reflect that the C-style-cast goes very wrong and the bad cast is not detected at all. Partially it happens that way. The bad cast is not detected, but I was surprised when the program, instead of crashing at b->hi();, it printed on the screen the word "hello".
Now, why is this happening ? What object was used to call such a method, when there's no B object instantiated ? I'm using g++ to compile.
As others said it is undefined behaviour.
Why it is working though? It is probably because the function call is linked statically, during compile-time (it's not virtual function). The function B::hi() exists so it is called. Try to add variable to class B and use it in function hi(). Then you will see problem (trash value) on the screen:
class B
{
public:
void hi() { cout << "hello, my value is " << x << endl; }
private:
int x = 5;
};
Otherwise you could make function hi() virtual. Then function is linked dynamically, at runtime and program crashes immediately:
class B
{
public:
virtual void hi() { cout << "hello" << endl; }
};
This only works because of the implementation of the hi() method itself, and the peculiar part of the C++ spec called undefined behaviour.
Casting using a C-style cast to an incompatible pointer type is undefined behaviour - literally anything at all could happen.
In this case, the compiler has obviously decided to just trust you, and has decided to believe that b is indeed a valid pointer to an instance of B - this is in fact all a C-style cast will ever do, as they involve no runtime behaviour. When you call hi() on it, the method works because:
It doesn't access any instance variables belonging to B but not A (indeed, it doesn't access any instance variables at all)
It's not virtual, so it doesn't need to be looked up in b's vtable to be called
Therefore it works, but in almost all non-trivial cases such a malformed cast followed by a method call would result in a crash or memory corruption. And you can't rely on this kind of behaviour - undefined doesn't mean it has to be the same every time you run it, either. The compiler is perfectly within its rights with this code to insert a random number generator and, upon generating a 1, start up a complete copy of the original Doom. Keep that firmly in mind whenever anything involving undefined behaviour appears to work, because it might not work tomorrow and you need to treat it like that.
Now, why is this happening ?
Because it can happen. Anything can happen. The behaviour is undefined.
The fact that something unexpected happened demonstrates well why UB is so dangerous. If it always caused a crash, then it would be far easier to deal with.
What object was used to call such a method
Most likely, the compiler blindly trusts you, and assumes that b does point to an object of type B (which it doesn't). It probably would use the pointed memory as if the assumption was true. The member function didn't access any of the memory that belongs to the object, and the behaviour happened to be the same as if there had been an object of correct type.
Being undefined, the behaviour could be completely different. If you try to run the program again, the standard doesn't guarantee that demons won't fly out of your nose.

Is it a better way to use an Init() for memory allocation than Constructor?

Suppose I have things like:
class obj001
{
public:
obj001() {
std::cout << "ctor == obj001" << std::endl;
}
~obj001() {
std::cout << "dtor == obj001" << std::endl;
}
};
class obj002
{
public:
obj002() {
std::cout << "ctor == obj002" << std::endl;
}
~obj002() {
std::cout << "dtor == obj002" << std::endl;
}
};
class packet001
{
public:
packet001(): p01(NULL), p02(NULL) {
/*p01 = new obj001;
p02 = new obj002;
throw "hahaha";*/
std::cout << "CTOR == PACKET01" << std::endl;
}
~packet001() {
delete p01;
delete p02;
std::cout << "DTOR == PACKET01" << std::endl;
}
void init() {
p01 = new obj001;
p02 = new obj002;
throw "hahaha";
}
obj001* p01;
obj002* p02;
};
And if I do:
try
{
packet001 superpack;
superpack.init();
}
catch(char* type)
{
}
Then the init() failed, and the Dtor of superpack will be called.
But if I put memory allocation inside the Ctor of superpack,
(And do not execute init(), of course)
then after the Ctor failed, the Dtor will not be called, so p01 and p02 are leaked.
So, it it better to use things like init()?
Thanks!
Using two phase construction, ordinary construction followed by an outside call to an initfunction, means that after construction you don't know yet whether you have a valid object at hand. And that means that in any function that gets such an object as argument, you don't know whether the object is valid. This means a lot of extra checking and uncertainty which in turn means bugs and added work, so, a constructor should instead establish a fully functional, valid object.
The set of assumptions that go into the notion of "functional, valid" is called the class invariant.
So in other words, more academic phrasing, the job of a constructor is to establish the class invariant, so that it’s known to hold after construction.
Then keeping the object valid in every externally available operation, means that it will continue to be guaranteed valid. Thus no further validity checking is required. This scheme isn’t entirely 100% applicable to all objects (a counter-example is an object representing a file, where any operation might cause the object to become effectively invalid), but mostly it’s a good idea and works well, and where it doesn't work directly, it works for the parts.
So in your constructor you should ensure cleanup by one of the following means:
Use standard library containers (or 3rd party ones) instead of dealing directly with raw arrays and dynamic allocation.
Or use sub-objects that each manage just one resource. A sub-object can be a data member or a base class. If a data member, it can be smart pointer.
Or in the worst case, use try-catch for direct cleanup.
It’s also technically possible to use the C idea of checking return values to invoke direct cleanup as necessary. But the list above is in order of decreasing ease and safety. The C style coding is somewhere beyond the bottom of that list.
The C++ language creator, Bjarne Stroustrup, has written a little about this very subject, in his appendix Appendix E: Standard-Library Exception Safety appendix to the 3rd edition of The C++ Programming Language. Just download the PDF, and in your PDF reader search for “init(”. You should land directly a bit into section §E3.5, about Constructors and Invariants; do read on through at least section §E.3.5.1 about Using init() Functions.
As Bjarne lists there, …
[…] having a separate init() function is an opportunity to
[1] forget to call init() (§10.2.3),
[2] forget to test on the success of init(),
[3] call init() more than once,
[4] forget that init()might throw an exception, and
[5] use the object before calling init().
Bjarne’s discussion is, I think, great for a beginner, as is the whole book.
However, be aware that a common reason for two-phase construction, namely to support derived class specific initialization, is simply not mentioned at all, not part of Bjarne’s picture here. This is the reason for two-phase initialization in many GUI frameworks. Some C++ GUI frameworks with OK single phase initialization do exist, however, proving that mostly it was all an educational issue – that those early C++ programmers simply did not know about, or could not assume that their library users would understand, C++ RAII.
The best thing is to avoid these kinds of allocations altogether. You can put instances directly within a class for many things. If you really need a pointer, you can use unique_ptr and shared_ptr for automatic memory management.
In your example, this would be fine:
struct packet001
{
obj001 p01;
obj002 p02;
};
If you need them to be pointers:
struct packet001
{
packet001()
: p01(new obj001),
p02(new obj002)
{
}
std::unique_ptr<obj001> p01;
std::unique_ptr<obj002> p02;
};
The memory will automatically be freed in the destructor, and deallocations will happen properly if an exception occurs during construction.
Shouldn't you catch all exceptions in Ctor and clean up properly if exceptions arrive inside Ctor?
I have used two phase construction, or various other means pointed out for in constructor cleanup, in instances where the constructor is likely to fail many a times AND where I want the program to continue to run even after that failure; e.g. trying to construct an object that reads a file, where the file name was provided by the user. There the constructor is likely to fail many a times, e.g. on bad user input.
But bad_alloc - that should be rare for a well designed program. And what exactly are you going to do if the memory allocation fails? Your C++ program is most likely doomed at that point of time. Why worry about a memory leak at that point of time? Now you can point out some counter examples where programs can continue to run even after encountering a bad alloc, OR programs which employ fancy techniques to avoid bad_allocs, but is your program one of those?

What happens in C++ when I pass an object by reference and it goes out of scope?

I think this question is best asked with a small code snippet I just wrote:
#include <iostream>
using namespace std;
class BasicClass
{
public:
BasicClass()
{
}
void print()
{
cout << "I'm printing" << endl;
}
};
class FriendlyClass
{
public:
FriendlyClass(BasicClass& myFriend) :
_myFriend(myFriend)
{
}
void printFriend()
{
cout << "Printing my friend: ";
_myFriend.print();
}
private:
BasicClass& _myFriend;
};
int main(int argv, char** argc)
{
FriendlyClass* fc;
{
BasicClass bc;
fc = new FriendlyClass(bc);
fc->printFriend();
}
fc->printFriend();
delete fc;
return 0;
}
The code compiles and runs fine using g++:
$ g++ test.cc -o test
$ ./test
Printing my friend: I'm printing
Printing my friend: I'm printing
However, this is not the behavior I was expecting. I was expecting some sort of failure on the second call to fc->printFriend(). Is my understanding of how the passing/storing by reference works incorrect or is this something that just happens to work on a small scale and would likely blow up in a more sophisticated application?
It works exactly as for pointers: using something (pointer/reference) that refers to an object that no longer exists is undefined behavior. It may appear to work but it can break at any time.
Warning: what follows is a quick explanation of why such method calls can seem to work in several occasions, just for informative purposes; when writing actual code you should rely only on what the standard says
As for the behavior you are observing: on most (all?) compilers method calls are implemented as function calls with a hidden this parameter that refers to the instance of the class on which the method is going to operate. But in your case, the this pointer isn't being used at all (the code in the function is not referring to any field, and there's no virtual dispatch), so the (now invalid) this pointer is not used and the call succeeds.
In other instances it may appear to work even if it's referring to an out-of-scope object because its memory hasn't been reused yet (although the destructor has already run, so the method will probably find the object in an inconsistent state).
Again, you shouldn't rely on this information, it's just to let you know why that call still works.
When you store a reference to an object that has ended its lifetime, accessing it is undefined behavior. So anything can happen, it can work, it can fail, it can crash, and as it appears I like to say it can order a pizza.
Undefined behavior. By definition you cannot make assumptions about what will happen when that code runs. The compiler may not be clearing out the memory where bc resides yet, but you can't count on it.
I actually fixed the same bug in a program at work once. When using Intel's compiler the variable which had gone out of scope had not been "cleaned up" yet, so the memory was still "valid" (but the behavior was undefined). Microsoft's compiler however cleaned it up more aggressively and the bug was obvious.
You have a dangling reference, which results in undefined behavior.
look here: Can a local variable's memory be accessed outside its scope?
It will almost will work, since you have no virtual functions, and you don't acces fields of BasicClass: all methods you call have static binding, and 'this' is never used,
so you never actually access "not allocated memory".

static_cast doubt

#include <iostream>
class A
{
public:
A()
{
std::cout << "\n A_Constructor \t" << this <<std::endl;
}
void A_Method()
{
std::cout <<"\n A_Method \t" << this <<std::endl;
}
};
class B:public A
{
public:
B()
{
std::cout <<"\n B_Constructor \n";
}
void B_Method()
{
std::cout <<"\n B_Method \t" << this <<std::endl;
}
};
int main()
{
A *a_obj = new A;
B *b_obj = static_cast<B*> (a_obj); // This isn't safe.
b_obj->B_Method();
getchar();
return 0;
}
OUTPUT :
A_Constructor 001C4890
B_Method 001C4890
As no run-time check isn't involved in type conversion, static_cast isn't safe. But in this example, I got what I didn't even expect. Since there is no call to B::B(), any of it's members should not be able to get called by b_obj. Despite that I got the output.
In this simple case, I might have succeeded though it is known unsafe. My doubts are -
Though there is no call to B::B(), how was I able to access class B member functions.
Can some one please provide an example, where this is unsafe and might get wrong( though what I given before might serve as a bad example, but even better).
I did it on Visual Studio 2010 and with \Wall option set.
This is Undefined Behavior. Sometimes UB causes crashes. Sometimes it seems to "work". You are right that you shouldn't do this, even though in this case less bad things happened.
What you are trying is undefined behaviour so ANYTHING could happen. This seems to work (and probably works) fine as you don't try to access any data members (you don't have any). Try to add some data members to both classes and access it from methods, you will see that behaviour will change to completely unpredictable.
I doubt that this particular case is UB. First, it just casts a pointer from one type to another. Since there is no virtual/multiple inheritance involved, no pointer adjustment is performed, so basically the pointer value stays the same. Surely it points to an object of a wrong type, but who cares, as long as we don't access any B members, even if there were some? And even if there was pointer adjustment involved, it would still be okay if we didn't access any memory pointed by it, which we don't.
Then, the example calls a method of B. Since it is not virtual, it just a normal function call with the hidden argument this (think B::B_Method(this)). Now, this points to an object of a wrong type, but again, who cares? The only thing it does is print it, which is always a safe thing to do.
In fact, you can even call methods using a NULL pointer. It works as long as the method isn't virtual and doesn't try to access anything pointed by this. I once had a library used by many programs. This library had a singleton-like class which had to be constructed explicitly, but none of the programs actually did it. The instance pointer was initialized to NULL by default, as it was global. It worked perfectly since the class had no data members at all. Then I added some and all the programs suddenly started to crash. When I found out the reason, I had a really good laugh. I've been using an object that didn't even exist.