This is not a question about why you would write code like this, but more as a question about how a method is executed in relation to the object it is tied to.
If I have a struct like:
struct F
{
// some member variables
void doSomething(std::vector<F>& vec)
{
// do some stuff
vec.push_back(F());
// do some more stuff
}
}
And I use it like this:
std::vector<F>(10) vec;
vec[0].doSomething(vec);
What happens if the push_back(...) in doSomething(...) causes the vector to expand? This means that vec[0] would be copied then deleted in the middle of executing its method. This would be no good.
Could someone explain what exactly happens here?
Does the program instantly crash? Does the method just try to operate on data that doesn't exist?
Does the method operate "orphaned" of its object until it runs into a problem like changing the object's state?
I'm interested in how a method call is related to the associated object.
Yes, it's bad. It's possible for your object to be copied (or moved in C++11 if the distinction is relevant to your code) while your are inside doSomething(). So after the push_back() returns, the this pointer may no longer point to the location of your object. For the specific case of vector::push_back(), it's possible that the memory pointed to by this has been freed and the data copied to a new array somewhere else. For other containers (list, for example) that leave their elements in place, this is (probably) not going to cause problems at all.
In practice, it's unlikely that your code is going to crash immediately. The most likely circumstance is a write to free memory and a silent corruption of the state of your F object. You can use tools like valgrind to detect this kind of behavior.
But basically you have the right idea: don't do this, it's not safe.
Could someone explain what exactly happens here?
Yes. If you access the object, after a push_back, resize or insert has reallocated the vector's contents, it's undefined behavior, meaning what actually happens is up to your compiler, your OS, what do some more stuff is and maybe a number of other factors like maybe phase of the moon, air humidity in some distant location,... you name it ;-)
In short, this is (indirectly via the std::vector implemenation) calling the destructor of the object itself, so the lifetime of the object has ended. Further, the memory previously occupied by the object has been released by the vector's allocator. Therefore the use the object's nonstatic members results in undefined behavior, because the this pointer passed to the function does not point to an object any more. You can however access/call static members of the class:
struct F
{
static int i;
static int foo();
double d;
void bar();
// some member variables
void doSomething(std::vector<F>& vec)
{
vec.push_back(F());
int n = foo(); //OK
i += n; //OK
std::cout << d << '\n'; //UB - will most likely crash with access violation
bar(); //UB - what actually happens depends on the
// implementation of bar
}
}
Related
I'm trying to implement the Observer pattern, but i don't want observers being responsible for my program safety by maintaining the reference list in ObservableSubject.
Meaning that when the Observer object lifetime ends, I dont want to explicitly call ObservervableSubject::removeObserver(&object).
I have come up with an idea to use use pointer references in ObservableSubject.
My question are: Is implementation described above and attempted below possible?
And what is happening in my program, how do i prevent dereferencing trash?
Apriori excuse: This is an attempt at understanding C++, not something that should have actual use or be prefered over other implementations.
My solution attempt:
// Example program
#include <iostream>
#include <string>
#include <vector>
class ObserverInterface {
public:
virtual ~ObserverInterface() {};
virtual void handleMessage() = 0;
};
class ObservableSubject
{
std::vector<std::reference_wrapper<ObserverInterface*>> listeners;
public:
void addObserver(ObserverInterface* obs)
{
if (&obs)
{
// is this a reference to the copied ptr?
// still, why doesnt my guard in notify protect me
this->listeners.push_back(obs);
}
}
void removeObserver(ObserverInterface* obs)
{
// todo
}
void notify()
{
for (ObserverInterface* listener : this->listeners)
{
if (listener)
{
listener->handleMessage();
}
}
}
};
class ConcreteObserver : public ObserverInterface {
void handleMessage()
{
std::cout << "ConcreteObserver: I'm doing work..." << std::endl;
}
};
int main()
{
ObservableSubject o;
{
ConcreteObserver c;
o.addListener(&c);
}
o.notify();
std::cin.get();
}
Line in ObservableSubject::notify() : Listener->handleMessage() throws the following exception:
Exception thrown: read access violation.
listener->**** was 0xD8BF48B. occurred
Your program has undefined behavior.
ObservableSubject o;
{
ConcreteObserver c;
o.addListener(&c); // Problem
}
c gets destructed when the scope ends. You end up storing a stale pointer in the list of listeners of o.
You can resolve the problem by defining c in the same scope as o or by using dynamically allocated memory.
ObservableSubject o;
ConcreteObserver c;
o.addListener(&c);
or
ObservableSubject o;
{
ConcreteObserver* c = new ConcreteObserver;
o.addListener(c);
}
When you use dynamically allocated memory, the additional scope is not useful. You might as well not use it.
ObservableSubject o;
ConcreteObserver* c = new ConcreteObserver;
o.addListener(c);
If you choose to use the second approach, make sure to deallocate the memory. You need to add
delete c;
before the end of the function.
Update, in response to OP's comment
You said:
Maybe i wasn't clear. Solving the lifetime/stale pointer problem was the intention of my solution. I know i have no problems if i have properly managed lifetime, or if i add detachObserver option on Observer destruction. I want to somehow be able to tell from the ObservableSubject if his list of Observers was corrupted, without the Observer explicitly telling that.
Since dereferencing an invalid pointer is cause for undefined behavior, it is essential that you track the lifetime of observers and make sure to update the list of observers when necessary. Without that, you are courting undefined behavior.
Note, I don't recommend the following approach, but I think it meets your requirements. You have a duplicated observer list. One is under control of the Observers, and the other, using weak pointers, is handled by the Observable object.
Make the Observer constructors private and use an ObserverFactory (which is their friend) to obtain a std::shared_ptr<Observer>. The factory has a map from raw pointers to reference wrappers to the associated shared pointer.
The listeners list becomes std::vector<std::weak_ptr<Observer>>. On list traversal, you try to lock the weak_ptr; if it succeeds, handle the message; if it fails, that is, you get nullptr, remove the weak pointer from the list.
When the listener no longer wants to listen, it tells the Factory to do a reset on its shared pointer and remove from the map. This step is rather ugly, as it is just a fancy delete this, normally a code smell.
I believe you can also do this with std::shared_from_this.
The plan is you move the maintenance away from the ObservableSubject back into the Observers.
// is this a reference to the copied ptr?
Yes, it is. It invokes undefined behaviour because the obs pointer variable goes out of scope at the end of the function, resulting in a dangling reference.
The whole idea doesn’t gain you anything. Even if you make the ref-to-pointer approach work correctly, you are depending on one thing: That that exact pointer variable is set to nullptr once the object dies. Essentially that’s the same problem as ensuring that no dangling pointers are held in listeners.
For a heap object: How do you make sure that nobody deletes the object through a different pointer? Or forgets to null the registered pointer? It’s even worse for stack objects like in your example. The object goes out of scope and dies automatically. There is no opportunity to null anything unless you introduce an additional pointer variable that you’d have to manage manually.
You could consider two general alternatives to your approach:
Make the relation bidirectional. Then whoever dies first (observable or observer) can notify the other party abouts its death in the destructor.
If you don’t like the bidirectionality a central, all-knowing orchestrator that decouples oberservers and observables works, too. Of course that introduces some kind of global state.
Real-life implementations usually go in the general direction of leveraging C++ destructors for deregistration. Have a look at Qt’s signal/slot mechanism, for example.
I've been playing around with a way to handle function-objects and I'm curious about if I'm misunderstanding something, possibly creating a problem further down the line. I ask, because I want to see this approach work, and understand better what I'm doing.
The idea is quite simple: I create a virtual access pointer like this:
struct access { virtual void tick() = 0; };
The obj (out of many different) the access pointer will work on:
struct obj1 :public access
{ void tick() { cout << "I'm being called " << endl; } };
And I have the "controller" class that will (eventually) hold and control the flow of the actions:
class container
{
vector<access*>acc_objects; //do these exists by themselves?
public:
void obj_add(access &x) { acc_objects.push_back(&x); }
void tick() { for (auto x : acc_objects)x->tick(); } //active functors
}controller; //there will be only 1
The goal (eventually) will be to send a pointer to various objects, to a vector of access pointers, thereby being able to call them as if they were the same. Either way, whats confusing me at this point is when I do this:
{ //in scope
obj1 anobj;
conts.obj_add(anobj);
int x=1; //x in scope
}
conts.tick(); //referrign to the object still works
cout<<x<<endl; //but this does not
Why can I still refer to obj1? Is this just undefined behaviour, or does the object exist by reference only somehow. Im also asking, because finding a way to have functor-objects exist and vanish reliably is something I'm trying to accomplish.
Edit: cleaned up the question, took out some irrelevant code from the example
Your bug is not object slicing per se.
{
obj1 anobj;
// ...
}
Here, anobj goes out of scope, and gets destroyed at the end of the scope.
Meanwhile, inside the scope, a pointer to anobj gets saved in your container. Referencing the object via the pointer, after the scope has ended results in a reference to a destroyed object.
Nothing was sliced anywhere. The entire object was destroyed, and the subsequent reference to the object comprises undefined behavior.
This question already has answers here:
How can I determine if a C++ object has been deallocated?
(6 answers)
Closed 4 years ago.
Consider this program:
int main()
{
struct test
{
test() { cout << "Hello\n"; }
~test() { cout << "Goodbye\n"; }
void Speak() { cout << "I say!\n"; }
};
test* MyTest = new test;
delete MyTest;
MyTest->Speak();
system("pause");
}
I was expecting a crash, but instead this happened:
Hello
Goodbye
I say!
I'm guessing this is because when memory is marked as deallocated it isn't physically wiped, and since the code references it straight away the object is still to be found there, wholly intact. The more allocations made before calling Speak() the more likely a crash.
Whatever the reason, this is a problem for my actual, threaded code. Given the above, how can I reliably tell if another thread has deleted an object that the current one wants to access?
There is no platform-independent way of detecting this, without having the other thread(s) set the pointer to NULL after they've deleted the object, preferably inside a critical section, or equivalent.
The simple solution is: design your code so that this can't occur. Don't delete objects that might be needed by other threads. Clear up shared resource only once it's safe.
I was expecting a crash, but instead
this happened:
That is because Speak() is not accessing any members of the class. The compiler does not validate pointers for you, so it calls Speak() like any other function call, passing the (deleted) pointer as the hidden 'this' parameter. Since Speak() does not access that parameter for anything, there is no reason for it to crash.
I was expecting a crash, but instead this happened:
Undefined Behaviour means anything can happen.
Given the above, how can I reliably tell if another thread has deleted an object that the current one wants to access?
How about you set the MyTest pointer to zero (or NULL). That will make it clear to other threads that it's no longer valid. (of course if your other threads have their own pointers pointing to the same memory, well, you've designed things wrong. Don't go deleting memory that other threads may use.)
Also, you absolutely can't count on it working the way it has. That was lucky. Some systems will corrupt memory immediately upon deletion.
Despite it's best to improve the design to avoid access to a deleted object, you can add a debug feature to find the location where you access deleted objects.
Make all methods and the destructor virtual.
Check that your compiler creates an object layout where the pointer to
the vtable is in front of the object
Make the pointer to the vtable invalid in the destructor
This dirty trick causes that all functions calls reads the address where the pointer points to and cause a NULL pointer exception on most systems. Catch the exception in the debugger.
If you hesitate to make all methods virtual, you can also create an abstract base class and inherit from this class. This allows you to remove the virtual function with little effort. Only the destructor needs to be virtual inside the class.
example
struct Itest
{
virtual void Speak() = 0;
virtual void Listen() = 0;
};
struct test : public Itest
{
test() { cout << "Hello\n"; }
virtual ~test() {
cout << "Goodbye\n";
// as the last statement!
*(DWORD*)this = 0; // invalidate vtbl pointer
}
void Speak() { cout << "I say!\n"; }
void Listen() { cout << "I heard\n"; }
};
You might use reference counting in this situation. Any code that dereferences the pointer to the allocated object will increment the counter. When it's done, it decrements. At that time, iff the count hits zero, deletion occurs. As long as all users of the object follow the rules, nobody access the deallocated object.
For multithreading purposes I agree with other answer that it's best to follow design principles that don't lead to code 'hoping' for a condition to be true. From your original example, were you going to catch an exception as a way to tell if the object was deallocated? That is kind of relying on a side effect, even if it was a reliable side effect which it's not, which I only like to use as a last resort.
This is not a reliable way to "test" if something has been deleted elsewhere because you are invoking undefined behavior - that is, it may not throw an exception for you to catch.
Instead, use std::shared_ptr or boost::shared_ptr and count references. You can force a shared_ptr to delete it's contents using shared_ptr::reset(). Then you can check if it was deleted later using shared_ptr::use_count() == 0.
You could use some static and runtime analyzer like valgrind to help you see these things, but it has more to do with the structure of your code and how you use the language.
// Lock on MyTest Here.
test* tmp = MyTest;
MyTest = NULL;
delete tmp;
// Unlock MyTest Here.
if (MyTest != NULL)
MyTest->Speak();
One solution, not the most elegant...
Place mutexes around your list of objects; when you delete an object, mark it as null. When you use an object, check for null. Since access is serialized, you'll have a consistent operation.
Consider the following situation:
SomeType *sptr = someFunction();
// do sth with sptr
I am unaware of the internals of someFunction(). Its pretty obvious that the pointer to the object which someFunction() is returning must be either malloc'ed or be a static variable.
Now, I do something with sptr, and quit. clearly the object be still on the heap which is possibly a source of leak.
How do I avoid this?
EDIT:
Are references more safer than pointers.
Do the destructor for SomeType would be called if I do :
{
SomeType &sref = *sptr;
}
Any insights.
You need to read the documentation on someFunction. someFunction needs to clearly define the ownership of the returned pointer (does the caller own it and need to call delete or does someFunction own it and will make sure the the object is destructed sometime in the future).
If the code does not document it's behavior, there is no safe way to use it.
What do you mean by quit? End the process? Your heap is usually destroyed when the process is destroyed. You would only get a leak potential after quitting the process if your asked the operating system to do something for you (like get a file or window handle) and didn't release it.
Also, functions that return pointers need to document very well whose responsibility it is to deallocate the pointer target (if at all), otherwise, you can't know whether you need to delete it yourself or you could delete it by accident (a disaster if you were not meant to do so).
If the documentation of the function doesn't tell you what to do, check the library documentation - sometimes a whole library takes the same policy rather than documenting it in each and every function. If you can't find the answer anywhere, contact the author or give up on the library, since the potential for errors is not worth it, IMHO.
In my experience most functions that return a pointer either allocate it dynamically or return a pointer that is based on the input parameter. In this case, since there are no arguments, I would bet that it is allocated dynamically and you should delete it when you're done. But programming shouldn't be a guessing game.
It's always a good habit to clean up after yourself, don't presume the OS will do it;
There's a good change your IDE or debugger will report memory leak when you quit your application.
What do you have to do ? Well, it depends, but normally you have to release the memory allocated by someFunction(), and the documentation will probably help you with that, either there's an API to release the memory or you have to do it manually with free or delete.
Max
The library should document this.
Either you delete it explicitly after use or you call some release method which makes sure that object (and any other resources it points to*) doesn't leak.
(given a choice) Unless its a huge (in terms of memory) object, I would rather prefer a return by value. Or pass a reference to the function.
If someFunction returns an object for me it should be normal to have a pair function like someFunctionFree which you'll call to release the resources of the SomeType object. All the things needed should be found in the documentation of someFunction (mainly how the object can be freed or if the object is automatically freed). I personally prefer the corresponding deallocation function (a CreateObject/DestroyObject pair).
As others note it's up to the function to enforce it's ownership assumptions in code. Here's one way to do that using what's known as smart pointers:
#include <iostream>
#include <boost/shared_ptr.hpp>
struct Foo
{
Foo( int _x ) : x(_x) {}
~Foo() { std::cout << "Destructing a Foo with x=" << std::hex << x << "\n"; }
int x;
};
typedef boost::shared_ptr<Foo> FooHandle;
FooHandle makeFoo(int _x = 0xDEADBEEF) {
return FooHandle(new Foo(_x));
}
int main()
{
{
FooHandle fh = makeFoo();
}
std::cout<<"No memory leaks here!\n";
return 0;
}
Yesterday I read some code of a colleague and came across this:
class a_class
{
public:
a_class() {...}
int some_method(int some_param) {...}
int value_1;
int value_2;
float value_3;
std::vector<some_other_class*> even_more_values;
/* and so on */
}
a_class a_instances[10];
void some_function()
{
do_stuff();
do_more_stuff();
memset(a_instances, 0, 10 * sizeof(a_class)); // <===== WTF?
}
Is that legal (the WTF line, not the public attributes)? To me it smells really, really bad...
The code ran fine when compiled with VC8, but it throws an "unexpected exception" when compiled with VC9 when calling a_instances[0].event_more_values.push_back(whatever), but when accessing any of the other members. Any insights?
EDIT: Changed the memset from memset(&a_instances... to memset(a_instances.... Thanks for pointing it out Eduard.
EDIT2: Removed the ctor's return type. Thanks litb.
Conclusion: Thanks folks, you confirmed my suspicion.
This is a widely accepted method for initialization for C structs.
In C++ it doesn't work ofcourse because you can't assume anything about vectors internal structure. Zeroing it out is very likely to leave it in an illegal state which is why your program crashes.
He uses memset on a non-POD class type. It's invalid, because C++ only allows it for the simplest cases: Where a class doesn't have a user declared constructor, destructor, no virtual functions and several more restrictions. An array of objects of it won't change that fact.
If he removes the vector he is fine with using memset on it though. One note though. Even if it isn't C++, it might still be valid for his compiler - because if the Standard says something has undefined behavior, implementations can do everything they want - including blessing such behavior and saying what happens. In his case, what happens is probably that you apply memset on it, and it would silently clear out any members of the vector. Possible pointers in it, that would point to the allocated memory, will now just contain zero, without it knowing that.
You can recommend him to clear it out using something like this:
...
for(size_t i=0; i < 10; i++)
objects[i].clear();
And write clear using something like:
void clear() {
a_object o;
o.swap(*this);
}
Swapping would just swap the vector of o with the one of *this, and clear out the other variables. Swapping a vector is especially cheap. He of course needs to write a swap function then, that swaps the vector (even_more_values.swap(that.even_more_values)) and the other variables.
I am not sure, but I think the memset would erase internal data of the vector.
When zeroing out a_instances, you also zero out the std_vector within. Which probably allocates a buffer when constructed. Now, when you try to push_back, it sees the pointer to the buffer being NULL (or some other internal member) so it throws an exception.
It's not legitimate if you ask. That's because you can't overload writing via pointers as you can overload assignment operators.
The worst part of it is that if the vector had anything in it, that memory is now lost because the constructor wasn't called.
NEVER over-write a C++ object. EVER. If it was a derived object (and I don't know the specifics of std::vector), this code also over-writes the object's vtable making it crashy as well as corrupted.
Whoever wrote this doesn't understand what objects are and needs you to explain what they are and how they work so that they don't make this kind of mistake in the future.
You shouldn't do memset on C++ objects, because it doesn't call the proper constructor or destructor.
Specifically in this case, the destructor of even_more_values member of all a_instances's elements is not called.
Actually, at least with the members that you listed (before /* and so on */), you don't need to call memset or create any special destructor or clear() function. All these members are deleted automatically by the default destructor.
You should implement a method 'clear' in your class
void clear()
{
value1=0;
value2=0;
value_3=0f;
even_more_values.clear();
}
What you have here might not crash, but it probably won't do what you want either! Zeroing out the vector won't call the destructor for each a_class instance. It will also overwrite the internal data for a_class.even_more_values (so if your push_back() is after the memset() you are likely to get an access violation).
I would do two things differently:
Use std::vector for your storage both in a_class and in some_function().
Write a destructor for a_class that cleans up properly
If you do this, the storage will be managed for you by the compiler automatically.
For instance:
class a_class
{
public:
a_class() {...}
~a_class() { /* make sure that even_more_values gets cleaned up properly */ }
int some_method(int some_param) {...}
int value_1;
int value_2;
float value_3;
std::vector<some_other_class*> even_more_values;
/* and so on */
}
void some_function()
{
std::vector<a_class> a_instances( 10 );
// Pass a_instances into these functions by reference rather than by using
// a global. This is re-entrant and more likely to be thread-safe.
do_stuff( a_instances );
do_more_stuff( a_instances );
// a_instances will be cleaned up automatically here. This also allows you some
// weak exception safety.
}
Remember that if even_more_values contains pointers to other objects, you will need to delete those objects in the destructor of a_class. If possible, even_more_values should contain the objects themselves rather than pointers to those objects (that way you may not have to write a destructor for a_class, the one the compiler provides for you may be sufficient).