Code:
#include <cstdio>
#include <new>
struct Foo {
char ch;
~Foo() { ++ch; }
};
int main() {
static_assert(sizeof(Foo) == 1);
char buffer;
auto const* pc = new (&buffer) Foo{42};
// Change value using only the const pointer
std::printf("%d\n", +buffer);
pc->~Foo();
std::printf("%d\n", +buffer);
}
godbolt
I am not causing any UB as far as I can tell, but GCC and Clang disagree on the result. I think the output should obviously be "42 43". That is the case for Clang, but GCC thinks the output is "42 0". How is that possible? Who zeros out the buffer? Am I missing something?
Your code has undefined behavior. The storage for buffer has been reused for the Foo object you created, so it's lifetime has ended and you can no longer use it. The relevent section of the standard is [basic.life]/1 with 1.5 being the relavent sub section.
The lifetime of an object o of type T ends when: [...]
the storage which the object occupies is released, or is reused by an object that is not nested within o ([intro.object]).
In your final line, the lvalue buffer doesn't access any object.
The char object that was there initially had its lifetime ended by reusing its storage for a Foo. The Foo had its lifetime ended by invoking the destructor. And no one created any object in the storage after that.
lvalue-to-rvalue conversion (which is what +buffer does, but passing buffer as an argument to a variadic function would too) is not permitted where no object exists.
§6.7.3.5
A program may end the lifetime of any object by reusing the
storage which the object occupies ...[cut]
You are accessing buffer after its lifetime is expired.
Related
I'm wondering if the following is undefined?
int main()
{
struct Doggy { int a; ~Doggy() {} };
Doggy* p = new Doggy[100];
p[50].~Doggy();
p[50].a = 3; // Is this not allowed? The destructor was called on an
// object occupying that area of memory.
// Can I access it safely?
if (p[50].a == 3);
}
I guess this is generally good to know, but the reason I'm specifically wanting to know is that I have a data structure consisting of an array, where the buckets can be nullable by setting a value, kind of like buckets in a hash table array. And when the bucket is emptied the destructor is called, but then checking and setting the null state after the destructor is called I'm wondering if it's illegal.
To elaborate a little, say I have an array of objects and each object can be made to represent null in each bucket, such as:
struct Handle
{
int value = 0; // Zero is null value
~Handle(){}
};
int main()
{
Handle* p = new Handle[100];
// Remove object 50
p[50].~Handle();
p[50].value = 0; // Set to null
if (p[50].value == 0) ; // Then it's null, can I count on this?
// Is this defined? I'm accessing memory that was occupied by
// object that was destroyed.
}
Yes it'll be UB:
[class.dtor/19]
Once a destructor is invoked for an object, the object's lifetime ends; the behavior is undefined if the destructor is invoked for an object whose lifetime has ended ([basic.life]).
[Example 2: If the destructor for an object with automatic storage duration is explicitly invoked, and the block is subsequently left in a manner that would ordinarily invoke implicit destruction of the object, the behavior is undefined. — end example]
p[50].~Handle(); and later delete[] p; will make it call the destructor for an object whose lifetime has ended.
For p[50].value = 0; after the lifetime of the object has ended, this applies:
[basic.life/6]
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways. For an object under construction or destruction, see [class.cdtor]. Otherwise, such a pointer refers to allocated storage ([basic.stc.dynamic.allocation]), and using the pointer as if the pointer were of type void* is well-defined. Indirection through such a pointer is permitted but the resulting lvalue may only be used in limited ways, as described below. The program has undefined behavior if:
6.2 - the pointer is used to access a non-static data member or call a non-static member function of the object
Yes, it's mostly. Handle::value is just an offset to a pointer of type Handle, so it's just going to work wherever you point it to, even if the containing object isn't currently constructed. If you were to use anything with virtual keyword, this would end up broken though.
p[50].~Handle(); this however is a different beast. You should never invoke destructors manually unless you have also explicitly invoked the constructor with placement new. Still not illegal, but dangerous.
delete[] p; (omitted in your example!) is where you end up with double-destruction, at which point you are well beyond UB, straight up in the "it's broken" domain.
It is possible to return with a pointer for a function, and this is useful for lots of reasons, but is it suggested to take reference from that return value?
#include <iostream>
#include <memory>
using namespace std;
unique_ptr<int> give_a_number(){
return std::make_unique<int>(6);
}
int main(){
int& ret = *give_a_number();
return ret;
}
According to the sanitizer, no leaks occur :
user#comp: Scrapbook$ g++ main.cpp -fsanitize=address -g
user#comp: Scrapbook$
So the reference is cleaned up somehow, but I fail to understand why.
How does the clean-up happen, and even how the unque_ptr is able to keep track of the data.
Should this be consdered as safe behavior?
Is this an accepted usage in the community?
There is no leak here. In general an object owned by a std::unique_ptr is never leaked, as long as .release() is not called on the owning std::unique_ptr.
Your program does however have undefined behavior because the lifetimes are not correct. If you store a reference or pointer to the owned object you take over some responsibility to assure correct object lifetimes.
give_a_number() returns, by-value, a std::unique_ptr owning the int object. This std::unique_ptr is therefore materialized as a temporary object in the statement
int& ret = *give_a_number();
You don't move this temporary into any persistent std::unique_ptr instance. So when the temporary object is destroyed at the end of the statement, the int it owns is destroyed as well. Now your reference is dangling.
You then use the dangling reference in return ret;, causing undefined behavior.
You can store a reference to the owned object, but if you do so, you must assure that the std::unique_ptr instance owning the reference out-lives the int reference to avoid lifetime issues. E.g. the following are fine:
int main(){
return *give_a_number(); // `std::unique_ptr` temporary outlives return value initialization
}
int main(){
auto ptr = give_a_number(); // `ptr` lives until `main` returns
int& ret = *ptr;
return ret;
}
Usually it is safer to just store the smart pointer in an automatic variable as above and obtain the object by dereferencing with *ptr, where needed. The std::unique_ptr and the object it owns then live until the end of the block, if it is not moved from.
But it is perfectly correct to e.g. pass *ptr to a function by-reference. Since ptr outlives such a function call, there will not be any problem.
This also works if you pass *give_a_number() directly as argument to a function, because the std::unique_ptr temporary will outlive such a call as well. But in that case the std::unique_ptr and the object it owns will live only until the end of the statement (full-expression), not the end of the block.
unique_ptr lives only in an expression with give_a_number. Pointer to int is destroyed before return statement.
One may see how it behaves by using non-POD A with logging destructor and constructor calls.
struct A
{
A() { std::cout << "A created\n"; }
~A() { std::cout << "A destroyed\n"; }
};
unique_ptr<A> give_a_number(){
return std::unique_ptr<A>(new A());
}
int main(){
std::cout << "start\n";
A& ret = *give_a_number();
std::cout << "reference to removed object here\n";
return 0;
}
Output is:
start
A created
A destroyed
reference to removed object here
I don't know why sanitizer does not print alert in your case because mine does when called on your code. I use g++ 5.3.1
==21820==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200000eff0 at pc 0x000000400bd5 bp 0x7ffee5f3cac0 sp 0x7ffee5f3cab8
READ of size 4 at 0x60200000eff0 thread T0
This question already has answers here:
Taking the address of a temporary object
(7 answers)
Closed 7 years ago.
I have two code segments which I expected the same outcome:
First one:
SomeClass somefunc(...){
SomeClass newObj;
//some codes modify this object
return newObj;
}
int main(){
SomeClass *p;
p = &(somefuc(...));
}
Second one:
SomeClass *somefunc(...){
SomeClass newObj;
//some codes modify this object
return &newObj;
}
int main(){
SomeClass *p;
p = somefunc(...);
}
Why is it I got a "taking the address of a temporary object" error when I tried to build the first code segment, while the second code segment doesn't produce an error?
Before you even think about this, you need to learn the rules of temporary lifetime.
The broad case is that a temporary object is destroyed at the end of the full-expression creating it. The implication is that if
SomeClass *p;
p = &(somefunc(...));
were allowed to work, p would be a dangling pointer, targeting an object that no longer exists.
The big exception to the above rule is that when a reference with automatic lifetime is directly bound to the temporary object, the lifetime of the temporary is extended to be equal to the lifetime of the reference. Note that this does not cover const T& make_lvalue(const T& t) { return t; } because the reference isn't binding directly, nor class member references.
There are a few cases which are completely safe, in which the address of the temporary is only used immediately and not stored for later. e.g.
memcpy(buffer, &f(), sizeof(decltype(f())));
Of course, this results in the "address of a temporary" error you're encountered, but you can work around it via
memcpy(buffer, std::addressof(f()), sizeof(decltype(f())));
But do NOT store the resulting pointer.
The first snippet does rightfully not compile because, as the compiler said, you cannot take the address of a temporary object because it would be destroyed at the end of the expression (here: the assignment). Thus, saving its address would be meaningless.
The second snippet does compile, but is still incorrect although it might seem to work for the reasons stated here (at least if you try to access the object through the pointer).
The first example does not compile because somefunc returns a value, and you attempt to take the address of this temporary thing it returns. This would work:
Someclass* p;
Someclass val = somefunc (...);
p = &val;
The second example does not compile -- or shouldn't -- because somefunc is supposed to return a Someclass, but instead it returns a pointer to a Someclass. Make it return Someclass* and then it should compile -- but now you're returning a pointer to a local variable, which no longer exists after you leave the function. Best solution is the first example, as patched here.
I recently saw a piece of code which used storage buffers to create objects and then simply swapped the buffers in order to avoid the copying overhead. Here is a simple example using integers:
std::aligned_storage_t<sizeof(int), alignof(int)> storage1;
std::aligned_storage_t<sizeof(int), alignof(int)> storage2;
new (&storage1) int(1);
new (&storage2) int(2);
std::swap(storage1, storage2);
int i1 = reinterpret_cast<int&>(storage1);
int i2 = reinterpret_cast<int&>(storage2);
//this prints 2 1
std::cout << i1 << " " << i2 << std::endl;
This feels like undefined behaviour in the general case (specifically swapping the buffers and then accessing the objects as if they were still there) but I am not sure what the standard says about such usage of storage and placement new. Any feedback is much appreciated!
I suspect there are a few factors rendering this undefined, but we only need one:
[C++11: 3.8/1]: [..] The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
All subsequent use is use after end-of-life, which is bad and wrong.
The key is that each buffer is being reused.
So, although I would expect this to work in practice at least for trivial types (and for some classes), it's undefined.
The following may have been able to save you:
[C++11: 3.8/7]: If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object [..]
…except that you are not creating a new object.
It may or may not be worth noting here that, surprisingly, the ensuing implicit destructor calls are both well-defined:
[C++11: 3.8/8]: If a program ends the lifetime of an object of type T with static (3.7.1), thread (3.7.2), or automatic (3.7.3) storage duration and if T has a non-trivial destructor, the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined.
I started to implement an ID based memory pool, where every element has an id, which is basically an index in a vector. In this special case I know the index before I construct the object itself so I thought I set the ID before I call the constructor.
Some details
Allocating an object from an ID based pool is the following:
allocate a free id from the pool
get a memory address based on the id value
construct the object on the memory address
set the ID member of the object
and the deallocation is based on that id
here is the code (thanks jrok):
#include <new>
#include <iostream>
struct X
{
X()
{
// id come from "nothing"
std::cout << "X constructed with id: " << id << std::endl;
}
int id;
};
int main()
{
void* buf = operator new(sizeof(X));
// can I set the ID before the constructor call
((X*)buf)->id = 42;
new (buf) X;
std::cout << ((X*)buf)->id;
}
EDIT
I found a stock solution for this in boost sandbox:
sandbox Boost.Tokenmap
Can I set a member variable before constructor call?
No, but you can make a base class with ID that sets ID within its constructor (and throws exception if ID can't be allocated, for example). Derive from that class, and at the moment derived class enter constructor, ID will be already set. You could also manage id generation within another class - either within some kind of global singleton, or you could pass id manager as a first parameter to constructor.
typedef int Id;
class IdObject{
public:
Id getId() const{
return id;
}
protected:
IdManager* getIdManager() ...
IdObject()
:id(0){
IdManager* manager = getIdManager();
id = manager->generateId();
if (!id)
throw IdException;
manager->registerId(id, this);
}
~IdObject(){
if (id)
getIdManager()->unregisterId(id, this);
}
private:
Id id;
IdObject& operator=(IdObject &other){
}
IdObject(IdObject &other)
:id(0){
}
};
class DerivedObject: public IdObject{
public:
DerivedObject(){
//at this point, id is set.
}
};
This kind of thing.
Yes, you can do what you're doing, but it's really not a good idea. According to the standard, your code invokes Undefined Behaviour:
3.8 Object lifetime [basic.life]
The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization
if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial
default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —
end note ] The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
Before the lifetime of an object has started but after the storage which the object will occupy has been
allocated or, after the lifetime of an object has ended and before the storage which the object occupied is
reused or released, any pointer that refers to the storage location where the object will be or was located
may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise,
such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*,
is well-defined. Such a pointer may be dereferenced but the resulting lvalue may only be used in limited
ways, as described below. The program has undefined behavior if:
— the pointer is used to access a non-static data member or call a non-static member function of the
object
When your code invokes Undefined Behaviour, the implementation is allowed to do anything it wants to. In most cases nothing will happen - and if you're lucky your compiler will warn you - but occasionally the result will be unexpectedly catastrophic.
You describe a pool of N objects of the same type, using a contiguous array as the underlying storage. Note that in this scenario you do not need to store an integer ID for each allocated object - if you have a pointer to the allocated object, you can derive the ID from the offset of the object within the array like so:
struct Object
{
};
const int COUNT = 5; // allow enough storage for COUNT objects
char storage[sizeof(Object) * COUNT];
// interpret the storage as an array of Object
Object* pool = static_cast<Object*>(static_cast<void*>(storage));
Object* p = pool + 3; // get a pointer to the third slot in the pool
int id = p - pool; // find the ID '3' for the third slot
No, you cannot set anything in an object before its constructor is called. However, you have a couple of choices:
Pass the ID to the constructor itself, so it can store the ID in the object.
Allocate extra memory in front of the object being constructed, store the ID in that extra memory, then have the object access that memory when needed.
If you know the object's to-be address, which is the case for your scenario, then yes you can do that kind of thing. However, it is not well-defined behaviour, so it's most probably not a good idea (and in every case not good design). Although it will probably "work fine".
Using a std::map as suggested in a comment above is cleaner and has no "ifs" and "whens" of UB attached.
Despite writing to a known memory address will probably be "working fine", an object doesn't exist before the constructor is run, so using any of its members is bad mojo.
Anything is possible. No compiler will likely do any such thing, but the compiler might for example memset the object's storage with zero before running the constructor, so even if you don't set your ID field, it's still overwritten. You have no way of knowing, since what you're doing is undefined.
Is there a reason you want to do this before the constructor call?
Allocating an object from an ID based pool is the following:
1) allocate a free id from the pool
2) get a memory address based on the id value
3) construct the object on the memory address
4) set the ID member of the object and the deallocation is based on that id
According to your steps, you are setting the ID after the constructor.
so I thought I set the ID before I call the constructor.
I hate to be blunt, but you need to have a better reason than that to wade into the undefined behaviour territory. Remember, as programmers, there is a lot we're learning all the time and unless there is absolutely no way around it, we need to stay away from minefields, undefined behavior being one of them.
As other people have pointed out, yes you can do it, but that's like saying you can do rm -rf / as root. Doesn't mean you should :)
C makes it easy to shoot yourself in the foot. C++ makes it harder, but when you do, you blow away your whole leg! — Bjarne Stroustrup