Is it really possible to separate storage allocation from object initialization? - c++

From [basic.life/1]:
The lifetime of an object or reference is a runtime property of the object or reference. A variable is said to have vacuous initialization if it is default-initialized and, if it is of class type or a (possibly multi-dimensional) array thereof, that class type has a trivial default constructor. The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
its initialization (if any) is complete (including vacuous initialization) ([dcl.init]),
except that if the object is a union member or subobject thereof, its lifetime only begins if that union member is the initialized member in the union ([dcl.init.aggr], [class.base.init]), or as described in [class.union] and [class.copy.ctor], and except as described in [allocator.members].
From [dcl.init.general/1]:
If no initializer is specified for an object, the object is default-initialized.
From [basic.indet/1]:
When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).
[Note 1: Objects with static or thread storage duration are zero-initialized, see [basic.start.static]. — end note]
Consider this C++ program:
int main() {
int i;
i = 3;
return 0;
}
Is initialization performed in the first statement int i; or second statement i = 3; of the function main according to the C++ standard?
I think it is the former, which performs vacuous initialization to an indeterminate value and therefore begins the lifetime of the object (the latter does not perform initialization, it performs assignment to the value 3). If that is so, is it really possible to separate storage allocation from object initialization?

If that is so, is it really possible to separate storage allocation from object initialization?
Yes:
void *ptr = malloc(sizeof(int));
ptr points to allocated storage, but no objects live in that storage (C++20 says that some objects may be there, but nevermind that now). Objects won't exist unless we create some there:
new(ptr) int;

You are getting confused between allocating storage for an object and initializing an object, and they are definitely not the same thing.
In your example, the object i is never initialized. As a local value, it has space reserved for its storage, but it is not initialized with any value.
The second line’s statement assigns a value of 3 to it. This again is not initialization.
Objects with global storage are required by the standard to be both allocated and initialized (to zero or whatever the default initializer does). All other objects are only initialized if the written language construct can support it.
C++ allocators work on this same distinction. The new operator, behind the scenes, both allocates and initializes objects, after which you may assign the object a new value. If you need to, though, you can use the underlying language constructs to manage the two things separately.
For most purposes you do not need to care about the difference between object initialization and assignment in your code. If you get to the point where it matters, you either already know how the concepts differ or need to learn really quickly.

Is initialization performed in the first statement int i; or second statement i = 3; of the function main according to the C++ standard?
The first. The second statement is assignment, not initialization. The second statement marks the point "until that value is replaced ([expr.ass])" from your quotes of the standard.
If [initialization is the first statement] is so, is it really possible to separate storage allocation from object initialization?
Yes, but not in a such a simple example as yours. A common example that comes to mind is a std::vector. Reserving capacity allocates storage space, but that storage is not initialized until an object is added to the vector.
std::vector<int> v; // Allocates and initializes the vector object.
v.reserve(1); // Ensures space has been allocated for an int object.
/*
At this point, the first contained element has space allocated, but has
not yet been initialized. If you want to do nutty things between allocation
and object initialization, this is the place to do it. Note that you are
not allowed to access the allocated space since it belongs to the vector.
You'd have to replicate the inner workings of a vector to do that...
*/
v.emplace_back(3); // Initializes the first contained object.
Quoting the standard
Short version:
There is nothing to quote because the standard does not explicitly prohibit all spurious actions. Compilers avoid spurious actions by their own volition.
Long version:
Strictly speaking, the standard does not guarantee that reserve() does not initialize anything. The requirements imposed on reserve() in [vector.capacity] are more focused on what must be done than on prohibiting spurious activity. The closest it comes to this guarantee is the requirement that the time complexity of reserve() be linear in the size of the container (not in the capacity, but in the current size). This would make it impossible to always initialize everything that was reserved. However, a compiler could still choose to initialize a fixed number of reserved elements, say up to 10 million of them. As long as this limit is fixed, it counts as constant-time complexity, so is allowed by [vector.capacity].
Now let's get real. Compilers are designed to produce fast code, without introducing unnecessary, useless busywork. Compilers do not seek out the possibility of doing additional work simply because the standard does not prohibit it. Except for debug builds, no compiler is going to introduce an initialization when it it not required. The people who view the possibility as something worth considering are language lawyers who lose sight of the big picture. You don't pay for what you don't need. The question to ask here is not "Could you quote the standard supporting that no initialization happens?" but "Could you quote the standard supporting that no initialization is required?" Since the additional work of initialization is not required, it will not happen in practice.
Still, reality means little to some language lawyers, and this question does have that tag. To be thorough, I will demonstrate that it is "possible to separate storage allocation from object initialization" even if you happen to use a pathological, yet standards-compliant, compiler that was over-engineered by masochists. I need only one case to demonstrate "possible", so let's abandon int for a more bizarre, yet fully legal, type.
The sole precondition for reserve() is that the contained type can be move-inserted into the container. This precondition is satisfied by the following class.
class C {
// Default construction is not supported.
C() = delete;
public:
// Move construction is allowed, even outside this class.
C(C &&) = default;
};
I have designed this class to be rather hard to initialize. The only allowed construction is move-construction; in order to initialize an object of this type, you need to already have an object of this type. Who creates the first object? No one. No objects of this type can exist. However, one can still create a vector of these objects (an empty vector, but still a vector).
It is legal to define std::vector<C> v;, and to follow that by a call to v.reserve(1);. This allocates space (1 byte is needed on my system) for an object of type C, and yet there is no possible initialization of this object. QED.

Related

Why static instance constructor not called at second time? [duplicate]

The book Object oriented programming in c++ by Robert Lafore says,
A static local variable has the visibility of an automatic local
variable (that is, inside the function containing it). However, its
lifetime is the same as that of a global variable, except that it
doesn’t come into existence until the first call to the function
containing it. Thereafter it remains in existence for the life of the
program
What does coming into existence after first call of function mean? The storage for static local is allocated at the time program is loaded in the memory.
The storage is allocated before main is entered, but (for example) if the static object has a ctor with side effects, those side effects might be delayed until just before the first time the function is called.
Note, however, that this is not necessarily the case. Constant initialization is only required to happen before that block is entered (not necessarily just as execution "crosses" that definition). Likewise, implementations are allowed to initialize other block-scope static variables earlier than required under some circumstances (if you want to get into the gory details of the circumstances, you can look at [basic.start.init] and [stmt.dcl], but it basically comes down to: as long as it doesn't affect the value with which it's initialized. For example, if you had something like:
int i;
std::cin >> i;
{
static int x = i;
...the implementation wouldn't be able to initialize x until the block was entered, because the value with which it was being initialized wouldn't be known until them. On the other hand, if you had:
{
static int i = 0;
...the implementation could carry out the initialization as early as it wished (and most would/will basically carry out such an initialization at compile time, so it won't involve executing any instructions at run-time at all). Even for less trivial cases, however, earlier initialization is allowed when logically possible (e.g., the value isn't coming from previous execution).
In C++ storage duration of an object (when raw memory gets allocated for it) and lifetime of an object are two separate concepts. The author was apparently referring to the latter one when he was talking about object's "coming into existence".
In general case it is not enough to allocate storage for an object to make it "come into existence". Lifetime of an object with non-trivial initialization begins once its initialization is complete. For example, an object of a class with a non-trivial constructor does not officially "live" until its constructor has completed execution.
Initialization of a static local object is performed when the control passes over the declaration for the very first time. Before that the object does not officially exist, even if the memory for it is already allocated.
Note that the author is not painstakingly precise in his description. It is not sufficient to just call the function containing the declaration. The control has to pass through the declaration of the object for it to begin its lifetime. If the function contains branching, this does not necessarily happen during the very first call to the function.
For object with trivial initialization (like int objects), there's no difference between storage duration and lifetime. For such objects allocating memory is all that needs to be done. But in general case allocating memory alone is not sufficient.
It means that the static variable inside a function doesn't get initialized (by the constructor or the assignment operator) until the first call for that function.
As soon as the function, which contains a static local variable, is called the static local variable is initialized.

Is it legal to construct data members of a struct separately?

class A;
class B;
//we have void *p pointing to enough free memory initially
std::pair<A,B> *pp=static_cast<std::pair<A,B> *>(p);
new (&pp->first) A(/*...*/);
new (&pp->second) B(/*...*/);
After the code above get executed, is *pp guaranteed to be in a valid state? I know the answer is true for every compiler I have tested, but the question is whether this is legal according to the standard and hence. In addition, is there any other way to obtain such a pair if A or B is not movable in C++98/03? (thanks to #StoryTeller-UnslanderMonica , there is a piecewise constructor for std::pair since C++11)
“Accessing” the members of the non-existent pair object is undefined behavior per [basic.life]/5; pair is never a POD-class (having user-declared constructors), so a pointer to its storage out of lifetime may not be used for its members. It’s not clear whether forming a pointer to the member is already undefined, or if the new is.
Neither is there a way to construct a pair of non-copyable (not movable, of course) types in C++98—that’s why the piecewise constructor was added along with move semantics.
A more simple question: is using a literal string well defined?
Not even that is, as its lifetime is not defined. You can't use a string literal in conforming code.
So the committee that never took the time to make string literals well defined obviously did not bother with specifying which objects of class type can be made to exist by placement new of its subobjects - polymorphic objects obviously cannot be created that way!
That standard did not even bother describing the semantic of union.
About lifetime the standard is all over the place, and that isn't just editorial: it reflects a deep disagreement between serious people about what begins a lifetime, what an object is, what an lvalue is, etc.
Notably people have all sorts of false or contradicting intuitions:
an infinite number of objects cannot be created by one call to malloc
an lvalue refers to an object
overlapping objects are against the object model
a unnamed object can only be created by new or by the compiler (temporaries)
...

What is the C++ equivalent of an 'allocated object having no declared type' in C?

I'm writing a memory manager for my VM in C++. Well, more exactly the VM instructions will be compiled into C++ with an embedded memory manager. I'm much more comfortable in handling C, but now I do need native support of exception handling, pretty much the only reason I'm using C++.
Both C and C++ have the strict aliasing rule that two objects of incompatible types shall not overlap, with a small exception in C for unions. But to define the behaviour of memory allocation functions such as malloc, calloc, alloca etc., the C standard has the following paragraph.
6.5-6 The effective type of an object for an access to its stored value is the declared type of the object, if any. Allocated objects
have no declared type. If a value is stored into an object having no
declared type through an lvalue having a type that is not a character
type, then the type of the lvalue becomes the effective type of the
object for that access and for subsequent accesses that do not modify
the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
This effectively makes using raw allocated memory for any type a well defined behaviour in C. I tried to find a similar paragraph in the C++ standard document but could not find one. I think C++ has a different approach in this regard. What is the C++ equivalent of an 'allocated object having no declared type' in C, and how does the C++ standard define it?
For C++, this is is described in Object lifetime [object.life]; in particular:
The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-trivial initialization, its initialization is complete.
Lifetime continues until the storage is reused or the object is destructed:
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly
calling the destructor for an object of a class type with a non-trivial destructor.
This has the rather odd implication that unused allocated storage (returned from operator new) contains an object of every trivial type that fits in that block of storage, at least until the block of storage is used. But then, the standard cares more about getting the treatment of non-trivial types right than this minor wart.
I'm not sure such an analog exists, or is needed in C++. To allocate memory you can do one of three things
Foo f;
This will allocate sizeof(Foo) amount of memory on the stack. The size is known at compile-time, which is how the compiler knows how much space to allocate. The same is true of arrays, etc.
The other option is
Foo* f = new Foo; // or the smart pointer alternatives
This will allocate from the heap, but again sizeof(Foo) must be known, but this allows allocation at run-time.
The third option, as #BasileStarynkevitch mentions, is placement new
You can see that all of these allocation mechanisms in C++ require knowledge of the type that you are allocating space for.
While it is possible to use malloc, etc in C++, it is frowned upon as it goes against the grain for typical C++ semantics. You can see a discussion of a similar question. The other mechanisms of allocating "raw" memory are hacky. For example
void* p = operator new(size);
This will allocate size number of bytes.
I think "an allocated object with no declared type" is effectively allocated storage that is not yet initialised and no object has yet been created in that memory space. From the global allocation function ::operator new(std::size_t) and family (§3.7.4/2);
§3.7.4.1/2
The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall return the address of the start of a block of storage whose length in bytes shall be at least as large as the requested size. There are no constraints on the contents of the allocated storage on return from the allocation function.
The creation of an object, whether it is automatic or dynamic, is governed by two stages, the allocation itself and then the construction of the object in that space.
§3.8/1
The lifetime of an object is a runtime property of the object. An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [Note: initialization by a trivial copy/move constructor is non- vacuous initialization. — end note] The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-trivial initialization, its initialization is complete.
Correspondingly;
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
C++ WD n4527.
I think, the approach of C++ in this regard can be summarized as: "malloc() is evil. Use new instead. There is no such thing as an object without a declared type." Of course, C++ implementations need to define the global operator new(), which is basically the C++ version of malloc() (and which can be provided by the user). The very existence of this operator proves that there is something like objects without a declared type in C++, but the standard won't admit it.
If I were you, I'd take the pragmatic approach. Both the global operator new() and malloc() are available in C++, so any implementation must be able to use their return values sensibly. Especially malloc() will behave identical in C and C++. So, just handle these untyped objects the same way as you would handle them in C, and you should be fine.

Is it safe to create and use vectors during static initialization?

I have C++ code which declares static-lifetime variables which are initialized by function calls. The called function constructs a vector instance and calls its push_back method. Is the code risking doom via the C++ static initialization order fiasco? If not, why not?
Supplementary information:
What's the "static initialization order fiasco"?
It's explained in C++ FAQ 10.14
Why would I think use of vector could trigger the fiasco?
It's possible that the vector constructor makes use of the value of another static-lifetime variable initialized dynamically. If so, then there is nothing to ensure that vector's variable is initialized before I use vector in my code. Initializing result (see code below) could end up calling the vector constructor before vector's dependencies are fully initialized, leading to access to uninitialized memory.
What does this code look like anyway?
struct QueryEngine {
QueryEngine(const char *query, string *result_ptr)
: query(query), result_ptr(result_ptr) { }
static void AddQuery(const char *query, string *result_ptr) {
if (pending == NULL)
pending = new vector<QueryEngine>;
pending->push_back(QueryEngine(query, result_ptr));
}
const char *query;
string *result_ptr;
static vector<QueryEngine> *pending;
};
vector<QueryEngine> *QueryEngine::pending = NULL;
void Register(const char *query, string *result_ptr) {
QueryEngine::AddQuery(query, result_ptr);
}
string result = Register("query", &result);
Fortunately, static objects are zero-initialised even before any other initialisation is performed (even before the "true" initialisation of the same objects), so you know that the NULL will be set on that pointer long before Register is first invoked.1
Now, in terms of operating on your vector, it appears that (technically) you could run into such a problem:
[C++11: 17.6.5.9/3]: A C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.
[C++11: 17.6.5.9/4]: [Note: This means, for example, that implementations can’t use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. —end note]
Notice that, although synchronisation is being required in this note, that's been mentioned within a passage that ultimately acknowledges that static implementation details are otherwise allowed.
That being said, it seems like the standard should further state that user code should avoid operating on standard containers during static initialisation, if the intent were that the semantics of such code could not be guaranteed; I'd consider this a defect in the standard, either way. It should be clearer.
1 And it is a NULL pointer, whatever the bit-wise representation of that may be, rather than a blot to all-zero-bits.
vector doesn't depend on anything preventing its use in dynamic initialisation of statics. The only issue with your code is a lack of thread safety - no particular reason to think you should care about that, unless you have statics whose construction spawns threads....
Initializing result (see code below) could end up calling the vector constructor before that class is fully initialized, leading to access to uninitialized memory.
No... initialising result calls AddQuery which checks if (pending == NULL) - the initialisation to NULL will certainly have been done before any dynamic initialisation, per 3.6.2/2:
Constant initialization is performed:
...
— if an object with static or thread storage duration is not initialized by a constructor call and if either the object is value-initialized or every full-expression that appears in its initializer is a constant expression
So even if the result assignment is in a different translation unit it's safe. See 3.6.2/2:
Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. Static initialization shall be performed before any dynamic initialization takes place.

Array of pointers member, is it initialized?

If I have a
class A
{
private:
Widget* widgets[5];
};
Is it guaranteed that all pointers are NULL, or do I need to initialize them in the constructor? Is it true for all compilers?
Thanks.
The array is not initialized unless you do it. The standard does not require the array to be initialized.
It is not initialized if it is on the stack or using the default heap allocator (although you can write your own to do so).
If it is a global variable it is zero filled.
This is true for all conformant compilers.
It depends on the platform and how you allocate or declare instances of A. If it's on the stack or heap, you need to explicitly initialize it. If it's with placement new and a custom allocator that initializes memory to zero or you declare an instance at file scope AND the platform has the null pointer constant be bitwise zero, you don't. Otherwise, you should.
EDIT: I suppose I should have stated the obvious which was "don't assume that this happens".
Although in reality the answer is "it depends on the platform". The standard only tells you what happens when you initialize explicitly or at file scope. Otherwise, it is easiest to assume that you are in an environment that will do the exact opposite of what you want it to do.
And if you really need to know (for educational or optimizational purposes), consult the documentation and figure out what you can rely on for that platform.
In general case the array will not be initialized. However, keep in mind that the initial value of the object of class type depends not only on how the class itself is defined (constructor, etc), but might also depend on the initializer used when creating the object.
So, in some particular cases the answer to your question might depend on the initializer you supply when creating the object and on the version of C++ language your compiler implements.
If you supply no initializer, the array will contain garbage.
A* pa = new A;
// Garbage in the array
A a;
// Garbage in the array
If supply the () initializer, in C++98 the array will still contain garbage. In C++03 however the object will be value-initialized and the array will contain null-pointers
A* pa = new A();
// Null-pointers in the array in C++03, still garbage in C++98
A a = A();
// Null-pointers in the array in C++03, still garbage in C++98
Also, objects with static storage duration are always zero-initialized before any other initialization takes place. So, if you define an object of A class with static storage duration, the array will initially contain null-pointers.