Is constructing objects in a char array well-formed - c++

This is almost standard textbook use of placement new
template<size_t Len, size_t Align>
class aligned_memory
{
public:
aligned_memory() : data((char*)(((std::uintptr_t)mem + Align - 1) & -Align)) {}
char* get() const {return data;}
private:
char mem[Len + Align - 1];
char* data;
};
template<typename T, size_t N>
class Array
{
public:
Array() : sz(0) {}
void push_back(const T& t)
{
new (data.get() + sz++ * sizeof(T)) T(t);
}
void pop_back()
{
((T*)data.get() + --sz)->~T();
}
private:
aligned_memory<N * sizeof(T), alignof(T)> data;
size_t sz;
};
Seems pretty fine, until we look into strict-aliasing, there seems to be some conflict in whether this is well-formed
Camp ill-formed
C++'s Strict Aliasing Rule - Is the 'char' aliasing exemption a 2-way street?
Strict aliasing rule and 'char *' pointers
Camp well-formed
Does encapsulated char array used as object breaks strict aliasing rule
How to avoid strict aliasing errors when using aligned_storage
They all agree on char* may always reference another object, but some point out its ill-formed to do so the other way round.
Clearly our char[] converts to char* then casted to T*, with which it is used to call its destructor.
So, does the above program break the strict-aliasing rule? Specifically, where in the standard does it says it is well-formed or ill-formed?
EDIT: as background info, this is written for C++0x, before the advent of alignas and std::launder. Not asking specifically for a C++0x solution, but it is preferred.
alignof is cheating, but its here for example purposes.

Gathering from the hints throughout the countless helpful comments, here is my interpretation of what's happening.
TLDR its well-formed‡see edit
Quoting in the order I find more logical from [basic.life]†
The properties ascribed to objects and references throughout this International Standard apply for a given object or reference only during its lifetime.
An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. [...] The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-vacuous initialization, its initialization is complete.
The lifetime of an object o of type T ends when:
if T is a class type with a non-trivial destructor , the destructor call starts, or
the storage which the object occupies is released, or is reused by an object that is not nested within o
From [basic.lval]†
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char, unsigned char, or std​::​byte type.
We deduce that
The lifetime of the chars in the char[] ends when another object reuses that space.
The lifetime of an object of type T started when push_back is called.
Since the address ((T*)data.get() + --sz) is always that of an object with type T whose lifetime has started and not yet ended, it is valid to call ~T() with it.
During this process, the char[] and char* in aligned_memory aliases objects of type T but it is legal to do so. Also, no glvalue is obtained from them, so they could have been pointers of any type.
To answer my own question in the comments whether using any memory as storage is also well-formed
U u;
u->~U();
new (&u) T;
((T*)&u)->~T();
new (&u) U;
Following the 4 points above, the answer is yes‡see edit, as long as the alignment of U is not weaker than T.
‡ EDIT: I've neglected another paragraph of [basic.life]
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Which means even though using the object is well-formed, the means which the object is obtained is not. Specifically, post C++17, std::launder has to be called
(std::launder((T*)data.get()) + --sz)->~T();
Prior C++17, a workaround would be to use the pointer acquired from the placement new instead
T* p = new (data.get() + sz++ * sizeof(T)) T(t); // store p somewhere
† Quoted from n4659, as far as I can see, same holds for n1905

Placement-new creates an object at the specified location (C++14 expr.new/1), and ends the lifetime of any other object that was occupying the location (basic.life/1.4).
The code ((T*)data.get() + --sz)->~T(); accesses an object of type T at the location where there is an object of type T. This is fine. It is irrelevant if there used to be a char array at the location.

Related

Does reinterpret_casting std::aligned_storage* to T* without std::launder violate strict-aliasing rules? [duplicate]

This question already has answers here:
Does this really break strict-aliasing rules?
(3 answers)
Closed 5 years ago.
The following example comes from std::aligned_storage page of cppreference.com:
#include <iostream>
#include <type_traits>
#include <string>
template<class T, std::size_t N>
class static_vector
{
// properly aligned uninitialized storage for N T's
typename std::aligned_storage<sizeof(T), alignof(T)>::type data[N];
std::size_t m_size = 0;
public:
// Create an object in aligned storage
template<typename ...Args> void emplace_back(Args&&... args)
{
if( m_size >= N ) // possible error handling
throw std::bad_alloc{};
new(data+m_size) T(std::forward<Args>(args)...);
++m_size;
}
// Access an object in aligned storage
const T& operator[](std::size_t pos) const
{
return *reinterpret_cast<const T*>(data+pos);
}
// Delete objects from aligned storage
~static_vector()
{
for(std::size_t pos = 0; pos < m_size; ++pos) {
reinterpret_cast<T*>(data+pos)->~T();
}
}
};
int main()
{
static_vector<std::string, 10> v1;
v1.emplace_back(5, '*');
v1.emplace_back(10, '*');
std::cout << v1[0] << '\n' << v1[1] << '\n';
}
In the example, the operator[] just reinterpret_casts std::aligned_storage* to T* without std:launder, and performs an indirection directly. However, according to this question, this seems to be undefined, even if an object of type T has been ever created.
So my question is: does the example program really violate strict-aliasing rules? If it does not, what's wrong with my comprehension?
I asked a related question in the ISO C++ Standard - Discussion forum. I learned the answer from those discussions, and write it here to hope to help someone else who is confused about this question. I will keep updating this answer according to those discussions.
Before P0137, refer to [basic.compound] paragraph 3:
If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.
and [expr.static.cast] paragraph 13:
If the original pointer value represents the address A of a byte in memory and A satisfies the alignment requirement of T, then the resulting pointer value represents the same address as the original pointer value, that is, A.
The expression reinterpret_cast<const T*>(data+pos) represents the address of the previously created object of type T, thus points to that object. Indirection through this pointer indeed get that object, which is well-defined.
However after P0137, the definition for a pointer value is changed and the first block-quoted words is deleted. Now refer to [basic.compound] paragraph 3:
Every value of pointer type is one of the following:
a pointer to an object or function (the pointer is said to point to the object or function), or
...
and [expr.static.cast] paragraph 13:
If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
The expression reinterpret_cast<const T*>(data+pos) still points to the object of type std::aligned_storage<...>::type, and indirection get a lvalue referring to that object, though the type of the lvalue is const T. Evaluation of the expression v1[0] in the example tries to access the value of the std::aligned_storage<...>::type object through the lvalue, which is undefined behavior according to [basic.lval] paragraph 11 (i.e. the strict-aliasing rules):
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in [conv.qual]) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char, unsigned char, or std​::​byte type.
The code doesn't violate the strict aliasing rule in any way. An lvalue of type const T is used to access an object of type T, which is permitted.
The rule in question, as covered by the linked question, is a lifetime rule; C++14 (N4140) [basic.life]/7. The problem is that, according to this rule, the pointer data+pos may not be used to manipulate the object created by placement-new. You're supposed to use the value "returned" by placement-new.
The question naturally follows: what about the pointer reinterpret_cast<T *>(data+pos) ? It is unclear whether accessing the new object via this new pointer violates [basic.life]/7.
The author of the answer you link to, assumes (with no justification offered) that this new pointer is still "a pointer that pointed to the original object". However it seems to me that it is also possible to argue that , being a T *, it cannot point to the original object, which is a std::aligned_storage and not a T.
This shows that the object model is underspecified. The proposal P0137, which was incorporated into C++17, was addressing a problem in a different part of the object model. But it introduced std::launder which is a sort of mjolnir to squash a wide range of aliasing, lifetime and provenance issues.
Undoubtedly the version with std::launder is correct in C++17. However, as far as I can see, P0137 and C++17 don't have any more to say about whether or not the version without launder is correct.
IMHO it is impractical to call the code UB in C++14, which did not have std::launder, because there is no way around the problem other than to waste memory storing all the result pointers of placement-new. If this is UB then it's impossible to implement std::vector in C++14, which is far from ideal.

memmove in-place change of effective type (type-punning)

In the following question:
What's a proper way of type-punning a float to an int and vice-versa?, the conclusion is that the way to construct doubles from integer bits and vise versa is via memcpy.
That's fine, and the pseudo_cast conversion method found there is:
template <typename T, typename U>
inline T pseudo_cast(const U &x)
{
static_assert(sizeof(T) == sizeof(U));
T to;
std::memcpy(&to, &x, sizeof(T));
return to;
}
and I would use it like this:
int main(){
static_assert(std::numeric_limits<double>::is_iec559);
static_assert(sizeof(double)==sizeof(std::uint64_t));
std::uint64_t someMem = 4614253070214989087ULL;
std::cout << pseudo_cast<double>(someMem) << std::endl; // 3.14
}
My interpretation from just reading the standard and cppreference is/was that is should also be possible to use memmove to change the effective type in-place, like this:
template <typename T, typename U>
inline T& pseudo_cast_inplace(U& x)
{
static_assert(sizeof(T) == sizeof(U));
T* toP = reinterpret_cast<T*>(&x);
std::memmove(toP, &x, sizeof(T));
return *toP;
}
template <typename T, typename U>
inline T pseudo_cast2(U& x)
{
return pseudo_cast_inplace<T>(x); // return by value
}
The reinterpret cast in itself is legal for any pointer (as long as cv is not violated, item 5 at cppreference/reinterpret_cast). Dereferencing however requires memcpy or memmove (§6.9.2), and T and U must be trivially copyable.
Is this legal? It compiles and does the right thing with gcc and clang.
memmove source and destinations are explicitly allowed to overlap, according
to cppreference std::memmove and memmove,
The objects may overlap: copying takes place as if the characters were
copied to a temporary character array and then the characters were
copied from the array to dest.
Edit: originally the question had a trivial error (causing segfault) spotted by #hvd. Thank you! The question remains the same, is this legal?
C++ does not allow a double to be constructed merely by copying the bytes. An object first needs to be constructed (which may leave its value uninitialised), and only after that can you fill in its bytes to produce a value. This was underspecified up to C++14, but the current draft of C++17 includes in [intro.object]:
An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).
Although constructing a double with default initialision does not perform any initialisation, the construction does still need to happen. Your first version includes this construction by declaring the local variable T to;. Your second version does not.
You could modify your second version to use placement new to construct a T in the same location that previously held an U object, but in that case, when you pass &x to memmove, it is no longer required to read the bytes that had made up x's value, because the object x has already been destroyed by the earlier placement new.
My reading of the standard suggests that both these functions will result in UB.
consider:
int main()
{
long x = 10;
something_with_x(x*10);
double& y = pseudo_cast_inplace<double>(x);
y = 20;
something_with_y(y*10);
}
Because of the strict alias rule, it seems to me that there's nothing to stop the compiler from reordering instructions to produce code as-if:
int main()
{
long x = 10;
double& y = pseudo_cast_inplace<double>(x);
y = 20;
something_with_x(x*10); // uh-oh!
something_with_y(y*10);
}
I think the only legal way to write this is:
template <typename T, typename U>
inline T pseudo_cast(U&& x)
{
static_assert(sizeof(T) == sizeof(U));
T result;
std::memcpy(std::addressof(result), std::addressof(x), sizeof(T));
return result;
}
Which in reality results in the exact same assembler output (i.e. none whatsoever - the entire function is elided, as are the variables themselves) - at least on gcc with -O2
This should be legal in C++20. Example in godbolt.
template <typename T, typename U>
requires (
sizeof(U) >= sizeof(T) and
std::alignment_of_v<T> <= std::alignment_of_v<U> and
std::is_trivially_copyable_v<T> and
std::is_trivially_destructible_v<U>
)
[[nodiscard]] T& reinterpret_object(U& obj)
{
// Get access to object representation
std::byte* bytes = reinterpret_cast<std::byte*>(&obj);
// Copy object representation to temporary buffer.
// Implicitly create a T object in the destination storage. The lifetime of U object ends.
// Copy temporary buffer back.
void* storage = std::memmove(bytes, bytes, sizeof(T));
// Storage pointer value is 'pointer to T object', so we are allowed to cast it to the proper pointer type.
return *static_cast<T*>(storage);
}
reinterpret_cast to a different pointer type is allowed (7.6.1.10)
An object pointer can be explicitly converted to an object pointer of a different type.
Accessing the object representation through an std::byte* pointer is allowed (7.2.1)
If a program attempts to access the stored value of an object through a glvalue whose type is not similar to one of the following types the behavior is undefined
a char, unsigned char, or std​::​byte type.
std::memmove behaves as-if copying to a temporary buffer and can implicitly create objects (21.5.3)
The functions memcpy and memmove are signal-safe.
Both functions implicitly create objects ([intro.object]) in the destination region of storage immediately prior to copying the sequence of characters to the destination.
Implicit object creation is described in (6.7.2)
Some operations are described as implicitly creating objects within a specified region of storage.
For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types ([basic.types]) in its specified region of storage if doing so would result in the program having defined behavior.
If no such set of objects would give the program defined behavior, the behavior of the program is undefined.
If multiple such sets of objects would give the program defined behavior, it is unspecified which such set of objects is created.
[Note 4: Such operations do not start the lifetimes of subobjects of such objects that are not themselves of implicit-lifetime types.
— end note]
Further, after implicitly creating objects within a specified region of storage, some operations are described as producing a pointer to a suitable created object.
These operations select one of the implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior.
If no such pointer value would give the program defined behavior, the behavior of the program is undefined.
If multiple such pointer values would give the program defined behavior, it is unspecified which such pointer value is produced.
It is not specified that std::memmove is such a function and its returned pointer value would be a pointer to the implicitly created object.
But it makes sense that is is so.
Returning a pointer to the new object is allowed by (7.6.1.9)
A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1.
If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified.
Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b.
Otherwise, the pointer value is unchanged by the conversion.
If std::memmove does not return a usable pointer value, std::launder<T>(reinterpret_cast<T*>(bytes)) (17.6.5) should be able to produce such a pointer value.
Additional notes:
I'm not 100% sure if all the requires are correct or some condition is missing.
To get zero overhead, the compiler must to optimize the std::memmove away (gcc and clang seem to do it).
The lifetime of the original object ends (6.7.3)
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling a destructor or pseudo-destructor ([expr.prim.id.dtor]) for the object.
This means that using the original name or pointers or references to it will result in undefined behaviour.
The object can be "revived" by reinterpreting it back reinterpret_object<U>(reinterpret_object<T>(obj)) and that should allow using the old references (6.7.3)
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if the original object is transparently replaceable (see below) by the new object.
An object o1 is transparently replaceable by an object o2 if:
the storage that o2 occupies exactly overlays the storage that o1 occupied, and
o1 and o2 are of the same type (ignoring the top-level cv-qualifiers), and
o1 is not a complete const object, and
neither o1 nor o2 is a potentially-overlapping subobject ([intro.object]), and
either o1 and o2 are both complete objects, or o1 and o2 are direct subobjects of objects p1 and p2, respectively, and p1 is transparently replaceable by p2.
The object representations should be "compatible", interpreting the bytes of the original object as bytes of the new one can produce "garbage" or even trap representations.
Accessing a double while the actual type is uint64_t is undefined behavior because compiler will never consider that an object of type double can share the address of an object of type uint64_t intro.object:
Unless an object is a bit-field or a base class subobject of zero size, the address of that object is the address of the first byte it occupies.
Two objects a and b with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they have distinct addresses.

What is the dynamic type of the object allocated by malloc?

The C++ standard refers to the term "dynamic type" (and the C standard refers to "effective type" in the similar context), for example
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
But how is the dynamic type of the object allocated with malloc determined?
For example:
void *p = malloc(sizeof(int));
int *pi = (int*)p;
Will the dynamic type of the object pointed to by pi be int?
According to the C++ specification:
Dynamic type:
<glvalue> type of the most derived object (1.8) to which the glvalue denoted by a glvalue expression refers
The return value of malloc is a block of uninitialized storage. No object has been constructed within that storage. And therefore it has no dynamic type.
The void* does not point to an object, and only objects have a dynamic type.
You can create an object within that storage by beginning its lifetime. But until you do so, it's just storage.
In C, the effective type is only relevant when you access an object. Then in is determined by
the declaration type, if it has one
the type of another object of which it is a copy (eg. memcpy)
the type of the lvalue through which it is accessed, e.g if a void* is converted to another pointer type (e.g int*), which then is dereferenced.
The latter is usually what happens with malloced objects, if you assign the return value of malloc to a pointer type.
Dynamic type is a formal term to describe essentially polymorphic objects i.e. ones with at least one virtual function. It is thus a C++ term as C has no concept of virtual, for example.
But how is the dynamic type of the object allocated with malloc
determined?
It isn't. malloc allocates N raw bytes of memory and returns it through a void* - it's your job to infer the right type. Moreover, this memory just represents the area where the object is placed, but this object will not be alive till you explicitly call its constructor. (again, from a C++ perspective)
Will the dynamic type of the object pointed to by pi be int?
No, because the term dynamic type is meaningful when describing object with class types. int is not nor can be.
class Foo
{
//virtual ~Foo() = default;
virtual void f() {}
};
class Bar : public Foo
{
virtual void f() {}
};
// ...
Foo *ptr = new Bar();
Here Foo is the static type of ptr while Bar is its dynamic type.
The status quo is that malloc does not create objects. The only constructs that do are new expressions, definitions, casts and assignments to variant members. See P0137R0 for proper wording on this.
If you wanted to use the storage yielded by malloc, assuming that it is properly aligned (which is the case unless you use extended alignments), employ a call to placement new:
auto p = malloc(sizeof(int));
int* i = new (p) int{0};
// i points to an object of type int, initialized to zero
Hence using malloc in C++ is quite useless, as bog-standard new effectively combines the above steps into one.
See also #T.C.'s answer in the related question of the asker.
As per 1.3.7 in C++11 standard,
dynamic type
glvalue type of the most derived object (1.8) to which the glvalue denoted by a glvalue expression refers
[Example: if a pointer (8.3.1) p whose static type is “pointer to class B” is pointing to an object of class
D, derived from B (Clause 10), the dynamic type of the expression *p is “D.” References (8.3.2) are treated
similarly. — end example ]
for an example
class A {}
class B : public A {}
A *a = new B;
the "static" type of a is A * while its dynamic type is B *.
The idea of referencing not the same type comes to protect from something like
class A{}
class B : public A {int x;}
class C : public A {int y;}
A *a = new B;
reinterpret_cast<C *>(a)->x;
which may lead to undefined behavior.
void * does not point to an object, but the distinction between dynamic and declaration type makes sense only for objects.

Is it legal to modify an object created with new through a const pointer?

So this answer made me think about the scenario where you assign the result of new to a pointer to a const. AFAIK, there's no reason you can't legally const_cast the constness away and actually modify the object in this situation:
struct X{int x;};
//....
const X* x = new X;
const_cast<X*>(x)->x = 0; // okay
But then I thought - what if you actually want new to create a const object. So I tried
struct X{};
//....
const X* x = new const X;
and it compiled!!!
Is this a GCC extension or is it standard behavior? I have never seen this in practice. If it's standard, I'll start using it whenever possible.
new obviously doesn't create a const object (I hope).
If you ask new to create a const object, you get a const object.
there's no reason you can't legally const_cast the constness away and actually modify the object.
There is. The reason is that the language specification calls that out explicitly as undefined behaviour. So, in a way, you can, but that means pretty much nothing.
I don't know what you expected from this, but if you thought the issue was one of allocating in readonly memory or not, that's far from the point. That doesn't matter. A compiler can assume such an object can't change and optimise accordingly and you end up with unexpected results.
const is part of the type. It doesn't matter whether you allocate your object with dynamic, static or automatic storage duration. It's still const. Casting away that constness and mutating the object would still be an undefined operation.
constness is an abstraction that the type system gives us to implement safety around non-mutable objects; it does so in large part to aid us in interaction with read-only memory, but that does not mean that its semantics are restricted to such memory. Indeed, C++ doesn't even know what is and isn't read-only memory.
As well as this being derivable from all the usual rules, with no exception [lol] made for dynamically-allocated objects, the standards mention this explicitly (albeit in a note):
[C++03: 5.3.4/1]: The new-expression attempts to create an object of the type-id (8.1) or new-type-id to which it is applied. The type of that object is the allocated type. This type shall be a complete object type, but not an abstract class type or array thereof (1.8, 3.9, 10.4). [Note: because references are not objects, references cannot be created by new-expressions. ] [Note: the type-id may be a cv-qualified type, in which case the object created by the new-expression has a cv-qualified type. ] [..]
[C++11: 5.3.4/1]: The new-expression attempts to create an object of the type-id (8.1) or new-type-id to which it is applied. The type of that object is the allocated type. This type shall be a complete object type, but not an abstract class type or array thereof (1.8, 3.9, 10.4). It is implementation-defined whether over-aligned types are supported (3.11). [ Note: because references are not objects, references cannot be created by new-expressions. —end note ] [ Note: the type-id may be a cv-qualified type, in which case the object created by the new-expression has a cv-qualified type. —end note ] [..]
There's also a usage example given in [C++11: 7.1.6.1/4].
Not sure what else you expected. I can't say I've ever done this myself, but I don't see any particular reason not to. There's probably some tech sociologist who can tell you statistics on how rarely we dynamically allocate something only to treat it as non-mutable.
My way of looking at this is:
X and const X and pointers to them are distinct types
there is an implicit conversion from X* to const X*, but not the other way around
therefore the following are legal and the x in each case has identical type and behaviour
const X* x = new X;
const X* x = new const X;
The only remaining question is whether a different allocator might be called in the second case (perhaps in read only memory). The answer is no, there is no such provision in the standard.

To what extent is C++ a statically-typed language?

I used to think that the answer to this question was "100%", but I've recently been pointed to an example that makes it worth thinking twice. Consider a C array declared as an object with automatic storage duration:
int main()
{
int foo[42] = { 0 };
}
Here, the type of foo is clearly int[42]. Consider, instead, this case:
int main()
{
int* foo = new int[rand() % 42];
delete[] foo;
}
Here, the type of foo is int*, but how can one tell the type of the object created by the new expression at compile-time? (Emphasis is meant to stress the fact that I am not talking about the pointer returned by the new expression, but rather about the array object created by the new expression).
This is what Paragraph 5.3.4/1 of the C++11 Standard specifies about the result of a new expression:
[...] Entities created by a new-expression have dynamic storage duration (3.7.4). [ Note: the lifetime of such
an entity is not necessarily restricted to the scope in which it is created. —end note ] If the entity is a non-array
object, the new-expression returns a pointer to the object created. If it is an array, the new-expression
returns a pointer to the initial element of the array.
I used to think that in C++ the type of all objects is determined at compile-time, but the above example seems to disprove that belief. Also, per Paragraph 1.8/1:
[...] The properties of an object are determined when the object
is created. An object can have a name (Clause 3). An object has a storage duration (3.7) which influences
its lifetime (3.8). An object has a type (3.9). [...]
So my questions are:
What is meant by "properties" in the last quoted paragraph? Clearly, the name of an object cannot count as something which is determined "when the object is created"- unless "created" here means something different than I think;
Are there other examples of objects whose type is determined only at run-time?
To what extent is it correct to say that C++ is a statically-typed language? Or rather, what is the most proper way of classifying C++ in this respect?
It would be great if anybody could elaborate at least on one of the above points.
EDIT:
The Standard seems to make it clear that the new expression does indeed create an array object, and not just several objects laid out as an array as pointed out by some. Per Paragraph 5.3.4/5 (courtesy of Xeo):
When the allocated object is an array (that is, the noptr-new-declarator syntax is used or the new-type-id or
type-id denotes an array type), the new-expression yields a pointer to the initial element (if any) of the array.
[ Note: both new int and new int[10] have type int* and the type of new int[i][10] is int (*)[10]
—end note ] The attribute-specifier-seq in a noptr-new-declarator appertains to the associated array type.
The new-expression doesn't create an object with runtime-varying array type. It creates many objects, each of static type int. The number of these objects is not known statically.
C++ provides two cases (section 5.2.8) for dynamic type:
Same as the static type of the expression
When the static type is polymorphic, the runtime type of the most-derived object
Neither of these gives any object created by new int[N] a dynamic array type.
Pedantically, evaluation of the new-expression creates an infinite number of overlapping array objects. From 3.8p2:
[ Note: The lifetime of an array object starts as soon as storage with proper size and alignment is obtained, and its lifetime ends when the storage which the array occupies is reused or released. 12.6.2 describes the lifetime of base and member subobjects. — end note ]
So if you want to talk about the "array object" created by new int[5], you have to give it not only type int[5] but also int[4], int[1], char[5*sizeof(int)], and struct s { int x; }[5].
I submit that this is equivalent to saying that array types do not exist at runtime. The type of an object is supposed to be restrictive, information, and tell you something about its properties. Allowing a memory area to be treated as an infinite number of overlapping array objects with different type in effect means that the array object is completely typeless. The notion of runtime type only makes sense for the element objects stored within the array.
The terms 'static type' and 'dynamic type' apply to expressions.
static type
type of an expression (3.9) resulting from analysis of the program without considering execution semantics
dynamic type
<glvalue> type of the most derived object (1.8) to which the glvalue denoted by a glvalue expression refers
Additionally, you can see that a dynamic type only differs from a static type when the static type can be derived from, which means a dynamic array type is always the same as the expression's static type.
So your question:
but how can one tell the type of the object created by the new expression at compile-time?
Objects have types, but they're not 'static' or 'dynamic' types absent an expression that refers to the object. Given an expression, the static type is always known at compile time. In the absence of derivation the dynamic type is the same as the static type.
But you're asking about objects' types independent of expressions. In the example you give you've asked for an object to be created but you don't specify the type of object you want to have created at compile time. You can look at it like this:
template<typename T>
T *create_array(size_t s) {
switch(s) {
case 1: return &(*new std::array<T, 1>)[0];
case 2: return &(*new std::array<T, 2>)[0];
// ...
}
}
There's little special or unique about this. Another possibility is:
struct B { virtual ~B() {}};
struct D : B {};
struct E : B {};
B *create() {
if (std::bernoulli_distribution(0.5)(std::default_random_engine())) {
return new D;
}
return new E;
}
Or:
void *create() {
if (std::bernoulli_distribution(0.5)(std::default_random_engine())) {
return reinterpret_cast<void*>(new int);
}
return reinterpret_cast<void*>(new float);
}
The only difference with new int[] is that you can't see into its implementation to see it selecting between different types of objects to create.
I used to think that in C++ the type of all objects is determined at compile-time, but the above example seems to disprove that belief.
The example you cite is talking about storage duration of the item. C++ recognizes three storage durations:
Static storage duration is the duration of global and local static variables.
Automatic storage duration is the duration for "stack allocated" function-local variables.
Dynamic storage duration is the duration for dynamically allocated memory such as that with new or malloc.
The use of the word "dynamic" here has nothing to do with the object's type. It refers to how an implementation must store the data that makes up an object.
I used to think that in C++ the type of all objects is determined at compile-time, but the above example seems to disprove that belief.
In your example, there is one variable, which has type int*. There is not an actual array type for the underlying array which can be recovered in any meaningful way to the program. There is no dynamic typing going on.