is it possible to share two arrays in a union like this:
struct
{
union
{
float m_V[Height * Length];
float m_M[Height] [Length];
} m_U;
};
Do these two arrays share the same memory size or is one of them longer?
Both arrays are required to have the same size and layout. Of course,
if you initialize anything using m_V, then all accesses to m_M are
undefined behavior; a compiler might, for example, note that nothing in
m_V has changed, and return an earlier value, even though you've
modifed the element through m_M. I've actually used a compiler which
did so, in the distant past. I would avoid accesses where the union
isn't visible, say by passing a reference to m_V and a reference to
m_M to the same function.
It is implicitly guaranteed that these will be the same size in memory. The compiler is not allowed to insert padding anywhere in either the 2D array or the 1D array, because everything must be compatible with sizeof.
[Of course, if you wrote to m_V and read from m_M (or vice versa), you'd still be type-punning, which technically invokes undefined behaviour. But that's a different matter.]
Related
The answers I got for this question until now has two exactly the opposite kinds of answers: "it's safe" and "it's undefined behaviour". I decided to rewrite the question in whole to get some better clarifying answers, for me and for anyone who might arrive here via Google.
Also, I removed the C tag and now this question is C++ specific
I am making an 8-byte-aligned memory heap that will be used in my virtual machine. The most obvious approach that I can think of is by allocating an array of std::uint64_t.
std::unique_ptr<std::uint64_t[]> block(new std::uint64_t[100]);
Let's assume sizeof(float) == 4 and sizeof(double) == 8. I want to store a float and a double in block and print the value.
float* pf = reinterpret_cast<float*>(&block[0]);
double* pd = reinterpret_cast<double*>(&block[1]);
*pf = 1.1;
*pd = 2.2;
std::cout << *pf << std::endl;
std::cout << *pd << std::endl;
I'd also like to store a C-string saying "hello".
char* pc = reinterpret_cast<char*>(&block[2]);
std::strcpy(pc, "hello\n");
std::cout << pc;
Now I want to store "Hello, world!" which goes over 8 bytes, but I still can use 2 consecutive cells.
char* pc2 = reinterpret_cast<char*>(&block[3]);
std::strcpy(pc2, "Hello, world\n");
std::cout << pc2;
For integers, I don't need a reinterpret_cast.
block[5] = 1;
std::cout << block[5] << std::endl;
I'm allocating block as an array of std::uint64_t for the sole purpose of memory alignment. I also do not expect anything larger than 8 bytes by its own to be stored in there. The type of the block can be anything if the starting address is guaranteed to be 8-byte-aligned.
Some people already answered that what I'm doing is totally safe, but some others said that I'm definitely invoking undefined behaviour.
Am I writing correct code to do what I intend? If not, what is the appropriate way?
The global allocation functions
To allocate an arbitrary (untyped) block of memory, the global allocation functions (§3.7.4/2);
void* operator new(std::size_t);
void* operator new[](std::size_t);
Can be used to do this (§3.7.4.1/2).
§3.7.4.1/2
The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall return the address of the start of a block of storage whose length in bytes shall be at least as large as the requested size. There are no constraints on the contents of the allocated storage on return from the allocation function. The order, contiguity, and initial value of storage allocated by successive calls to an allocation function are unspecified. The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type with a fundamental alignment requirement (3.11) and then used to access the object or array in the storage allocated (until the storage is explicitly deallocated by a call to a corresponding deallocation function).
And 3.11 has this to say about a fundamental alignment requirement;
§3.11/2
A fundamental alignment is represented by an alignment less than or equal to the greatest alignment supported by the implementation in all contexts, which is equal to alignof(std::max_align_t).
Just to be sure on the requirement that the allocation functions must behave like this;
§3.7.4/3
Any allocation and/or deallocation functions defined in a C++ program, including the default versions in the library, shall conform to the semantics specified in 3.7.4.1 and 3.7.4.2.
Quotes from C++ WD n4527.
Assuming the 8-byte alignment is less than the fundamental alignment of the platform (and it looks like it is, but this can be verified on the target platform with static_assert(alignof(std::max_align_t) >= 8)) - you can use the global ::operator new to allocate the memory required. Once allocated, the memory can be segmented and used given the size and alignment requirements you have.
An alternative here is the std::aligned_storage and it would be able to give you memory aligned at whatever the requirement is.
typename std::aligned_storage<sizeof(T), alignof(T)>::type buffer[100];
From the question, I assume here that the both the size and alignment of T would be 8.
A sample of what the final memory block could look like is (basic RAII included);
struct DataBlock {
const std::size_t element_count;
static constexpr std::size_t element_size = 8;
void * data = nullptr;
explicit DataBlock(size_t elements) : element_count(elements)
{
data = ::operator new(elements * element_size);
}
~DataBlock()
{
::operator delete(data);
}
DataBlock(DataBlock&) = delete; // no copy
DataBlock& operator=(DataBlock&) = delete; // no assign
// probably shouldn't move either
DataBlock(DataBlock&&) = delete;
DataBlock& operator=(DataBlock&&) = delete;
template <class T>
T* get_location(std::size_t index)
{
// https://stackoverflow.com/a/6449951/3747990
// C++ WD n4527 3.9.2/4
void* t = reinterpret_cast<void*>(reinterpret_cast<unsigned char*>(data) + index*element_size);
// 5.2.9/13
return static_cast<T*>(t);
// C++ WD n4527 5.2.10/7 would allow this to be condensed
//T* t = reinterpret_cast<T*>(reinterpret_cast<unsigned char*>(data) + index*element_size);
//return t;
}
};
// ....
DataBlock block(100);
I've constructed more detailed examples of the DataBlock with suitable template construct and get functions etc., live demo here and here with further error checking etc..
A note on the aliasing
It does look like there are some aliasing issues in the original code (strictly speaking); you allocate memory of one type and cast it to another type.
It may probably work as you expect on your target platform, but you cannot rely on it. The most practical comment I've seen on this is;
"Undefined behaviour has the nasty result of usually doing what you think it should do, until it doesn’t” - hvd.
The code you have probably will work. I think it is better to use the appropriate global allocation functions and be sure that there is no undefined behaviour when allocating and using the memory you require.
Aliasing will still be applicable; once the memory is allocated - aliasing is applicable in how it is used. Once you have an arbitrary block of memory allocated (as above with the global allocation functions) and the lifetime of an object begins (§3.8/1) - aliasing rules apply.
What about std::allocator?
Whilst the std::allocator is for homogenous data containers and what your are looking for is akin to heterogeneous allocations, the implementation in your standard library (given the Allocator concept) offers some guidance on raw memory allocations and corresponding construction of the objects required.
Update for the new question:
The great news is there's a simple and easy solution to your real problem: Allocate the memory with new (unsigned char[size]). Memory allocated with new is guaranteed in the standard to be aligned in a way suitable for use as any type, and you can safely alias any type with char*.
The standard reference, 3.7.3.1/2, allocation functions:
The pointer returned shall be suitably aligned so that it can be
converted to a pointer of any complete object type and then used to
access the object or array in the storage allocated
Original answer for the original question:
At least in C++98/03 in 3.10/15 we have the following which pretty clearly makes it still undefined behavior (since you're accessing the value through a type that's not enumerated in the list of exceptions):
If a program attempts to access the stored value of an object through
an lvalue of other than one of the following types the behavior is
undefined):
— the dynamic type of the object,
— a cvqualified version of the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a cvqualified version of the dynamic type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
— a type that is a (possibly cvqualified) base class type of the dynamic type of the object,
— a char or unsigned char type.
pc pf and pd are all different types that access memory specified in block as uint64_t, so for say 'pf the shared types are float and uint64_t.
One would violate the strict aliasing rule were once to write using one type and read using another since the compile could we reorder the operations thinking there is no shared access. This is not your case however, since the uint64_t array is only used for assignment, it is exactly the same as using alloca to allocate the memory.
Incidentally there is no issue with the strict aliasing rule when casting from any type to a char type and visa versa. This is a common pattern used for data serialization and deserialization.
A lot of discussion here and given some answers that are slightly wrong, but making up good points, I just try to summarize:
exactly following the text of the standard (no matter what version) ... yes, this is undefined behaviour. Note the standard doesn't even have the term strict aliasing -- just a set of rules to enforce it no matter what implementations could define.
understanding the reason behind the "strict aliasing" rule, it should work nicely on any implementation as long as neither float or double take more than 64 bits.
the standard won't guarantee you anything about the size of float or double (intentionally) and that's the reason why it is that restrictive in the first place.
you can get around all this by ensuring your "heap" is an allocated object (e.g. get it with malloc()) and access the aligned slots through char * and shifting your offset by 3 bits.
you still have to make sure that anything you store in such a slot won't take more than 64 bits. (that's the hard part when it comes to portability)
In a nutshell: your code should be safe on any "sane" implementation as long as size constraints aren't a problem (means: the answer to the question in your title is most likely no), BUT it's still undefined behaviour (means: the answer to your last paragraph is yes)
I'll make it short: All your code works with defined semantics if you allocate the block using
std::unique_ptr<char[], std::free>
mem(static_cast<char*>(std::malloc(800)));
Because
every type is allowed to alias with a char[] and
malloc() is guaranteed to return a block of memory sufficiently aligned for all types (except maybe SIMD ones).
We pass std::free as a custom deleter, because we used malloc(), not new[], so calling delete[], the default, would be undefined behaviour.
If you're a purist, you can also use operator new:
std::unique_ptr<char[]>
mem(static_cast<char*>(operator new[](800)));
Then we don't need a custom deleter. Or
std::unique_ptr<char[]> mem(new char[800]);
to avoid the static_cast from void* to char*. But operator new can be replaced by the user, so I'm always a bit wary of using it. OTOH; malloc cannot be replaced (only in platform-specific ways, such as LD_PRELOAD).
Yes, because the memory locations pointed to by pf could overlap depending on the size of float and double. If they didn't, then the results of reading *pd and *pf would be well defined but not the results of reading from block or pc.
The behavior of C++ and the CPU are distinct. Although the standard provides memory suitable for any object, the rules and optimizations imposed by the CPU make the alignment for any given object "undefined" - an array of short would reasonably be 2 byte aligned, but an array of a 3 byte structure may be 8 byte aligned. A union of all possible types can be created and used between your storage and the usage to ensure no alignment rules are broken.
union copyOut {
char Buffer[200]; // max string length
int16 shortVal;
int32 intVal;
int64 longIntVal;
float fltVal;
double doubleVal;
} copyTarget;
memcpy( copyTarget.Buffer, Block[n], sizeof( data ) ); // move from unaligned space into union
// use copyTarget member here.
If you tag this as C++ question,
(1) why use uint64_t[] but not std::vector?
(2) in term of memory management, your code lack of management logic, which should keep track of which blocks are in use and which are free and the tracking of contiguoous blocks, and of course the allocate and release block methods.
(3) the code shows an unsafe way of using memory. For example, the char* is not const and therefore the block can be potentially be written to and overwrite the next block(s). The reinterpret_cast is consider danger and should be abstract from the memory user logic.
(4) the code doesn't show the allocator logic. In C world, the malloc function is untyped and in C++ world, the operator new is typed. You should consider something like the new operator.
According to the standard, in [expr.sizeof] (5.3.3.2) we get:
When applied to a reference or a reference type, the result is the size of the referenced type.
This seems to go along with the fact that references are unspecified [dcl.ref] (8.3.2.4):
It is unspecified whether or not a reference requires storage
But it seems pretty strange to me to have this kind of inconsistency within the language. Regardless of whether or not the reference requires storage, wouldn't it be important to be able to determine how much size the reference uses? Seeing these results just seems wrong:
sizeof(vector<int>) == 24
sizeof(vector<int>*) == 8
sizeof(vector<int>&) == 24
sizeof(reference_wrapper<vector<int>>) == 8
What is the reasoning behind wanting sizeof(T&) == sizeof(T) by definition?
The choice is somewhat arbitrary, and trying to fully justify either option will lead to circular metaphysical arguments.
The intent of a reference is to be (an alias for) the object itself; under that reasoning it makes sense for them both to have the same size (and address), and that is what the language specifies.
The abstraction is leaky - sometimes a reference has its own storage, separate from the object - leading to anomolies like those you point out. But we have pointers for when we need to deal with a "reference" as a separate entity to the object.
Argument 1: A reference should be a synonym of your object hence the interface of the reference should be exactly the same as the interface of the object, also all operators should work in the same way on object and on reference (except type operators).
It will make sense in the following code:
MyClass a;
MyClass& b = a;
char a_buf[sizeof(a)];
char b_buf[sizeof(b)]; // you want b_buf be the same size as a_buf
memcpy(&a, a_buf, sizeof(a));
memcpy(&b, b_buf, sizeof(b)); // you want this line to work like the above line
Argument 2: From C++ standard's point of view references are something under the hood and it even doesn't say if they occupy memory or not, so it cannot say how to get their size.
How to get reference size: Since by all compilers references are implemented by help of constant pointers and they occupy memory, there is a way to know their size.
class A_ref
{A& ref;}
sizeof(A_ref);
It's not particularly important to know how much storage a reference requires, only the change in storage requirements caused by adding a reference. And that you can determine:
struct with
{
char c;
T& ref;
};
struct without
{
char c;
};
return sizeof (with) - sizeof (without);
I want to access continuously declared member arrays of the same type with a single pointer.
So for example say I have :
struct S
{
double m1[2];
double m2[2];
}
int main()
{
S obj;
double *sp = obj.m1;
// Code that maybe unsafe !!
for (int i(0); i < 4; ++i)
*(sp++) = i; // *
return 0;
}
Under what circumstances is line (*) problematic ?
I know there's for sure a problem when virtual functions are present but I need a more structured answer than my assumptions
You can be sure that the members of the struct are stored in a contiguos block of bytes, in the order they appear. Besides, the elements of the arrays are contiguous. So, it seems that everything is fine.
The problem here is that there is no standard way of knowing if there is padding bytes between consecutive members in the struct.
So, it is unsafe to assume that there is not padding bytes at all.
If you can be plenty sure, for some particular reason, that there are not padding bytes, then the 4 double elements will be contiguous, as you want.
The C++ standard makes certain guarantees about the layout of "plain old data" (or in C++11, standard layout) types. For the most part, these inherit from how C treated such data.
What follows only applies to "plain old data"/"standard layout" structures and data.
If you have two structs with the same initial order and type of arguments, casting a pointer to one to a pointer to the other and accessing their common initial prefix is valid, and will access the corresponding field. This is known as "layout compatible". This also applies if you have a structure X and a structure Y, and X is the first element of the structure Y -- a pointer to Y can be cast to a pointer to X, and it will access the fields of the X substructure in Y.
Now, while it is a common assumption, I am unaware of a requirement of either C or C++ that an array and a structure starting with fields of the same type and count are layout compatible with an array.
Your case is somewhat similar, in that we have two arrays adjacent to each other in a structure, and you are treating it as one large array of size equal to the sum of those two arrays size. It is a relatively common and safe assumption that it works, but I am unaware of a guarantee in the standard that it actually works.
In this kind of undefined behavior, you have to examine your particular compilers guarantees (de facto or explicit) about layout of plain old data/standard layout data, as the C++ standard does not guarantee your code does what you want it to do.
Is there a difference between pointer to integer-pointer (int**) and pointer to character-pointer (char**), and any other case of pointer to pointer?
Isn't the memory block size for any pointer is the same, so the sub-datatype doesn't play a role in here?
Is it just a semantic distinction with no other significance?
Why not to use just void**?
Why should we use void** when you want a pointer to a char *? Why should we not use char **?
With char **, you have type safety. If the pointer is correctly initialized and not null, you know that by dereferencing it once you get a valid char * - and by dereferencing that pointer, in turn, you get a char.
Why should you ignore this advantage in type safety, and instead play pointer Russian roulette with void**?
The difference is in type-safety. T** implicitly interprets the data as T. void**, however, needs to be manually casted first. And no, pointers are not all 4 / 8 bytes on 32 / 64bit architectures respectively. Member function pointers, for instance, contain offset information too, which needs to be stored in the pointer itself (in the most common implementation).
Most C implementations use the same size and format for all pointers, but this is not required by the C standard.
Some machines do not have byte addressing, so the C implementation implements it by using shifts and other operations. In these implementations, pointers to larger types, such as int, may be normal addresses, but pointers to char would have to have both a machine address and a byte-within-word offset.
Additionally, C makes use of the type information for a variety of purposes, including reducing mistakes made by programmers (possibly giving warnings or errors when you attempt to use a pointer to int where a pointer to float is needed) and optimization. Regarding optimization, consider this example:
void foo(float *array, int *limit)
{
for (int i = 0; i < *limit; ++i)
array[i] = <some calculation>;
}
The C standard says a compiler may use the fact that array and limit are pointers to different types to conclude that they do not overlap. Given this rule, the C implementation may evaluate *limit once when the loop starts, because it knows it will not change during the loop. Without this rule, the compiler would have to assume that one of the assignments to array[i] might change *limit, and it would have to load *limit from memory in each iteration.
I want to parse UTF-8 in C++. When parsing a new character, I don't know in advance if it is an ASCII byte or the leader of a multibyte character, and also I don't know if my input string is sufficiently long to contain the remaining characters.
For simplicity, I'd like to name the four next bytes a, b, c and d, and because I am in C++, I want to do it using references.
Is it valid to define those references at the beginning of a function as long as I don't access them before I know that access is safe? Example:
void parse_utf8_character(const string s) {
for (size_t i = 0; i < s.size();) {
const char &a = s[i];
const char &b = s[i + 1];
const char &c = s[i + 2];
const char &d = s[i + 3];
if (is_ascii(a)) {
i += 1;
do_something_only_with(a);
} else if (is_twobyte_leader(a)) {
i += 2;
if (is_safe_to_access_b()) {
do_something_only_with(a, b);
}
}
...
}
}
The above example shows what I want to do semantically. It doesn't illustrate why I want to do this, but obviously real code will be more involved, so defining b,c,d only when I know that access is safe and I need them would be too verbose.
There are three takes on this:
Formally
well, who knows. I could find out for you by using quite some time on it, but then, so could you. Or any reader. And it's not like that's very practically useful.
EDIT: OK, looking it up, since you don't seem happy about me mentioning the formal without looking it up for you. Formally you're out of luck:
N3280 (C++11) §5.7/5 “If both the pointer operand and the result point to elements of the same array object, or one past
the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.”
Two situations where this can produce undesired behavior: (1) computing an address beyond the end of a segment, and (2) computing an address beyond an array that the compiler knows the size of, with debug checks enabled.
Technically
you're probably OK as long as you avoid any lvalue-to-rvalue conversion, because if the references are implemented as pointers, then it's as safe as pointers, and if the compiler chooses to implement them as aliases, well, that's also ok.
Economically
relying needlessly on a subtlety wastes your time, and then also the time of others dealing with the code. So, not a good idea. Instead, declare the names when it's guaranteed that what they refer to, exists.
Before going into the legality of references to unaccessible memory, you have another problem in your code. Your call to s[i+x] might call string::operator[] with a parameter bigger then s.size(). The C++11 standard says about string::operator[] ([string.access], §21.4.5):
Requires: pos <= size().
Returns: *(begin()+pos) if pos < size(), otherwise a reference to an object of type T with value charT(); the referenced value shall not be modified.
This means that calling s[x] for x > s.size() is undefined behaviour, so the implementation could very well terminate your program, e.g. by means of an assertion, for that.
Since string is now guaranteed to be continous, you could go around that problem using &s[i]+x to get an address. In praxis this will probably work.
However, strictly speaking doing this is still illegal unfortunately. The reason for this is that the standard allows pointer arithmetic only as long as the pointer stays inside the same array, or one past the end of the array. The relevant part of the (C++11) standard is in [expr.add], §5.7.5:
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
Therefore generating references or pointers to invalid memory locations might work on most implementations, but it is technically undefined behaviour, even if you never dereference the pointer/use the reference. Relying on UB is almost never a good idea , because even if it works for all targeted systems, there are no guarantees about it continuing to work in the future.
In principle, the idea of taking a reference for a possibly illegal memory address is itself perfectly legal. The reference is only a pointer under the hood, and pointer arithmetic is legal until dereferencing occurs.
EDIT: This claim is a practical one, not one covered by the published standard. There are many corners of the published standard which are formally undefined behaviour, but don't produce any kind of unexpected behaviour in practice.
Take for example to possibility of computing a pointer to the second item after the end of an array (as #DanielTrebbien suggests). The standard says overflow may result in undefined behaviour. In practice, the overflow would only occur if the upper end of the array is just short of the space addressable by a pointer. Not a likely scenario. Even when if it does happen, nothing bad would happen on most architectures. What is violated are certain guarantees about pointer differences, which don't apply here.
#JoSo If you were working with a character array, you can avoid some of the uncertainty about reference semantics by replacing the const-references with const-pointers in your code. That way you can be certain no compiler will alias the values.