Is this use of std::array undefined behavior? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Aliasing `T*` with `char*` is allowed. Is it also allowed the other way around?
I'm using a std::array of chars to hold a value of unknown primitive type, which is no more than 10 bytes long, like so:
std::array<char, 10> val;
*reinterpret_cast<double*>(val.data()) = 6.3;
//blah blah blah...
double stuff = *reinterpret_cast<double*>(val.data());
I have read that casting back and forth through char * is not undefined, because the compiler assumes a char * may alias a value of any type. Does this still work when the value is placed in (what I assume is) an array of chars inside the object?
Note: I am aware that I could be using a union here, but that would result in a large amount of boilerplate code for what I am doing, and I would like to avoid it if necessary, hence the question.

Yes, std::array< char, 10 > does not satisfy the alignment requirements of double so that reinterpret_cast provokes UB.
Try std::aligned_storage instead.

It doesn't matter what the array is contained in.
The standard does not even consider what surrounds something (it's that basic), but does support conversion to/from char sequences.
To do this directly via reinterpret_cast and assignment, you need to have the buffer correctly aligned.
An alternative is to use memcpy, which doesn't care about alignment.
On a related issue, it's generally not a good idea to go down to the binary level. For example, a simple version change of the compiler might make a file of binary-serialized data inaccessible. A main driver for doing this anyway is raw performance considerations.

Related

Use cases of std::byte [duplicate]

This question already has an answer here:
What is the purpose of std::byte?
(1 answer)
Closed 5 years ago.
The recent addition of std::byte to C++17 got me wondering why this type was even added to the standard at all. Even after reading the cppreference reference it's use cases don't seem clear to me.
The only use case I can come up with is that it more clearly expresses intent, as std::byte should only be treated as a collection of bits instead of a character type such as char which we used for both purposes before.
Meaning that:
this:
std::vector<std::byte> memory;
Is more clear than this:
std::vector<char> memory;
Is this the only use case and reason it was added to the standard or am I missing a big point here?
The only use case I can come up with is that it more clearly expresses intent
I think it was one of the reasons. This paper explains the motivation behind std::byte and compares its usage with the usage of char:
Motivation and Scope
Many programs require byte-oriented access to
memory. Today, such programs must use either the char, signed char, or
unsigned char types for this purpose. However, these types perform a
“triple duty”. Not only are they used for byte addressing, but also as
arithmetic types, and as character types. This multiplicity of roles
opens the door for programmer error – such as accidentally performing
arithmetic on memory that should be treated as a byte value – and
confusion for both programmers and tools. Having a distinct byte type
improves type-safety, by distinguishing byte-oriented access to memory
from accessing memory as a character or integral value. It improves
readability.
Having the type would also make the intent of code
clearer to readers (as well as tooling for understanding and
transforming programs). It increases type-safety by removing
ambiguities in expression of programmer’s intent, thereby increasing
the accuracy of analysis tools.
Another reason is that std::byte is restricted in terms of operations which can be performed on this type:
Like char and unsigned char, it can be used to access raw memory
occupied by other objects (object representation), but unlike those
types, it is not a character type and is not an arithmetic type. A
byte is only a collection of bits, and only bitwise logic operators
are defined for it.
which ensures an additional type safety as it is mentioned in the paper above.

How does type conversion affect the memory accessed by the program and its variables? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
If I have the following code:
int variable = 65;
double variable2 = 54.34;
double sum = (double)variable + variable2;
So, in this case there will be 2 bytes allocated to variable. Again, if the variable is typecasted into a double variable. So it will be assigned an additional 8 bytes for its representation.
Pertaining to this, my question is that, will the variable, "variable" be assigned 10 bytes of memory, or am I getting something wrong?
(double)variable is an rvalue, so it doesn't formally have any storage.
At a level much lower than C++, the hardware platform's implementation of "addition of two doubles" might mean that it has to put a copy of variable in the bit-format of a double somewhere in memory so it can be added.
Probably the double bit-format of variable will end up in a the register of an FPU (Floating Point Unit) instead and so will never occupy "memory".
Either way, you can't get at that memory with C++. You would need to tell the compiler to put it somewhere, i.e. make it an lvalue, or use it in a way where the C++ standard requires that an an lvalue be created, e.g., passing it to a function
That's probably also true of the int bit-format of variable and of variable2, which (even though lvalues) the compiler will hopefully shove them off into (CPU or FPU) registers if you never use the lvalueness, so they won't appear in memory either.
The size of int is actually compiler specific, so you can't make any size assumptions about it unless you only work with one compiler. I would strongly recommend #including <cstdint> and using the u/intX_t types(i.e. int16_t, int32_t, int64_t) so you know the size of the type you're dealing with. They're aliases that map to whatever built-in types are necessary to achieve a given minimum size.
Cast behavior can vary depending on the type. In general all casts perform a conversion of some kind. It helps to think of casting as "converting", where the conversion rules depend on the type. Unless you're dealing with a custom cast operator, the conversion is going to promote, demote, transform, or reinterpret the bytes of the thing being casted from. Promotions (i.e. short => int) extend the bits depending on if it's signed or unsigned. Demotions (i.e. int => short) trim off the higher bits, leaving you with the low part. Transformations (i.e. int => double) convert from one format to another. Reinterpretations (i.e. int* => double*) treat the bits as if it were the casted to type.
Your code converts an int to a double and adds it to another double. This has the same effect as if you had left out the cast, because int is implicitly casted to double when used in an arithmetic expression like that.

Possible to force memory alignment on pointer param in C?

I have a function in C which takes a uint8_t * param, which must point to 32-bit aligned memory. Is it possible in C or C++, or with any particular platform's macros, to add some decoration to the parameter, such that the compiler or linker will throw an error at build time if it is not aligned as required?
The idea here is that I want to protect the function against improper use by other users (or me in 6 months). I know how to align the stuff I want to pass to it. I would like to ensure that no one can pass misaligned stuff to it.
Based on this answer, I think the answer to my question is "no", it's not possible to enforce this at build time, but it seems like a useful feature, so I thought I'd check. My work-around is to put assert((((size_t)ptr) % 4) == 0); in the function, so at least I could trap it at runtime when debugging.
In my experience, results are undefined if you cast a misaligned uint8_t* to uint32_t* on many embedded platforms, so I don't want to count on the "correct" result coming out in the end. Plus this is being used on a realtime system, so a slowdown may not be acceptable.
Citations welcome, if there are any.
No, there's nothing in the C or C++ standards that I know of that can force a pointer parameter to hold an appropriate value.
To get the memory, use posix_memalign:
#include <stdlib.h>
int posix_memalign(void **memptr, size_t alignment, size_t size);
DESCRIPTION
The posix_memalign() function shall allocate size bytes aligned on a
boundary specified by alignment, and shall return a pointer to the
allocated memory in memptr. The value of alignment shall be a power of
two multiple of sizeof(void *).
Upon successful completion, the value pointed to by memptr shall be a
multiple of alignment.
For dynamic allocation, have a look at the standard (since C11) aligned_alloc.
For static allocation, I don't know of a standard method, so it'll be compiler dependent. For gcc eg., check the aligned attribute.

C++ Convert Address of Memory To Value? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
In C++, using the iostream, you can print a variable's memory address. For example:
std::cout << &variable << std::endl;
// Example Result: 002AFD84
However, what if I wanted to store this memory address into a variable? Such as converting the memory address to a string or double(or int, etc.)? Or even convert that string or double(or int, etc.) back again, to a memory address?
I'd like to do this for various reasons, one being: This would allow me to return the memory address for data within a DLL to a program calling the DLL and it's functions. On top of this, I won't have to keep track of the data itself within the DLL, since the data could then be referenced by it's memory address.
I cannot use pointers in this particular situation due to constraints. The constraints being: The interpreted programming language I am using does not have access to pointers. Due to this, pointers cannot be used to reference the data outside of the DLL.
As a side question, what number format do memory addresses use? They seem to always seems to be 8 characters in length, but I can't figure out what format this is.
To convert a pointer into a string representation, you can use a string stream. These are similar to the standard I/O streams std::cin and std::cout, but write to or read from a string rather than performing I/O.
std::ostringstream oss;
oss << &variable;
std::string address = oss.str();
To convert a pointer into an integer that represents the same address, use reinterpret_cast. The type uintptr_t, if it exists, is guaranteed to be large enough to hold any pointer value. But I think usually it suffices to use unsigned long.
unsigned long address = reinterpret_cast<unsigned long>(&variable);
Converting a pointer into a floating-point type seems fairly useless. You would have to convert into an integral type first, then convert to a floating-point type from there.
If you absolutely need to do this I suggest being selective about what types you convert to. Arbitrarily converting pointers to non-pointer types can be problematic and introduce problems that are difficult to detect. This is especially true if you are using reinterpret_cast to perform the conversions. One of the more common issues is the size of the destination type between various platforms. When you use something like reinterpret_cast you typically don't get warnings about loss of precision during the conversion.
For situations require you to convert a pointer to an integral type I suggest wrapping these conversion in a function template. This will allow you a bit of flexibility in performing the conversion and can perform compile-time size checks to ensure the destination type is large enough to hold the pointer.
Something like the code below might be helpful.
template<class DestType, class SourceType>
DestType bubblicious_value_cast(const SourceType& src)
{
static_assert(sizeof(DestType) >= sizeof(SourceType), "Destination size is too small");
return reinterpret_cast<DestType>(src);
}
int main()
{
void* ptr = nullptr;
int val = bubblicious_value_cast<int>(ptr);
}
You can use reinterpret_cast like this:
uintptr_t address = reinterpret_cast<uintptr_t>(&variable);
In 64-bit (or 32-bit) environment memory address has 64-bit (32-bit) length, respectively.

Are boolean variables typically implemented as single bits? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
One-byte bool. Why?
I want to add a boolean variable to a class. However, this class is pretty size-sensitive, and as a result I'm loath to add another field. However, it is composed of a pile of members that are at least a char wide, and a single other bool.
If I were hand-writing this code, I would implement those boolean fields as bits in the last byte or so of the object. Since accesses have to be byte-aligned, this would cause no spacial overhead.
Now, do compilers typically do this trick? The only reason I can of for them not to is because it would involve an additional mask to get that bit out of there.
No, compilers can't do this trick because the address of each member has to be distinct. If you want to pack a fixed number of bits, use std::bitset. If you need a variable number of bits use boost::dynamic_bitset.
No, I don't know of any compilers which optimize a bool down to a bit.
You can force this behavior via:
unsigned int m_firstBit : 1;
unsigned int m_secondBit : 1;
unsigned int m_thirdBit : 1;
As for reasons why not, it would likely violate some language guarantees. For instance, you couldn't pass &myBool to a function which takes a bool* if it doesn't have its own reserved byte.
Compilers typically do not do that, but you could use std::bitset<2> to pack two bools into one byte.