Casting a reference to a pointer to a reference to a void* - c++

Is the following well defined behavior?
#include <cstdlib>
#include <iostream>
void reallocate_something(void *&source_and_result, size_t size) {
void *dest = malloc(size);
memcpy(dest, source_and_result, size);
free(source_and_result);
source_and_result = dest;
}
void reallocate_something(int *&source_and_result, size_t size) {
// I the cast safe in this use case?
reallocate_something(reinterpret_cast<void*&>(source_and_result), size);
}
int main() {
const size_t size = 4 * sizeof(int);
int *start = static_cast<int*>(malloc(size));
*start = 0;
std::cout << start << ' ' << *start << '\n';
reallocate_something(start, size);
std::cout << start << ' ' << *start << '\n';
return 0;
}
The code uses a reinterpret_cast to pass a reference to a pointer and re-allocate it, free it, and set it to the new area allocated. Is this well defined?
In particular A static_cast would work if I did not want to pass a reference, and that would be well defined.
The tag is C++, and I'm asking about this code as-is within the C++ standard.

Is the following well defined behavior?
No, it's not. You can't interpret int * pointer with void * handle, int and void are not similar types. You can convert an int * pointer to void * and back. If your function takes a reference, to do the conversion you need a new temporary variable of type void * to hold the result of the conversion, and then you have to assign it back, like in the other answer https://stackoverflow.com/a/69641609/9072753 .
Anyway, just make it a template, and write nice C++ code with placement new. Something along:
template<typename T>
void reallocate_something(T *&pnt, size_t cnt) {
T *dest = reinterpret_cast<T *>(malloc(cnt * sizeof(T)));
if (dest == NULL) throw ...;
for (size_t i = 0; i < cnt; ++i) {
new (dest[i]) T(pnt[i]);
}
free(static_cast<void*>(pnt));
pnt = dest;
}

Actually I'm not sure but I feel this is the correct way to do.
#include <iostream>
#include <cstring>
void reallocate_something(void *&source_and_result, size_t size) {
void *dest = malloc(size);
memcpy(dest, source_and_result, size);
free(source_and_result);
source_and_result = dest;
}
void reallocate_something(int *&source_and_result, size_t size) {
// Is the cast safe in this use case?
void *temp = static_cast<void*>(source_and_result);
reallocate_something(temp, size);
source_and_result = static_cast<int*>(temp);
}
int main() {
const size_t size = 4 * sizeof(int);
int *start = static_cast<int*>(malloc(size));
std::cout << start << ' ' << *start << '\n';
reallocate_something(start, size);
std::cout << start << ' ' << *start << '\n';
return 0;
}

This is not well defined for void* and int* are not similar. Refer to Type aliasing section here.
Note that pointer round trip via void* like below is well defined. Particularly, there is no type aliasing here.
T* pt = ...;
void* p = pt;
auto pt2 = static_cast<T*>(p);
assert(pt2 == pt);
This is different from following code with type aliasing which is not well defined.
T* pt = ...;
void* p = nullptr;
reinterpret_cast<T*&>(p) = pt; // or *reinterpret_cast<T**>(&p) = pt;
auto pt2 = static_cast<T*>(p);
assert(pt2 == pt);
It follows that your sample code can be revised as below.
void reallocate_something(int *&source_and_result, size_t size) {
void* p = source_and_result;
reallocate_something(p, size);
source_and_result = static_cast<int*>(p);
}
Or better yet
void* reallocate_something(void *source_and_result, size_t size) {
void *dest = malloc(size);
memcpy(dest, source_and_result, size);
free(source_and_result);
return dest;
}
void reallocate_something(int *&source_and_result, size_t size) {
source_and_result = static_cast<int*>(reallocate_something(source_and_result, size));
}

There exist platforms where the bitwise representations of int* and void* are incompatible. On such platforms, it would be often be impossible for a compiler to allow a reference of one type to meaningfully act upon an object of the other, and the Standard thus refrains from requiring that implementations do so.
Of course, the vast majority of platforms use the same representation for all PODS pointer types, and when the Standard was written it was obvious to pretty much everyone that (1) it allowed compilers for such platforms to process reference type casts usefully, and (2) compilers should process such casts usefully except when there was a compelling reason to do otherwise. It was expected that the only compiler writers who would care about whether such conversions had defined behavior would be those targeting platforms where such support would be expensive (e.g. requiring that any int* whose address is taken be stored using the same bit pattern as void*, even if that would require shuffling its bits around when using it to fetch an int), and compiler writers were expected know more about the costs and benefits of such support than the Committee ever could.
Most implementations can be configured to process such casts in the manner that would have been expected when the Standard was written, but the Standard does not mandate such support; when such configurations, the behavior of the construct should be regarded as defined by a popular language extension.

Related

Why does the following result in segmentation fault?

const int* additional(int* s, int* f){
const int* ts = reinterpret_cast<const int*>(*s + *f);
return ts;
}
int main() {
int a = 10, b = 20;
const int* oc = additional(&a, &b);
std::cout << *oc;
return 0;
}
I've tried using static, although it produces the same error
There are many things wrong with your code.
*s + *f is an int, not a pointer (you add the dereferenced values).
you are doing a reinterpret cast which isn't needed at all. Just pass the int's directly without pointers and you are good to go.
const int additional(int s, int f){
return s + f;
}
int main() {
int a = 10, b = 20;
const int oc = additional(a, b);
std::cout << oc;
return 0;
}
You reinterpret the number 30 as a pointer to const int and attempt to read through the reinterpreted pointer. The operating system noticed that the process was attempting to access an address wasn't mapped for the process and sent the segfault signal to terminate the process in order to protect the badly behaving process from itself.
Reinterpret casting is unsafe. Don't use it unless you know what you're doing. And when you know what you're doing, you'll know that it's quite rare to need to use it.
I was aiming to shorten the int t = *f + *s;
That is already extremely short. The function that you defined is much longer and so is even a call to the function. Note that the initialiser expression that you quote has type int while your function returns const int*. That, along with the broken reinterpret cast are the problem.
If you wanted to make the indirection shorter, then how about using references instead of pointers:
const int& f = a;
const int& s = b;
int t = a + b; // shorter

The safe and standard-compliant way of accessing array of integral type as an array of another unrelated integral type?

Here's what I need to do. I'm sure it's a routine and recognizable coding task for many C++ developers out there:
void processAsUint16(const char* memory, size_t size) {
auto uint16_ptr = (const uint16_t*)memory;
for (size_t i = 0, n = size/sizeof(uint16_t); i < n; ++i) {
std::cout << uint16_ptr[i]; // Some processing of the other unrelated type
}
}
Problem: I'm developing with an IDE that integrates clang static code analysis, and every way of casting I tried, short of memcpy (which I don't want to resort to) is either discouraged or strongly discouraged. For example, reinterpret_cast is simply banned by the CPP Core Guidelines. C-style cast is discouraged. static_cast cannot be used here.
What's the right way of doing this that avoids type aliasing problems and other kinds of undefined behavior?
What's the right way of doing this that avoids type aliasing problems and other kinds of undefined behavior?
You use memcpy:
void processAsUint16(const char* memory, size_t size) {
for (size_t i = 0; i < size; i += sizeof(uint16_t)) {
uint16_t x;
memcpy(&x, memory + i, sizeof(x));
// do something with x
}
}
uint16_t is trivially copyable, so this is fine.
Or, in C++20, with std::bit_cast (which awkwardly has to go through an array first):
void processAsUint16(const char* memory, size_t size) {
for (size_t i = 0; i < size; i += sizeof(uint16_t)) {
alignas(uint16_t) char buf[sizeof(uint16_t)];
memcpy(buf, memory + i, sizeof(buf));
auto x = std::bit_cast<uint16_t>(buf);
// do something with x
}
}
Practically speaking, compilers will just "do the right thing" if you just reinterpret_cast, even if it's undefined behavior. Perhaps something like std::bless will give us a more direct, non-copying, mechanism of doing this, but until then...
My preference would be to treat the array of char as a sequence of octets in a defined order. This obviously doesn't work if it actually can be either order depending on target architecture, but in practise, a memory buffer like this usually comes from a file or a network connection.
void processAsUint16(const char* memory, size_t size) {
for (size_t i = 0; i < size; i += 2) {
const unsigned char lo = memory[i];
const unsigned char hi = memory[i+1];
const uint16_t x = lo + hi*256; // or "lo | hi << 8"
// do something with x
}
}
Note that we do not use sizeof(uint16_t) here. memory is a sequence of octets, so even if CHAR_BITS is 16, there will be two chars needed to hold a uint16_t.
This can be a little bit cleaner if memory can be declared as unsigned char - no need for the definition of lo and hi.

C++11: reinterpreting array of structs as array of struct's member

Consider the following type:
struct S
{
char v;
};
Given an array of const S, is it possible to, in a standard conformant way, reinterpret it as an array of const char whose elements correspond to the value of the member v for each of the original array's elements, and vice-versa? For example:
const S a1[] = { {'a'}, {'4'}, {'2'}, {'\0'} };
const char* a2 = reinterpret_cast< const char* >(a1);
for (int i = 0; i < 4; ++i)
std::cout << std::boolalpha << (a1[i].v == a2[i]) << ' ';
Is the code above portable and would it print true true true true? If not, is there any other way of achieving this?
Obviously, it is possible to create a new array and initialize it with the member v of each element of the original array, but the whole idea is to avoid creating a new array.
Trivially, no - the struct may have padding. And that flat out breaks any reinterpretation as an array.
Formally the struct may have padding so that its size is greater than 1.
I.e., formally you can't reinterpret_cast and have fully portable code, except for ¹an array of only one item.
But for the in-practice, some years ago someone asked if there was now any compiler that by default would give sizeof(T) > 1 for struct T{ char x; };. I have yet to see any example. So in practice one can just static_assert that the size is 1, and not worry at all that this static_assert will fail on some system.
I.e.,
S const a1[] = { {'a'}, {'4'}, {'2'}, {'\0'} };
static_assert( sizeof( S ) == 1, "!" );
char const* const a2 = reinterpret_cast<char const*>( a1 );
for( int i = 0; i < 4; ++i )
{
assert( a1[i].v == a2[i] );
}
Since it's possible to interpret the C++14 and later standards in a way where the indexing has Undefined Behavior, based on a peculiar interpretation of "array" as referring to some original array, one might instead write this code in a more awkward and verbose but guaranteed valid way:
// I do not recommend this, but it's one way to avoid problems with some compiler that's
// based on an unreasonable, impractical interpretation of the C++14 standard.
#include <assert.h>
#include <new>
auto main() -> int
{
struct S
{
char v;
};
int const compiler_specific_overhead = 0; // Redefine per compiler.
// With value 0 for the overhead the internal workings here, what happens
// in the machine code, is the same as /without/ this verbose work-around
// for one impractical interpretation of the standard.
int const n = 4;
static_assert( sizeof( S ) == 1, "!" );
char storage[n + compiler_specific_overhead];
S* const a1 = ::new( storage ) S[n];
assert( (void*)a1 == storage + compiler_specific_overhead );
for( int i = 0; i < n; ++i ) { a1[i].v = "a42"[i]; } // Whatever
// Here a2 points to items of the original `char` array, hence no indexing
// UB even with impractical interpretation of the C++14 standard.
// Note that the indexing-UB-free code from this point, is exactly the same
// source code as the first code example that some claim has indexing UB.
char const* const a2 = reinterpret_cast<char const*>( a1 );
for( int i = 0; i < n; ++i )
{
assert( a1[i].v == a2[i] );
}
}
Notes:
¹ The standard guarantees that there's no padding at the start of the struct.
The pointer arithmetic in a2[i] is undefined, see C++14 5.7 [expr.add] p7:
For addition or subtraction, if the expressions P or Q have type "pointer to cv T", where T and the array element type are not similar (4.5), the behavior is undefined. [ Note: In particular, a pointer to a base class cannot be used for pointer arithmetic when the array contains objects of a derived class type. — end note ]
Because of this rule, even if there is no padding and the sizes match, type-based alias analysis allows the compiler to assume that a1[i] and a2[i] do not overlap (because the pointer arithmetic is only valid if a2 really is an array of char not just something with the same size and alignment, and if it's really an array of char it must be a separate object from an array of S).
I think I'd be inclined to use a compile-time transformation if the source data is constant:
#include <iostream>
#include <array>
struct S
{
char v;
};
namespace detail {
template<std::size_t...Is>
constexpr auto to_cstring(const S* p, std::index_sequence<Is...>)
{
return std::array<char, sizeof...(Is)> {
p[Is].v...
};
}
}
template<std::size_t N>
constexpr auto to_cstring(const S (&arr)[N])
{
return detail::to_cstring(arr, std::make_index_sequence<N>());
}
int main()
{
const /*expr if you wish*/ S a1[] = { {'a'}, {'4'}, {'2'}, {'\0'} };
const /*expr if you wish*/ auto a2 = to_cstring(a1);
for (int i = 0; i < 4; ++i)
std::cout << std::boolalpha << (a1[i].v == a2[i]) << ' ';
}
output:
true true true true
even when the data is not a constexpr, gcc and clang are pretty good at constant folding complex sequences like this.

Alternative notation to *(ptr)[j++]?

When passing a double pointer to a function, I used the notation *ptr[j++] in my function which lead the program to crash. I guessed it happened due to operator precedence, so I rectified it by writing (*ptr)[j++] but I didn't like this notation. It feels long and confusing.
I also know of the notation ptr[0][j++] but I also don't like it.Is there any better notation or approach around all of this?
My code:
#include <iostream>
using namespace std;
void mset(int **ptr, size_t size);
void main(void)
{
const size_t size = 10;
int *ptr = new int[size];
mset(&ptr, size);
for (size_t n = 0; n < size; n++) {
std::cout << ptr[n] << std::endl;
}
}
void mset(int **ptr, size_t size)
{
size_t j = 0;
while(j < size)
(*ptr)[j++] = 3;
}
P.S I know that I can write void mset(int *ptr, size_t size) and invoke mset(ptr, size), but I am asking about that particular case.
Simple use an extra level of indirection, something like:
auto p = *ptr;
for (size_t j = 0; j < size; ++j)
p[j] = 3;
For this particular case, inside mset(int **ptr, size_t size), you are never using ptr as it is. but using with dereference (i.e. *ptr). Hence, I would recommend to pass the pointer reference.
mset(ptr, size);
// ^^^ pass simply
void mset(int* const &ptr, const size_t size)
{ // ^^^ pointer reference which cannot change (prevents memory leak)
size_t j = 0;
while(j < size)
ptr[j++] = 3; // No dereferencing required
}
You may also remove the reference as well, because you don't intend to change the value of ptr ever in the mset(). But passing as above is also fine to be able to use the same ptr from main().
Here is the demo.
BTW, a standard compliant main() always returns int.
Declare the level one pointer (int**, and not int*) as const, and compiler would raise an error:
void mset(int const** ptr, size_t size)
{
size_t j = 0;
while (j < size)
*ptr[j++] = 3; // ERROR
}
You are sure that base pointer isn't going to change, and you don't want function to change its contents, so make is const. However, multiple level of pointers do cause problems, it is better to use references,
vectors, templates and other better/newer alternatives.

Casting void pointers, depending on data (C++)

Basically what I want to do is, depending on the some variable, to cast a void pointer into a different datatype. For example (the 'cast' variable is just something in order to get my point across):
void* ptr = some data;
int temp = some data;
int i = 0;
...
if(temp == 32) cast = (uint32*)
else if(temp == 16) cast = (uint16*)
else cast = (uint8*)
i = someArray[*((cast)ptr)];
Is there anything in C++ that can do something like this (since you can't actually assign a variable to be just (uint32*) or something similar)? I apologize if this isn't clear, any help would be greatly appreciated.
The "correct" way:
union MyUnion
{
uint32 asUint32;
uint16 asUint16;
uint8 asUint8;
}
uint32 to_index(int size, MyUnion* ptr)
{
if (size== 32) return ptr->asUint32;
if (size== 16) return ptr->asUint16;
if (size== 8) return ptr->asUint8;
}
i = someArray[to_index(temp,ptr)]
[update: fixed dumb typo]
Clearly, boost::variant is the way to go. It already stores a type-tag that makes it impossible for you to cast to the wrong type, ensuring this using the help of the compiler. Here is how it works
typedef boost::variant<uint32_t*, uint16_t*, uint8_t*> v_type;
// this will get a 32bit value, regardless of what is contained. Never overflows
struct PromotingVisitor : boost::static_visitor<uint32_t> {
template<typename T> uint32_t operator()(T* t) const { return *t; }
};
v_type v(some_ptr); // may be either of the three pointers
// automatically figures out what pointer is stored, calls operator() with
// the correct type, and returns the result as an uint32_t.
int i = someArray[boost::apply_visitor(PromotingVisitor(), v)];
A cleaner solution:
uint32 to_index(int temp, void* ptr) {
if (temp == 32) return *((uint32*)ptr);
if (temp == 16) return *((uint16*)ptr);
if (temp == 8) return *((uint8*)ptr);
assert(0);
}
i = someArray[to_index(temp,ptr)]
It sounds like maybe you're after a union, or if you're using Visual Studio a _variant_t. Or maybe typeinfo() would be helpful? (To be honest, I'm not quite sure exactly what you're trying to do).
As far as the casts, you can cast just about anything to anything -- that's what makes C++ dangerous (and powerful if you're really careful).
Also, note that pointer values are 32-bit or 64-bit in most platforms, so you couldn't store a uint64 in a void* on a 32-bit platform.
Finally, maybe this is what you want:
void* p = whatever;
uint32 x = (uint32)p;
or maybe
uint32 source = 6;
void* p = &source;
uint32 dest = *((uint32*)p);
void* p =
If you were locked into using a void ptr, and absolutely needed to call [] with different types:
template <typename cast_to>
inline
int get_int_helper(someArray_t someArray, void* ptr) {
return someArray[*static_cast<cast_to*>(ptr)];
}
int get_int(someArray_t someArray, void* ptr, int temp) {
switch ( temp ) {
case 32: return get_int_helper<uint32>(someArray,ptr);
case 16: return get_int_helper<uint16>(someArray,ptr);
default: return get_int_helper<uint8>(someArray,ptr);
}
}
However, as others have pointed out; there are probably better/other ways to do it. Most likely, whatever array you have doesn't have multiple operator[], so it doesn't need the different types. In addition, you could be using boost::variant to hold a discriminated union of the types so you wouldn't have to pass around temp
It seems you want to store the "cast" function that takes a void* and produces an unsigned integer. So, make it a function:
std::map<int, boost::function<unsigned(*)(void*)> casts;
template <typename T> unsigned cast(void* v) { return *(T*)v; }
casts[32] = cast<uint32>;
casts[16] = cast<uint16>;
casts[8] = cast<uint8>;
casts[128] = MySpecialCastFromDouble;
void* foo = getFoo();
unsigned bar = casts[16](foo);