Is there a way to avoid the strict aliasing warning?

Is there a way to avoid the strict aliasing warning? - c++

I'm using a library (ENet), which uses callbacks. In those callback functions, it passes a struct which contains a void* for user data, for your own use. I'd like to use that variable, but not as a pointer. I don't want to allocate memory just so I can point to it, I'd rather use the space of the void* directly to store a size_t.
But, as expected, when I cast the void* variable to a size_t variable, I get a strict alias warning. And the callback's struct doesn't provide a union to access it as something other than a void*.
I know I can disable that warning completely, but I'd rather just silence it for this particular case. Is there a way to write a cast of this sort that lets the compiler know it is intentional, to avoid the warning?
Edit:
Here is an example of what I'm trying to do. Since I need to be able to edit the user value, I'm casting it to size_t while also trying to grab a reference to it:
size_t& offset = reinterpret_cast<size_t&>(packet->userData);
This works, but gives the warning.

But, as expected, when I dereference the void* variable to a size_t variable, I get a strict alias warning.
If you want to use the void * itself just to transport a plain integer, you don't want to dereference it, you want to cast it to an appropriate integral type (intptr_t is your best bet, as the standard guarantees that it can survive to a roundtrip through void *).
intptr_t magic=42;
register_callback(&myfunc, (void *)magic);
// ...
void myfunc(void *context)
{
intptr_t magic=(intptr_t)context;
// ...
}
(if you like C++-style casts, those would all be reinterpret_cast)
Besides, you are probably doing something weird in your code, because void * (like char * and unsigned char *) is not subjected to the strict aliasing rule (they can alias any other pointer).
Update
Here is an example of what I'm trying to do. Since I need to be able to edit the user value, I'm casting it to size_t while also trying to grab a reference to it:
size_t& offset = reinterpret_cast<size_t&>(packet->userData);
This works, but gives the warning.
Nope, even assuming that size_t and void * had the same size and alignment requirements (which is not guaranteed), this cannot be done portably; aliasing a void *& with a size_t & is not allowed (also, this is particularly devious because it's hidden in a reference).
If you really need to do this, you have to come to terms with your compiler; in gcc, for example, you could compile just the file where you have this thing with -fno-strict-aliasing, which, instead of simply disabling the warning (=hiding the potential problem under the carpet), disables the compiler assumptions about strict aliasing, so that the generated code works correctly even if pointers/references to unrelated types point to the same stuff.

Related

Why pass a pointer as a (char ) and cast to a (long )

I know legacy is always a justification, but I wanted to check out this example from MariaDB and see if I understand it enough to critique what's going on,
static int show_open_tables(THD *, SHOW_VAR *var, char *buff) {
var->type = SHOW_LONG;
var->value = buff;
*((long *)buff) = (long)table_cache_manager.cached_tables();
return 0;
}
Here they're taking in char* and they're writing it to var->value which is also a char*. Then they force a pointer to a long in the buff and set the type to a SHOW_LONG to indicate it as such.
I'm wondering why they would use a char* for this though and not a uintptr_t -- especially being when they're forcing pointers to longs and other types in it.
Wasn't the norm pre-uintptr_t to use void* for polymorphism in C++?

There seems to be two questions here. So I've split my answer up.
Using char*
Using a char* is fine. Character types (char, signed char, and unsigned char) are specially treated by the C and C++ standards. The C standard defines the following rules for accessing an object:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
This effectively means character types are the closest the standards come to defining a 'byte' type (std::byte in C++17 is just defined as enum class byte : unsigned char {})
However, as per the above rules casting a char* to a long* and then assigning to it is incorrect (although generally works in practice). memcpy should be used instead. For example:
long cached_tables = table_cache_manager.cached_tables();
memcpy(buf, &cached_tables, sizeof(cached_tables));
void* would also be a legitimate choice. Whether it is better is a mater of opinion. I would say the clearest option would be to add a type alias for char to convey the intent to use it as a byte type (e.g. typedef char byte_t). Of the top of my head though I can think of several examples of prominent libraries which use char as is, as a byte type. For example, the Boost memory mapped file code gives a char* and leveldb uses std::string as a byte buffer type (presumably to taking advantage of SSO).
Regarding uinptr_t:
uintptr_t is an optional type defined as an unsigned integer capable of holding a pointer. If you want to store the address of a pointed-to object in an integer, then it is a suitable type to use. It is not a suitable type to use here.

they're taking in char* and they're writing it to var->value which is also a char*. Then they force a pointer to a long in the buff and set the type to a SHOW_LONG to indicate it as such.
Or something. That code is hideous.
I'm wondering why they would use a char* for this though and not a uintptr_t -- especially being when they're forcing pointers to longs and other types in it.
Who knows? Who knows what the guy was on when he wrote it? Who cares? That code is hideous, we certainly shouldn't be trying to learn from it.
Wasn't the norm pre-uintptr_t to use void* for polymorphism in C++?
Yes, and it still is. The purpose of uintptr_t is to define an integer type that is big enough to hold a pointer.
I wanted to check out this example from MariaDB and see if I understand it enough to critique what's going on
You might have reservations about doing so but I certainly don't, that API is just a blatant lie. The way to do it (if you absolutely have to) would (obviously) be:
static int show_open_tables(THD *, SHOW_VAR *var, long *buff) {
var->type = SHOW_LONG;
var->value = (char *) buff;
*buff = (long)table_cache_manager.cached_tables();
return 0;
}
Then at least it is no longer a ticking time bomb.
Hmmm, OK, maybe (just maybe) that function is used in a dispatch table somewhere and therefore needs (unless you cast it) to have a specific signature. If so, I'm certainly not going to dig through 10,000 lines of code to find out (and anyway, I can't, it's so long it crashes my tablet).
But if anything, that would just make it worse. Now that timebomb has become a stealth bomber. And anyway, I don't believe it's that for a moment. It's just a piece of dangerous nonsense.

About keyword “const” in c++

According to the c++ grammar, const int* const p means that what p points to and it' value can't be rewritten.But today I found that if I code like this:
void f(const int* const p)
{
char* ch = (char*) p;
int* q = (int*) ch;
(*q) = 3; //I can modify the integer that p points to
}
In this condition,the keyword "const" will lose it's effect.Is there any significance to use "const"?

You are casting away the constness here:
char* ch = (char*) p;
Effectively, you are saying "I know what I am doing, forget you are const, I accept the consequences." C++ allows you to do stuff like this because sometimes it can be useful/necessary. But it is fraught with danger.
Note that if the argument passed to the function were really const, then your code would result in undefined behaviour (UB). And you have no way of knowing from inside the function.
Note also that in C++ it is preferable to make your intent clear,
int* pi = const_cast<int*>(p);
This makes it clear that your intention is to cast away the const. It is also easier to search for. The same caveats about danger and UB apply.

Your example will crash the app if const int* const p points to a compile time constant, when casting away constancy you need to be sure what you are doing.
C++ is a language for adults, extra power will never be sacrificed for ephemeral safety, it is the code author's choice whether to stay in a safe zone or to use cast operators and move one step closer to C/asm.

C/C++ will let you do many things that allow you to 'hurt' yourself. Here, casting away the const of p is "legal" because the compiler assumes you know what you are doing. Whether this is good coding style or not is another matter.
When you do something like this, you assume responsibility for any side effects and issues it could create. Here, if the memory pointed to in the parameter is static memory, the program will likely crash.
In short, just because you can do something doesn't mean it is a good idea.

The const keyword is a way to use the compiler to catch programming mistakes, nothing more. It makes no claims about the mutability of memory at runtime, only that the compiler should shout at you if you directly modify the value (without using C-style casts).
A C-style cast is pretty much a marker saying 'I know what I'm doing here'. Which in most instances is false.

Here you change the type of the pointer. Using such a cast (C-type) cast you can change it to any possible pointer with no problem.
The same way you can use c++ cast: const_cast<...>(...):
int* p_non_const = const_cast<int*>(p);
In this case (I hope) you see immediately what is happening - you simply rid of the constness.
Note that in your program you also don't need temprorary variable ch. You can do it directly:
int* q = (int*) p;
In principle you should not use such a cast, because correctly designed and written program doesn't need it. Usually the const_cast is used for quick and temporary changes in the program (to check something) or to use a function from some unproperly designed library.

What is the meaning of this?

Code:
void *buff;
char *r_buff = (char *)buff;
I can't understand the type casting of buff. Please help.
Thank you.

buff is a pointer to some memory, where the type of its content is unspecified (hence the void).
The second line tells that r_buff shall point to the same memory location, and the contents shall be interpreted as char(s).

buff is typed as a void pointer, which means it points to memory without declaring anything about the contents.
When you cast to char *, you declare that you're interpreting the pointer as being a char pointer.

In well written C++, you should not use C-style casts. So your cast should look like this:
void *buff;
char *r_buff = static_cast<char *>(buff);
See here for an explanation of what the C++ casting operators do.

By its name, buff is likely to be a memory buffer in which to write data, possibly binary data.
There are reasons why one might want to cast it to char *, potentially to use pointer arithmetic on it as one is writing because you cannot do that with a void*.
For example if you are supplied also a size (likely) and your API requires not pointer and size but 2 pointers (begin and end) you will need pointer arithmetic to determine where the end is.
The code could well be C in which case the cast is correct. If the code is C++ though a static_cast is preferable although the C cast is not incorrect in this instance. The reason a static_cast is generally preferred is that the compiler will catch more occasions when you cast incorrectly that way, and it is also more easily greppable. However casting in general breaks type-safety rules and therefore is preferably avoided much of the time. (Not that it is never correct, as it may be here).

"dereferencing type-punned pointer will break strict-aliasing rules" warning

I use a code where I cast an enum* to int*. Something like this:
enum foo { ... }
...
foo foobar;
int *pi = reinterpret_cast<int*>(&foobar);
When compiling the code (g++ 4.1.2), I get the following warning message:
dereferencing type-punned pointer will break strict-aliasing rules
I googled this message, and found that it happens only when strict aliasing optimization is on. I have the following questions:
If I leave the code with this warning, will it generate potentially wrong code?
Is there any way to work around this problem?
If there isn't, is it possible to turn off strict aliasing from inside the source file (because I don't want to turn it off for all source files and I don't want to make a separate Makefile rule for this source file)?
And yes, I actually need this kind of aliasing.

In order:
Yes. GCC will assume that the pointers cannot alias. For instance, if you assign through one then read from the other, GCC may, as an optimisation, reorder the read and write - I have seen this happen in production code, and it is not pleasant to debug.
Several. You could use a union to represent the memory you need to reinterpret. You could use a reinterpret_cast. You could cast via char * at the point where you reinterpret the memory - char * are defined as being able to alias anything. You could use a type which has __attribute__((__may_alias__)). You could turn off the aliasing assumptions globally using -fno-strict-aliasing.
__attribute__((__may_alias__)) on the types used is probably the closest you can get to disabling the assumption for a particular section of code.
For your particular example, note that the size of an enum is ill defined; GCC generally uses the smallest integer size that can be used to represent it, so reinterpreting a pointer to an enum as an integer could leave you with uninitialised data bytes in the resulting integer. Don't do that. Why not just cast to a suitably large integer type?

You could use the following code to cast your data:
template<typename T, typename F>
struct alias_cast_t
{
union
{
F raw;
T data;
};
};
template<typename T, typename F>
T alias_cast(F raw_data)
{
alias_cast_t<T, F> ac;
ac.raw = raw_data;
return ac.data;
}
Example usage:
unsigned int data = alias_cast<unsigned int>(raw_ptr);

But why are you doing this? It will break if sizeof(foo) != sizeof(int). Just because an enum is like an integer does not mean it is stored as one.
So yes, it could generate "potentially" wrong code.

Have you looked into this answer ?
The strict aliasing rule makes this
setup illegal, two unrelated types
can't point to the same memory. Only
char* has this privilege.
Unfortunately you can still code this
way, maybe get some warnings, but have
it compile fine.

Strict aliasing is a compiler option, so you need to turn it off from the makefile.
And yes, it can generate incorrect code. The compiler will effectively assume that foobar and pi aren't bound together, and will assume that *pi won't change if foobar changed.
As already mentioned, use static_cast instead (and no pointers).

Needless pointer-casts in C

I got a comment to my answer on this thread:
Malloc inside a function call appears to be getting freed on return?
In short I had code like this:
int * somefunc (void)
{
int * temp = (int*) malloc (sizeof (int));
temp[0] = 0;
return temp;
}
I got this comment:
Can I just say, please don't cast the
return value of malloc? It is not
required and can hide errors.
I agree that the cast is not required in C. It is mandatory in C++, so I usually add them just in case I have to port the code in C++ one day.
However, I wonder how casts like this can hide errors. Any ideas?
Edit:
Seems like there are very good and valid arguments on both sides. Thanks for posting, folks.

It seems fitting I post an answer, since I left the comment :P
Basically, if you forget to include stdlib.h the compiler will assume malloc returns an int. Without casting, you will get a warning. With casting you won't.
So by casting you get nothing, and run the risk of suppressing legitimate warnings.
Much is written about this, a quick google search will turn up more detailed explanations.
edit
It has been argued that
TYPE * p;
p = (TYPE *)malloc(n*sizeof(TYPE));
makes it obvious when you accidentally don't allocate enough memory because say, you thought p was TYPe not TYPE, and thus we should cast malloc because the advantage of this method overrides the smaller cost of accidentally suppressing compiler warnings.
I would like to point out 2 things:
you should write p = malloc(sizeof(*p)*n); to always ensure you malloc the right amount of space
with the above approach, you need to make changes in 3 places if you ever change the type of p: once in the declaration, once in the malloc, and once in the cast.
In short, I still personally believe there is no need for casting the return value of malloc and it is certainly not best practice.

This question is tagged both for C and C++, so it has at least two answers, IMHO:
C
Ahem... Do whatever you want.
I believe the reason given above "If you don't include "stdlib" then you won't get a warning" is not a valid one because one should not rely on this kind of hacks to not forget to include an header.
The real reason that could make you not write the cast is that the C compiler already silently cast a void * into whatever pointer type you want, and so, doing it yourself is overkill and useless.
If you want to have type safety, you can either switch to C++ or write your own wrapper function, like:
int * malloc_Int(size_t p_iSize) /* number of ints wanted */
{
return malloc(sizeof(int) * p_iSize) ;
}
C++
Sometimes, even in C++, you have to make profit of the malloc/realloc/free utils. Then you'll have to cast. But you already knew that. Using static_cast<>() will be better, as always, than C-style cast.
And in C, you could override malloc (and realloc, etc.) through templates to achieve type-safety:
template <typename T>
T * myMalloc(const size_t p_iSize)
{
return static_cast<T *>(malloc(sizeof(T) * p_iSize)) ;
}
Which would be used like:
int * p = myMalloc<int>(25) ;
free(p) ;
MyStruct * p2 = myMalloc<MyStruct>(12) ;
free(p2) ;
and the following code:
// error: cannot convert ‘int*’ to ‘short int*’ in initialization
short * p = myMalloc<int>(25) ;
free(p) ;
won't compile, so, no problemo.
All in all, in pure C++, you now have no excuse if someone finds more than one C malloc inside your code...
:-)
C + C++ crossover
Sometimes, you want to produce code that will compile both in C and in C++ (for whatever reasons... Isn't it the point of the C++ extern "C" {} block?). In this case, C++ demands the cast, but C won't understand the static_cast keyword, so the solution is the C-style cast (which is still legal in C++ for exactly this kind of reasons).
Note that even with writing pure C code, compiling it with a C++ compiler will get you a lot more warnings and errors (for example attempting to use a function without declaring it first won't compile, unlike the error mentioned above).
So, to be on the safe side, write code that will compile cleanly in C++, study and correct the warnings, and then use the C compiler to produce the final binary. This means, again, write the cast, in a C-style cast.

One possible error it can introduce is if you are compiling on a 64-bit system using C (not C++).
Basically, if you forget to include stdlib.h, the default int rule will apply. Thus the compiler will happily assume that malloc has the prototype of int malloc(); On Many 64-bit systems an int is 32-bits and a pointer is 64-bits.
Uh oh, the value gets truncated and you only get the lower 32-bits of the pointer! Now if you cast the return value of malloc, this error is hidden by the cast. But if you don't you will get an error (something to the nature of "cannot convert int to T *").
This does not apply to C++ of course for 2 reasons. Firstly, it has no default int rule, secondly it requires the cast.
All in all though, you should just new in c++ code anyway :-P.

Well, I think it's the exact opposite - always directly cast it to the needed type. Read on here!

The "forgot stdlib.h" argument is a straw man. Modern compilers will detect and warn of the problem (gcc -Wall).
You should always cast the result of malloc immediately. Not doing so should be considered an error, and not just because it will fail as C++. If you're targeting a machine architecture with different kinds of pointers, for example, you could wind up with a very tricky bug if you don't put in the cast.
Edit: The commentor Evan Teran is correct. My mistake was thinking that the compiler didn't have to do any work on a void pointer in any context. I freak when I think of FAR pointer bugs, so my intuition is to cast everything. Thanks Evan!

Actually, the only way a cast could hide an error is if you were converting from one datatype to an smaller datatype and lost data, or if you were converting pears to apples. Take the following example:
int int_array[10];
/* initialize array */
int *p = &(int_array[3]);
short *sp = (short *)p;
short my_val = *sp;
in this case the conversion to short would be dropping some data from the int. And then this case:
struct {
/* something */
} my_struct[100];
int my_int_array[100];
/* initialize array */
struct my_struct *p = &(my_int_array[99]);
in which you'd end up pointing to the wrong kind of data, or even to invalid memory.
But in general, and if you know what you are doing, it's OK to do the casting. Even more so when you are getting memory from malloc, which happens to return a void pointer which you can't use at all unless you cast it, and most compilers will warn you if you are casting to something the lvalue (the value to the left side of the assignment) can't take anyway.

#if CPLUSPLUS
#define MALLOC_CAST(T) (T)
#else
#define MALLOC_CAST(T)
#endif
...
int * p;
p = MALLOC_CAST(int *) malloc(sizeof(int) * n);
or, alternately
#if CPLUSPLUS
#define MYMALLOC(T, N) static_cast<T*>(malloc(sizeof(T) * N))
#else
#define MYMALLOC(T, N) malloc(sizeof(T) * N)
#endif
...
int * p;
p = MYMALLOC(int, n);

People have already cited the reasons I usually trot out: the old (no longer applicable to most compilers) argument about not including stdlib.h and using sizeof *p to make sure the types and sizes always match regardless of later updating. I do want to point out one other argument against casting. It's a small one, but I think it applies.
C is fairly weakly typed. Most safe type conversions happen automatically, and most unsafe ones require a cast. Consider:
int from_f(float f)
{
return *(int *)&f;
}
That's dangerous code. It's technically undefined behavior, though in practice it's going to do the same thing on nearly every platform you run it on. And the cast helps tell you "This code is a terrible hack."
Consider:
int *p = (int *)malloc(sizeof(int) * 10);
I see a cast, and I wonder, "Why is this necessary? Where is the hack?" It raises hairs on my neck that there's something evil going on, when in fact the code is completely harmless.
As long as we're using C, casts (especially pointer casts) are a way of saying "There's something evil and easily breakable going on here." They may accomplish what you need accomplished, but they indicate to you and future maintainers that the kids aren't alright.
Using casts on every malloc diminishes the "hack" indication of pointer casting. It makes it less jarring to see things like *(int *)&f;.
Note: C and C++ are different languages. C is weakly typed, C++ is more strongly typed. The casts are necessary in C++, even though they don't indicate a hack at all, because of (in my humble opinion) the unnecessarily strong C++ type system. (Really, this particular case is the only place I think the C++ type system is "too strong," but I can't think of any place where it's "too weak," which makes it overall too strong for my tastes.)
If you're worried about C++ compatibility, don't. If you're writing C, use a C compiler. There are plenty really good ones avaliable for every platform. If, for some inane reason, you have to write C code that compiles cleanly as C++, you're not really writing C. If you need to port C to C++, you should be making lots of changes to make your C code more idiomatic C++.
If you can't do any of that, your code won't be pretty no matter what you do, so it doesn't really matter how you decide to cast at that point. I do like the idea of using templates to make a new allocator that returns the correct type, although that's basically just reinventing the new keyword.

Casting a function which returns (void *) to instead be an (int *) is harmless: you're casting one type of pointer to another.
Casting a function which returns an integer to instead be a pointer is most likely incorrect. The compiler would have flagged it had you not explicitly cast it.

One possible error could (depending on this is whether what you really want or not) be mallocing with one size scale, and assigning to a pointer of a different type. E.g.,
int *temp = (int *)malloc(sizeof(double));
There may be cases where you want to do this, but I suspect that they are rare.

I think you should put the cast in. Consider that there are three locations for types:
T1 *p;
p = (T2*) malloc(sizeof(T3));
The two lines of code might be widely separated. Therefore it's good that the compiler will enforce that T1 == T2. It is easier to visually verify that T2 == T3.
If you miss out the T2 cast, then you have to hope that T1 == T3.
On the other hand you have the missing stdlib.h argument - but I think it's less likely to be a problem.

On the other hand, if you ever need to port the code to C++, it is much better to use the 'new' operator.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js