Is this hack to remove aliasing warning UB? - c++

We just upgraded our compiler to gcc 4.6 and now we get some of these warnings. At the moment our codebase is not in a state to be compiled with c++0x and anyway, we don't want to run this in prod (at least not yet) - so I needed a fix to remove this warning.
The warnings occur typically because of something like this:
struct SomeDataPage
{
// members
char vData[SOME_SIZE];
};
later, this is used in the following way
SomeDataPage page;
new(page.vData) SomeType(); // non-trivial constructor
To read, update and return for example, the following cast used to happen
reinterpret_cast<SomeType*>(page.vData)->some_member();
This was okay with 4.4; in 4.6 the above generates:
warning: type punned pointer will break strict-aliasing rules
Now a clean way to remove this error is to use a union, however like I said, we can't use c++0x (and hence unrestricted unions), so I've employed the horrible hack below - now the warning has gone away, but am I likely to invoke nasal daemons?
static_cast<SomeType*>(reinterpret_cast<void*>(page.vData))->some_member();
This appears to work okay (see simple example here: http://www.ideone.com/9p3MS) and generates no warnings, is this okay(not in the stylistic sense) to use this till c++0x?
NOTE: I don't want to use -fno-strict-aliasing generally...
EDIT: It seems I was mistaken, the same warning is there on 4.4, I guess we only picked this up recently with the change (it was always unlikely to be a compiler issue), the question still stands though.
EDIT: further investigation yielded some interesting information, it seems that doing the cast and calling the member function in one line is what is causing the warning, if the code is split into two lines as follows
SomeType* ptr = reinterpret_cast<SomeType*>(page.vData);
ptr->some_method();
this actually does not generate a warning. As a result, my simple example on ideone is flawed and more importantly my hack above does not fix the warning, the only way to fix it is to split the function call from the cast - then the cast can be left as a reinterpret_cast.

SomeDataPage page;
new(page.vData) SomeType(); // non-trivial constructor
reinterpret_cast<SomeType*>(page.vData)->some_member();
This was okay with 4.4; in 4.6 the above generates:
warning: type punned pointer will break strict-aliasing rules
You can try:
SomeDataPage page;
SomeType *data = new(page.vData) SomeType(); // non-trivial constructor
data->some_member();

Why not use:
SomeType *item = new (page.vData) SomeType();
and then:
item->some_member ();
I don't think a union is the best way, it may also be problematic. From the gcc docs:
`-fstrict-aliasing'
Allows the compiler to assume the strictest aliasing rules
applicable to the language being compiled. For C (and C++), this
activates optimizations based on the type of expressions. In
particular, an object of one type is assumed never to reside at
the same address as an object of a different type, unless the
types are almost the same. For example, an `unsigned int' can
alias an `int', but not a `void*' or a `double'. A character type
may alias any other type.
Pay special attention to code like this:
union a_union {
int i;
double d;
};
int f() {
a_union t;
t.d = 3.0;
return t.i;
}
The practice of reading from a different union member than the one
most recently written to (called "type-punning") is common. Even
with `-fstrict-aliasing', type-punning is allowed, provided the
memory is accessed through the union type. So, the code above
will work as expected. However, this code might not:
int f() {
a_union t;
int* ip;
t.d = 3.0;
ip = &t.i;
return *ip;
}
How this relates to your problem is tricky to determine. I guess the compiler is not seeing the data in SomeType as the same as the data in vData.

I'd be more concerned about SOME_SIZE not being big enough, frankly. However, it is legal to alias any type with a char*. So simply doing reinterpret_cast<T*>(&page.vData[0]) should be just fine.
Also, I'd question this kind of design. Unless you're implementing boost::variant or something similar, there's not much reason to use it.

struct SomeDataPage
{
// members
char vData[SOME_SIZE];
};
This is problematic for aliasing/alignment reasons. For one, the alignment of this struct isn't necessarily the same as the type you're trying to pun inside it. You could try using GCC attributes to enforce a certain alignment:
struct SomeDataPage { char vData[SOME_SIZE] __attribute__((aligned(16))); };
Where alignment of 16 should be adequate for anything I've come across. Then again, the compiler still won't like your code, but it won't break if the alignment is good. Alternatively, you could use the new C++0x alignof/alignas.
template <class T>
struct DataPage {
alignof(T) char vData[sizeof(T)];
};

Related

Violating strict aliasing does not always produce a compiler warning

Strict-aliasing has kinda thrown me into a loop. Here's the code.
I have a class
#include <arpa/inet.h>
#include <net/route.h>
class Alias
{
public:
struct rtentry rt;
struct sockaddr_in *address;
void try_aliasing()
{
address = (struct sockaddr_in *)&rt.rt_dst;
address->sin_family = AF_INET;
}
};
If I use it like:
int main()
{
Alias a ;
a.try_aliasing();
return 0;
}
It shows:
warning:dereferencing pointer '<anonymous>' does break strict-aliasing rules
However if I use the class as:
int main()
{
Alias *a = new Alias();
a->try_aliasing();
return 0;
}
It compiles just fine.
Compiled both times using:
g++ a.cpp -Wall -O2
Have looked through some threads on strict-aliasing but they've failed to clear the reason for this behavior for me.
In most situations where compilers would be capable of generating "strict-aliasing" warnings, they could just as easily recognize the relationship between the actions on storage using different lvalues. The reason that the Standard doesn't require that compilers recognize access to storage using lvalues of different types is to avoid requiring them to pessimistically assume aliasing in cases where they would otherwise have no reason to expect it [and would thus have no reason to issue any warnings about it].
As written, the Standard does not recognize any circumstances in which an lvalue of non-character type can be derived from another lvalue and used to access storage as its own type. Even something like:
struct foo {int x;} s = {0};
s.x = 1;
invokes UB because it uses an lvalue of type int to access the storage of an object of type struct foo. The authors of the Standard rely upon compiler writers to recognize situations where common sense would imply that they should behave predictably without regard for whether the Standard actually requires it. I don't think any compiler is so stupid as to not recognize that an operation on s.x is actually an operation on s, and there are many other situations where the authors of the Standard thought compilers would have the sense to recognize such things without being ordered to do so.
Unfortunately, some compiler writers have come to view the rules as justification for assuming that code which looks like it would access storage of one type using lvalues of another, doesn't do so, rather than merely making such assumptions about code that doesn't look like it does so. Given something like:
void doSomehing(structs s2 *p);
void test(struct s1 *p)
{
doSomething((struct s2*)p};
}
there are at a few ways that something like test might be invoked:
1. It might receive a pointer to a `struct s1`, which `doSomething` will need to operate upon [perhaps using the Common Initial Sequence guarantee] as though it is a `struct s2`. Further, either:
1a. During the execution of `doSomething` storage accessed exclusively via pointers derived from a struct s2 like the one that was passed in will be accessed exclusively via such means, or...
1b. During the execution of `doSomething` storage accessed via things of that type will also be accessed via other unrelated means.
2. It might receive a pointer to a `struct s2`, which has been cast to a `struct s1*` for some reason in the calling code.
3. It might receive a pointer to a `struct s1`, which `doSomething` will process as though it's a `struct s1`, despite the fact that it accepts a parameter of type `struct s2`.
A compiler might observe that none of the situations whose behavior is defined by the Standard are very likely, and thus decide to issue a warning on that basis. On the other hand, the most common situation by far would be #1a, which a compiler really should be able to handle in predictable fashion, without difficulty, whether the Standard requires it or not, by ensuring that any operations on things of type struct s2 which are performed within the function get sequenced between operations on type struct s1 which precede the call, and those on type struct s1 which follow the function call. Unfortunately, gcc and clang don't do that.

Weird enum in destructor

Currently, I am reading the source code of Protocol Buffer, and I found one weird enum codes defined here
~scoped_ptr() {
enum { type_must_be_complete = sizeof(C) };
delete ptr_;
}
void reset(C* p = NULL) {
if (p != ptr_) {
enum { type_must_be_complete = sizeof(C) };
delete ptr_;
ptr_ = p;
}
}
Why the enum { type_must_be_complete = sizeof(C) }; is defined here? what is it used for?
This trick avoids UB by ensuring that definition of C is available when this destructor is compiled. Otherwise the compilation would fail as the sizeof incomplete type (forward declared types) can not be determined but the pointers can be used.
In compiled binary, this code would be optimized out and would have no effect.
Note that: Deleting incomplete type may be undefined behavior from 5.3.5/5:.
if the object being deleted has incomplete class type at the point of
deletion and the complete class has a non-trivial destructor or a
deallocation function, the behavior is undefined.
g++ even issues the following warning:
warning: possible problem detected in invocation of delete
operator: warning: 'p' has incomplete type warning: forward
declaration of 'struct C'
sizeof(C) will fail at compile time if C is not a complete type. Setting a local scope enum to it makes the statement benign at runtime.
It's a way of the programmer protecting themselves from themselves: the behaviour of a subsequent delete ptr_ on an incomplete type is undefined if it has a non-trivial destructor.
To understand the enum, start with considering the destructor without it:
~scoped_ptr() {
delete ptr_;
}
where ptr_ is a C*. If type C is incomplete at this point, i.e. all that the compiler knows is struct C;, then (1)a default-generated do-nothing destructor is used for the C instance pointed to. That's unlikely to be the right thing to do for an object managed by a smart pointer.
If deleting via a pointer to incomplete type had always had Undefined Behavior, then the standard could just require the compiler to diagnose it and fail. But it's well-defined when the real destructor is trivial: knowledge that the programmer can have, but the compiler doesn't have. Why the language defines and allows this is beyond me, but C++ supports many practices that today are not regarded as best practices.
A complete type has a known size, and hence, sizeof(C) will compile if and only if C is a complete type -- with known destructor. So it can be used as a guard. One way would be simply
(void) sizeof(C); // Type must be complete
I would guess that with some compiler and options the compiler optimizes it away before it could notice that it shouldn't compile, and that the enum is a way to avoid such non-conforming compiler behavior:
enum { type_must_be_complete = sizeof(C) };
An alternative explanation for the choice of enum rather than just a discarded expression, is simply personal preference.
Or as James T. Hugget suggests in a comment to this answer, “The enum may be a way of creating a pseudo-portable error message at compile time”.
(1) The default-generated do-nothing destructor for an incomplete type was a problem with old std::auto_ptr. It was so insidious that it made its way into a GOTW item about the PIMPL idiom, written by the chair of the international C++ standardization committee Herb Sutter. Of course, nowadays that std::auto_ptr is deprecated, one will instead use some other mechanism.
Maybe a trick to be sure C is defined.

When CAN i break aliasing rules?

I get this warning. I would like defined behavior but i would like to keep this code as it is. When may i break aliasing rules?
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
String is my own string which is a POD. This code is called from C. S may be an int. String is pretty much struct String { RealString*s; } but templated and helper functions. I do a static assert to make sure String is a pod, is 4bytes and int is 4bytes. I also wrote an assert which checks if all pointers are >= NotAPtr. Its in my new/malloc overload. I may put that assert in String as well if you suggest
Considering the rules i am following (mainly that string is a pod and always the same size as int) would it be fine if i break aliasing rules? Is this one of the few times one is breaking it right?
void func(String s) {
auto v=*(unsigned int*)&s;
myassert(v);
if(v < NotAPtr) {
//v is an int
}
else{
//v is a ptr
}
}
memcpy is fully supported. So is punning to a char* (you can then use std::copy, for instance).
If you cannot change code to 2 functions as proposed why not (requires C99 compiler as uses uintptr_t - for older MSVC you need to define it yourself, 2008/2010 should be ok):
void f(RealString *s) {
uintptr_t int = reinterpret_cast<uintptr_t>(s);
assert(int);
}
The Standard specifies a minimal set of actions that all conforming implementations must process in predictable fashion unless they encounter translation limits (whereupon all bets are off). It does not attempt to define all of the actions that an implementation must support to be suitable for any particular purpose. Instead, support for actions beyond those mandated is treated as a Quality of Implementation issue. The authors acknowledge that an implementation could be conforming and yet be of such poor quality as to be useless.
Code such as yours should be usable for quality implementations that are intended for low-level programming, and which represent things in memory in the expected fashion. It should not be expected to be usable on other kinds of implementations, including those which interpret "quality of implementation" issues as an invitation to try to behave in poor-quality-but-conforming fashion.
The safe way of treating a variable as two different types is to turn it into a union. One part of the union can be your pointer, the other part an integer.
struct String
{
union
{
RealString*s;
int i;
};
};

"dereferencing type-punned pointer will break strict-aliasing rules" warning

I use a code where I cast an enum* to int*. Something like this:
enum foo { ... }
...
foo foobar;
int *pi = reinterpret_cast<int*>(&foobar);
When compiling the code (g++ 4.1.2), I get the following warning message:
dereferencing type-punned pointer will break strict-aliasing rules
I googled this message, and found that it happens only when strict aliasing optimization is on. I have the following questions:
If I leave the code with this warning, will it generate potentially wrong code?
Is there any way to work around this problem?
If there isn't, is it possible to turn off strict aliasing from inside the source file (because I don't want to turn it off for all source files and I don't want to make a separate Makefile rule for this source file)?
And yes, I actually need this kind of aliasing.
In order:
Yes. GCC will assume that the pointers cannot alias. For instance, if you assign through one then read from the other, GCC may, as an optimisation, reorder the read and write - I have seen this happen in production code, and it is not pleasant to debug.
Several. You could use a union to represent the memory you need to reinterpret. You could use a reinterpret_cast. You could cast via char * at the point where you reinterpret the memory - char * are defined as being able to alias anything. You could use a type which has __attribute__((__may_alias__)). You could turn off the aliasing assumptions globally using -fno-strict-aliasing.
__attribute__((__may_alias__)) on the types used is probably the closest you can get to disabling the assumption for a particular section of code.
For your particular example, note that the size of an enum is ill defined; GCC generally uses the smallest integer size that can be used to represent it, so reinterpreting a pointer to an enum as an integer could leave you with uninitialised data bytes in the resulting integer. Don't do that. Why not just cast to a suitably large integer type?
You could use the following code to cast your data:
template<typename T, typename F>
struct alias_cast_t
{
union
{
F raw;
T data;
};
};
template<typename T, typename F>
T alias_cast(F raw_data)
{
alias_cast_t<T, F> ac;
ac.raw = raw_data;
return ac.data;
}
Example usage:
unsigned int data = alias_cast<unsigned int>(raw_ptr);
But why are you doing this? It will break if sizeof(foo) != sizeof(int). Just because an enum is like an integer does not mean it is stored as one.
So yes, it could generate "potentially" wrong code.
Have you looked into this answer ?
The strict aliasing rule makes this
setup illegal, two unrelated types
can't point to the same memory. Only
char* has this privilege.
Unfortunately you can still code this
way, maybe get some warnings, but have
it compile fine.
Strict aliasing is a compiler option, so you need to turn it off from the makefile.
And yes, it can generate incorrect code. The compiler will effectively assume that foobar and pi aren't bound together, and will assume that *pi won't change if foobar changed.
As already mentioned, use static_cast instead (and no pointers).

Needless pointer-casts in C

I got a comment to my answer on this thread:
Malloc inside a function call appears to be getting freed on return?
In short I had code like this:
int * somefunc (void)
{
int * temp = (int*) malloc (sizeof (int));
temp[0] = 0;
return temp;
}
I got this comment:
Can I just say, please don't cast the
return value of malloc? It is not
required and can hide errors.
I agree that the cast is not required in C. It is mandatory in C++, so I usually add them just in case I have to port the code in C++ one day.
However, I wonder how casts like this can hide errors. Any ideas?
Edit:
Seems like there are very good and valid arguments on both sides. Thanks for posting, folks.
It seems fitting I post an answer, since I left the comment :P
Basically, if you forget to include stdlib.h the compiler will assume malloc returns an int. Without casting, you will get a warning. With casting you won't.
So by casting you get nothing, and run the risk of suppressing legitimate warnings.
Much is written about this, a quick google search will turn up more detailed explanations.
edit
It has been argued that
TYPE * p;
p = (TYPE *)malloc(n*sizeof(TYPE));
makes it obvious when you accidentally don't allocate enough memory because say, you thought p was TYPe not TYPE, and thus we should cast malloc because the advantage of this method overrides the smaller cost of accidentally suppressing compiler warnings.
I would like to point out 2 things:
you should write p = malloc(sizeof(*p)*n); to always ensure you malloc the right amount of space
with the above approach, you need to make changes in 3 places if you ever change the type of p: once in the declaration, once in the malloc, and once in the cast.
In short, I still personally believe there is no need for casting the return value of malloc and it is certainly not best practice.
This question is tagged both for C and C++, so it has at least two answers, IMHO:
C
Ahem... Do whatever you want.
I believe the reason given above "If you don't include "stdlib" then you won't get a warning" is not a valid one because one should not rely on this kind of hacks to not forget to include an header.
The real reason that could make you not write the cast is that the C compiler already silently cast a void * into whatever pointer type you want, and so, doing it yourself is overkill and useless.
If you want to have type safety, you can either switch to C++ or write your own wrapper function, like:
int * malloc_Int(size_t p_iSize) /* number of ints wanted */
{
return malloc(sizeof(int) * p_iSize) ;
}
C++
Sometimes, even in C++, you have to make profit of the malloc/realloc/free utils. Then you'll have to cast. But you already knew that. Using static_cast<>() will be better, as always, than C-style cast.
And in C, you could override malloc (and realloc, etc.) through templates to achieve type-safety:
template <typename T>
T * myMalloc(const size_t p_iSize)
{
return static_cast<T *>(malloc(sizeof(T) * p_iSize)) ;
}
Which would be used like:
int * p = myMalloc<int>(25) ;
free(p) ;
MyStruct * p2 = myMalloc<MyStruct>(12) ;
free(p2) ;
and the following code:
// error: cannot convert ‘int*’ to ‘short int*’ in initialization
short * p = myMalloc<int>(25) ;
free(p) ;
won't compile, so, no problemo.
All in all, in pure C++, you now have no excuse if someone finds more than one C malloc inside your code...
:-)
C + C++ crossover
Sometimes, you want to produce code that will compile both in C and in C++ (for whatever reasons... Isn't it the point of the C++ extern "C" {} block?). In this case, C++ demands the cast, but C won't understand the static_cast keyword, so the solution is the C-style cast (which is still legal in C++ for exactly this kind of reasons).
Note that even with writing pure C code, compiling it with a C++ compiler will get you a lot more warnings and errors (for example attempting to use a function without declaring it first won't compile, unlike the error mentioned above).
So, to be on the safe side, write code that will compile cleanly in C++, study and correct the warnings, and then use the C compiler to produce the final binary. This means, again, write the cast, in a C-style cast.
One possible error it can introduce is if you are compiling on a 64-bit system using C (not C++).
Basically, if you forget to include stdlib.h, the default int rule will apply. Thus the compiler will happily assume that malloc has the prototype of int malloc(); On Many 64-bit systems an int is 32-bits and a pointer is 64-bits.
Uh oh, the value gets truncated and you only get the lower 32-bits of the pointer! Now if you cast the return value of malloc, this error is hidden by the cast. But if you don't you will get an error (something to the nature of "cannot convert int to T *").
This does not apply to C++ of course for 2 reasons. Firstly, it has no default int rule, secondly it requires the cast.
All in all though, you should just new in c++ code anyway :-P.
Well, I think it's the exact opposite - always directly cast it to the needed type. Read on here!
The "forgot stdlib.h" argument is a straw man. Modern compilers will detect and warn of the problem (gcc -Wall).
You should always cast the result of malloc immediately. Not doing so should be considered an error, and not just because it will fail as C++. If you're targeting a machine architecture with different kinds of pointers, for example, you could wind up with a very tricky bug if you don't put in the cast.
Edit: The commentor Evan Teran is correct. My mistake was thinking that the compiler didn't have to do any work on a void pointer in any context. I freak when I think of FAR pointer bugs, so my intuition is to cast everything. Thanks Evan!
Actually, the only way a cast could hide an error is if you were converting from one datatype to an smaller datatype and lost data, or if you were converting pears to apples. Take the following example:
int int_array[10];
/* initialize array */
int *p = &(int_array[3]);
short *sp = (short *)p;
short my_val = *sp;
in this case the conversion to short would be dropping some data from the int. And then this case:
struct {
/* something */
} my_struct[100];
int my_int_array[100];
/* initialize array */
struct my_struct *p = &(my_int_array[99]);
in which you'd end up pointing to the wrong kind of data, or even to invalid memory.
But in general, and if you know what you are doing, it's OK to do the casting. Even more so when you are getting memory from malloc, which happens to return a void pointer which you can't use at all unless you cast it, and most compilers will warn you if you are casting to something the lvalue (the value to the left side of the assignment) can't take anyway.
#if CPLUSPLUS
#define MALLOC_CAST(T) (T)
#else
#define MALLOC_CAST(T)
#endif
...
int * p;
p = MALLOC_CAST(int *) malloc(sizeof(int) * n);
or, alternately
#if CPLUSPLUS
#define MYMALLOC(T, N) static_cast<T*>(malloc(sizeof(T) * N))
#else
#define MYMALLOC(T, N) malloc(sizeof(T) * N)
#endif
...
int * p;
p = MYMALLOC(int, n);
People have already cited the reasons I usually trot out: the old (no longer applicable to most compilers) argument about not including stdlib.h and using sizeof *p to make sure the types and sizes always match regardless of later updating. I do want to point out one other argument against casting. It's a small one, but I think it applies.
C is fairly weakly typed. Most safe type conversions happen automatically, and most unsafe ones require a cast. Consider:
int from_f(float f)
{
return *(int *)&f;
}
That's dangerous code. It's technically undefined behavior, though in practice it's going to do the same thing on nearly every platform you run it on. And the cast helps tell you "This code is a terrible hack."
Consider:
int *p = (int *)malloc(sizeof(int) * 10);
I see a cast, and I wonder, "Why is this necessary? Where is the hack?" It raises hairs on my neck that there's something evil going on, when in fact the code is completely harmless.
As long as we're using C, casts (especially pointer casts) are a way of saying "There's something evil and easily breakable going on here." They may accomplish what you need accomplished, but they indicate to you and future maintainers that the kids aren't alright.
Using casts on every malloc diminishes the "hack" indication of pointer casting. It makes it less jarring to see things like *(int *)&f;.
Note: C and C++ are different languages. C is weakly typed, C++ is more strongly typed. The casts are necessary in C++, even though they don't indicate a hack at all, because of (in my humble opinion) the unnecessarily strong C++ type system. (Really, this particular case is the only place I think the C++ type system is "too strong," but I can't think of any place where it's "too weak," which makes it overall too strong for my tastes.)
If you're worried about C++ compatibility, don't. If you're writing C, use a C compiler. There are plenty really good ones avaliable for every platform. If, for some inane reason, you have to write C code that compiles cleanly as C++, you're not really writing C. If you need to port C to C++, you should be making lots of changes to make your C code more idiomatic C++.
If you can't do any of that, your code won't be pretty no matter what you do, so it doesn't really matter how you decide to cast at that point. I do like the idea of using templates to make a new allocator that returns the correct type, although that's basically just reinventing the new keyword.
Casting a function which returns (void *) to instead be an (int *) is harmless: you're casting one type of pointer to another.
Casting a function which returns an integer to instead be a pointer is most likely incorrect. The compiler would have flagged it had you not explicitly cast it.
One possible error could (depending on this is whether what you really want or not) be mallocing with one size scale, and assigning to a pointer of a different type. E.g.,
int *temp = (int *)malloc(sizeof(double));
There may be cases where you want to do this, but I suspect that they are rare.
I think you should put the cast in. Consider that there are three locations for types:
T1 *p;
p = (T2*) malloc(sizeof(T3));
The two lines of code might be widely separated. Therefore it's good that the compiler will enforce that T1 == T2. It is easier to visually verify that T2 == T3.
If you miss out the T2 cast, then you have to hope that T1 == T3.
On the other hand you have the missing stdlib.h argument - but I think it's less likely to be a problem.
On the other hand, if you ever need to port the code to C++, it is much better to use the 'new' operator.