reinterpret_cast rvalue and optimization

reinterpret_cast rvalue and optimization - c++

I am converting a bunch of code over to use C++-style casts (with the help of -Wold-style-cast). I'm not entirely sold on its use for primitive variables, but I'm new to C++-style casts in general.
One issue occurs in some endian converting code. The current code looks like this:
#define REINTERPRET_VARIABLE(VAR,TYPE) (*((TYPE*)(&VAR)))
//...
uint16_t reverse(uint16_t val) { /*stuff to reverse uint16_t*/ }
int16_t reverse( int16_t val) {
uint16_t temp = reverse(REINTERPRET_VARIABLE(val,uint16_t));
return REINTERPRET_VARIABLE(temp,int16_t);
}
Now, endianness doesn't care about signedness. Therefore, to reverse an int16_t, we can treat it exactly like a uint16_t for the purposes of the reversal. This suggests code like this:
int16_t reverse( int16_t val) {
return reinterpret_cast<int16_t>(reverse(reinterpret_cast<uint16_t>(val)));
}
However, as described in this and in particular this question, reinterpret_cast requires a reference or a pointer (unless it's casting to itself). This suggests:
int16_t reverse( int16_t val) {
return reinterpret_cast<int16_t&>(reverse(reinterpret_cast<uint16_t&>(val)));
}
This doesn't work because, as my compiler tells me, the outside cast wants an lvalue. To fix this, you'd need to do something like:
int16_t reverse( int16_t val) {
uint16_t temp = reverse(reinterpret_cast<uint16_t&>(val));
return reinterpret_cast<int16_t&>(temp);
}
This is not much different from the original code, and indeed the temporary variable exists for the same reason, but four questions were raised for me:
Why is a temporary even necessary for a reinterpret_cast? I can understand a dumb compiler's needing to have a temporary to support the pointer nastiness of REINTERPRET_VARIABLE, but reinterpret_cast is supposed to just reinterpret bits. Is this clashing with RVO or something?
Will requiring that temporary incur a performance penalty, or is it likely that the compiler can figure out that the temporary really should just be the return value?
The second reinterpret_cast looks like it's returning a reference. Since the function return value isn't a reference, I'm pretty sure this is okay; the return value will be a copy, not a reference. However, I would still like to know what casting to a reference really even means? It is appropriate in this case, right?
Are there any other performance implications I should be aware of? I'd guess that reinterpret_cast would be, if anything, faster since the compiler doesn't need to figure out that the bits should be reinterpreted--I just tell it that they should?

temp is required because the & (address-of) operator is applied to it on the next line. This operator requires an lvalue (the object to take the address of).
I'd expect the compiler to optimize it out.
reinterpret_cast<T&>(x) is the same as * reinterpret_cast<T *>(&x), it is an lvalue designating the same memory location as x occupies. Note that the type of an expression is never a reference; but the result of casting to T&, or of using the * operator is an lvalue.
I wouldn't expect any performance issues.
There are no strict aliasing problems with this particular piece of code, because it is allowed to alias an integer type as the signed or unsigned variation of the same type. But you suggest the codebase is full of reinterpret casts, so you should keep your eye out for strict aliasing violations elsewhere, perhaps compile with -fno-strict-aliasing until it is sorted out.

Since no one has answered this with language-lawyery facts in two years, I'll answer it instead with my educated guesses.
Who knows. But it's apparently necessary, as you've surmised. To avoid issues with strict aliasing, it would be safest to use memcpy, which will be optimized correctly by any compiler.
The answer to any such question is always to profile it and to check the disassembly. In the example you gave, e.g. GCC will optimize it to:
reverse(short):
mov eax, edi
rol ax, 8
ret
Which looks pretty optimal (the mov is for copying from the input register; if you inline your function and use it, you'll see it is absent entirely).
This is a language lawyer question. Probably has some useful semantic meaning. Don't worry about it. You haven't written code like this since.
Again, profile. Maybe reinterpret casting gets in the way of certain optimizations. You should follow the same guidelines as you would for strict aliasing, mentioned above.

Related

C++ strict-aliasing agnostic cast

I've read lots of QAs about strict aliasing here in Stack Overflow but all they are pretty common and discussion always tends to refer to deep-deep details of C++ standard which are almost always are difficult to understand properly. Especially when standard do not say things directly but describes something in a muddy unclear way.
So, my question is probably a possible duplicate of tonns of QAs here, but, please, just answer a specific question:
Is it a correct way to do a "nonalias_cast"?:
template<class OUT, class IN>
inline auto nonalias_cast(IN *data) {
char *tmp = reinterpret_cast<char *>(data);
return reinterpret_cast<OUT>(tmp);
}
float f = 3.14;
unsigned *u = nonalias_cast<unsigned *>(&f);
*u = 0x3f800000;
// now f should be equal 1.0
I guess the answer is no. But is there any nice workaround? Except disabling strict-aliasing flag of course. Union is not a handy option as well unless there is a way to fit a union hack inside nonalias_cast function body. memcpy is not an option here as well - data change should be synchronysed.
An impossible dream or an elusive reality?
UPD:
Okay, since we've got a negative answer on "is it possible?" question, I'd like to ask you an extra-question which do bothers me:
How would you resolve this task? I mean there is a plenty of practical tasks which more-less demand a "play with a bits" approach. For instance assume you have to write a IEEE-754 Floating Point Converter like this. I'm more concerned with the practical side of the question: how to have a workaround to reach the goal? In a least "pain in ##$" way.

As the other answers have correctly pointed out: This is not possible as you are not allowed to access the float object through an unsigned pointer and there is no cast that will remove that rule.
So how do you work around this issue? Don't access the object through an unsigned pointer! Use a float* or char* for passing the object around, as those are the only pointer types that are allowed under strict aliasing. Then when you actually need to access the object under unsigned semantics, you do a memcpy from the float* to a local unsigned (and memcpy back once you are done). Your compiler will be smart enough to generate efficient code for this.
Note that this means that you will have float* everywhere on your interfaces instead of unsigned*. And that is exactly what makes this work: The type system is aware of the correct data types at all times. Things only start to crumble if you try to smuggle a float through the type system as an unsigned*, which you'll hopefully agree is kind of a fishy idea in the first place.

Is it a correct way to do a "nonalias_cast"?
No.
But is there any nice workaround?
Again, no.
Reason for both is simply that &f is not the address of some object of type unsigned int, and no amount of casting on the pointer is going to change that.

No, your nonalias_cast does not work, and cannot work.
Type aliasing rules are not (directly) about converting pointers. In fact, none of your conversions have undefined behaviour. The rules are about accessing an object of certain type, through a pointer of another type.
No matter how you convert the pointer, the pointed object is still a float object, and accessing it through an unsigned pointer violates type aliasing rules.
An impossible dream or an elusive reality?
In standard C++, it is impossible.

Is const a lie? (since const can be cast away) [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Sell me on const correctness
What is the usefulness of keyword const in C or C++ since it's allowed such a thing?
void const_is_a_lie(const int* n)
{
*((int*) n) = 0;
}
int main()
{
int n = 1;
const_is_a_lie(&n);
printf("%d", n);
return 0;
}
Output: 0
It is clear that const cannot guarante the non-modifiability of the argument.

const is a promise you make to the compiler, not something it guarantees you.
For example,
void const_is_a_lie(const int* n)
{
*((int*) n) = 0;
}
#include <stdio.h>
int main()
{
const int n = 1;
const_is_a_lie(&n);
printf("%d", n);
return 0;
}
Output shown at http://ideone.com/Ejogb is
1
Because of the const, the compiler is allowed to assume that the value won't change, and therefore it can skip rereading it, if that would make the program faster.
In this case, since const_is_a_lie() violates its contract, weird things happen. Don't violate the contract. And be glad that the compiler gives you help keeping the contract. Casts are evil.

In this case, n is a pointer to a constant int. When you cast it to int* you remove the const qualifier, and so the operation is allowed.
If you tell the compiler to remove the const qualifier, it will happily do so. The compiler will help ensure that your code is correct, if you let it do its job. By casting the const-ness away, you are telling the compiler that you know that the target of n is non-constant and you really do want to change it.
If the thing that your pointer points to was in fact declared const in the first place, then you are invoking undefined behavior by attempting to change it, and anything could happen. It might work. The write operation might not be visible. The program could crash. Your monitor could punch you. (Ok, probably not that last one.)
void const_is_a_lie(const char * c) {
*((char *)c) = '5';
}
int main() {
const char * text = "12345";
const_is_a_lie(text);
printf("%s\n", text);
return 0;
}
Depending on your specific environment, there may be a segfault (aka access violation) in const_is_a_lie since the compiler/runtime may store string literal values in memory pages that are not writable.
The Standard has this to say about modifying const objects.
7.1.6.1/4 The cv-qualifiers [dcl.type.cv]
Except that any class member declared mutable (7.1.1) can be modified, any attempt to modify a const object during its lifetime (3.8) results in undefined behavior
"Doctor, it hurts when I do this!" "So don't do that."

Your...
int n = 1;
...ensures n exists in read/write memory; it's a non-const variable, so a later attempt to modify it will have defined behaviour. Given such a variable, you can have a mix of const and/or non-const pointers and references to it - the constness of each is simply a way for the programmer to guard against accidental change in that "branch" of code. I say "branch" because you can visualise the access given to n as being a tree in which - once a branch is marked const, all the sub-branches (further pointers/references to n whether additional local variables, function parameters etc. initialised therefrom) will need to remain const, unless of course you explicitly cast that notion of constness away. Casting away const is safe (if potentially confusing) for variables that are mutable like your n, because they're ultimately still writing back into a memory address that is modifiable/mutable/non-const. All the bizarre optimisations and caching you could imagine causing trouble in these scenarios aren't allowed as the Standard requires and guarantees sane behaviour in the case I've just described.
Sadly it's also possible to cast away constness of genuinely inherently const variables like say const int o = 1;, and any attempt to modify them will have undefined behaviour. There are many practical reasons for this, including the compiler's right to place them in memory it then marks read only (e.g. see UNIX mprotect(2)) such that an attempted write will cause a CPU trap/interrupt, or read from the variable whenever the originally-set value is needed (even if the variable's identifier was never mentioned in the code using the value), or use an inlined-at-compile-time copy of the original value - ignoring any runtime change to the variable itself. So, the Standard leaves the behaviour undefined. Even if they happen to be modified as you might have intended, the rest of the program will have undefined behaviour thereafter.
But, that shouldn't be surprising. It's the same situation with types - if you have...
double d = 1;
*(int*)&d = my_int;
d += 1;
...have you have lied to the compiler about the type of d? Ultimately d occupies memory that's probably untyped at a hardware level, so all the compiler ever has is a perspective on it, shuffling bit patterns in and out. But, depending on the value of my_int and the double representation on your hardware, you may have created an invalid combination of bits in d that don't represent any valid double value, such that subsequent attempts to read the memory back into a CPU register and/or do something with d such as += 1 have undefined behaviour and might, for example, generate a CPU trap / interrupt.
This is not a bug in C or C++... they're designed to let you make dubious requests of your hardware so that if you know what you're doing you can do some weird but useful things and rarely need to fall back on assembly language to write low level code, even for device drivers and Operating Systems.
Still, it's precisely because casts can be unsafe that a more explicit and targeted casting notation has been introduced in C++. There's no denying the risk - you just need to understand what you're asking for, why it's ok sometimes and not others, and live with it.

The type system is there to help, not to babysit you. You can circumvent the type system in many ways, not only regarding const, and each time that you do that what you are doing is taking one safety out of your program. You can ignore const-correctness or even the basic type system by passing void* around and casting as needed. That does not mean that const or types are a lie, only that you can force your way over the compiler's.
const is there as a way of making the compiler aware of the contract of your function, and let it help you not violate it. In the same way that a variable being typed is there so that you don't need to guess how to interpret the data as the compiler will help you. But it won't baby sit, and if you force your way and tell it to remove const-ness, or how the data is to be retrieved the compiler will just let you, after all you did design the application, who is it to second guess your judgement...
Additionally, in some cases, you might actually cause undefined behavior and your application might even crash (for example if you cast away const from an object that is really const and you modify the object you might find out that the side effects are not seen in some places (the compiler assumed that the value would not change and thus performed constant folding) or your application might crash if the constant was loaded into a read-only memory page.

Never did const guarantee immutability: the standard defines a const_cast that allows modifying const data.
const is useful for you to declare more intent and avoid changing data that is you meant to be read only. You'll get a compilation error asking you to think twice if you do otherwise. You can change your mind, but that's not recommended.
As mentionned by other answers the compiler may optimize a bit more if you use const-ness but the benefits are not always significant.

Strict pointer aliasing: is access through a 'volatile' pointer/reference a solution?

On the heels of a specific problem, a self-answer and comments to it, I'd like to understand if it is a proper solution, workaround/hack or just plain wrong.
Specifically, I rewrote code:
T x = ...;
if (*reinterpret_cast <int*> (&x) == 0)
...
As:
T x = ...;
if (*reinterpret_cast <volatile int*> (&x) == 0)
...
with a volatile qualifier to the pointer.
Let's just assume that treating T as int in my situation makes sense. Does this accessing through a volatile reference solve pointer aliasing problem?
For a reference, from specification:
[ Note: volatile is a hint to the implementation to avoid aggressive
optimization involving the object because the value of the object might
be changed by means undetectable by an implementation. See 1.9 for
detailed semantics. In general, the semantics of volatile are intended
to be the same in C++ as they are in C. — end note ]
EDIT:
The above code did solve my problem at least on GCC 4.5.

Volatile can't help you avoid undefined behaviour here. So, if it works for you with GCC it's luck.
Let's assume T is a POD. Then, the proper way to do this is
T x = …;
int i;
memcpy(&i,&x,sizeof i);
if (i==0)
…
There! No strict aliasing problem and no memory alignment problem. GCC even handles memcpy as an intrinsic function (no function call is inserted in this case).

Volatile can't help you avoid undefined behaviour here.
Well, anything regarding volatile is somewhat unclear in the standard. I mostly agreed with your answer, but now I would like to slightly disagree.
In order to understand what volatile means, the standard is not clear for most people, notably some compiler writers. It is better to think:
when using volatile (and only when), C/C++ is pretty much high level assembly.
When writing to a volatile lvalue, the compiler will issue a STORE, or multiple STORE if one is not enough (volatile does not imply atomic).
When writing to a volatile lvalue, the compiler will issue a LOAD, or multiple LOAD if one is not enough.
Of course, where there is no explicit LOAD or STORE, the compiler will just issue instructions which imply a LOAD or STORE.
sellibitze gave the best solution: use memcpy for bit reinterpretations.
But if all accesses to a memory region are done with volatile lvalues, it is perfectly clear that the strict aliasing rules do not apply. This is the answer to your question.

C++ optimization of reference-to-pointer argument

I'm wondering with functions like the following, whether to use a temporary variable (p):
void parse_foo(const char*& p_in_out,
foo& out) {
const char* p = p_in_out;
/* Parse, p gets incremented etc. */
p_in_out = p;
}
or can I just use the original argument and expect it to be optimized similarly to the above anyway? It seems like there should be such an optimization, but I've seen the above done in a few places such as Mozilla's code, with vague comments about "avoiding aliasing".

All good answers, but if you're worried about performance optimization, the actual parsing is going to take nearly all of the time, so pointer aliasing will probably be "in the noise".

The variant with a temporary variable could be faster since it doesn't imply that every change to the pointer is reflected back to the argument and the compiler has better chances on generating faster code. However the right way to test this is to compile and look at the disassembly.
Meanwhile this has noting to do with avoiding aliasing. In fact, the variant with a temporary variant does employ aliasing - now you have two pointers into the same array and that's exactly what aliasing is.

I would use a temporary if there is a possibility that the function is transactional.
i.e. the function succeeds or fails completely (no middle ground).
In this case I would use a temp to maintain state while the function executes and only assign back to the in_out parameter when the function completes successfully.
If the function exits prematurely (ie via exception) then we have two situations:
With a temporary (the external pointer is unchanged)
Using the parameter directly the external state is modified to reflect position.
I don't see any optimization advantages to either method.

Yes, you should assign it to a local that you mark restrict (__restrict in MSVC).
The reason for this is that if the compiler cannot be absolutely sure that nothing else in the scope points at p_in_out, it cannot store the contents under the pointer in a local register. It must read the data back every time you write to any other char * in the same scope. This is not an issue of whether it is a "smart" compiler or not; it is a consequence of correctness requirements.
By writing char* __restrict p you promise the compiler that no other pointer in the same scope points to the same address as p. Without this guarantee, the value of *p can change any time any other pointer is written to, or it may change the contents of some other pointer every time *p is written to. Thus, the compiler has to write out every assignment to *p back to memory immediately, and it has to read them back after every time another pointer is written through.
So, guaranteeing the compiler that this cannot happen — that it can load *p exactly once and assume no other pointer affects it — can be an improvement in performance. Exactly how much depends on the particular compiler and situation: on processors subject to a load-hit-store penalty, it's massive; on most x86 CPUs, it's modest.
The reason to prefer a pointer to a reference here is simply that a pointer can be marked restrict and a reference cannot. That's just the way C++ is.
You can try it both ways and measure the results to see which is really faster. And if you're curious, I've written in depth on restrict and the load-hit-store elsewhere.
addendum: after writing the above I realize that the people at Moz were more worried about the reference itself being aliased -- that is, that something else might point at the same address where const char *p is stored, rather than the char to which p points. But my answer is the same: under the hood, const char *&p means const char **p, and that's subject to the same aliasing issues as any other pointer.

How does the compiler know that p_in_out isn't aliased somehow? It really can't optimize away writing the data back through the reference.
struct foo {
setX(int); setY(int);
const char* current_pos;
} x;
parse_foo(x.current_pos, x);
I look at this and ask why you didn't just return the pointer Then you don't have a reference to a pointer and you don't have to worry about modify the original.
const char* parse_foo(const char* p, foo& out) {
//use p;
return p;
}
It also means you can call the function with an rvalue:
p = parse_foo(p+2, out);

One thought that comes immediately in mind: exception safety. If you throw an exception during parsing, the use of a temporary variable is what you should do to provide strong exception safety: Either the function call succeeded completely or it didn't do anything (from a user's perspective).

Why do some c++ compilers let you take the address of a literal?

A C++ compiler that I will not name lets you take the address of a literal, int *p = &42;
Clearly 42 is an r-value and most compilers refuse to do so.
Why would a compiler allow this? What could you do with this other than shoot yourself in the foot?

What if you needed a pointer to an integer with the value of 42? :)
C++ references are much like automatically dereferenced pointers. One can create a constant reference to a literal, like this:
const int &x = 42;
It effectively requires the compiler to initialize a pointer with the address of an integer with the value 42, as you might subsequently do this:
const int *y = &x;
Combine that with the fact that compilers need to have logic to distinguish between a value which has not had its address taken, and one which has, so it knows to store it in memory. The first need not have a memory location, as it can be entirely temporary and stored in a register, or it may be eliminated by optimization. Taking the address of the value potentially introduces an alias the compiler can't track and inhibits optimization. So, applying the & operator may force the value, whatever it is, into memory.
So, it's possible you found a bug that combined these two effects.

Because 42 is the answer to life, the universe and everything. When asked for its address it is the answer itself.

Tongue slightly (nut by no means totally) in cheek:
I'd say that in C++ application code taking the address of an integer whether lvalue or rvalue is almost always a mistake. Even using integers, for doing anything much more than controlling loops or counting is probably a design error, and if you need to pass an integer to a function which might change it, use a reference.

Found something related to rvalue references in C++0x -- move semantics
http://www.artima.com/cppsource/rvalue.html

It effectively requires the compiler to initialize a pointer with the address of an integer with the value 42
Then why, in some compilers, we can't take the address of a literal directly ?
int* ptr = &10;
The reference:
int& ref = 10;
is almost the same thing as a pointer, though...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

reinterpret_cast rvalue and optimization - c++

Related

C++ strict-aliasing agnostic cast

Is const a lie? (since const can be cast away) [duplicate]

Strict pointer aliasing: is access through a 'volatile' pointer/reference a solution?

C++ optimization of reference-to-pointer argument

Why do some c++ compilers let you take the address of a literal?

Categories

Resources