conversion to static_cast<unsigned char> - c++

Does a conversion like:
int a[3];
char i=1;
a[ static_cast<unsigned char>(i) ];
introduce any overhead like conversions or can the compiler optimize everything away?
I am interested because I want to get rid of -Wchar-subscripts warnings, but want to use a char as index (other reasons)

I did one test on Clang 3.4.1 for this code :
int ival(signed char c) {
int a[] = {0,1,2,3,4,5,6,7,8,9};
unsigned char u = static_cast<unsigned char>(c);
return a[u];
}
Here is the relevant part or the assembly file generated with c++ -S -O3
_Z4ivala: # #_Z4ivala
# BB#0:
pushl %ebp
movl %esp, %ebp
movzbl 8(%ebp), %eax
movl .L_ZZ4ivalaE1a(,%eax,4), %eax
popl %ebp
ret
There is no trace of the conversion.

On most modern architectures char and unsigned char have the same size and alignment, hence unsigned char can represent all non-negative values of char and casting one to another does not require any CPU instructions.

Related

Does the compiler optimize references to constant variables?

When it comes to the C and C++ languages, does the compiler optimize references to constant variables so that the program automatically knows what values are being referred to, instead of having to peek at the memory locations of the constant variables? When it comes to arrays, does it depend on whether the index value to point at in the array is a constant at compile time?
For instance, take a look at this code:
int main(void) {
1: char tesst[3] = {'1', '3', '7'};
2: char erm = tesst[1];
}
Does the compiler "change" line 2 to "char erm = '3'" at compile time?
I personally would expect the posted code to turn into "nothing", since neither variable is actually used, and thus can be removed.
But yes, modern compilers (gcc, clang, msvc, etc) should be able to replace that reference to the alternative with it's constant value [as long as the compiler can be reasonably sure that the content of tesst isn't being changed - if you pass tesst into a function, even if its as a const reference, and the compiler doesn't actually know the function is NOT changing that, it will assume that it does and load the value].
Compiling this using clang -O1 opts.c -S:
#include <stdio.h>
int main()
{
char tesst[3] = {'1', '3', '7'};
char erm = tesst[1];
printf("%d\n", erm);
}
produces:
...
main:
pushq %rax
.Ltmp0:
movl $.L.str, %edi
movl $51, %esi
xorl %eax, %eax
callq printf
xorl %eax, %eax
popq %rcx
retq
...
So, the same as printf("%d\n", '3');.
[I'm using C rather than C++ because it would be about 50 lines of assembler if I used cout, as everything gets inlined]
I expect gcc and msvc to make a similar optimisation (tested gcc -O1 -S and it gives exactly the same code, aside from some symbol names are subtly different)
And to illustrate that "it may not do it if you call a function":
#include <stdio.h>
extern void blah(const char* x);
int main()
{
char tesst[3] = {'1', '3', '7'};
blah(tesst);
char erm = tesst[1];
printf("%d\n", erm);
}
main: # #main
pushq %rax
movb $55, 6(%rsp)
movw $13105, 4(%rsp) # imm = 0x3331
leaq 4(%rsp), %rdi
callq blah
movsbl 5(%rsp), %esi
movl $.L.str, %edi
xorl %eax, %eax
callq printf
xorl %eax, %eax
popq %rcx
retq
Now, it fetches the value from inside tesst.
It mostly depends on the level of optimization and which compiler you are using.
With maximum optimizations, the compiler will indeed probably just replace your whole code with char erm = '3';. GCC -O3 does this anyway.
But then of course it depends on what you do with that variable. The compiler might not even allocate the variable, but just use the raw number in the operation where the variable occurs.
Depends on the compiler version, optimization options used and many other things. If you want to make sure that the const variables are optimized and if they are compile time constants you can use something like constexpr in c++. It is guaranteed to be evaluated at compile time unlike normal const variables.
Edit: constexpr may be evaluated at compile time or runtime. To guarantee compile-time evaluation, we must either use it where a constant expression is required (e.g., as an array bound or as a case label) or use it to initialize a constexpr. so in this case
constexpr char tesst[3] = {'1','3','7'};
constexpr char erm = tesst[1];
would lead to compile time evaluation. Nice read at https://isocpp.org/blog/2013/01/when-does-a-constexpr-function-get-evaluated-at-compile-time-stackoverflow

Can C++ differentiate between a compile time constant and a variable passed to a function? [duplicate]

Here's my problem. I have a BINARY_FLAG macro:
#define BINARY_FLAG( n ) ( static_cast<DWORD>( 1 << ( n ) ) )
Which can be used either like this ("constant" scenario):
static const SomeConstant = BINARY_FLAG( 5 );
or like this ("variable" scenario):
for( int i = 0; i < 10; i++ ) {
DWORD flag = BINARY_FLAG( i );
// do something with the value
}
This macro is not foolproof at all - one can pass -1 or 34 there and there will at most be a warning yet behavior will be undefined. I'd like to make it more foolproof.
For the constant scenario I could use a template:
template<int Shift> class BinaryFlag {
staticAssert( 0 <= Shift && Shift < sizeof( DWORD) * CHAR_BIT );
public:
static const DWORD FlagValue = static_cast<DWORD>( 1 << Shift );
};
#define BINARY_FLAG( n ) CBinaryFlag<n>::FlagValue
but this will not go for the "variable" scenario - I'd need a runtime assertion there:
inline DWORD ProduceBinaryFlag( int shift )
{
assert( 0 <= shift && shift < sizeof( DWORD) * CHAR_BIT );
return static_cast<DWORD>( 1 << shift );
}
#define BINARY_FLAG( n ) ProduceBinaryFlag(n)
The latter is good, but has no compile-time checks. Of course, I'd like a compile-time check where possible and a runtime check otherwise. At all times I want as little runtime overhead as possible so I don't want a function call (that maybe won't be inlined) when a compile-time check is possible.
I saw this question, but it doesn't look like it is about the same problem.
Is there some construct that would allow to alternate between the two depending on whether the expression passed as a flag number is a compile-time constant or a variable?
This is simpler than you think :)
Let's have a look:
#include <cassert>
static inline int FLAG(int n) {
assert(n>=0 && n<32);
return 1<<n;
}
int test1(int n) {
return FLAG(n);
}
int test2() {
return FLAG(5);
}
I don't use MSVC, but I compiled with Mingw GCC 4.5:
g++ -c -S -O3 08042.cpp
The resulting code for first method looks like:
__Z5test1i:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %ecx
cmpl $31, %ecx
ja L4
movl $1, %eax
sall %cl, %eax
leave
ret
L4:
movl $4, 8(%esp)
movl $LC0, 4(%esp)
movl $LC1, (%esp)
call __assert
.p2align 2,,3
And the second:
__Z5test2v:
pushl %ebp
movl %esp, %ebp
movl $32, %eax
leave
ret
See? The compiler is smart enough to do it for you. No need for macros, no need for metaprogramming, no need for C++0x. As simple as that.
Check if MSVC does the same... But look - it's really easy for the compiler to evaluate a constant expression and drop the unused conditional branch. Check it if you want to be sure... But generally - trust your tools.
It's not possible to pass an argument to a macro or function and determine if it's compile time constant or a variable.
The best way is that you #define BINARY_FLAG(n) with compile time code and place that macro everywhere and then compile it. You will receive compiler-errors at the places where n is going to be runtime. Now, you can replace those macros with your runtime macro BINARY_FLAG_RUNTIME(n). This is the only feasible way.
I suggest you use two macros.
BINARY_FLAG
CONST_BINARY_FLAG
That will make your code easier to grasp for others. You do know, at the time of writing, if it is a const or not.
And I would in no case worry about runtime overhead. Your optimizer, at least in VS, will sort that out for you.

Boolean multiplication in c++?

Consider the following:
inline unsigned int f1(const unsigned int i, const bool b) {return b ? i : 0;}
inline unsigned int f2(const unsigned int i, const bool b) {return b*i;}
The syntax of f2 is more compact, but do the standard guarantees that f1 and f2 are strictly equivalent ?
Furthermore, if I want the compiler to optimize this expression if b and i are known at compile-time, which version should I prefer ?
Well, yes, both are equivalent. bool is an integral type and true is guaranteed to convert to 1 in integer context, while false is guaranteed to convert to 0.
(The reverse is also true, i.e. non-zero integer values are guaranteed to convert to true in boolean context, while zero integer values are guaranteed to convert to false in boolean context.)
Since you are working with unsigned types, one can easily come up with other, possibly bit-hack-based yet perfectly portable implementations of the same thing, like
i & -(unsigned) b
although a decent compiler should be able to choose the best implementation by itself for any of your versions.
P.S. Although to my great surprise, GCC 4.1.2 compiled all three variants virtually literally, i.e. it used machine multiplication instruction in multiplication-based variant. It was smart enough to use cmovne instruction on the ?: variant to make it branchless, which quite possibly made it the most efficient implementation.
Yes. It's safe to assume true is 1 and false is 0 when used in expressions as you do and is guaranteed:
C++11, Integral Promotions, 4.5:
An rvalue of type bool can be converted to an rvalue of type int, with
false becoming zero and true becoming one.
The compiler will use implicit conversion to make an unsigned int from b, so, yes, this should work. You're skipping the condition checking by simple multiplication. Which one is more effective/faster? Don't know. A good compiler would most likely optimize both versions I'd assume.
FWIW, the following code
inline unsigned int f1(const unsigned int i, const bool b) {return b ? i : 0;}
inline unsigned int f2(const unsigned int i, const bool b) {return b*i;}
int main()
{
volatile unsigned int i = f1(42, true);
volatile unsigned int j = f2(42, true);
}
compiled with gcc -O2 produces this assembly:
.file "test.cpp"
.def ___main; .scl 2; .type 32; .endef
.section .text.startup,"x"
.p2align 2,,3
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB2:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
andl $-16, %esp
subl $16, %esp
call ___main
movl $42, 8(%esp) // i
movl $42, 12(%esp) // j
xorl %eax, %eax
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
LFE2:
There's not much left of either f1 or f2, as you can see.
As far as C++ standard is concerned, the compiler is allowed to do anything with regards to optimization, as long as it doesn't change the observable behaviour (the as if rule).

How could my code tell a compile-time constant versus a variable?

Here's my problem. I have a BINARY_FLAG macro:
#define BINARY_FLAG( n ) ( static_cast<DWORD>( 1 << ( n ) ) )
Which can be used either like this ("constant" scenario):
static const SomeConstant = BINARY_FLAG( 5 );
or like this ("variable" scenario):
for( int i = 0; i < 10; i++ ) {
DWORD flag = BINARY_FLAG( i );
// do something with the value
}
This macro is not foolproof at all - one can pass -1 or 34 there and there will at most be a warning yet behavior will be undefined. I'd like to make it more foolproof.
For the constant scenario I could use a template:
template<int Shift> class BinaryFlag {
staticAssert( 0 <= Shift && Shift < sizeof( DWORD) * CHAR_BIT );
public:
static const DWORD FlagValue = static_cast<DWORD>( 1 << Shift );
};
#define BINARY_FLAG( n ) CBinaryFlag<n>::FlagValue
but this will not go for the "variable" scenario - I'd need a runtime assertion there:
inline DWORD ProduceBinaryFlag( int shift )
{
assert( 0 <= shift && shift < sizeof( DWORD) * CHAR_BIT );
return static_cast<DWORD>( 1 << shift );
}
#define BINARY_FLAG( n ) ProduceBinaryFlag(n)
The latter is good, but has no compile-time checks. Of course, I'd like a compile-time check where possible and a runtime check otherwise. At all times I want as little runtime overhead as possible so I don't want a function call (that maybe won't be inlined) when a compile-time check is possible.
I saw this question, but it doesn't look like it is about the same problem.
Is there some construct that would allow to alternate between the two depending on whether the expression passed as a flag number is a compile-time constant or a variable?
This is simpler than you think :)
Let's have a look:
#include <cassert>
static inline int FLAG(int n) {
assert(n>=0 && n<32);
return 1<<n;
}
int test1(int n) {
return FLAG(n);
}
int test2() {
return FLAG(5);
}
I don't use MSVC, but I compiled with Mingw GCC 4.5:
g++ -c -S -O3 08042.cpp
The resulting code for first method looks like:
__Z5test1i:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %ecx
cmpl $31, %ecx
ja L4
movl $1, %eax
sall %cl, %eax
leave
ret
L4:
movl $4, 8(%esp)
movl $LC0, 4(%esp)
movl $LC1, (%esp)
call __assert
.p2align 2,,3
And the second:
__Z5test2v:
pushl %ebp
movl %esp, %ebp
movl $32, %eax
leave
ret
See? The compiler is smart enough to do it for you. No need for macros, no need for metaprogramming, no need for C++0x. As simple as that.
Check if MSVC does the same... But look - it's really easy for the compiler to evaluate a constant expression and drop the unused conditional branch. Check it if you want to be sure... But generally - trust your tools.
It's not possible to pass an argument to a macro or function and determine if it's compile time constant or a variable.
The best way is that you #define BINARY_FLAG(n) with compile time code and place that macro everywhere and then compile it. You will receive compiler-errors at the places where n is going to be runtime. Now, you can replace those macros with your runtime macro BINARY_FLAG_RUNTIME(n). This is the only feasible way.
I suggest you use two macros.
BINARY_FLAG
CONST_BINARY_FLAG
That will make your code easier to grasp for others. You do know, at the time of writing, if it is a const or not.
And I would in no case worry about runtime overhead. Your optimizer, at least in VS, will sort that out for you.

How do I declare an array created using malloc to be volatile in c++

I presume that the following will give me 10 volatile ints
volatile int foo[10];
However, I don't think the following will do the same thing.
volatile int* foo;
foo = malloc(sizeof(int)*10);
Please correct me if I am wrong about this and how I can have a volatile array of items using malloc.
Thanks.
int volatile * foo;
read from right to left "foo is a pointer to a volatile int"
so whatever int you access through foo, the int will be volatile.
P.S.
int * volatile foo; // "foo is a volatile pointer to an int"
!=
volatile int * foo; // foo is a pointer to an int, volatile
Meaning foo is volatile. The second case is really just a leftover of the general right-to-left rule.
The lesson to be learned is get in the habit of using
char const * foo;
instead of the more common
const char * foo;
If you want more complicated things like "pointer to function returning pointer to int" to make any sense.
P.S., and this is a biggy (and the main reason I'm adding an answer):
I note that you included "multithreading" as a tag. Do you realize that volatile does little/nothing of good with respect to multithreading?
volatile int* foo;
is the way to go. The volatile type qualifier works just like the const type qualifier. If you wanted a pointer to a constant array of integer you would write:
const int* foo;
whereas
int* const foo;
is a constant pointer to an integer that can itself be changed. volatile works the same way.
Yes, that will work. There is nothing different about the actual memory that is volatile. It is just a way to tell the compiler how to interact with that memory.
I think the second declares the pointer to be volatile, not what it points to. To get that, I think it should be
int * volatile foo;
This syntax is acceptable to gcc, but I'm having trouble convincing myself that it does anything different.
I found a difference with gcc -O3 (full optimization). For this (silly) test code:
volatile int v [10];
int * volatile p;
int main (void)
{
v [3] = p [2];
p [3] = v [2];
return 0;
}
With volatile, and omitting (x86) instructions which don't change:
movl p, %eax
movl 8(%eax), %eax
movl %eax, v+12
movl p, %edx
movl v+8, %eax
movl %eax, 12(%edx)
Without volatile, it skips reloading p:
movl p, %eax
movl 8(%eax), %edx ; different since p being preserved
movl %edx, v+12
; 'p' not reloaded here
movl v+8, %edx
movl %edx, 12(%eax) ; p reused
After many more science experiments trying to find a difference, I conclude there is no difference. volatile turns off all optimizations related to the variable which would reuse a subsequently set value. At least with x86 gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33). :-)
Thanks very much to wallyk, I was able to devise some code use his method to generate some assembly to prove to myself the difference between the different pointer methods.
using the code: and compiling with -03
int main (void)
{
while(p[2]);
return 0;
}
when p is simply declared as pointer, we get stuck in a loop that is impossible to get out of. Note that if this were a multithreaded program and a different thread wrote p[2] = 0, then the program would break out of the while loop and terminate normally.
int * p;
============
LCFI1:
movq _p(%rip), %rax
movl 8(%rax), %eax
testl %eax, %eax
jne L6
xorl %eax, %eax
leave
ret
L6:
jmp L6
notice that the only instruction for L6 is to goto L6.
==
when p is volatile pointer
int * volatile p;
==============
L3:
movq _p(%rip), %rax
movl 8(%rax), %eax
testl %eax, %eax
jne L3
xorl %eax, %eax
leave
ret
here, the pointer p gets reloaded each loop iteration and as a consequence the array item also gets reloaded. However, this would not be correct if we wanted an array of volatile integers as this would be possible:
int* volatile p;
..
..
int* j;
j = &p[2];
while(j);
and would result in the loop that would be impossible to terminate in a multithreaded program.
==
finally, this is the correct solution as tony nicely explained.
int volatile * p;
LCFI1:
movq _p(%rip), %rdx
addq $8, %rdx
.align 4,0x90
L3:
movl (%rdx), %eax
testl %eax, %eax
jne L3
leave
ret
In this case the the address of p[2] is kept in register value and not loaded from memory, but the value of p[2] is reloaded from memory on every loop cycle.
also note that
int volatile * p;
..
..
int* j;
j = &p[2];
while(j);
will generate a compile error.