Any difference in compiler behavior for each of these snippets? - c++

Please consider following code:
1.
uint16 a = 0x0001;
if(a < 0x0002)
{
// do something
}
2.
uint16 a = 0x0001;
if(a < uint16(0x0002))
{
// do something
}
3.
uint16 a = 0x0001;
if(a < static_cast<uint16>(0x0002))
{
// do something
}
4.
uint16 a = 0x0001;
uint16 b = 0x0002;
if(a < b)
{
// do something
}
What compiler does in backgorund and what is the best (and correct) way to do above testing?
p.s. sorry, but I couldn't find the better title :)
EDIT:
values 0x0001 and 0x0002 are only example. There coudl be any 2 byte value instead.
Thank you in advance!

The last example is the best code-wise, as you shouldn't use "magic constants" in your code.
In fact, the best way would be to make b const, (edit) and use meaningful names:
uint16 currentSpeed = 0x0001;
const uint16 cMaxSpeed = 0x0002;
if (currentSpeed < cMaxSpeed)
{
// do something
}
Other than that, there is very little difference "under the bonnet" between your examples.

It is usually best to avoid unnamed "magic" numbers in code, since it is hard for a maintainer to understand what the number is supposed to mean. For this reason, it is good practice to name your constants. For this reason, don't do number 1.
In C++, it is best to use static_cast, rather than C style casts. I'm sure there are probably other questions about why this is the case, but the best reference here is Meyers (Effective C++). For this reason, prefer 3 over 2, but 3 still suffers from the magic number problem.
Four is the best, except the variable names are meaningless, and it might make sense for one or both variables to be const.
I'm not sure if there is any difference between any in terms of compiled code, but there might be due to the literal being interpreted as something other than uint16. It might be a uint32 for instance, although you should still get the expected result.

If that is all, there's no difference (GCC, -O2). In all cases //do something is simply executed unconditionally.
So it's just a style issue.

Since the numbers you're working with are both single digit and less than 10, there's no difference between decimal and hexadecimal. Unless you've defined uint16 in an unexpected way, the cast and/or static_cast should make no difference at all. There should be no real difference between using the constant directly, and initializing a variable, then using that.
What you should be concerned with is making sure that a reader can understand what's going on -- give meaningful names, so it's apparent why you're comparing those items. Since the casts aren't really accomplishing anything, you'd be better off without them.

Related

Authoritative "correct" way to avoid signed-unsigned warnings when testing a loop variable against size_t

The code below generates a compiler warning:
private void test()
{
byte buffer[100];
for (int i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
warning: comparison between signed and unsigned integer expressions
[-Wsign-compare]
This is because sizeof() returns a size_t, which is unsigned.
I have seen a number of suggestions for how to deal with this, but none with a preponderance of support and none with any convincing logic nor any references to support one approach as clearly "better." The most common suggestions seem to be:
ignore the warnings
turn off the warnings
use a loop variable of type size_t
use a loop variable of type size_t with tricks to avoid decrementing past zero
cast size_of(buffer) to an int
some extremely convoluted suggestions that I did not have the patience to follow because they involved unreadable code, generally involving vectors and/or iterators
libraries that I cannot load in the AVR / ARM embedded environments I often use.
free functions returning a valid int or long representing the byte count of T
Don't use loops (gotta love that advice)
Is there a "correct" way to approach this?
-- Begin Edit --
The example I gave is, of course, trivial, and meant only to demonstrate the type mismatch warning that can occur in an indexing situation.
#3 is not necessarily the obviously correct answer because size_t carries special risks in a decrementing loop such as
for (size_t i = myArray.size; i > 0; --i)
(the array may someday have a size of zero).
#4 is a suggestion to deal with decrementing size_t indexes by including appropriate and necessary checks to avoid ever decrementing past zero. Since that makes the code harder to read, there are some cute shortcuts that are not particularly readable, hence my referring to them as "tricks."
#7 is a suggestion to use libraries that are not generalizable in the sense that they may not be available or appropriate in every setting.
#8 is a suggestion to keep the checks readable, but to hide them in a non-member method, sometimes referred to as a "free function."
#9 is a suggestion to use algorithms rather than loops. This was offered many times as a solution to the size_t indexing problem, and there were a lot of upvotes. I include it even though I can't use the stl library in most of my environments and would have to write the code myself.
-- End Edit--
I am hoping for evidence-based guidance or references as to best practices for handling something like this. Is there a "standard text" or a style guide somewhere that addresses the question? A defined approach that has been adopted/endorsed internally by a major tech company? An emulatable solution forthcoming in a new language release? If necessary, I would be satisfied with an unsupported public recommendation from a single widely recognized expert.
None of the options on offer seem very appealing. The warnings drown out other things I want to see. I don't want to miss signed/unsigned comparisons in places where it might matter. Decrementing a loop variable of type size_t with comparison >=0 results in an infinite loop from unsigned integer wraparound, and even if we protect against that with something like for (size_t i = sizeof(buffer); i-->0 ;), there are other issues with incrementing/decrementing/comparing to size_t variables. Testing against size_t - 1 will yield a large positive 'oops' number when size_t is unexpectedly zero (e.g. strlen(myEmptyString)). Casting an unsigned size_t to an integer is a container size problem (not guaranteed a value) and of course size_t could potentially be bigger than an int.
Given that my arrays are of known sizes well below Int_Max, it seems to me that casting size_t to a signed integer is the best of the bunch, but it makes me cringe a little bit. Especially if it has to be static_cast<int>. Easier to take if it's hidden in a function call with some size testing, but still...
Or perhaps there's a way to turn off the warnings, but just for loop comparisons?
I find any of the three following approaches equally good.
Use a variable of type int to store the size and compare the loop variable to it.
byte buffer[100];
int size = sizeof(buffer);
for (int i = 0; i < size; ++i)
{
buffer[i] = 0;
}
Use size_t as the type of the loop variable.
byte buffer[100];
for (size_t i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
Use a pointer.
byte buffer[100];
byte* end = buffer + sizeof(buffer)
for (byte* p = buffer; p < end; ++p)
{
*p = 0;
}
If you are able to use a C++11 compiler, you can also use a range for loop.
byte buffer[100];
for (byte& b : buffer)
{
b = 0;
}
The most appropriate solution will depend entirely on context. In the context of the code fragment in your question the most appropriate action is perhaps to have type-agreement - the third option in your bullet list. This is appropriate in this case because the usage of i throughout the code is only to index the array - in this case the use of int is inappropriate - or at least unnecessary.
On the other hand if i were an arithmetic object involved in some arithmetic expression that was itself signed, the int might be appropriate and a cast would be in order.
I would suggest that as a guideline, a solution that involves the fewest number of necessary type casts (explicit of implicit) is appropriate, or to look at it another way, the maximum possible type agreement. There is not one "authoritative" rule because the purpose and usage of the variables involved is semantically rather then syntactically dependent. In this case also as has been pointed out in other answers, newer language features supporting iteration may avoid this specific issue altogether.
To discuss the advice you say you have been given specifically:
ignore the warnings
Never a good idea - some will be genuine semantic errors or maintenance issues, and by teh time you have several hundred warnings you are ignoring, how will you spot the one warning that is and issue?
turn off the warnings
An even worse idea; the compiler is helping you to improve your code quality and reliability. Why would you disable that?
use a loop variable of type size_t
In this precise example, that is exactly why you should do; exact type agreement should always be the aim.
use a loop variable of type size_t with tricks to avoid decrementing past zero
This advice is irrelevant for the trivial example given. Moreover I presume that by "tricks" the adviser in fact means checks or just correct code. There is no need for "tricks" and the term is entirely ambiguous - who knows what the adviser means? It suggests something unconventional and a bit "dirty", when there is not need for any solution with such attributes.
cast size_of(buffer) to an int
This may be necessary if the usage of i warrants the use of int for correct semantics elsewhere in the code. The example in the question does not, so this would not be an appropriate solution in this case. Essentially if making i a size_t here causes type agreement warnings elsewhere that cannot themselves be resolved by universal type agreement for all operands in an expression, then a cast may be appropriate. The aim should be to achieve zero warnings an minimum type casts.
some extremely convoluted suggestions that I did not have the patience to follow, generally involving vectors and/or iterators
If you are not prepared to elaborate or even consider such advice, you'd have better omitted the "advice" from your question. The use of STL containers in any case is not always appropriate to a large segment of embedded targets in any case, excessive code size increase and non-deterministic heap management are reasons to avoid on many platforms and applications.
libraries that I cannot load in an embedded environment.
Not all embedded environments have equal constraints. The restriction is on your embedded environment, not by any means all embedded environments. However the "loading of libraries" to resolve or avoid type agreement issues seems like a sledgehammer to crack a nut.
free functions returning a valid int or long representing the byte count of T
It is not clear what that means. What id a "free function"? Is that just a non-member function? Such a function would internally necessarily have a type case, so what have you achieved other than hiding a type cast?
Don't use loops (gotta love that advice).
I doubt you needed to include that advice in your list. The problem is not in any case limited to loops; it is not because you are using a loop that you have the warning, it is because you have used < with mismatched types.
My favorite solution is to use C++11 or newer and skip the whole manual size bounding entirely like so:
// assuming byte is defined by something like using byte = std::uint8_t;
void test()
{
byte buffer[100];
for (auto&& b: buffer)
{
b = 0;
}
}
Alternatively, if I can't use the ranged-based for loop (but still can use C++11 or newer), my favorite syntax becomes:
void test()
{
byte buffer[100];
for (auto i = decltype(sizeof(buffer)){0}; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}
Or for iterating backwards:
void test()
{
byte buffer[100];
// relies on the defined modwrap semantics behavior for unsigned integers
for (auto i = sizeof(buffer) - 1; i < sizeof(buffer); --i)
{
buffer[i] = 0;
}
}
The correct generic way is to use a loop iterator of type size_t. Simply because the is the most correct type to use for describing an array size.
There is not much need for "tricks to avoid decrementing past zero", because the size of an object can never be negative.
If you find yourself needing negative numbers to describe a variable size, it is probably because you have some special case where you are iterating across an array backwards. If so, the "trick" to deal with it is this:
for(size_t i=0; i<sizeof(array); i++)
{
size_t index = sizeof(array)-1 - i;
array[index] = something;
}
However, size_t is often an inconvenient type to use in embedded systems, because it may end up as a larger type than what your MCU can handle with one instruction, resulting in needlessly inefficient code. It may then be better to use a fixed width integer such as uint16_t, if you know the maximum size of the array in advance.
Using plain int in an embedded system is almost certainly incorrect practice. Your variables must be of deterministic size and signedness - most variables in an embedded system are unsigned. Signed variables also lead to major problems whenever you need to use bitwise operators.
If you are able to use C++ 11, you could use decltype to obtain the actual type of what sizeof returns, for instance:
void test()
{
byte buffer[100];
// On macOS decltype(sizeof(buffer)) returns unsigned long, this passes
// the compiler without warnings.
for (decltype(sizeof(buffer)) i = 0; i < sizeof(buffer); ++i)
{
buffer[i] = 0;
}
}

Cast ssize_t or size_t

In source files which I am using in my project, there is a comparison between ssize_t and size_t variables:
ssize_t sst;
size_t st;
if(sst == st){...}
I would like to get rid of the warning:
warning: comparison between signed and unsigned integer expressions
But I am not sure, which variable should I cast to the other?
if((size_t)sst == st){...}
or
if(sst == (ssize_t)st){...}
What is safer, better, cleaner? Thanks
There is no one right answer to this question. There are several possible answers, depending on what you know a priori about the values that those variables may take on.
If you know that sst is non-negative, then you can safely cast sst to size_t, as this will not change the value (incidentally, this is what happens if you have no cast at all).
If sst might be negative but you know that st will never be larger than SSIZE_MAX, then you can safely cast st to ssize_t, as this will not change the value.
If sst might be negative, and st might be larger than SSIZE_MAX, then neither cast is correct; either one could change the value, resulting in an incorrect comparison. Instead, you would do the following if (sst >= 0 && (size_t)sst == st).
If you’re not absolutely certain that one of the first two situations applies, choose the third option as it is correct in all cases.
Either will work fine as long as both values fit in the positive representable range of ssize_t.
If either value doesn't, you could end up in trouble - check those cases before testing for equality:
if ((sst >= 0) && (st <= SSIZE_MAX) && (sst == (ssize_t)st))
{
...
}
(I'm sure the C++ people will recommend you avoid the C-style cast entirely - I have no doubt someone will comment or answer and let you know the right way to do that in C++.)

int to bool[] in c++

What I want to do is given an argument const int &i, return the bits of the binary representation of i in the form of an array of bool (And back would also be great)... Does anyone know how?
Unless you really need it to be specifically an array of bool, I'd use an std::bitset:
std::bitset bits<32>(i);
You can normally treat that pretty much like an array of bool, testing, setting and flipping individual bits, etc. Of course, if you want portability to something that has a different size of int, you may want to modify it to something like:
#define size (sizeof(int) * CHAR_BIT)
std::bitset bits<size>(i);
Edit: As people much more experienced than me point out, doing this can lead to problems if the number is negative (what happens exactly depends on your compiler). In any case, it would be meaningless to process negative numbers this way unless you also stipulated what kind of arithmetic representation the return value would use (1s complement? 2s complement? prefix sign bit?) so this kind of approach turns out to be practically useless for negative numbers as far as I can tell.
Sorry for diverting attention from more worthy answers.
Original
Well, this comes to mind:
int i = 42; // or whatever
std::vector<bool> vec;
while(i) {
vec.push_back(i & 1);
i >>= 1;
}
std::reverse(vec);
Of course this is not an array, but it's trivial to copy the contents of the vector to an array instead if that's what you want, for example:
bool boolArray[] = new bool[vec.size()];
std::copy(vec.rbegin(), vec.rend(), boolArray);

Why or why not should I use 'UL' to specify unsigned long?

ulong foo = 0;
ulong bar = 0UL;//this seems redundant and unnecessary. but I see it a lot.
I also see this in referencing the first element of arrays a good amount
blah = arr[0UL];//this seems silly since I don't expect the compiler to magically
//turn '0' into a signed value
Can someone provide some insight to why I need 'UL' throughout to specify specifically that this is an unsigned long?
void f(unsigned int x)
{
//
}
void f(int x)
{
//
}
...
f(3); // f(int x)
f(3u); // f(unsigned int x)
It is just another tool in C++; if you don't need it don't use it!
In the examples you provide it isn't needed. But suffixes are often used in expressions to prevent loss of precision. For example:
unsigned long x = 5UL * ...
You may get a different answer if you left off the UL suffix, say if your system had 16-bit ints and 32-bit longs.
Here is another example inspired by Richard Corden's comments:
unsigned long x = 1UL << 17;
Again, you'd get a different answer if you had 16 or 32-bit integers if you left the suffix off.
The same type of problem will apply with 32 vs 64-bit ints and mixing long and long long in expressions.
Some compiler may emit a warning I suppose.
The author could be doing this to make sure the code has no warnings?
Sorry, I realize this is a rather old question, but I use this a lot in c++11 code...
ul, d, f are all useful for initialising auto variables to your intended type, e.g.
auto my_u_long = 0ul;
auto my_float = 0f;
auto my_double = 0d;
Checkout the cpp reference on numeric literals: http://www.cplusplus.com/doc/tutorial/constants/
You don't normally need it, and any tolerable editor will have enough assistance to keep things straight. However, the places I use it in C# are (and you'll see these in C++):
Calling a generic method (template in C++), where the parameter types are implied and you want to make sure and call the one with an unsigned long type. This happens reasonably often, including this one recently:
Tuple<ulong, ulong> = Tuple.Create(someUlongVariable, 0UL);
where without the UL it returns Tuple<ulong, int> and won't compile.
Implicit variable declarations using the var keyword in C# or the auto keyword coming to C++. This is less common for me because I only use var to shorten very long declarations, and ulong is the opposite.
When you feel obligated to write down the type of constant (even when not absolutely necessary) you make sure:
That you always consider how the compiler will translate this constant into bits
Who ever reads your code will always know how you thought the constant looks like and that you taken it into consideration (even you, when you rescan the code)
You don't spend time if thoughts whether you need to write the 'U'/'UL' or don't need to write it
also, several software development standards such as MISRA require you to mention the type of constant no matter what (at least write 'U' if unsigned)
in other words it is believed by some as good practice to write the type of constant because at the worst case you just ignore it and at the best you avoid bugs, avoid a chance different compilers will address your code differently and improve code readability

Which is more readable (C++ = )

int valueToWrite = 0xFFFFFFFF;
static char buffer2[256];
int* writePosition = (int* ) &buffer2[5];
*writePosition = valueToWrite;
//OR
* ((int*) &buffer2[10] ) = valueToWrite;
Now, I ask you guys which one do you find more readable. The 2 step technique involving a temporary variable or the one step technique?
Do not worry about optimization, they both optimize to the same thing, as you can see here.
Just tell me which one is more readable for you.
or DWORD PTR ?buffer2#?1??main##9#4PADA+5, -1
or DWORD PTR ?buffer2#?1??main##9#4PADA+10, -1
int* writePosition = (int* ) &buffer2[5]
Or
*((int*) &buffer2[10] ) = valueToWrite;
Are both incorrect because on some platforms access to unaligned values (+5 +10) may cost hundreds of CPU cycles and on some (like older ARM) it would cause an illegal operation.
The correct way is:
memcpy( buffer+5, &valueToWrite, sizeof(valueToWrite));
And it is more readable.
Once you encapsulate it inside a class, it does not really matter which technique you use. The method name will provide the description as to what the code is doing. Thus, in most cases you will not have to delve into the actual impl. to see what is going on.
class Buffer
{
char buffer2[256];
public:
void write(int pos, int value) {
int* writePosition = (int*) &buffer2[pos];
*writePosition = value;
}
}
If I was forced to choose, I'd say 1. However, I'll note the code as presented is very C like either way; I'd shy away from either and re-examine the the problem. Here's a simple one that is more C++-y
const char * begin = static_cast<char*>(static_cast<void*>(&valueToWrite));
std::copy(begin, begin+sizeof(int), &buffer2[5]);
The first example is more readable purely on the basis that your brain doesn't have to decipher the pointer operations globed together.
This will reduce the time a developer looking at the code for the first time needs to understand what's actually going. In my experience this loosely correlates to reducing the probability of introducing new bugs.
I find the second, shorter one easier to read.
I suspect, however, that this rather depends on whether you are the type of person that can easily 'get' pointers.
The type casting from char* to int* is a little awkward, though. I presume there is a good reason this needs to be done.
Watch out -- this code probably won't work due to alignment issues! Why not just use memset?
#include <string.h>
memset(buffer2+10, 0xFF, 4);
If you can afford to tie yourself to a single compiler (or do preprocessor hacks around compatiblity issues), you can use a packed-structs option to get symbolic names for the values you're writing. For example, on GCC:
struct __attribute__ ((__packed__)) packed_struct
{
char stuff_before[5]
int some_value;
}
/* .... */
static char buffer2[256];
struct packed_struct *ps = buffer2;
ps->some_value = valueToWrite;
This has a number of advantages:
Your code more clearly reflects what you're doing, if you name your fields well.
Since the compiler knows if the platform you're on supports efficient unaligned access, it can automatically choose between native unaligned access, or appropriate workarounds on platforms that don't support unaligned access.
But again, has the major disadvantage of not having any standardized syntax.
Most readable would be either variant with a comment added on what you're doing there.
That being said, I despise variables introduced simply for the purpose of a one-time use a couple of lines later. Doing most of my work in the maintenance area, getting dozens of variable names pushed in my face that are poor efforts not having to write an explanatory comment sets me on edge.
Definitely:
* ((int*) &buffer2[10] ) = valueToWrite;
I parse it not in one but few steps, and that is why it is more readable: I have all steps in one line.
From the readability perspective, the behaviour of your code should be clear, but "clear" is not how I would describe either of these alternatives. In fact, they are the opposite of "clear", as they are non-portable.
On top of alignment issues, there's integer representation (the size varies from system to system, as does sign representation, endianness and padding to throw into the soup). Thus, the behaviour of your code from system to system is erratic.
If you want to be clear about what your algorithm is supposed to do, you should explicitly put each byte into its correct place. For example:
void serialise_uint_lsb(unsigned char *destination, unsigned source) {
destination[0] = source & 0xff; source >>= 8;
destination[1] = source & 0xff; source >>= 8;
assert(source == 0);
}
void deserialise_uint_lsb(unsigned *destination, unsigned char *source) {
*destination = 0;
*destination <<= 8; *destination += source[1];
*destination <<= 8; *destination += source[0];
}
Serialisation and deserialisation are idiomatic concepts for programmers... *printf and *scanf are forms of serialisation/deserialisation, for example, except it's idiomatically instilled into your head that the most significant (decimal) digit goes first... which is the problem with your code; your code doesn't tell your system the direction of the integer, how many bytes there are, etc... bad news.
Use a serialisation/deserialisation function. Programmers will understand that best.