Does "default" switch case disturb jump table optimization? - c++

In my code I'm used to write fall-back default cases containing asserts like the following, to guard me against forgetting to update the switch in case semantics change
switch(mode) {
case ModeA: ... ;
case ModeB: ... ;
case .. /* many of them ... */
default: {
assert(0 && "Unknown mode!");
return ADummyValue();
}
};
Now I wonder whether the artificial fall-back check default case will interfere with jump table generations? Imagine "ModeA" an "ModeB" etc are consecutive so the compiler could optimize into a table. Since the "default" case contains an actual "return" statement (since the assert will disappear in release mode and the compiler will moan about a missing return statement), it seems unlikely the compiler optimizes the default branch away.
What's the best way to handle this? Some friend recommended me to replace "ADummyValue" with a null pointer dereference, so that the compiler, in presence of undefined behavior, could omit to warn about a missing return statement. Are there better ways to solve this?

If your compiler is MSVC, you can use __assume intrinsic : http://msdn.microsoft.com/en-us/library/1b3fsfxw(v=VS.80).aspx

At least with the compilers I've looked at, the answer is generally no. Most of them will compile a switch statement like this to code roughly equivalent to:
if (mode < modeA || mode > modeLast) {
assert(0 && "Unknown mode!");
return ADummyValue();
}
switch(mode) {
case modeA: ...;
case modeB: ...;
case modeC: ...;
// ...
case modeLast: ...;
}

if you're using "default" (ha!) <assert.h>, the definition's tied to the NDEBUG macro anyway, so maybe just
case nevermind:
#if !defined(NDEBUG)
default:
assert("can" && !"happen");
#endif
}

I only see 1 solution in case the optimization actually is disturbed: the infamous "#ifndef NDEBUG" round the default case. Not the nicest trick, but clear in this situation.
BTW: did you already have a look what your compiler does with and without the default case?

If you have a state that should never be reached, then you should kill the program, because it just reached an unexpected state, even in the release mode (you might just be more diplomatic and actually save users data and do all that other nice stuff before going down).
And please don't obsess over micro optimizations unless you actually have measured (using a profiler) that you need them.

The best way to handle this is not to disable the assert. That way you can also keep an eye on possible bugs. Sometimes it is better for the application to crash with a good message explaining what exactly happened, then to continue working.

Use compiler extensions:
// assume.hpp
#pragma once
#if defined _MSC_VER
#define MY_ASSUME(e) (__assume(e), (e) ? void() : void())
#elif defined __GNUC__
#define MY_ASSUME(e) ((e) ? void() : __builtin_unreachable())
#else // defined __GNUC__
#error unknown compiler
#endif // defined __GNUC__
-
// assert.hpp
#include <cassert>
#include "assume.hpp"
#undef MY_ASSERT
#ifdef NDEBUG
#define MY_ASSERT MY_ASSUME
#else // NDEBUG
#define MY_ASSERT assert
#endif // NDEBUG

Related

Way to toggle debugging code on and off

I was programming a manchester decoding algorithm for arduino, and I often had to print debug stuff when trying to get things working, but printing to serial and string constants add a lot of overhead. I can't just leave it there in the final binary.
I usually just go through the code removing anything debug related lines.
I'm looking for a way to easily turn it on and off.
The only way I know is this
#if VERBOSE==1
Serial.println();
Serial.print(s);
Serial.print(" ");
Serial.print(t);
Serial.print(" preamble");
#endif
...
#if VERBOSE==1
Serial.println(" SYNC!\n");
#endif
and on top of the file I can just have
#define VERBOSE 0 // 1 to debug
I don't like how much clutter it adds to single liners. I was very tempted to do something very nasty like this. But yeah, evil.
Change every debug output to
verbose("debug message");
then use
#define verbose(x) Serial.print(x) //debug on
or
#define verbose(x) //debug off
There's a C++ feature that allows me to just do this instead of preprocessor?
At the risk of sounding silly: Yes, there is a C++ feature for this, it looks like this:
if (DEBUG)
{
// Your debugging stuff here…
}
If DEBUG is a compile-time constant (I think using a macro is reasonable but not required in this case), the compiler will almost certainly generate no code (not even a branch) for the debugging stuff if debug is false at compile-time.
In my code, I like having several debugging levels. Then I can write things like this:
if (DEBUG_LEVEL >= DEBUG_LEVEL_FINE)
{
// Your debugging stuff here…
}
Again, the compiler will optimize away the entire construct if the condition is false at compile-time.
You can even get more fancy by allowing a two-fold debugging level. A maximum level enabled at compile-time and the actual level used at run-time.
if (MAX_DEBUG >= DEBUG_LEVEL_FINE && Config.getDebugLevel() >= DEBUG_LEVEL_FINE)
{
// Your debugging stuff here…
}
You can #define MAX_DEBUG to the highest level you want to be able to select at run-time. In an all-performance build, you can #define MAX_DEBUG 0 which will make the condition always false and not generate any code at all. (Of course, you cannot select debugging at run-time in this case.)
However, if squeezing out the last instruction is not the most important issue and all your debugging code does is some logging, then the usual pattern lokks like this:
class Logger
{
public:
enum class LoggingLevel { ERROR, WARNING, INFO, … };
void logError(const std::string&) const;
void logWarning(const std::string&) const;
void logInfo(const std::string&) const;
// …
private:
LoggingLevel level_;
};
The various functions then compare the current logging level to the level indicated by the function name and if it is less, immediately return. Except in tight loops, this will probably be the most convenient solution.
And finally, we can combine both worlds by providing inline wrappers for the Logger class.
class Logger
{
public:
enum class LoggingLevel { ERROR, WARNING, INFO, … };
void
logError(const char *const msg) const
{
if (COMPILE_TIME_LOGGING_LEVEL >= LoggingLevel::ERROR)
this->log_(LoggingLevel::ERROR, msg);
}
void
logError(const std::string& msg) const
{
if (COMPILE_TIME_LOGGING_LEVEL >= LoggingLevel::ERROR)
this->log_(LoggingLevel::ERROR, msg.c_str());
}
// …
private:
LoggingLevel level_;
void
log_(LoggingLevel, const char *) const;
};
As long as evaluating the function arguments for your Logger::logError etc calls does not have visible side-effects, chances are good that the compiler will eliminate the call if the conditional in the inline function is false. This is why I have added the overloads that take a raw C-string to optimize the frequent case where the function is called with a string literal. Look at the assembly to be sure.
Personally I wouldn't have a a lot of #ifdef DEBUG scattered around my code:
#ifdef DEBUG
printf("something");
#endif
// some code ...
#ifdef DEBUG
printf("something else");
#endif
rather, I would wrap it in a function:
void DebugPrint(const char const *debugText) // ToDo: make it variadic [1]
{
#ifdef DEBUG
printf(debugText);
#endif
}
DebugPrint("something");
// some code ...
DebugPrint("something else");
If you don't define DEBUG then the macro preprocessor (not the compiler) won't expand that code.
The slight downside of my approach is that, although it makes your cod cleaner, it imposes an extra function call, even if DEBUG is not defined. It is possible that a smart linker will realize that the called function is empty and will remove the function calls, but I wouldn't bank on it.
References:
“Variadic function” in: Wikipedia, The Free Encyclopedia.
I also would suggest to use inline functions which become empty if a flag is set. Why when it is set? Because you usually want to debug always unless you compile a release build.
Because NDEBUG is already used you could use it too to avoid using multiple different flags. The definition of a debug level is also very useful.
One more thing to say: Be careful using functions which are altered by using macros! You could easily violate the One Definition Rule by translating some parts of your code with and some other without debugging disabled.
You might follow the convention of assert(3) and wrap debugging code with
#ifndef NDEBUG
DebugPrint("something");
#endif
See here (on StackOverflow, which would be a better place to ask) for a practical example.
In a more C++ like style, you could consider
ifdef NDEBUG
#define debugout(Out) do{} while(0)
#else
extern bool dodebug;
#define debugout(Out) do {if (dodebug) { \
std::cout << __FILE__ << ":" << __LINE__ \
<< " " << Out << std::endl; \
}} while(0)
#endif
then use debugout("here x=" << x) in your program. YMMV. (you'll set your dodebug flag either thru a gdb command or thru some program argument, perhaps parsed using getopt_long(3), at least on Linux).
PS. Remind that the do{...}while(0) is an old trick to make a robust statement like macro (suitable in every position where a plain statement is, e.g. as the then or else part of an if etc...).
You could also use templates utilizing the constexpr if feature in C++17. you don't have to worry about the preprocessor at all but your declaration and definition have to be in the same place when using templates.

Does an empty function get optimized out with impure expressions?

// Example assert function
inline void assertImpl(bool mExpr, const std::string& mMsg) {
if(!mExpr) { printMsg(mMsg); abortExecution(); }
}
// Wrapper macro
#ifdef NDEBUG
#define MY_ASSERT(...) do{ ; }while(false)
#else
#define MY_ASSERT(...) assertImpl(__VA_ARGS__)
#endif
Consider the case where mExpr or mMsg are not-pure expressions - is there a way to force the compiler to optimize them out anyway?
bool globalState{false};
bool pureExpression() { return false; }
bool impureExpression() { globalState = !globalState; return false; }
// ...
// This will very likely be optimized out with (NDEBUG == 0)
MY_ASSERT(pureExpression());
// Will this be optimized out with (NDEBUG == 0)
MY_ASSERT(impureExpression());
What do compilers usually do in a situation where an impure expression is "discarded"?
Is there a way to make 100% sure that pure expressions get optimized out?
Is there a way to make 100% sure that impure expressions get optimized out or never get optimized out?
After macro expansion, your call to impureExpression() no longer exists: it's not part of the macro expansion result. If the call to your function isn't there, the function won't be called, on all conforming implementations, at any optimisation level, as long as NDEBUG is defined.
Note: you talk about NDEBUG == 0, but if that is what you want the condition to be, your #ifdef NDEBUG condition is incorrect. #ifdef tests whether the macro is defined, and pays no attention to what the definition is.
The optimizer is not involved here. In the macro that is enabled with NDEBUG, the arguments are discarded regardless.

Emulating GCC's __builtin_unreachable?

I get a whole lot of warnings about switches that only partially covers the range of an enumeration switched over. Therefor, I would like to have a "default" for all those switches and put __builtin_unreachable (GCC builtin) in that case, so that the compiler know that case is not reachable.
However, I came to know that GCC4.3 does not support that builtin yet. Is there any good way to emulate that functionality? I thought about dereferencing a null pointer instead, but that may have other undesirable effects/warnings and such. Do you have any better idea?
The upcoming 2023 revision of the C standard (C23, ISO/IEC 9899:2023) is going to have a new macro unreachable
#include <stddef.h>
void unreachable(void);
with the effect of gcc's __builtin_unreachable.
On older C standards, you may be able to call an inline function declared _Noreturn to mark anything after that call as unreachable. The compiler is allowed to throw out any code after such a function. If the function itself is static (and does return), the compiler will usually also inline the function. Here is an example:
static _Noreturn void unreachable() {
return; /* intentional */
}
/* ... */
foo();
bar(); /* should better not return */
unreachable();
baz(); /* compiler will know this is not reachable */
Notice that you invoke undefined behavior if a function marked _Noreturn indeed returns. Be sure that said function will never be called.
Hmm, something like (since __builtin_unreachable() appeared in 4.5):
#define GCC_VERSION (__GNUC__ * 10000 \
+ __GNUC_MINOR__ * 100 \
+ __GNUC_PATCHLEVEL__)
#if GCC_VERSION >= 40500
#define my_unreachable() __builtin_unreachable()
#else
#define my_unreachable() do { printf("Oh noes!!!111\n"); abort(); } while(0)
#endif
Would abort (leaving a core dump) or throw (allowing for alternate data capture) accommodate your needs?
Do you really want to have switch statements that don't cover the full enumeration? I nearly always try to list all the possible cases (to no-op) with no default case so that gcc will warn me if new enumerations are added, as it may be required to handle them rather than letting it silently (during compile) fall into the default.
keep it simple:
assert(false);
or, better yet:
#define UNREACHABLE (!"Unreachable code executed!")
assert(UNREACHABLE);
template<unsigned int LINE> class Unreachable_At_Line {};
#define __builtin_unreachable() throw Unreachable_At_Line<__LINE__>()
Edit:
Since you want to have unreachable code to be omitted by compiler, below is the simplest way.
#define __builtin_unreachable() { struct X {X& operator=(const X&); } x; x=x; }
Compiler optimizes away x = x; instruction especially when it's unreachable. Here is the usage:
int foo (int i)
{
switch(i)
{
case 0: return 0;
case 1: return 1;
default: return -1;
}
__builtin_unreachable(); // never executed; so compiler optimizes away
}
If you put __builtin_unreachable() in the beginning of foo() then compiler generates a linker error for unimplemented operator =. I ran these tests in gcc 3.4.6 (64-bit).

How to silence 'The last statement should return a value' warning?

Sun Studio 12.1 prints the warning
Warning: The last statement should return a value.
frequently for functions like that:
int f()
{
/* some code that may return */
// if we end up here, something is broken
throw std::runtime_error("Error ...");
}
It is perfectly clear that we do not need a return value at the end of the function. I hesitate to insert something like
// Silence a compiler warning
return 42;
at the end of such a function, since it is dead code anyway. For more complicated return types, it might actually be difficult to construct a 'sensible' bogus value.
What is the recommended way to silence such a warning?
Can you reorganize the code in the function in such a way (hopefully more logical as well) that the normal path happens at the end of the function so that a return can be used, and the exceptional path happens earlier, NOT as the last statement?
EDIT: If reorganizing the function really doesn't make sense, you can always just put a dummy return 0; with a comment. It's better to squelch the warning that way than more globally.
If you really want to quiet the warning permanently, you can use #pragma error_messages (off, wnoretvalue) but note that the warning really is useful most of the time so I absolutely don't suggest turning it off. You can use the on version of the pragma to re-enable the warning after the function, but the compiler will still emit the warning if your function is ever inlined. If you put the function in its own source file and use the pragma that should shush the warning relatively safely though, since it can't affect other translation units.
Another really wacky possibility is to switch to g++. Unless you're compiling for SPARC g++ may actually generate better code than Sun studio.
I find it a perfect spot for abort(). You should never end there, according to you, so something like:
UNREACHABLE("message")
which expands into:
#ifdef NDEBUG
#define UNREACHABLE(Message_) abort();
#else
#define UNREACHABLE(Message_) assert(0 && Message_);
#endif
Looks appropriate
Since you know the exception will be systematically called, why don't you simply return a 0?
Perhaps encapsulate the contents in a do { } while (false); construct:
int my_function()
{
int result = DEFAULT_VALUE;
do
{
result = /*...*/
// Whatever
if (error)
{
throw std::runtime_error("Error ...");
}
} while (false);
return result;
}
The idea is for normal operation to set the result value then let the execution flow to the end or use a break to jump to the return statement.
I don't know of a "recommended" way to deal with it, but to answer your question about coping with more complex types, what about:
ComplexType foo()
{
...
throw std::runtime( "Error..." );
return *(ComplexType*)(0);
}
This would then work with any return type. I realise it looks evil, but its there just to silence the warning. As you say, this code will never be executed, and it may even be optimised out.

Advantage of macro over in-line in C++

We know that in-line are favorable as they are checked by the compiler and same operation ( like ++x ) does not evaluate more than once when passed as an argument as compared to macros.
But in an interview I was asked the specific advantages or the circumstances when a macro is more favorable to inline in C++.
Does anyone know the answer or can give a thought on this question ?
The only thing I can think of is there are some tricks that you can do with a macro that can't be done with an inline function. Pasting tokens together at compile-time, and that sort of hackery.
Here is a specific situation where macros are not only preferred, they are actually the only way to accomplish something.
If you want to write a logging function which logs not only some message, but the file & line number of where the instance occured, you can either call your function directly, typing in the file & line values (or macros) directly:
LogError("Something Bad!", __FILE__, __LINE__);
...or, if you want it to work automatically, you must rely on a macro (warning: not compiled):
#define LogErrorEx(ERR) (LogError(ERR, __FILE__, __LINE__))
// ...
LogErrorEx("Something Else Bad!");
This cannot be achieved using templates, default parameters, default construction, or any other device in C++.
Sometimes you want to extend the language in ways that aren't possible with any other method.
#include <iostream>
#define CHECK(x) if (x); else std::cerr << "CHECK(" #x ") failed!" << std::endl
int main() {
int x = 053;
CHECK(x == 42);
return 0;
}
This prints CHECK(x == 42) failed!.
In C++ specifically, one usage of MACROs that seem pop up very often (except for the debug print with file and line) is the use of MACROs to fill in a set of standard methods in a class that cannot be inherited from a base class. In some libraries that create custom mechanisms of RTTI, serialization, expression templates, etc., they often rely on a set of static const variables and static methods (and possibly special semantics for some overloaded operators that cannot be inherited) which are almost always the same but need to be added to any class that the user defines within this framework. In these cases, MACROs are often provided such that the user doesn't have to worry about putting all the necessary code (he only has to invoke the MACRO with the require info). For example, if I make a simple RTTI (Run-Time Type Identification) system, I might require that all classes have a TypeID and be dynamically castable:
class Foo : public Bar {
MY_RTTI_REGISTER_CLASS(Foo, Bar, 0xBAADF00D)
};
#define MY_RTTI_REGISTER_CLASS(CLASSNAME,BASECLASS,UNIQUEID) \
public:\
static const int TypeID = UNIQUEID;\
virtual void* CastTo(int aTypeID) {\
if(aTypeID == TypeID)\
return this;\
else\
return BASECLASS::CastTo(aTypeID);\
};
The above could not be done with templates or inheritance, and it makes the user's life easier and avoids code repetition.
I would say that this kind of use of MACROs is by far the most common in C++.
As already said, macros can use preprocessor directives: __FILE__, __LINE__ for instance, but of course #include and #define can also be useful to parameter behaviour:
#ifdef __DEBUG__
# define LOG(s) std:cout << s << std:endl
#else
# define LOG(s)
#endif
Depending wether __DEBUG__ is defined or not (via #define or via compiler options), the LOG macro will be active or not. This is an easy way to have debug info everywhere in your code that can be easily de-activated.
You can also think of changing the way memory is allocated (malloc will be redefined to target a memory pool instead of the standard heap for instance, etc...).
Inline functions are, as the name indicates, restricted to functional tasks, execution of some code.
Macros have a much broader application they may expand e.g to declarations or replace entire language constructs. Some examples (written for C and C++) that can't be done with functions:
typedef struct POD { double a; unsigned b } POD;
#declare POD_INITIALIZER { 1.0, 37u }
POD myPOD = POD_INITIALIZER;
#define DIFFICULT_CASE(X) case (X)+2 :; case (X)+3
#define EASY_CASE(X) case (X)+4 :; case (X)+5
switch (a) {
default: ++a; break;
EASY_CASE('0'): --a; break;
DIFFICULT_CASE('A'): a = helperfunction(a); break;
}
#define PRINT_VALUE(X) \
do { \
char const* _form = #X " has value 0x%lx\n"; \
fprintf(stderr, _form, (unsigned long)(X)); \
} while (false)
In the context of C++, Boost has a lot of more examples that are more involved and more useful.
But because with such macros you are in some sort extending the language (not strictly, the preprocessor is part of it) many people dislike macros, particularly in the C++ community, a bit less in the C community.
In any case, if you use such constructs you should always be very clear in what the should achieve, document well, and fight against the temptation to obfuscate your code.
A macro is just like a text replacement definition.
These essential differences that come into my mind are:
It must not be function-like. I mean it must not necessarily contain some consistent set of brackets for example.
It can be used elsewhere. Like in a class declaration scope or even in the global scope. So it must not be in the scope of another function.
You must use them if you want to perform actions that are impossible to be performed using functions:
initializing complicated tables (makes core more readable)
ease declaration of some special members like event IDs or tag classes (used a lot in MFC IMPLEMENT_DYNAMIC)
squeeze repetitive declarations at the beginning of functions
the already mentioned use of __LINE__, __FILE__, ... for logging
#include <stdio.h>
#define sq(x) x*x
int main()
{
printf("%d", sq(2+1));
printf("%d", sq(2+5));
return 0;
}
The output for this code are 5 and 17. Macros expand textually. Its not like functions.
Explanation for this example:
sq(2+1) = 2+1*2+1 = 2+2+1 = 5
sq(2+5) = 2+5*2+5 = 2+10+5 = 17
I would add two uses:
MIN and MAX, until C++0x, because the return type had to be declared by hand, mixed min and max as inlined functions would have been nightmarish, while a simple macro did it in the blink of an eye.
privacy: you can always undef the macro before exiting your header, you cannot "undeclare" an inline function (or another symbol). This is due to the absence of proper modularity in C and C++ languages.