Why use 'function address == NULL' instead of 'false'? - c++

Browsing among some legacy code I've found such function:
static inline bool EmptyFunc()
{
return (void*) EmptyFunc == NULL;
}
What are the differences from this one:
static inline bool EmptyFunc()
{
return false;
}
This code was created to compile under several different platforms, like PS2, Wii, PC... Are there any reason to use the first function? Like better optimization or avoiding some strange compiler misbehavior?

Semantically both functions are the same: they always return false*. Folding the first expression to a constant value "false" is completely allowed by the standard since it would not change any observable side-effects (of which there are none). Since the compiler sees the entire function it also free to optimize away any calls to it and replace it with a constant "false" value.
That is, there is no "general" value in the first form and is likely a mistake on the part of the programmer. The only possibility is that it exploits some special behaviour (or defect) in a specific compiler/version. To what end I don't know however. If you wish to prevent inlining using a compiler-specific attribute would be the correct approach -- anything else is prone to breaking should the compiler change.
(*This assumes that NULL is never defined to be EmptyFunc, which would result in true being returned.).

Strictly speaking, a function pointer may not be cast to a void pointer, what happens then is outside the scope of the standard. The C11 standard lists it as a "common extension" in J.5.7 (I suspect that the same applies in C++). So the only difference between the two cases in that the former is non-portable.
It would seem that the most likely cause of the former version is either a confused programmer or a confused compiler. We can tell for certain that the programmer was confused/sloppy by the lack of an explaining comment.
It doesn't really make much sense to declare a function as inline and then try to trick the compiler into not inlining the code by including the function address in the code. So I think we can rule out that theory, unless of course the programmer was confused and thought it made sense.

Related

How to force a compile error in C++(17) if a function return value isn't checked? Ideally through the type system

We are writing safety-critical code and I'd like a stronger way than [[nodiscard]] to ensure that checking of function return values is caught by the compiler.
[Update]
Thanks for all the discussion in the comments. Let me clarify that this question may seem contrived, or not "typical use case", or not how someone else would do it. Please take this as an academic exercise if that makes it easier to ignore "well why don't you just do it this way?". The question is exactly whether it's possible to create a type(s) that fails compiling if it is not assigned to an l-value as the return result of a function call .
I know about [[nodiscard]], warnings-as-errors, and exceptions, and this question asks if it's possible to achieve something similar, that is a compile time error, not something caught at run-time. I'm beginning to suspect it's not possible, and so any explanation why is very much appreciated.
Constraints:
MSVC++ 2019
Something that doesn't rely on warnings
Warnings-as-Errors also doesn't work
It's not feasible to constantly run static analysis
Macros are OK
Not a runtime check, but caught by the compiler
Not exception-based
I've been trying to think how to create a type(s) that, if it's not assigned to a variable from a function return, the compiler flags an error.
Example:
struct MustCheck
{
bool success;
...???...
};
MustCheck DoSomething( args )
{
...
return MustCheck{true};
}
int main(void) {
MustCheck res = DoSomething(blah);
if( !res.success ) { exit(-1); }
DoSomething( bloop ); // <------- compiler error
}
If such a thing is provably impossible through the type system, I'll also accept that answer ;)
(EDIT) Note 1: I have been thinking about your problem and reached the conclusion that the question is ill posed. It is not clear what you are looking for because of a small detail: what counts as checking? How the checkings compose and how far from the point of calling?
For example, does this count as checking? note that composition of boolean values (results) and/or other runtime variable matters.
bool b = true; // for example
auto res1 = DoSomething1(blah);
auto res2 = DoSomething2(blah);
if((res1 and res2) or b){...handle error...};
The composition with other runtime variables makes it impossible to make any guarantee at compile-time and for composition with other "results" you will have to exclude certain logical operators, like OR or XOR.
(EDIT) Note 2: I should have asked before but 1) if the handling is supposed to always abort: why not abort from the DoSomething function directly? 2) if handling does a specific action on failure, then pass it as a lambda to DoSomething (after all you are controlling what it returns, and what it takese). 3) composition of failures or propagation is the only not trivial case, and it is not well defined in your question.
Below is the original answer.
This doesn't fulfill all the (edited) requirements you have (I think they are excessive) but I think this is the only path forward really.
Below my comments.
As you hinted, for doing this at runtime there are recipes online about "exploding" types (they assert/abort on destruction if they where not checked, tracked by an internal flag).
Note that this doesn't use exceptions (but it is runtime and it is not that bad if you test the code often, it is after all a logical error).
For compile-time, it is more tricky, returning (for example a bool) with [[nodiscard]] is not enough because there are ways of no discarding without checking for example assigning to a (bool) variable.
I think the next layer is to active -Wunused-variable -Wunused-expression -Wunused-parameter (and treat it like an error -Werror=...).
Then it is much harder to not check the bool because comparison is pretty much to only operation you can really do with a bool.
(You can assign to another bool but then you will have to use that variable).
I guess that's quite enough.
There are still Machiavelian ways to mark a variable as used.
For that you can invent a bool-like type (class) that is 1) [[nodiscard]] itself (classes can be marked nodiscard), 2) the only supported operation is ==(bool) or !=(bool) (maybe not even copyable) and return that from your function. (as a bonus you don't need to mark your function as [[nodiscard]] because it is automatic.)
I guess it is impossible to avoid something like (void)b; but that in itself becomes a flag.
Even if you cannot avoid the absence of checking, you can force patterns that will immediately raise eyebrows at least.
You can even combine the runtime and compile time strategy.
(Make CheckedBool exploding.)
This will cover so many cases that you have to be happy at this point.
If compiler flags don’t protect you, you will have still a backup that can be detected in unit tests (regardless of taking the error path!).
(And don’t tell me now that you don’t unit test critical code.)
What you want is a special case of substructural types. Rust is famous for implementing a special case called "affine" types, where you can "use" something "at most once". Here, you instead want "relevant" types, where you have to use something at least once.
C++ has no official built-in support for such things. Maybe we can fake it? I thought not. In the "appendix" to this answer I include my original logic for why I thought so. Meanwhile, here's how to do it.
(Note: I have not tested any of this; I have not written any C++ in years; use at your own risk.)
First, we create a protected destructor in MustCheck. Thus, if we simply ignore the return value, we will get an error. But how do we avoid getting an error if we don't ignore the return value? Something like this.
(This looks scary: don't worry, we wrap most of it in a macro.)
int main(){
struct Temp123 : MustCheck {
void f() {
MustCheck* mc = this;
*mc = DoSomething();
}
} res;
res.f();
if(!res.success) print "oops";
}
Okay, that looks horrible, but after defining a suitable macro, we get:
int main(){
CAPTURE_RESULT(res, DoSomething());
if(!res.success) print "oops";
}
I leave the macro as an exercise to the reader, but it should be doable. You should probably use __LINE__ or something to generate the name Temp123, but it shouldn't be too hard.
Disclaimer
Note that this is all sorts of hacky and terrible, and you likely don't want to actually use this. Using [[nodiscard]] has the advantage of allowing you to use natural return types, instead of this MustCheck thing. That means that you can create a function, and then one year later add nodiscard, and you only have to fix the callers that did the wrong thing. If you migrate to MustCheck, you have to migrate all the callers, even those that did the right thing.
Another problem with this approach is that it is unreadable without macros, but IDEs can't follow macros very well. If you really care about avoiding bugs then it really helps if your IDE and other static analyzers understand your code as well as possible.
As mentioned in the comments you can use [[nodiscard]] as per:
https://learn.microsoft.com/en-us/cpp/cpp/attributes?view=msvc-160
And modify to use this warning as compile error:
https://learn.microsoft.com/en-us/cpp/preprocessor/warning?view=msvc-160
That should cover your use case.

What would break if control-reaches-end-of-body were to return nullopt?

C++14 gave us automatic return type deduction, and C++17 has an optional<T> type template (or type constructor, if you will). Now, true, optional lives within the standard library, not the language itself, but - why not use it as return value from a non-void function when control reaches the end of the body? I would think that:
optional<int> foo(int x)
{
if (x > 0) return 2 * x;
}
should be perfectly valid and compilable syntax for a partial function, returning optional<int>.
Now, I know this is a bit of a crazy idea. My question is not whether you like it or not, but rather - suppose everyone on the committee really liked it for some strange reason. What would it break / conflict with?
Note: Of course if you specify a non-optional return value this can't work, but that doesn't count as breakage.
Think of functions that end with abort(); or a custom function that has the same effect. If the compiler cannot statically prove functions never reach the closing }, this would force the compiler to generate dead code, and is therefore in conflict with one of the principles of C++, namely the zero overhead principle: what you don't use, you don't pay for.
Special casing std::optional is ridiculous here. Users should be able to write their own first-class equivalent to std::optional.
Which means falling off the end of a function needs to involve using some kind of magic to figure out what the implicit return value should be.
The easiest magic is that falling-off-the-end is equivalent to return {}; In the case of optional, this is nullopt. If I read my standardese correctly, for int this is 0, and this matches the behavior of falling-off-the-end-of-main.
There are downsides. First, suppose you have a function:
int foo(bool condition) {
if (condition) return 7;
custom_abort(); // does not return, but not marked up with `[[noreturn]]`
}
This would cause the compiler to write a return {}; after custom_abort(); if the compiler cannot prove that abort doesn't return. This has a cost (in binary size at the least). Currently, the compiler is free to exclude any work required to return from foo after abort() and assume abort() will not return.
It is true that no valid programs will behave differently with this change, but what was previously undefined behavior becomes defined, and that can have costs.
We could approach this in a slightly different way:
int foo(bool condition) {
if (condition) return 7;
custom_abort();
this cannot be reached;
}
where we add in an explicit "this location cannot be reached" to C++.
Once added, we could then issue warnings for code paths that do not return, and in a later standard enforce the rule that all code paths must either assert they cannot be reached, or must return.
After such a transformation of the language was in place for a standard cycle or two, then implicit return {}; would be harmless, except for people who skipped over the return cannot happen phase of standardization.
Up to now, it is full Undefined Behavior. That means no valid existing code contains this construct. Adding well-defined behavior will therefore break no valid code. As for code that was broken, that may or may not be broken if your proposal would be accepted, but that's almost never a concern for WG21.
The main concern would be how it would interact with other language features. I don't see a conflict with constexpr; falling off the end of a constexpr function would give an empty constexpr optional<T>. The [[noreturn]] attribute obviously makes no sense. The [[nodiscard]] attribute affects the caller, not the implementation. Exceptions are not affected either. So on the whole, the proposal seems to stand on its own.
In a proposal to WG21, it might be worth suggesting a less radical alternative: make plain return; a valid alternative for return optional<T>{};

Need help regarding macro definition

Im reading c++ code, i have found such definition
#define USE_VAL(X) if (&X-1) {}
has anybody idea, what does it mean?
Based on the name, it looks like a way of getting rid of an "unused variable" warning. The intended use is probably something like this:
int function(int i)
{
USE_VAL(i)
return 42;
}
Without this, you could get a compiler warning that the parameter i is unused inside the function.
However, it's a rather dangerous way of going about this, because it introduces Undefined Behaviour into the code (pointer arithmetic beyond bounds of an actual array is Undefined by the standard). It is possible to add 1 to an address of an object, but not subtract 1. Of course, with + 1 instead of - 1, the compiler could then warn about "condition always true." It's possible that the optimiser will remove the entire if and the code will remain valid, but optimisers are getting better at exploiting "undefined behaviour cannot happen," which could actually mess up the code quite unexpectedly.
Not to mention that fact that operator& could be overloaded for the type involved, potentially leading to undesired side effects.
There are better ways of implementing such functionality, such as casting to void:
#define USE_VAL(X) static_cast<void>(X)
However, my personal preference is to comment out the name of the parameter in the function definition, like this:
int function(int /*i*/)
{
return 42;
}
The advantage of this is that it actually prevents you from accidentally using the parameter after passing it to the macro.
Typically it's to avoid an "unused return value" warning. Even if the usual "cast to void" idiom normally works for unused function parameters, gcc with -pedantic is particularly strict when ignoring the return values of functions such as fread (in general, functions marked with __attribute__((warn_unused_result))), so a "fake if" is often used to trick the compiler in thinking you are doing something with the return value.
A macro is a pre-processor directive, meaning that wherever it's used, it will be replaced by the relevant piece of code.
and here after USE_VAL(X) the space it is explain what will USE_VAL(X) do.
first it take the address of x and then subtract 1 from it. if it is 0 then do nothing.
where USE_VAL(X) will used it will replaced by the if (&X-1) {}

Why is returning a reference to a function local value not a compile error?

The following code invokes undefined behaviour.
int& foo()
{
int bar = 1234;
return bar;
}
g++ issues a warning:
warning: reference to local variable ‘bar’ returned [-Wreturn-local-addr]
clang++ too:
warning: reference to stack memory associated with local variable 'bar' returned [-Wreturn-stack-address]
Why is this not a compile error (ignoring -Werror)?
Is there a case where returning a ref to a local var is valid?
EDIT As pointed out, the spec mandates this be compilable. So, why does the spec not prohibit such code?
I would say that requiring this to make the program ill-formed (that is, make this a compilation error) would complicate the standard considerably for little benefit. You'd have to exactly spell out in the standard when such cases shall be diagnosed, and all compilers would have to implement them.
If you specify too little, it will not be too useful. And compilers probably already check for this to emit warnings, and real programmers compile with -Wall_you_can_give_me -Werror anyway.
If you specify too much, it will be difficult (or impossible) for compilers to implement the standard.
Consider this class (for which you only have the header and a library):
class Foo
{
int x;
public:
int& getInteger();
};
And this code:
int& bar()
{
Foo f;
return f.getInteger();
}
Now, should the standard be written to make this ill-formed or not? Probably not, what if Foo is implemented like this:
#include "Foo.h"
int global;
int& Foo::getInteger()
{
return global;
}
At the same time, it could be implemented like this:
#include "Foo.h"
int& Foo::getInteger()
{
return x;
}
Which of course would give you a dangling reference.
My point is that the compiler cannot really know whether returning a reference is OK or not, except for a few trivial cases (returning a reference to a function-scope automatic variable or parameter of non-reference type). I don't think it's worth it to complicate the standard for that. Especially as most compilers already warn about this as a quality-of-implementation matter.
Also, because you may want to get the current stack pointer (whatever that means on your particular implementation).
This function:
void* get_stack_pointer (void) { int x; return &x; };
AFAIK, it is not undefined behavior if you don't dereference the resulting pointer.
is much more portable than this one:
void* get_stack_pointer (void) {
register void* sp asm ("%esp"); return sp; }
As to why you may want to get the stack pointer: well, there are cases where you have a valid reason to get it: for instance the conservative Boehm garbage collector needs to scan the stack (so wants the stack pointer and the stack bottom).
And if you returned a C++ reference on which you would only take its address using the & unary operator, getting such an address is IIUC legal (it is IMHO the only licit operation you can do on it).
Another reason to get the stack pointer would be to get a non-NULL pointer address (which you could e.g. hash) different of any heap, local or static data. However, you could use (void*)1 or (void*)-1 for that purpose.
So the compiler is right in only warning against this.
I guess that a C++ compiler should accept
int& get_sp_ref(void) { int x; return x; }
void show_sp(void) {
std::cout << (&(get_sp_ref())) << std::endl; }
For the same reason C allows you to return a pointer to a memory block that's been freed.
It's valid according to the language specification. It's a horribly bad idea (and is nowhere close to being guaranteed to work) but it's still valid inasmuch as it's not forbidden.
If you're asking why the standard allows this, it's probably because, when references were introduced, that's the way they worked. Each iteration of the standard has certain guidelines to follow (such as minimising the possibility of "breaking changes", those that render existing well-formed programs invalid) and the standard is an agreement between user and implementer, with undoubtedly more implementers than users sitting on the committees :-)
It may be worth pushing that idea through as a potential change and seeing what ISO say but I suspect it would be considered one of those "breaking changes" and therefore very suspect.
To expand on the earlier answers, the ISO C++ standard does not capture the distinction between warnings and errors to begin with; it simply uses the term 'diagnostic' when referring to what a compiler must emit upon seeing an ill-formed program. Quoting N3337, 1.4, paragraphs 1 and 2:
The set of diagnosable rules consists of all syntactic and semantic rules in this
International Standard except for those rules containing an explicit notation that “no
diagnostic is required” or which are described as resulting in “undefined behavior.”
Although this International Standard states only requirements on C++ implementations,
those requirements are often easier to understand if they are phrased as requirements on
programs, parts of programs, or execution of programs. Such requirements have the
following meaning:
If a program contains no violations of the rules in this International Standard, a
conforming implementation shall, within its resource limits, accept and correctly execute
that program.
If a program contains a violation of any diagnosable rule or an occurrence of a
construct described in this Standard as “conditionally-supported” when the
implementation does not support that construct, a conforming implementation shall issue
at least one diagnostic message.
If a program contains a violation of a rule for which no diagnostic is required, this
International Standard places no requirement on implementations with respect to that
program.
Something not mentioned by other answers yet is that this code is OK if the function is never called.
The compiler isn't required to diagnose whether a function might ever be called or not. For example you might set up a program which looks for counterexamples to Fermat's Last Theorem, and calls this function if it finds one. It would be a mistake for the compiler to reject such a program.
Returning reference into local variable is bad idea, however some people may create code which requires that, so compiler should only warn about that and don't determine valid (valid structure) code as erroneous.
Angew already posted sample with local variable that is actually global. However there is some other (IMHO better) sample.
Object& GetSmth()
{
Object* obj = new Object();
return *obj;
}
In this case reference to local object is valid and caller after usage should dealocate memory.
IMPORTANT NOTE I don't encourage and don't recommend to use such coding style, because it is bad, usually it is hard to understand what is going on and it leads in some kind of problems like memory leaks or crashes. It is just a sample which shows why this particular situation cannot be treated as error.

Expressions with no side effects in C++

See, what I don't get is, why should programs like the following be legal?
int main()
{
static const int i = 0;
i < i > i;
}
I mean, surely, nobody actually has any current programs that have expressions with no side effects in them, since that would be very pointless, and it would make parsing & compiling the language much easier. So why not just disallow them? What benefit does the language actually gain from allowing this kind of syntax?
Another example being like this:
int main() {
static const int i = 0;
int x = (i);
}
What is the actual benefit of such statements?
And things like the most vexing parse. Does anybody, ever, declare functions in the middle of other functions? I mean, we got rid of things like implicit function declaration, and things like that. Why not just get rid of them for C++0x?
Probably because banning then would make the specification more complex, which would make compilers more complex.
it would make parsing & compiling the
language much easier
I don't see how. Why is it easier to parse and compile i < i > i if you're required to issue a diagnostic, than it is to parse it if you're allowed to do anything you damn well please provided that the emitted code has no side-effects?
The Java compiler forbids unreachable code (as opposed to code with no effect), which is a mixed blessing for the programmer, and requires a little bit of extra work from the compiler than what a C++ compiler is actually required to do (basic block dependency analysis). Should C++ forbid unreachable code? Probably not. Even though C++ compilers certainly do enough optimization to identify unreachable basic blocks, in some cases they may do too much. Should if (foo) { ...} be an illegal unreachable block if foo is a false compile-time constant? What if it's not a compile-time constant, but the optimizer has figured out how to calculate the value, should it be legal and the compiler has to realise that the reason it's removing it is implementation-specific, so as not to give an error? More special cases.
nobody actually has any current
programs that have expressions with no
side effects in them
Loads. For example, if NDEBUG is true, then assert expands to a void expression with no effect. So that's yet more special cases needed in the compiler to permit some useless expressions, but not others.
The rationale, I believe, is that if it expanded to nothing then (a) compilers would end up throwing warnings for things like if (foo) assert(bar);, and (b) code like this would be legal in release but not in debug, which is just confusing:
assert(foo) // oops, forgot the semi-colon
foo.bar();
things like the most vexing parse
That's why it's called "vexing". It's a backward-compatibility issue really. If C++ now changed the meaning of those vexing parses, the meaning of existing code would change. Not much existing code, as you point out, but the C++ committee takes a fairly strong line on backward compatibility. If you want a language that changes every five minutes, use Perl ;-)
Anyway, it's too late now. Even if we had some great insight that the C++0x committee had missed, why some feature should be removed or incompatibly changed, they aren't going to break anything in the FCD unless the FCD is definitively in error.
Note that for all of your suggestions, any compiler could issue a warning for them (actually, I don't understand what your problem is with the second example, but certainly for useless expressions and for vexing parses in function bodies). If you're right that nobody does it deliberately, the warnings would cause no harm. If you're wrong that nobody does it deliberately, your stated case for removing them is incorrect. Warnings in popular compilers could pave the way for removing a feature, especially since the standard is authored largely by compiler-writers. The fact that we don't always get warnings for these things suggests to me that there's more to it than you think.
It's convenient sometimes to put useless statements into a program and compile it just to make sure they're legal - e.g. that the types involve can be resolved/matched etc.
Especially in generated code (macros as well as more elaborate external mechanisms, templates where Policies or types may introduce meaningless expansions in some no-op cases), having less special uncompilable cases to avoid keeps things simpler
There may be some temporarily commented code that removes the meaningful usage of a variable, but it could be a pain to have to similarly identify and comment all the variables that aren't used elsewhere.
While in your examples you show the variables being "int" immediately above the pointless usage, in practice the types may be much more complicated (e.g. operator<()) and whether the operations have side effects may even be unknown to the compiler (e.g. out-of-line functions), so any benefit's limited to simpler cases.
C++ needs a good reason to break backwards (and retained C) compatibility.
Why should doing nothing be treated as a special case? Furthermore, whilst the above cases are easy to spot, one could imagine far more complicated programs where it's not so easy to identify that there are no side effects.
As an iteration of the C++ standard, C++0x have to be backward compatible. Nobody can assert that the statements you wrote does not exist in some piece of critical software written/owned by, say, NASA or DoD.
Anyway regarding your very first example, the parser cannot assert that i is a static constant expression, and that i < i > i is a useless expression -- e.g. if i is a templated type, i < i > i is an "invalid variable declaration", not a "useless computation", and still not a parse error.
Maybe the operator was overloaded to have side effects like cout<<i; This is the reason why they cannot be removed now. On the other hand C# forbids non-assignment or method calls expresions to be used as statements and I believe this is a good thing as it makes the code more clear and semantically correct. However C# had the opportunity to forbid this from the very beginning which C++ does not.
Expressions with no side effects can turn up more often than you think in templated and macro code. If you've ever declared std::vector<int>, you've instantiated template code with no side effects. std::vector must destruct all its elements when releasing itself, in case you stored a class for type T. This requires, at some point, a statement similar to ptr->~T(); to invoke the destructor. int has no destructor though, so the call has no side effects and will be removed entirely by the optimizer. It's also likely it will be inside a loop, then the entire loop has no side effects, so the entire loop is removed by the optimizer.
So if you disallowed expressions with no side effects, std::vector<int> wouldn't work, for one.
Another common case is assert(a == b). In release builds you want these asserts to disappear - but you can't re-define them as an empty macro, otherwise statements like if (x) assert(a == b); suddenly put the next statement in to the if statement - a disaster! In this case assert(x) can be redefined as ((void)0), which is a statement that has no side effects. Now the if statement works correctly in release builds too - it just does nothing.
These are just two common cases. There are many more you probably don't know about. So, while expressions with no side effects seem redundant, they're actually functionally important. An optimizer will remove them entirely so there's no performance impact, too.