The MSDN documentation for the Microsoft-specific __if_exists statement says the following (emphasis added):
Apply the __if_exists statement to identifiers both inside or outside a class. Do not apply the __if_exists statement to local variables.
Unfortunately there is no explanation for why you should not apply this to local variables. It compiles fine and has the expected effect, so I'm wondering if anyone knows why they say not to do this. Is it a correctness issue, or a maintainability issue, or something else?
I realize that this is a Microsoft-specific feature and not portable, but let's assume for argument's sake that there's a good reason to use it.
EDIT: Some folks are curious why I'm doing this, so here's an explanation. I realize this is a dirty hack, so unless you have a good suggestion for a better way to do it, please don't bother pointing out that it's gross. It's the least-gross alternative we were able to find given the large size of the code base.
We have a large body of legacy code (millions of lines) that uses the Microsoft-specific __FUNCTION__ macro as part of an error logging package. A significant fraction of that code is now wrapped inside lambda functions so that we can catch structured exceptions (with __try/__except) and still use unwindable objects. Inside those lambda functions, __FUNCTION__ evaluates to something useless like `anonymous-namespace'::<lambda23>::operator(), which is not useful for anything. Our workaround for this is to define new __FUNCTION__-like macro which checks for the existence of an alternate local variable with the enclosing function name, using __if_exists. Due to how the macros work, we can easily switch to the new __FUNCTION__ substitute and easily define the alternate name variable without changing tons of code, so it's a reasonably clean solution given the limitations. That is, of course, assuming that it's valid to use __if_exists this way.
As I said above, I know it's a dirty hack, so please don't tell me how ugly it is unless you have good ideas on how to do it better.
I don't know for sure, but one guess is a local variable might be optimized away by compiler, and maybe not of course, which renders __if_exists test unrelieable.
And I also don't see the reason to do this for a local variable, you are in that specific scope, you know everything, why you want to test if a local variable exist?
__if_exists is a dirty old hack inside Visual C++, with severe implementation limitations as it was only intended for ATL.
Local variables are special because you can have two local variables with the same name:
void foo()
{
int i = 1;
{
int i = 2;
}
}
This means there's a more complicated datastructure inside the compiler to track them. __if_exists has to do a name lookup, which may not be correct for some types of nested scopes like this.
Another historical case is that in Visual C++, for wasn't correctly scoped:
void foo()
{
for (int i = 1; false; ) { }
__if_exists(i) // What do you expect? VC++ let i escape.
}
Related
Below is the C++ function in a project I took over lately. Each of the last two statements is just a variable, containing no assignment. What will such kind of statement do? Lately, I saw such kinds of statements usually.
__fastcall TCardActionArea::TCardActionArea(TComponent* Owner)
:TArea(Owner,"CardActionArea")
{
// Get the thread id
ThreadId = std::__threadid();
this->Visible= false;
m_pBackGroundPicture = NULL;
m_pActionButtonMap.clear();
m_ActionsButtonDisplayed.clear();
m_changecnt = 0;
m_isNextbtn = true;
m_PictureParamPath1;
m_PictureParamPath2;
}
Normally, these statements do not do anything, and it is definitely not a common practice to write them.
Maybe the author just wanted to explicitly note that they do not need to assign any values to these members (although a comment would do better).
Maybe this is some hack for a particular compiler to prevent some optimization (e.g. to prevent the member from being optimized away), but it would be a very slippery hack that may not work on a next compiler version.
Maybe the author intended to assign something to these variables and just forgot to do this, so this may be a bug.
Or maybe the author just had some kind of template, e.g. listing all the members to make sure they did not forgot anything, and just kept the parts of template they did not need to change.
The only time I've seen statements like this used was to silence compiler warnings about unreferenced variables (usually function arguments). I haven't checked whether MSVC (which features of this code lead me to believe was used, at least originally) issues such warnings about unused members, although that does seem a stretch as it would only work in some whole-code analysis mode.
We are writing safety-critical code and I'd like a stronger way than [[nodiscard]] to ensure that checking of function return values is caught by the compiler.
[Update]
Thanks for all the discussion in the comments. Let me clarify that this question may seem contrived, or not "typical use case", or not how someone else would do it. Please take this as an academic exercise if that makes it easier to ignore "well why don't you just do it this way?". The question is exactly whether it's possible to create a type(s) that fails compiling if it is not assigned to an l-value as the return result of a function call .
I know about [[nodiscard]], warnings-as-errors, and exceptions, and this question asks if it's possible to achieve something similar, that is a compile time error, not something caught at run-time. I'm beginning to suspect it's not possible, and so any explanation why is very much appreciated.
Constraints:
MSVC++ 2019
Something that doesn't rely on warnings
Warnings-as-Errors also doesn't work
It's not feasible to constantly run static analysis
Macros are OK
Not a runtime check, but caught by the compiler
Not exception-based
I've been trying to think how to create a type(s) that, if it's not assigned to a variable from a function return, the compiler flags an error.
Example:
struct MustCheck
{
bool success;
...???...
};
MustCheck DoSomething( args )
{
...
return MustCheck{true};
}
int main(void) {
MustCheck res = DoSomething(blah);
if( !res.success ) { exit(-1); }
DoSomething( bloop ); // <------- compiler error
}
If such a thing is provably impossible through the type system, I'll also accept that answer ;)
(EDIT) Note 1: I have been thinking about your problem and reached the conclusion that the question is ill posed. It is not clear what you are looking for because of a small detail: what counts as checking? How the checkings compose and how far from the point of calling?
For example, does this count as checking? note that composition of boolean values (results) and/or other runtime variable matters.
bool b = true; // for example
auto res1 = DoSomething1(blah);
auto res2 = DoSomething2(blah);
if((res1 and res2) or b){...handle error...};
The composition with other runtime variables makes it impossible to make any guarantee at compile-time and for composition with other "results" you will have to exclude certain logical operators, like OR or XOR.
(EDIT) Note 2: I should have asked before but 1) if the handling is supposed to always abort: why not abort from the DoSomething function directly? 2) if handling does a specific action on failure, then pass it as a lambda to DoSomething (after all you are controlling what it returns, and what it takese). 3) composition of failures or propagation is the only not trivial case, and it is not well defined in your question.
Below is the original answer.
This doesn't fulfill all the (edited) requirements you have (I think they are excessive) but I think this is the only path forward really.
Below my comments.
As you hinted, for doing this at runtime there are recipes online about "exploding" types (they assert/abort on destruction if they where not checked, tracked by an internal flag).
Note that this doesn't use exceptions (but it is runtime and it is not that bad if you test the code often, it is after all a logical error).
For compile-time, it is more tricky, returning (for example a bool) with [[nodiscard]] is not enough because there are ways of no discarding without checking for example assigning to a (bool) variable.
I think the next layer is to active -Wunused-variable -Wunused-expression -Wunused-parameter (and treat it like an error -Werror=...).
Then it is much harder to not check the bool because comparison is pretty much to only operation you can really do with a bool.
(You can assign to another bool but then you will have to use that variable).
I guess that's quite enough.
There are still Machiavelian ways to mark a variable as used.
For that you can invent a bool-like type (class) that is 1) [[nodiscard]] itself (classes can be marked nodiscard), 2) the only supported operation is ==(bool) or !=(bool) (maybe not even copyable) and return that from your function. (as a bonus you don't need to mark your function as [[nodiscard]] because it is automatic.)
I guess it is impossible to avoid something like (void)b; but that in itself becomes a flag.
Even if you cannot avoid the absence of checking, you can force patterns that will immediately raise eyebrows at least.
You can even combine the runtime and compile time strategy.
(Make CheckedBool exploding.)
This will cover so many cases that you have to be happy at this point.
If compiler flags don’t protect you, you will have still a backup that can be detected in unit tests (regardless of taking the error path!).
(And don’t tell me now that you don’t unit test critical code.)
What you want is a special case of substructural types. Rust is famous for implementing a special case called "affine" types, where you can "use" something "at most once". Here, you instead want "relevant" types, where you have to use something at least once.
C++ has no official built-in support for such things. Maybe we can fake it? I thought not. In the "appendix" to this answer I include my original logic for why I thought so. Meanwhile, here's how to do it.
(Note: I have not tested any of this; I have not written any C++ in years; use at your own risk.)
First, we create a protected destructor in MustCheck. Thus, if we simply ignore the return value, we will get an error. But how do we avoid getting an error if we don't ignore the return value? Something like this.
(This looks scary: don't worry, we wrap most of it in a macro.)
int main(){
struct Temp123 : MustCheck {
void f() {
MustCheck* mc = this;
*mc = DoSomething();
}
} res;
res.f();
if(!res.success) print "oops";
}
Okay, that looks horrible, but after defining a suitable macro, we get:
int main(){
CAPTURE_RESULT(res, DoSomething());
if(!res.success) print "oops";
}
I leave the macro as an exercise to the reader, but it should be doable. You should probably use __LINE__ or something to generate the name Temp123, but it shouldn't be too hard.
Disclaimer
Note that this is all sorts of hacky and terrible, and you likely don't want to actually use this. Using [[nodiscard]] has the advantage of allowing you to use natural return types, instead of this MustCheck thing. That means that you can create a function, and then one year later add nodiscard, and you only have to fix the callers that did the wrong thing. If you migrate to MustCheck, you have to migrate all the callers, even those that did the right thing.
Another problem with this approach is that it is unreadable without macros, but IDEs can't follow macros very well. If you really care about avoiding bugs then it really helps if your IDE and other static analyzers understand your code as well as possible.
As mentioned in the comments you can use [[nodiscard]] as per:
https://learn.microsoft.com/en-us/cpp/cpp/attributes?view=msvc-160
And modify to use this warning as compile error:
https://learn.microsoft.com/en-us/cpp/preprocessor/warning?view=msvc-160
That should cover your use case.
You can't have code outside of functions except for declarations, definitions and preprocessor directives.
Is that statement accurate, or is there something I'm missing? I'm teaching my nephew to program, and he was trying to put a while loop before main. He's pretty young, I want to give him a hard simple rule that he can understand.
Not quite -- you can also put expressions in global variable declarations:
int myGlobalVar = 3 + SomeFunction(4) - anotherGlobalVar;
But you can only put expressions here, which have to evaluate to the value you're initializing the global with. You cannot put full statements (no blocks of code, no if statements, no loops, etc.). This code will get executed before main() gets a chance to run, so be careful with what you do here. I'd recommend against calling functions in global initializers unless you can't avoid it.
For your nephew:
no, you can't do it.
For yourself:
The compiler's input is technically what you get after the preprocessor is run. So, let's leave preprocessor out. After it has worked, you get a C++ program which is a sequence of declarations. Some delcarations may also be definitions, and some definitions (like function definitions) may have statements inside them.
HTH
Yes- you can't stick random executable code outside functions.
Yes, every kind of statement that does something must reside inside a context that can use it (this doesn't apply to variable initialization).
This because C++ is a structured programming language that encloses its behaviour inside procedures, as opposed to unstructured ones in which you have just one level of code and no scopes.
Well, there's namespaces...and the stuff Adam Rosenfield mentioned...and there's also exception try/catch that can be sort of external to functions. Unfortunately, I can't remember the syntax and can't find it with google.
Just having an conversation with collegue at work how to declare a variables.
For me I already decided which style I prefer, but maybe I wrong.
"C" style - all variable at the begining of function.
If you want to know data type of variable, just look at the begining of function.
bool Foo()
{
PARAM* pParam = NULL;
bool rc;
while (true)
{
rc = GetParam(pParam);
... do something with pParam
}
}
"C++" style - declare variables as local as possible.
bool Foo()
{
while (true)
{
PARAM* pParam = NULL;
bool rc = GetParam(pParam);
... do something with pParam
}
}
What do you prefer?
Update
The question is regarding POD variables.
The second one. (C++ style)
There are at least two good reasons for this:
This allow you to apply the YAGNI principle in the code, as you only declare variable when you need them, as close as possible to their use. That make the code easier to understand quickly as you don't have to get back and forth in the function to understand it all. The type of each variable is the main information about the variable and is not always obvious in the varaible name. In short : the code is easier to read.
This allow better compiler optimizations (when possible). Read : http://www.tantalon.com/pete/cppopt/asyougo.htm#PostponeVariableDeclaration
If due to the language you are using you are required to declare variables at the top of the function then clearly you must do this.
If you have a choice then it makes more sense to declare variables where they are used. The rule of thumb I use is: Declare variables with the smallest scope that is required.
Reducing the scope of a variable prevents some types errors, for example where you accidentally use a variable outside of a loop that was intended only to be used inside the loop. Reducing the scope of the variable will allow the compiler to spot the error instead of having code that compiles but fails at runtime.
I prefer the "C++ style". Mainly because it allows RAII, which you do in both your examples for the bool variable.
Furthermore, having a tight scope for the variable provides the compile better oppertunities for optimizations.
This is probably a bit subjective.
I prefer as locally as possible because it makes it completely clear what scope is intended for the variable, and the compiler generates an error if you access it outside the intended useful scope.
This isn't a style issue. In C++, non-POD types will have their constructors called at the point of declaration and destructors called at the end of the scope. You have to be wise about selecting where to declare variables or you will cause unnecessary performance issues. For example, declaring a class variable inside a loop may not be the wisest idea since constructor/destructor will be called every iteration of the loop. But sometimes, declaring class variables at the top of the function may not be the best if there is a chance that variable doesn't get used at all (like a variable is only used inside some 'if' statement).
I prefer C style because the C++ style has one major flaw to me: in a dense function it is very hard on eyes to find the declaration/initialization of the variable. (No syntax highlighting was able yet to cope reliably and predictably with my C++ coding hazards habits.)
Though I do adhere to no style strictly: only key variables are put there and most smallish minor variables live within the block where they are needed (like bool rc in your example).
But all important key variables in my code inevitably end up being declared on the top. And if in a nested block I have too much local variables, that is the sign that I have to start thinking about splitting the code into smaller functions.
This is a C++ disaster, check out this code sample:
#include <iostream>
void func(const int* shouldnotChange)
{
int* canChange = (int*) shouldnotChange;
*canChange += 2;
return;
}
int main() {
int i = 5;
func(&i);
std::cout << i;
return 0;
}
The output was 7!
So, how can we make sure of the behavior of C++ functions, if it was able to change a supposed-to-be-constant parameter!?
EDIT: I am not asking how can I make sure that my code is working as expected, rather I am wondering how to believe that someone else's function (for instance some function in some dll library) isn't going to change a parameter or posses some behavior...
Based on your edit, your question is "how can I trust 3rd party code not to be stupid?"
The short answer is "you can't." If you don't have access to the source, or don't have time to inspect it, you can only trust the author to have written sane code. In your example, the author of the function declaration specifically claims that the code will not change the contents of the pointer by using the const keyword. You can either trust that claim, or not. There are ways of testing this, as suggested by others, but if you need to test large amounts of code, it will be very labour intensive. Perhaps moreso than reading the code.
If you are working on a team and you have a team member writing stuff like this, then you can talk to them about it and explain why it is bad.
By writing sane code.
If you write code you can't trust, then obviously your code won't be trustworthy.
Similar stupid tricks are possible in pretty much any language. In C#, you can modify the code at runtime through reflection. You can inspect and change private class members. How do you protect against that? You don't, you just have to write code that behaves as you expect.
Apart from that, write a unittest testing that the function does not change its parameter.
The general rule in C++ is that the language is designed to protect you from Murphy, not Machiavelli. In other words, its meant to keep a maintainance programmer from accidentally changing a variable marked as const, not to keep someone from deliberatly changing it, which can be done in many ways.
A C-style cast means all bets are off. It's sort of like telling the compiler "Trust me, I know this looks bad, but I need to do this, so don't tell me I'm wrong." Also, what you've done is actually undefined. Casting off const-ness and then modifying the value means the compiler/runtime can do anything, including e.g. crash your program.
The only thing I can suggest is to allocate the variable shouldNotChange from a memory page that is marked as read-only. This will force the OS/CPU to raise an error if the application attempts to write to that memory. I don't really recommend this as a general method of validating functions just as an idea you may find useful.
The simplest way to enforce this would be to just not pass a pointer:
void func(int shouldnotChange);
Now a copy will be made of the argument. The function can change the value all it likes, but the original value will not be modified.
If you can't change the function's interface then you could make a copy of the value before calling the function:
int i = 5;
int copy = i
func(©);
Don't use C style casts in C++.
We have 4 cast operators in C++ (listed here in order of danger)
static_cast<> Safe (When used to 'convert numeric data types').
dynamic_cast<> Safe (but throws exceptions/returns NULL)
const_cast<> Dangerous (when removing const).
static_cast<> Very Dangerous (When used to cast pointer types. Not a very good idea!!!!!)
reinterpret_cast<> Very Dangerous. Use this only if you understand the consequences.
You can always tell the compiler that you know better than it does and the compiler will accept you at face value (the reason being that you don't want the compiler getting in the way when you actually do know better).
Power over the compiler is a two edged sword. If you know what you are doing it is a powerful tool the will help, but if you get things wrong it will blow up in your face.
Unfortunately, the compiler has reasons for most things so if you over-ride its default behavior then you better know what you are doing. Cast is one the things. A lot of the time it is fine. But if you start casting away const(ness) then you better know what you are doing.
(int*) is the casting syntax from C. C++ supports it fully, but it is not recommended.
In C++ the equivalent cast should've been written like this:
int* canChange = static_cast<int*>(shouldnotChange);
And indeed, if you wrote that, the compiler would NOT have allowed such a cast.
What you're doing is writing C code and expecting the C++ compiler to catch your mistake, which is sort of unfair if you think about it.