You can't have code outside of functions except for declarations, definitions and preprocessor directives.
Is that statement accurate, or is there something I'm missing? I'm teaching my nephew to program, and he was trying to put a while loop before main. He's pretty young, I want to give him a hard simple rule that he can understand.
Not quite -- you can also put expressions in global variable declarations:
int myGlobalVar = 3 + SomeFunction(4) - anotherGlobalVar;
But you can only put expressions here, which have to evaluate to the value you're initializing the global with. You cannot put full statements (no blocks of code, no if statements, no loops, etc.). This code will get executed before main() gets a chance to run, so be careful with what you do here. I'd recommend against calling functions in global initializers unless you can't avoid it.
For your nephew:
no, you can't do it.
For yourself:
The compiler's input is technically what you get after the preprocessor is run. So, let's leave preprocessor out. After it has worked, you get a C++ program which is a sequence of declarations. Some delcarations may also be definitions, and some definitions (like function definitions) may have statements inside them.
HTH
Yes- you can't stick random executable code outside functions.
Yes, every kind of statement that does something must reside inside a context that can use it (this doesn't apply to variable initialization).
This because C++ is a structured programming language that encloses its behaviour inside procedures, as opposed to unstructured ones in which you have just one level of code and no scopes.
Well, there's namespaces...and the stuff Adam Rosenfield mentioned...and there's also exception try/catch that can be sort of external to functions. Unfortunately, I can't remember the syntax and can't find it with google.
Related
Coming from the Ruby world, I instantly understood why Crystal chose not to implement a for method. But then I was surprised to see that Crystal does implement a for method for macros. I was even more surprised to find that macros don't allow an enumerable (.each, etc) syntax (i.e. {% ["one", "two", "three"].each do |value| %} isn't valid macro syntax).
Is there a logical reason for this syntax difference? It's possible that the answer is simply ~"because the devs decided that macro syntax looks like x, and non-macro syntax looks like y", but I'm guessing that there is more to it then that (an arbitrary syntax inconsistency seems like a flaw).
Thanks!
The main reason is that when the parser parses foo.bar do |arg| ... end, it expects an expression after |arg|, not %}, which is a parse error. So to allow that we'd need to enhance the parser (which is already quite complex) to take that into account. for was decided because of this, but also to make it clear that it's just not regular crystal but a different thing (it's an interpreted subset of crystal and the standard library).
Another reason is that if each and other iteration methods are allowed, why not while and until? That could allow endless loops in macros, which with just for aren't possible, so you can guarantee a macro finishes executing. Which... is actually not true given that we have run inside macros.
So I think I'm not opposed to change the language to allow each, each_with_index, etc., inside macros, and allow that syntax, and eventually remove for from the macro language. Opening an issue requesting this is a good way in this direction.
I know that I am trying to shoot myself in the leg ;) However, it will allow me to make the rest (big amount) of code smaller and more readable.
Is there any tricky way to create preprocessor macro inside of another preprocessor macro?
Here is the example, what I am looking for. My real scenario is more complex
// That's what I want to do and surely C++ doesn't like it.
#define MACROCREATER(B) #define MACRO##B B+B
void foo()
{
MACROCREATOR(5) // This should create new macro (#define MACRO5 5+5)
int a = MACRO5; // this will use new macro
}
The C++ Standard says (16.3.4.3):
The resulting completely
macro-replaced preprocessing token
sequence [... of the macro expansion...] is not processed as a
preprocessing directive even if it
resembles one...
So no, there is no 'official' way of achieving what you want with macros.
No. Even if a macro expands into something that looks like a preprocessing directive, the expansion is not evaluated as a preprocessing directive.
As a supplement to the answers above, if you really wanted to pre-process a source file twice—which is almost definitely not what you actually want to do—you could always invoke your compiler like this:
g++ -E input.cpp | g++ -c -x c++ - -o output.o
That is, run the file through the preprocessor, then run the preprocessed output via pipe through a full compilation routine, including a second preprocessing step. In order for this to have a reasonably good chance of working, I'd imagine you'd have to be rather careful in how you defined and used your macros, and all in all it would most likely not be worth the trouble and increased build time.
If you really want macros, use standard macro-based solutions. If you really want compile-time metaprogramming, use templates.
On a slightly related note, this reminds me of the fact that raytracing language POV-Ray made heavy use of a fairly complex preprocessing language, with flow-control directives such as #while that allowed conditional repetition, compile-time calculations, and other such goodies. Would that it were so in C++, but it simply isn't, so we just do it another way.
No. The pre-processor is single-pass. It doesn't re-evaluate the macro expansions.
As noted, one can #include a particular file more than once with different macro definitions active. This can make it practical to achieve some effects that could not be practically achieved via any other means.
As a simple example, on many embedded systems pointer indirection is very expensive compared to direct variable access. Code which uses a lot of pointer indirection may very well be twice as large and slow as code which simply uses variables. Consequently, if a particular routine is used with two sets of variables, in a scenario where one would usually pass in a pointer to a structure and then use the arrow operator, it may be far more efficient to simple put the routine in its own file (I normally use extension .i) which is #included once without macro _PASS2 defined, and a second time with. That file can then #ifdef _PASS2/#else to define macros for all the variables that should be different on the two passes. Even though the code gets generated twice, on some micros that will take less space than using the arrow operator with passed-in pointers.
Take a look at m4. It is similar to cpp, but recursive and much more powerful. I've used m4 to create a structured language for assemblers, e.g.
cmp r0, #0
if(eq)
mov r1, #0
else
add r1, #1
end
The "if", "else", and "end" are calls to m4 macros I wrote that generate jumps and labels, the rest is native assembly. In order to nest these if/else/end constructs, you need to do defines within a macro.
The MSDN documentation for the Microsoft-specific __if_exists statement says the following (emphasis added):
Apply the __if_exists statement to identifiers both inside or outside a class. Do not apply the __if_exists statement to local variables.
Unfortunately there is no explanation for why you should not apply this to local variables. It compiles fine and has the expected effect, so I'm wondering if anyone knows why they say not to do this. Is it a correctness issue, or a maintainability issue, or something else?
I realize that this is a Microsoft-specific feature and not portable, but let's assume for argument's sake that there's a good reason to use it.
EDIT: Some folks are curious why I'm doing this, so here's an explanation. I realize this is a dirty hack, so unless you have a good suggestion for a better way to do it, please don't bother pointing out that it's gross. It's the least-gross alternative we were able to find given the large size of the code base.
We have a large body of legacy code (millions of lines) that uses the Microsoft-specific __FUNCTION__ macro as part of an error logging package. A significant fraction of that code is now wrapped inside lambda functions so that we can catch structured exceptions (with __try/__except) and still use unwindable objects. Inside those lambda functions, __FUNCTION__ evaluates to something useless like `anonymous-namespace'::<lambda23>::operator(), which is not useful for anything. Our workaround for this is to define new __FUNCTION__-like macro which checks for the existence of an alternate local variable with the enclosing function name, using __if_exists. Due to how the macros work, we can easily switch to the new __FUNCTION__ substitute and easily define the alternate name variable without changing tons of code, so it's a reasonably clean solution given the limitations. That is, of course, assuming that it's valid to use __if_exists this way.
As I said above, I know it's a dirty hack, so please don't tell me how ugly it is unless you have good ideas on how to do it better.
I don't know for sure, but one guess is a local variable might be optimized away by compiler, and maybe not of course, which renders __if_exists test unrelieable.
And I also don't see the reason to do this for a local variable, you are in that specific scope, you know everything, why you want to test if a local variable exist?
__if_exists is a dirty old hack inside Visual C++, with severe implementation limitations as it was only intended for ATL.
Local variables are special because you can have two local variables with the same name:
void foo()
{
int i = 1;
{
int i = 2;
}
}
This means there's a more complicated datastructure inside the compiler to track them. __if_exists has to do a name lookup, which may not be correct for some types of nested scopes like this.
Another historical case is that in Visual C++, for wasn't correctly scoped:
void foo()
{
for (int i = 1; false; ) { }
__if_exists(i) // What do you expect? VC++ let i escape.
}
Just having an conversation with collegue at work how to declare a variables.
For me I already decided which style I prefer, but maybe I wrong.
"C" style - all variable at the begining of function.
If you want to know data type of variable, just look at the begining of function.
bool Foo()
{
PARAM* pParam = NULL;
bool rc;
while (true)
{
rc = GetParam(pParam);
... do something with pParam
}
}
"C++" style - declare variables as local as possible.
bool Foo()
{
while (true)
{
PARAM* pParam = NULL;
bool rc = GetParam(pParam);
... do something with pParam
}
}
What do you prefer?
Update
The question is regarding POD variables.
The second one. (C++ style)
There are at least two good reasons for this:
This allow you to apply the YAGNI principle in the code, as you only declare variable when you need them, as close as possible to their use. That make the code easier to understand quickly as you don't have to get back and forth in the function to understand it all. The type of each variable is the main information about the variable and is not always obvious in the varaible name. In short : the code is easier to read.
This allow better compiler optimizations (when possible). Read : http://www.tantalon.com/pete/cppopt/asyougo.htm#PostponeVariableDeclaration
If due to the language you are using you are required to declare variables at the top of the function then clearly you must do this.
If you have a choice then it makes more sense to declare variables where they are used. The rule of thumb I use is: Declare variables with the smallest scope that is required.
Reducing the scope of a variable prevents some types errors, for example where you accidentally use a variable outside of a loop that was intended only to be used inside the loop. Reducing the scope of the variable will allow the compiler to spot the error instead of having code that compiles but fails at runtime.
I prefer the "C++ style". Mainly because it allows RAII, which you do in both your examples for the bool variable.
Furthermore, having a tight scope for the variable provides the compile better oppertunities for optimizations.
This is probably a bit subjective.
I prefer as locally as possible because it makes it completely clear what scope is intended for the variable, and the compiler generates an error if you access it outside the intended useful scope.
This isn't a style issue. In C++, non-POD types will have their constructors called at the point of declaration and destructors called at the end of the scope. You have to be wise about selecting where to declare variables or you will cause unnecessary performance issues. For example, declaring a class variable inside a loop may not be the wisest idea since constructor/destructor will be called every iteration of the loop. But sometimes, declaring class variables at the top of the function may not be the best if there is a chance that variable doesn't get used at all (like a variable is only used inside some 'if' statement).
I prefer C style because the C++ style has one major flaw to me: in a dense function it is very hard on eyes to find the declaration/initialization of the variable. (No syntax highlighting was able yet to cope reliably and predictably with my C++ coding hazards habits.)
Though I do adhere to no style strictly: only key variables are put there and most smallish minor variables live within the block where they are needed (like bool rc in your example).
But all important key variables in my code inevitably end up being declared on the top. And if in a nested block I have too much local variables, that is the sign that I have to start thinking about splitting the code into smaller functions.
I know that I am trying to shoot myself in the leg ;) However, it will allow me to make the rest (big amount) of code smaller and more readable.
Is there any tricky way to create preprocessor macro inside of another preprocessor macro?
Here is the example, what I am looking for. My real scenario is more complex
// That's what I want to do and surely C++ doesn't like it.
#define MACROCREATER(B) #define MACRO##B B+B
void foo()
{
MACROCREATOR(5) // This should create new macro (#define MACRO5 5+5)
int a = MACRO5; // this will use new macro
}
The C++ Standard says (16.3.4.3):
The resulting completely
macro-replaced preprocessing token
sequence [... of the macro expansion...] is not processed as a
preprocessing directive even if it
resembles one...
So no, there is no 'official' way of achieving what you want with macros.
No. Even if a macro expands into something that looks like a preprocessing directive, the expansion is not evaluated as a preprocessing directive.
As a supplement to the answers above, if you really wanted to pre-process a source file twice—which is almost definitely not what you actually want to do—you could always invoke your compiler like this:
g++ -E input.cpp | g++ -c -x c++ - -o output.o
That is, run the file through the preprocessor, then run the preprocessed output via pipe through a full compilation routine, including a second preprocessing step. In order for this to have a reasonably good chance of working, I'd imagine you'd have to be rather careful in how you defined and used your macros, and all in all it would most likely not be worth the trouble and increased build time.
If you really want macros, use standard macro-based solutions. If you really want compile-time metaprogramming, use templates.
On a slightly related note, this reminds me of the fact that raytracing language POV-Ray made heavy use of a fairly complex preprocessing language, with flow-control directives such as #while that allowed conditional repetition, compile-time calculations, and other such goodies. Would that it were so in C++, but it simply isn't, so we just do it another way.
No. The pre-processor is single-pass. It doesn't re-evaluate the macro expansions.
As noted, one can #include a particular file more than once with different macro definitions active. This can make it practical to achieve some effects that could not be practically achieved via any other means.
As a simple example, on many embedded systems pointer indirection is very expensive compared to direct variable access. Code which uses a lot of pointer indirection may very well be twice as large and slow as code which simply uses variables. Consequently, if a particular routine is used with two sets of variables, in a scenario where one would usually pass in a pointer to a structure and then use the arrow operator, it may be far more efficient to simple put the routine in its own file (I normally use extension .i) which is #included once without macro _PASS2 defined, and a second time with. That file can then #ifdef _PASS2/#else to define macros for all the variables that should be different on the two passes. Even though the code gets generated twice, on some micros that will take less space than using the arrow operator with passed-in pointers.
Take a look at m4. It is similar to cpp, but recursive and much more powerful. I've used m4 to create a structured language for assemblers, e.g.
cmp r0, #0
if(eq)
mov r1, #0
else
add r1, #1
end
The "if", "else", and "end" are calls to m4 macros I wrote that generate jumps and labels, the rest is native assembly. In order to nest these if/else/end constructs, you need to do defines within a macro.