At: http://www.learncpp.com/cpp-tutorial/110-a-first-look-at-the-preprocessor/
It mentions a directive called "Macro defines". What do we mean when we say "Macro"?
Thanks.
A macro is a preprocessor directive that defines a name that is to be replaced (or removed) by the preprocessor right before compilation.
Example:
#define MY_MACRO1 somevalue
#define MY_MACRO2
#define SUM(a, b) (a + b)
then if anywhere in the code (except in the string literals) there is a mention of MY_MACRO1 or MY_MACRO2 the preprocessor replaces this with whatever comes after the name in the #define line.
There can also be macros with parameters (like the SUM). In that case the preprocessor recognizes the arguments, example:
int x = 1, y = 2;
int z = SUM(x, y);
preprocessor replaces like this:
int x = 1, y = 2;
int z = (x + y);
only after this replacement the compiler gets to compile the resulting code.
A macro is a code fragment that gets substituted into your program by the preprocessor (before compilation proper begins). This may be a function-like block, or it may be a constant value.
A warning when using a function-like macro. Consider the following code:
#define foo(x) x*x
If you call foo(3), it will become (and be compiled as) 3*3 (=9). If, instead, you call foo(2+3), it will become 2+3*2+3, (=2+6+3=11), which is not what you want. Also, since the code is substituted in place, foo(bar++) becomes bar++ * bar++, incrementing bar twice.
Macros are powerful tools, but it can be easy to shoot yourself in the foot while trying to do something fancy with them.
"Macro defines" merely indicate how they are specified (with #define directives), while "Macro" is the function or expression that is defined.
There is little difference between them aside from semantics, however.
Related
This question already has answers here:
C macros and use of arguments in parentheses
(2 answers)
Closed 5 years ago.
I tried to play with the definition of the macro SQR in the following code:
#define SQR(x) (x*x)
int main()
{
int a, b=3;
a = SQR(b+5); // Ideally should be replaced with (3+5*5+3), though not sure.
printf("%d\n",a);
return 0;
}
It prints 23. If I change the macro definition to SQR(x) ((x)*(x)) then the output is as expected, 64. I know that a call to a macro in C replaces the call with the definition of the macro, but I still can’t understand, how it calculated 23.
Pre-processor macros perform text-replacement before the code is compiled so
SQR(b+5) translates to
(b+5*b+5) = (6b+5) = 6*3+5 = 23
Regular function calls would calculate the value of the parameter (b+3) before passing it to the function, but since a macro is pre-compiled replacement, the algebraic order of operations becomes very important.
Consider the macro replacement using this macro:
#define SQR(x) (x*x)
Using b+5 as the argument. Do the replacement yourself. In your code, SQR(b+5) will become: (b+5*b+5), or (3+5*3+5). Now remember your operator precedence rules: * before +. So this is evaluated as: (3+15+5), or 23.
The second version of the macro:
#define SQR(x) ((x) * (x))
Is correct, because you're using the parens to sheild your macro arguments from the effects of operator precedence.
This page explaining operator preference for C has a nice chart. Here's the relevant section of the C11 reference document.
The thing to remember here is that you should get in the habit of always shielding any arguments in your macros, using parens.
Because (3+5*3+5 == 23).
Whereas ((3+5)*(3+5)) == 64.
The best way to do this is not to use a macro:
inline int SQR(int x) { return x*x; }
Or simply write x*x.
The macro expands to
a = b+5*b+5;
i.e.
a = b + (5*b) + 5;
So 23.
After preprocessing, SQR(b+5) will be expanded to (b+5*b+5). This is obviously not correct.
There are two common errors in the definition of SQR:
do not enclose arguments of macro in parentheses in the macro body, so if those arguments are expressions, operators with different precedences in those expressions may cause problem. Here is a version that fixed this problem
#define SQR(x) ((x)*(x))
evaluate arguments of macro more than once, so if those arguments are expressions that have side effect, those side effect could be taken more than once. For example, consider the result of SQR(++x).
By using GCC typeof extension, this problem can be fixed like this
#define SQR(x) ({ typeof (x) _x = (x); _x * _x; })
Both of these problems could be fixed by replacing that macro with an inline function
inline int SQR(x) { return x * x; }
This requires GCC inline extension or C99, See 6.40 An Inline Function is As Fast As a Macro.
A macro is just a straight text substitution. After preprocessing, your code looks like:
int main()
{
int a, b=3;
a = b+5*b+5;
printf("%d\n",a);
return 0;
}
Multiplication has a higher operator precedence than addition, so it's done before the two additions when calculating the value for a. Adding parentheses to your macro definition fixes the problem by making it:
int main()
{
int a, b=3;
a = (b+5)*(b+5);
printf("%d\n",a);
return 0;
}
The parenthesized operations are evaluated before the multiplication, so the additions happen first now, and you get the a = 64 result that you expect.
Because Macros are just string replacement and it is happens before the completion process. The compiler will not have the chance to see the Macro variable and its value. For example: If a macro is defined as
#define BAD_SQUARE(x) x * x
and called like this
BAD_SQUARE(2+1)
the compiler will see this
2 + 1 * 2 + 1
which will result in, maybe, unexpected result of
5
To correct this behavior, you should always surround the macro-variables with parenthesis, such as
#define GOOD_SQUARE(x) (x) * (x)
when this macro is called, for example ,like this
GOOD_SQUARE(2+1)
the compiler will see this
(2 + 1) * (2 + 1)
which will result in
9
Additionally, Here is a full example to further illustrate the point
#include <stdio.h>
#define BAD_SQUARE(x) x * x
// In macros alsways srround the variables with parenthesis
#define GOOD_SQUARE(x) (x) * (x)
int main(int argc, char const *argv[])
{
printf("BAD_SQUARE(2) = : %d \n", BAD_SQUARE(2) );
printf("GOOD_SQUARE(2) = : %d \n", GOOD_SQUARE(2) );
printf("BAD_SQUARE(2+1) = : %d ; because the macro will be \
subsituted as 2 + 1 * 2 + 1 \n", BAD_SQUARE(2+1) );
printf("GOOD_SQUARE(2+1) = : %d ; because the macro will be \
subsituted as (2 + 1) * (2 + 1) \n", GOOD_SQUARE(2+1) );
return 0;
}
Just enclose each and every argument in the macro expansion into parentheses.
#define SQR(x) ((x)*(x))
This will work for whatever argument or value you pass.
I have the following code:
#define FOO_BAR x
#define FOO(x) FOO_BAR
I do want FOO(2) to expand to 2, but I'm getting x instead. I tried to use EXPAND macro to force extra scan:
#define FOO_BAR x
#define EXPAND(x) x
#define FOO(x) EXPAND(FOO_BAR)
Note, this is intentional, that FOO_BAR doesn't accept x as an argument. Basically, I cannot pass x to FOO_BAR.
But it doesn't work as well. Any ideas?
I want this to work on any compiler (MSVC, gcc, clang).
What exactly I am trying to accomplish
My end goal is to create type safe enums for OpenGL. So, I need to do mapping from my safe enum to unsafe ones. So I have something like:
enum class my_enum {
foo,
bar
}
GLenum my_enum2gl(my_enum e) {
switch (e) {
case my_enum::foo: return GL_FOO;
case my_enum::bar: return GL_BAR;
}
return GL_NONE;
}
Since I'm lazy, I did some preprocessor magic. And implemented this as:
#define PP_IMPL_ENUM_VALUE(enum_pair) __PP_EVAL(__PP_IMPL_ENUM_VALUE enum_pair)
#define __PP_IMPL_ENUM_VALUE(cpp_enum, gl_enum) cpp_enum,
#define PP_IMPL_CONVERT(enum_pair) __PP_EVAL(__PP_IMPL_CONVERT enum_pair)
#define __PP_IMPL_CONVERT(cpp_enum, gl_enum) case name::cpp_enum: return gl_enum;
#define DEF_STATE_ENUM(name, ...) \
enum name { \
PP_FOR_EACH(PP_IMPL_ENUM_VALUE, ##__VA_ARGS__) \
}; \
namespace detail { \
GLenum name ## 2gl(name e) { \
switch(e) { \
__PP_EVAL(PP_FOR_EACH(PP_IMPL_CONVERT, ##__VA_ARGS__)) \
default: \
assert(!"Unknown value"); \
return GL_NONE; \
} \
} \
}
DEF_STATE_ENUM(my_enum,
(foo, GL_FOO),
(bar, GL_BAR)
)
The problem is that __PP_IMPL_CONVERT uses name which is not expanded. Passing x to FOO_BAR would mean that I'm passing some extra parameter to a functor for PP_FOR_EACH.
You need to understand
The preprocessor fully expands the arguments to each function-like macro before substituting them into the macro's expansion, except where they are operands of the # or ## preprocessing operators (in which case they are not expanded at all).
After modifying the input preprocessing token sequence by performing a macro expansion, the preprocessor automatically rescans the result for further macro expansions to perform.
In your lead example, then, given
#define FOO_BAR x
#define FOO(x) FOO_BAR
and a macro invocation
FOO(2)
, the preprocessor first macro expands the argument 2, leaving it unchanged, then replaces the macro call with its expansion. Since the expansion does not, in fact, use the argument in the first place, the initial result is
FOO_BAR
The preprocessor then rescans that, recognizes FOO_BAR as the identifier of an object-like macro, and replaces it with its expansion, yielding
x
, as you observed. This is the normal and expected behavior of a conforming C preprocessor, and to the best of my knowledge, C++ has equivalent specifications for its preprocessor.
Inserting an EXPAND() macro call does not help, because the problem is not failure to expand macros, but rather the time and context of macro expansion. Ultimately, it should not be all that surprising that when the replacement text of macro FOO(x) does not use the macro parameter x, the actual argument associated with that parameter has no effect on the result of the expansion.
I cannot fully address your real-world code on account of the fact that it depends centrally on a macro PP_FOR_EACH() whose definition you do not provide. Presumably that macro's name conveys the gist, but as you can see, the details matter. However, if you in fact understand how your PP_FOR_EACH macro actually works, then I bet you could come up with a variant that accepts an additional leading argument, by which you could convey (the same) name to each expansion.
Alternatively, this is the kind of problem for which X Macros were invented. I see that that alternative has already been raised in comments. You might even be able -- with some care -- to build a solution that uses X macros inside, so as to preserve the top-level interface you now have.
Why this macro gives output 144, instead of 121?
#include<iostream>
#define SQR(x) x*x
int main()
{
int p=10;
std::cout<<SQR(++p);
}
This is a pitfall of preprocessor macros. The problem is that the expression ++p is used twice, because the preprocessor simply replaces the macro "call" with the body pretty much verbatim.
So what the compiler sees after the macro expansion is
std::cout<<++p*++p;
Depending on the macro, you can also get problems with operator precedence if you are not careful with placing parentheses where needed.
Take for example a macro such as
// Macro to shift `a` by `b` bits
#define SHIFT(a, b) a << b
...
std::cout << SHIFT(1, 4);
This would result in the code
std::cout << 1 << 4;
which may not be what was wanted or expected.
If you want to avoid this, then use inline functions instead:
inline int sqr(const int x)
{
return x * x;
}
This has two things going for it: The first is that the expression ++p will only be evaluated once. The other thing is that now you can't pass anything other than int values to the function. With the preprocessor macro you could "call" it like SQR("A") and the preprocessor would not care while you instead get (sometimes) cryptical errors from the compiler.
Also, being marked inline, the compiler may skip the actual function call completely and put the (correct) x*x directly in the place of the call, thereby making it as "optimized" as the macro expansion.
The approach of squaring with this macro has two problems:
First, for the argument ++p, the increment operation is performed twice. That's certainly not intended. (As a general rule of thumb, just don't do several things in "one line". Separate them into more statements.). It doesn't even stop at incrementing twice: The order of these increments isn't defined, so there is no guaranteed outcome of this operation!
Second, even if you don't have ++p as the argument, there is still a bug in your macro! Consider the input 1 + 1. Expected output is 4. 1+1 has no side-effect, so it should be fine, shouldn't it? No, because SQR(1 + 1) translates to 1 + 1 * 1 + 1 which evaluates to 3.
To at least partially fix this macro, use parentheses:
#define SQR(x) (x) * (x)
Altogether, you should simply replace it by a function (to add type-safety!)
int sqr(int x)
{
return x * x;
}
You can think of making it a template
template <typename Type>
Type sqr(Type x)
{
return x * x; // will only work on types for which there is the * operator.
}
and you may add a constexpr (C++11), which is useful if you ever plan on using a square in a template:
constexpr int sqr(int x)
{
return x * x;
}
Because you are using undefined behaviour. When the macro is expanded, your code turns into this:
std::cout<<++p*++p;
using increment/decrement operators on the same variable multiple times in the same statement is undefined behaviour.
Because SQR(++p) expands to ++p*++p, which has undefined behavior in C/C++. In this case it incremented p twice before evaluating the multiplication. But you can't rely on that. It might be 121 (or even 42) with a different C/C++ compiler.
If have encountered this claim multiple times and can't figure out what it is supposed to mean. Since the resulting code is compiled using a regular C compiler it will end up being type checked just as much (or little) as any other code.
So why are macros not type safe? It seems to be one of the major reasons why they should be considered evil.
Consider the typical "max" macro, versus function:
#define MAX(a,b) a < b ? a : b
int max(int a, int b) {return a < b ? a : b;}
Here's what people mean when they say the macro is not type-safe in the way the function is:
If a caller of the function writes
char *foo = max("abc","def");
the compiler will warn.
Whereas, if a caller of the macro writes:
char *foo = MAX("abc", "def");
the preprocessor will replace that with:
char *foo = "abc" < "def" ? "abc" : "def";
which will compile with no problems, but almost certainly not give the result you wanted.
Additionally of course the side effects are different, consider the function case:
int x = 1, y = 2;
int a = max(x++,y++);
the max() function will operate on the original values of x and y and the post-increments will take effect after the function returns.
In the macro case:
int x = 1, y = 2;
int b = MAX(x++,y++);
that second line is preprocessed to give:
int b = x++ < y++ ? x++ : y++;
Again, no compiler warnings or errors but will not be the behaviour you expected.
Macros aren't type safe because they don't understand types.
You can't tell a macro to only take integers. The preprocessor recognises a macro usage and it replaces one sequence of tokens (the macro with its arguments) with another set of tokens. This is a powerful facility if used correctly, but it's easy to use incorrectly.
With a function you can define a function void f(int, int) and the compiler will flag if you try to use the return value of f or pass it strings.
With a macro - no chance. The only checks that get made are it is given the correct number of arguments. then it replaces the tokens appropriately and passes onto the compiler.
#define F(A, B)
will allow you to call F(1, 2), or F("A", 2) or F(1, (2, 3, 4)) or ...
You might get an error from the compiler, or you might not, if something within the macro requires some sort of type safety. But that's not down to the preprocessor.
You can get some very odd results when passing strings to macros that expect numbers, as the chances are you'll end up using string addresses as numbers without a squeak from the compiler.
Well they're not directly type-safe... I suppose in certain scenarios/usages you could argue they can be indirectly (i.e. resulting code) type-safe. But you could certainly create a macro intended for integers and pass it strings... the pre-processor handling the macros certainly doesn't care. The compiler may choke on it, depending on usage...
Since macros are handled by the preprocessor, and the preprocessor doesn't understand types, it will happily accept variables that are of the wrong type.
This is usually only a concern for function-like macros, and any type errors will often be caught by the compiler even if the preprocessor doesn't, but this isn't guaranteed.
An example
In the Windows API, if you wanted to show a balloon tip on an edit control, you'd use Edit_ShowBalloonTip. Edit_ShowBalloonTip is defined as taking two parameters: the handle to the edit control and a pointer to an EDITBALLOONTIP structure. However, Edit_ShowBalloonTip(hwnd, peditballoontip); is actually a macro that evaluates to
SendMessage(hwnd, EM_SHOWBALLOONTIP, 0, (LPARAM)(peditballoontip));
Since configuring controls is generally done by sending messages to them, Edit_ShowBalloonTip has to do a typecast in its implementation, but since it's a macro rather than an inline function, it can't do any type checking in its peditballoontip parameter.
A digression
Interestingly enough, sometimes C++ inline functions are a bit too type-safe. Consider the standard C MAX macro
#define MAX(a, b) ((a) > (b) ? (a) : (b))
and its C++ inline version
template<typename T>
inline T max(T a, T b) { return a > b ? a : b; }
MAX(1, 2u) will work as expected, but max(1, 2u) will not. (Since 1 and 2u are different types, max can't be instantiated on both of them.)
This isn't really an argument for using macros in most cases (they're still evil), but it's an interesting result of C and C++'s type safety.
There are situations where macros are even less type-safe than functions. E.g.
void printlog(int iter, double obj)
{
printf("%.3f at iteration %d\n", obj, iteration);
}
Calling this with the arguments reversed will cause truncation and erroneous results, but nothing dangerous. By contrast,
#define PRINTLOG(iter, obj) printf("%.3f at iteration %d\n", obj, iter)
causes undefined behavior. To be fair, GCC warns about the latter, but not about the former, but that's because it knows printf -- for other varargs functions, the results are potentially disastrous.
When the macro runs, it just does a text match through your source files. This is before any compilation, so it is not aware of the datatypes of anything it changes.
Macros aren't type safe, because they were never meant to be type safe.
The compiler does the type checking after macros had been expanded.
Macros and there expansion are meant as a helper to the ("lazy") author (in the sense of writer/reader) of C source code. That's all.
When I define this macro:
#define SQR(x) x*x
Let's say this expression:
SQR(a+b)
This expression will be replaced by the macro and looks like:
a+b*a+b
But, if I put a ++ operator before the expression:
++SQR(a+b)
What the expression looks like now? Is this ++ placed befor every part of SQR paramete? Like this:
++a+b*++a+b
Here I give a simple program:
#define SQR(x) x*x
int a, k = 3;
a = SQR(k+1) // 7
a = ++SQR(k+1) //9
When defining macros, you basically always want to put the macro parameters in parens to prevent the kind of weird behaviour in your first example, and put the result in parens so it can be safely used without side-effects. Using
#define SQR(x) ((x)*(x))
makes SQR(a+b) expand to ((a+b)*(a+b)) which would be mathematically correct (unlike a+b*a+b, which is equal to ab+a+b).
Putting things before or after a macro won't enter the macro. So ++SQR(x) becomes ++x*x in your example.
Note the following:
int a=3, b=1;
SQR(a+b) // ==> a+b*a+b = 3+1*3+1 = 7
++SQR(a+b) // ==> ++a+b*a+b ==> 4 + 1*4 + 1 = 9
// since preincrement will affect the value of a before it is read.
You're seeing the ++SQR(a+b) appear to increment by 2 since the preincrement kicks in before a i read either time, i.e. a increments, then is used twice and so the result is 2 higher than expected.
NOTE As #JonathanLeffler points out, the latter call invokes undefined behaviour; the evaluation is not guaranteed to happen left-to-right. It might produce different results on different compilers/OSes, and thus should never be relied on.
For C++ the right way to define this macro is to not use a macro, but instead use:
template<typename T> static T SQR( T a ) { return a*a; }
This will get right some horrible cases that the macro gets wrong:
For example:
SQR(++a);
with the function form ++a will be evaluated once. In the macro form you get undefined behaviour as you modify and read a value multiple times between sequence points (at least for C++)
A macro definition just replaces the code,hence it is generally preferable to put into parenthesis otherwise the code may replaced in a way you don't want.
Hence if you define it as :
#define SQR(x) ((x)*(x))
then
++SQR(a+b) = ++((a+b)*(a+b))
In your example, ++SQR(a+b) should be expanded as ++a+b*a+b.
So, if a == 3 and b == 1 you will get the answer 9 if the compiler evaluates it from left to right.
But your statement ++SQR(3+1) is not correct because it will be expanded as ++3+1*3+1 where ++3 is invalid.
In your preprocessor it evaluates to ++a+b*a+b. The right way is put brackets around each term and around the whole thing, like:
#define SQR(x) ((x)*(x))