C++ vs C Macro string concatenation difference - c++

I try to write a macro like following:
taken from link
and I apply same rule to my software whit out success.
I notice some difference from C and C++, but I don't understand why, the macro are preprocessor job !
also I notice some difference passing to the macro the values coming from an enumerators.
#include <stdio.h>
#define CONCAT(string) "start"string"end"
int main(void)
{
printf(CONCAT("-hello-"));
return 0;
}
the reported link used to try online the code link to a demo on ideone allow selection of different language
C is ok but changing to C++ it doesn't work.
Also in my IDE Visual Studio Code (MinGw C++) doesn't work.
My final target is write a macro to parametrize printf() function, for Virtual Console application using some escape codes. I try to add # to the macro concatenation and seems work but in case I pass an enumerator to the macro I have unexpected result. the code is :
#include <stdio.h>
#define def_BLACK_TXT 30
#define def_Light_green_bck 102
#define CSI "\e["
#define concat_csi(a, b) CSI #a ";" #b "m"
#define setTextAndBackgColor(tc, bc) printf(concat_csi(bc, tc))
enum VtColors { RESET_COLOR = 0, BLACK_TXT = 30, Light_green_bck = 102 };
int main(void){
setTextAndBackgColor(30, 102);
printf("\r\n");
setTextAndBackgColor(def_BLACK_TXT , def_Light_green_bck );
printf("\r\n");
setTextAndBackgColor(VtColors::BLACK_TXT , VtColors::Light_green_bck );
printf("\r\n");
printf("\e[102;30m");// <<--- this is the expected result of macro expansion
}
//and the output is : ( in the line 3 seems preprocessor don't resolve enum (the others line are ok) )
[102;30m
[102;30m
[VtColors::Light_green_bck;VtColors::BLACK_TXTm
[102;30m
Obviously I want use enumerators as parameter... (or I will change to #define).
But I'm curious to understand why it happens, and why there is difference in preprocessor changing from C to C++.
If anyone know the solution, many thanks.

There appears to be some compiler disagreement here.
MSVC compiles it as C++ without any issues.
gcc produces a compilation error.
The compilation error references a C++ feature called "user-defined literals", where the syntax "something"suffix gets parsed as a user-defined literal (presuming that this user-defined literal gets properly declared).
Since the preprocessor phase should be happening before the compilation phase, I conclude that the compilation error is a compiler bug.
Note that adding some whitespace produces the same result whether it gets compiled as C or C++ (and makes gcc happy):
#define CONCAT(string) "start" string "end"
EDIT: as of C++11, user-defined literals are considered to be distinct tokens:
Phase 3
The source file is decomposed into comments, sequences of
whitespace characters (space, horizontal tab, new-line, vertical tab,
and form-feed), and preprocessing tokens, which are the following:
a)
header names such as or "myfile.h"
b) identifiers
c)
preprocessing numbers d) character and string literals , including
user-defined (since C++11)
emphasis mine.
This occurs before phase 4: preprocessor execution, so a compilation error here is the correct result. "start"string, with no intervening whitespace, gets parsed as a user-defined literal, before the preprocessor phase.

to summarize the behavioral is the following: (see comment in the code)
#include <stdio.h>
#define CONCAT_1(string) "start"#string"end"
#define CONCAT_2(string) "start"string"end"
#define CONCAT_3(string) "start" string "end"
int main(void)
{
printf(CONCAT_1("-hello-")); // wrong insert double quote
printf("\r\n");
printf(CONCAT_1(-hello-)); // OK but is without quote
printf("\r\n");
#if false
printf(CONCAT_2("-hello-")); // compiler error
printf("\r\n");
#endif
printf(CONCAT_3("-hello-")); // OK
printf("\r\n");
printf("start" "-hello-" "end"); // OK
printf("\r\n");
printf("start""-hello-""end"); // OK
printf("\r\n");
return 0;
}
output:
start"-hello-"end <<<--- wrong insert double quote
start-hello-end
start-hello-end
start-hello-end
start-hello-end

Related

Cascaded macros in gcc C++14 vs msvc++ 2015

I have the following code working under msvc 2015:
#define CLASS_JS_PSG_PROPERTY_EX(PROPERTY, VALUE) \
static bool Get##PROPERTY(/*irrelevant params here...*/) \
{ \
...
some particular code
...
return true; \
}
#define CLASS_JS_PSG_PROPERTY(VALUE) \
CLASS_JS_PSG_PROPERTY_EX(##VALUE, VALUE)
...
#define kProp 1
CLASS_JS_PSG_PROPERTY_EX(Version, kProp)
CLASS_JS_PSG_PROPERTY(kProp)
This should define methods named GetVersion and GetkProp.
Now, this gives the following error under gcc C++14 (actually TDM-GCC-64):
pasting "(" and "kProp" does not give a valid preprocessing token
How should be written in order to compile under gcc C++14 and msvc 2015?
The trick is - if you don't want a name to get expanded as a macro, you must pass it to ## operator right away - but the result of concatenation must be a valid token. Something like this:
#include <iostream>
#define CLASS_JS_PSG_PROPERTY_EX_HELPER(GetName) \
static bool GetName() { return true; }
#define CLASS_JS_PSG_PROPERTY_EX(PROPERTY, VALUE) \
CLASS_JS_PSG_PROPERTY_EX_HELPER(Get##PROPERTY)
#define CLASS_JS_PSG_PROPERTY(VALUE) \
CLASS_JS_PSG_PROPERTY_EX_HELPER(Get##VALUE)
#define kProp 1
CLASS_JS_PSG_PROPERTY_EX(Version, kProp)
CLASS_JS_PSG_PROPERTY(kProp)
int main() {
std::cout << GetVersion() + GetkProp();
}
Works with gcc and MSVC
The reason your original code appears to work with MSVC is because MSVC preprocessor is famously non-conforming - it operates on a stream of characters (wrong), rather than a stream of tokens (right). In CLASS_JS_PSG_PROPERTY_EX(##VALUE, VALUE), ## is not a unary operator as you suggest - it's a binary operator that glues ( and VALUE into a single token (VALUE. This is not a valid preprocessing token, so the program is ill-formed, which is what GCC complains about. But MSVC preprocessor later breaks this nonsensical token back up into pieces (which a conforming preprocessor would never do).

trying to understand syntax [duplicate]

What does this line mean? Especially, what does ## mean?
#define ANALYZE(variable, flag) ((Something.##variable) & (flag))
Edit:
A little bit confused still. What will the result be without ##?
A little bit confused still. What will the result be without ##?
Usually you won't notice any difference. But there is a difference. Suppose that Something is of type:
struct X { int x; };
X Something;
And look at:
int X::*p = &X::x;
ANALYZE(x, flag)
ANALYZE(*p, flag)
Without token concatenation operator ##, it expands to:
#define ANALYZE(variable, flag) ((Something.variable) & (flag))
((Something. x) & (flag))
((Something. *p) & (flag)) // . and * are not concatenated to one token. syntax error!
With token concatenation it expands to:
#define ANALYZE(variable, flag) ((Something.##variable) & (flag))
((Something.x) & (flag))
((Something.*p) & (flag)) // .* is a newly generated token, now it works!
It's important to remember that the preprocessor operates on preprocessor tokens, not on text. So if you want to concatenate two tokens, you must explicitly say it.
## is called token concatenation, used to concatenate two tokens in a macro invocation.
See this:
Macro Concatenation with the ## Operator
One very important part is that this token concatenation follows some very special rules:
e.g. IBM doc:
Concatenation takes place before any
macros in arguments are expanded.
If the result of a concatenation is a
valid macro name, it is available for
further replacement even if it
appears in a context in which it
would not normally be available.
If more than one ## operator and/or #
operator appears in the replacement
list of a macro definition, the order
of evaluation of the operators is not
defined.
Examples are also very self explaining
#define ArgArg(x, y) x##y
#define ArgText(x) x##TEXT
#define TextArg(x) TEXT##x
#define TextText TEXT##text
#define Jitter 1
#define bug 2
#define Jitterbug 3
With output:
ArgArg(lady, bug) "ladybug"
ArgText(con) "conTEXT"
TextArg(book) "TEXTbook"
TextText "TEXTtext"
ArgArg(Jitter, bug) 3
Source is the IBM documentation. May vary with other compilers.
To your line:
It concatenates the variable attribute to the "Something." and adresses a variable which is logically anded which gives as result if Something.variable has a flag set.
So an example to my last comment and your question(compileable with g++):
// this one fails with a compiler error
// #define ANALYZE1(variable, flag) ((Something.##variable) & (flag))
// this one will address Something.a (struct)
#define ANALYZE2(variable, flag) ((Something.variable) & (flag))
// this one will be Somethinga (global)
#define ANALYZE3(variable, flag) ((Something##variable) & (flag))
#include <iostream>
using namespace std;
struct something{
int a;
};
int Somethinga = 0;
int main()
{
something Something;
Something.a = 1;
if (ANALYZE2(a,1))
cout << "Something.a is 1" << endl;
if (!ANALYZE3(a,1))
cout << "Somethinga is 0" << endl;
return 1;
};
This is not an answer to your question, just a CW post with some tips to help you explore the preprocessor yourself.
The preprocessing step is actually performed prior to any actual code being compiled. In other words, when the compiler starts building your code, no #define statements or anything like that is left.
A good way to understand what the preprocessor does to your code is to get hold of the preprocessed output and look at it.
This is how to do it for Windows:
Create a simple file called test.cpp and put it in a folder, say c:\temp.
Mine looks like this:
#define dog_suffix( variable_name ) variable_name##dog
int main()
{
int dog_suffix( my_int ) = 0;
char dog_suffix( my_char ) = 'a';
return 0;
}
Not very useful, but simple. Open the Visual studio command prompt, navigate to the folder and run the following commandline:
c:\temp>cl test.cpp /P
So, it's the compiler your running (cl.exe), with your file, and the /P option tells the compiler to store the preprocessed output to a file.
Now in the folder next to test.cpp you'll find test.i, which for me looks like this:
#line 1 "test.cpp"
int main()
{
int my_intdog = 0;
char my_chardog = 'a';
return 0;
}
As you can see, no #define left, only the code it expanded into.
According to Wikipedia
Token concatenation, also called token pasting, is one of the most subtle — and easy to abuse — features of the C macro preprocessor. Two arguments can be 'glued' together using ## preprocessor operator; this allows two tokens to be concatenated in the preprocessed code. This can be used to construct elaborate macros which act like a crude version of C++ templates.
Check Token Concatenation
lets consider a different example:
consider
#define MYMACRO(x,y) x##y
without the ##, clearly the preprocessor cant see x and y as separate tokens, can it?
In your example,
#define ANALYZE(variable, flag) ((Something.##variable) & (flag))
## is simply not needed as you are not making any new identifier. In fact, compiler issues "error: pasting "." and "variable" does not give a valid preprocessing token"

C++ help concatenating TCHAR

I recently learned of the ## functionality that i can define in the beginning of my code. I'm trying to compile the following code:
#include <windows.h>
#include <tchar.h>
#include <iostream>
#include <stdio.h>
#include <string>
#define paste(x,y) *x##*y
int main()
{
TCHAR *pcCommPort = "COM";
TCHAR *num = "5";
cout << paste(pcCommPort,num);
return 0;
}
and i keep getting the following error:
expression must have arithmetic or unscoped enum type
it's not liking the fact that i'm using pointers in my "define paste" line. Without any pointers, it'll just return the variable "pcCommPort5." what I want is "COM5."
I've tried _tcscat, strcat, strcat_s, visual studio didn't like any of these....
## doesn't concatenate arbitrary things (especially not strings). What it does it is merges symbols together in the parser into a single symbol.
Let's remove one of those * to see what's going on:
#include <iostream>
#define TO_STRING_HELPER(x) #x
#define TO_STRING(x) TO_STRING_HELPER(x)
#define CONCAT(x, y) *x##y
int main() {
char *pcCommPort = "COM";
char *num = "5";
std::cout << TO_STRING(CONCAT(pcCommPort, num)) << std::endl;
}
Output:
*pcCommPortnum
What CONCAT does in this code is:
Expand x into pcCommPort and y into num. This gives the expression *pcCommPort##num.
Concatenate the two symbols pcCommPort and num into one new symbol: pcCommPortnum. Now the expression is *pcCommPortnum (remember, that last part (pcCommPortnum) is all one symbol).
Finish evaluating the full macro as a * followed by the symbol pcCommPortnum. This becomes the expression *pcCommPortnum. Remember, those are two different symbols: * and pcCommPortnum. The two symbols just follow one after the other.
If we were to try to use *x##*y, what the compiler does is this:
Expand x into pcCommPort and y into num. This gives us the expression *pcCommPort##*num.
Concatenate the two symbols pcCommPort and * into one new symbol: pcCommPort*.
Here, the preprocessor hits an error: the single symbol pcCommPort* is not a valid preprocessing token. Remember, it's not two separate symbols at this point (it is not two symbols pcCommPort followed by *). It is one single symbol (which we call a token).
If you want to concatenate two strings, you're way better off using std::string. You can't* do what you're trying to do with the preprocessor.
*Note, though, that consecutive string literals will be merged together by the compiler (i.e. "COM" "5" will be merged into a single string "COM5" by the compiler). But this only works with string literals, so you'd have to #define pcCommPort "COM" and #define num "5", at which point you could do pcCommPort num (without any further macros) and the compiler would evaluate it to the string "COM5". But unless you really know what you're doing, you really should just use std::string.

Multi-line raw string literals as preprocessor macros arguments

Can a multi-line raw string literal be an argument of a preprocessor macro?
#define IDENTITY(x) x
int main()
{
IDENTITY(R"(
)");
}
This code doesn't compile in both g++4.7.2 and VC++11 (Nov.CTP).
Is it a compiler (lexer) bug?
Multiple line macro invocations are legal -
since you are using a raw string literal it should have compiled
There is a known GCC bug for this:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52852
If you had been using regular (nonraw) strings it would have been illegal.
This should have compiled:
printf(R"HELLO
WORLD\n");
But not this:
printf("HELLO
WORLD\n");
This should be coded as
printf("HELLO\nWORLD\n");
if a new line is intended between HELLO and WORLD or as
printf("HELLO "
"WORLD\n");
If no intervening new line was intended.
Do you want a new line in your literal? If so then couldn't you use
IDENTITY("(\n)");
The C compiler documentation at
http://gcc.gnu.org/onlinedocs/cpp.pdf
States that in section 3.3 (Macro Arguments) that
"The invocation of the macro need not be
restricted to a single logical line—it can cross
as many lines in the source file as you wish."

vs2010 C4353 why isn't this an error

I ran into this today in an if and after looking into it found that all these are all valid statements that generate the C4353 . My only guess is that this is the old way of doing noop in C. Why is this not an error. When would you use this to do anything useful.
int main()
{
nullptr();
0();
(1 == 2)();
return 0;
}
Using constant 0 as a function expression is an extension that is specific to Microsoft. They implemented this specifically because they saw a reason for it, which explains why it's wouldn't make sense to treat it as an error. But since it's non-standard, the compiler emits a warning.
You are correct that it is an alternative to using __noop().
All of these :
nullptr();
0();
(1 == 2)();
are no-op statements (meaning they don't do anything).
btw I hope you are not ignoring warnings. Most of the time it is a good practice to fix all warnings.
As explained in the C4353 warning page and in the __noop intrinsic documentation, the use of 0 as a function expression instructs the Microsoft C++ compiler to ignore calls to the function but still generate code that evaluates its arguments (for side effects).
The example given is a trace macro that gets #defined either to __noop or to a print function, depending on the value of the DEBUG preprocessor symbol:
#if DEBUG
#define PRINT printf_s
#else
#define PRINT __noop
#endif
int main() {
PRINT("\nhello\n");
}
The MSDN page for that warning has ample explanation and a motivating example:
// C4353.cpp
// compile with: /W1
void MyPrintf(void){};
#define X 0
#if X
#define DBPRINT MyPrint
#else
#define DBPRINT 0 // C4353 expected
#endif
int main(){
DBPRINT();
}
As you can see it is to support archaic macro usage.