Using the only the preprocessor for string concatenation? - c++

Is it possible to concatenate quoted string literals outside of the language (C++, in this case)?
That is, can I define MY_MACRO(a,b,c) and use it thus:
MY_MACRO("one", "two", "three")
and have it expand to: "onetwothree"?
The use case is to apply an attribute and its message to, say, a function signature, like so:
MY_ATTRIBUTE_MACRO("this", "is", "the reason") int foo() { return 99; }
and it would result in:
[[nodiscard("thisisthe reason")]] int foo() { return 99; }

The language already does string concatenation!
This:
"hi" "James"
becomes just one string literal.
That means you do not need any preprocessor tricks for this at all.
You need only employ this in the output of your macro:
#define MY_ATTRIBUTE_MACRO(x,y,z) [[nodiscard(x y z)]]
Now this:
MY_ATTRIBUTE_MACRO("this", "is", "the reason") int foo() { return 99; }
is this:
[[nodiscard("this" "is" "the reason")]] int foo() { return 99; }
which is actually already what you wanted, because of the implicit string concatenation (which happens after macro expansion):
[[nodiscard("thisisthe reason")]] int foo() { return 99; }
Translation phase 4:
[lex.phases]/4: Preprocessing directives are executed, macro invocations are expanded, and _­Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal-character-name is produced by token concatenation, the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted.
Translation phase 6:
[lex.phases]/6: Adjacent string literal tokens are concatenated.

I'm not sure what you mean by "outside the language" but, in C++, any string literals separated just by whitespace are implicitly concatenated into one. Thus, your MY_MACRO definition is actually very simple:
#include <iostream>
#define MY_MACRO(a, b, c) a b c
int main()
{
std::cout << MY_MACRO("one", "two", "three") << std::endl;
return 0;
}
The output from this short program is what you asked for: onetwothree.
Note: As a matter of curiosity/interest, it is normally recommended to enclose macro arguments in parentheses, in the definition part, so as to avoid unwanted side effects of the evaluation. However, in this case, using such parentheses won't work, and breaks the implicit concatenation:
#define MY_MACRO(a, b, c) (a) (b) (c) // Broken!

Related

Stringification of int in C/C++

The below code should output 100 to my knowledge of stringification. vstr(s) should be expanded with value of 100 then str(s) gets 100 and it should return the string "100". But, it outputs "a" instead. What is the reason? But, if I call with macro defined constant foo then it output "100". Why?
#include<stdio.h>
#define vstr(s) str(s)
#define str(s) #s
#define foo 100
int main()
{
int a = 100;
puts(vstr(a));
puts(vstr(foo));
return 0;
}
The reason is that preprocessors operate on tokens passed into them, not on values associated with those tokens.
#include <stdio.h>
#define vstr(s) str(s)
#define str(s) #s
int main()
{
puts(vstr(10+10));
return 0;
}
Outputs:
10+10
The # stringizing operator is part of the preprocessor. It's evaluated at compile time. It can't get the value of a variable at execution time, then somehow magically convert that to something it could have known at compile time.
If you want to convert an execution-time variable into a string at execution time, you need to use a function like std::to_string.
Since vstr is preprocessed, the line
puts(vstr(a));
is translated as:
puts("a");
The value of the variable a plays no role in that line. You can remove the line
int a = 100;
and the program will behave identically.
Stringification is the process of transforming something into a string. What your macro stringifies ?
Actually the name of the variable itself, this is done at compilation-time.
If you want to stringify and then print the value of the variable at execution-time, then you must used something like printf("%\n",v); in C or cout << v << endl; in C++.
A preprocessor macro is not the same thing as a function, it does not expand the arguments at runtime and sees the value, but rather processes it at preprocessing stage (which is before compilation, so it doesn't even know the variables dependency).
In this case, you've passed the macro a to stringify, which it did. The preprocessor doesn't care a is also the name of a variable.

Macro string concatenation

I use macros to concatenate strings, such as:
#define STR1 "first"
#define STR2 "second"
#define STRCAT(A, B) A B
which having STRCAT(STR1 , STR2 ) produces "firstsecond".
Somewhere else I have strings associated to enums in this way:
enum class MyEnum
{
Value1,
Value2
}
const char* MyEnumString[] =
{
"Value1String",
"Value2String"
}
Now the following does not work:
STRCAT(STR1, MyEnumString[(int)MyEnum::Value1])
I was just wondering whether it possible to build a macro that concatenate a #defined string literal with a const char*? Otherwise, I guess I'll do without macro, e.g. in this way (but maybe you have a better way):
std::string s = std::string(STR1) + MyEnumString[(int)MyEnum::Value1];
The macro works only on string literals, i.e. sequence of characters enclosed in double quotes. The reason the macro works is that C++ standard treats adjacent string literals like a single string literal. In other words, there is no difference to the compiler if you write
"Quick" "Brown" "Fox"
or
"QuickBrownFox"
The concatenation is performed at compile time, before your program starts running.
Concatenation of const char* variables needs to happen at runtime, because character pointers (or any other pointers, for that matter) do not exist until the runtime. That is why you cannot do it with your CONCAT macro. You can use std::string for concatenation, though - it is one of the easiest solutions to this problem.
It's only working for char literals that they can be concatenated in this way:
"A" "B"
This will not work for a pointer expression which you have in your sample, which expands to a statement like
"STR1" MyEnumString[(int)MyEnum::Value1];
As for your edit:
Yes I would definitely go for your proposal
std::string s = std::string(STR1) + MyEnumString[(int)MyEnum::Value1];
Your macro is pretty unnecessary, as it can only work with string literals of the same type. Functionally it does nothing at all.
std::string s = STRCAT("a", "b");
Is exactly the same as:
std::string s = "a" "b";
So I feel that it's best to just not use the macro at all. If you want a runtime string concatenating function, a more C++-canonical version is:
inline std::string string_concat(const std::string& a, const std::string& b)
{
return a + b;
}
But again, it seems almost pointless to have this function when you can just do:
std::string a = "a string";
std::string ab = a + "b string";
I can see limited use for a function like string_concat. Maybe you want to work on arbitrary string types or automatic conversion between UTF-8 and UTF-16...

How can the C++ Preprocessor be used on strings?

The preprocessor can be used to replace certain keywords with other words using #define. For example I could do #define name "George" and every time the preprocessor finds 'name' in the program it will replace it with "George".
However, this only seems to work with code. How could I do this with strings and text? For example if I print "Hello I am name" to the screen, I want 'name' to be replaced with "George" even though it is in a string and not code.
I do not want to manually search the string for keywords and then replace them, but instead want to use the preprocessor to just switch the words.
Is this possible? If so how?
I am using C++ but C solutions are also acceptable.
#define name "George"
printf("Hello I am " name "\n");
Adjacent string literals are concatenated in C and C++.
Quotes from C and C++ Standard:
For C (quoting C99, but C11 has something similar in 6.4.5p5):
(C99, 6.4.5p5) "In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and identically-prefixed string literal tokens are concatenated into a single multibyte character sequence."
For C++:
(C++11, 2.14.5p13) "In translation phase 6 (2.2), adjacent string literals are concatenated."
EDIT: as requested, add quotes from C and C++ Standard. Thanks to #MatteoItalia for the C++11 quote.
#define name "George"
printf("Hello I am %s\n", name);
Here name will be replaced by "George"
Your issue is that the preprocessor will (wisely) not replace tokens that are inside string literals.
So you must either use a function like printf or a variable rather than the preprocessor, or pull the token out of the string like so:
#include <iostream>
#define name "George"
int main(int argc, char** argv) {
std::cout << "Hello I am " << name << std::endl;
}

C++ Macro riddle: Printing the name of the TYPE

In a macro I can use xxxx_##TYPE and ##TYPE##_xxxxx to have the TYPE name filled in correctly, but I can't use ##TYPE## in the middle of a string e.g. (print "##TYPE## is the name of the type";)
Is there a way around it?
You can do this by combining two features. One is ''stringification'', whereby a macro argument is converted to a string by prefixing it with #. (This is related to, but different from, the ''token-pasting'' operator ## that you're obviously already familiar with.) The other is the fact that C++, when given multiple string literals in a row, will combine them into a single string. For example, "a" "b" "c" is equivalent to "abc". I'm not clear on exactly how your macro is to be defined, so I can't show you exactly what to type, but a full explanation and some good working examples are at http://gcc.gnu.org/onlinedocs/cpp/Stringification.html.
Edited to add a simple example, at Kleist's request. This program:
#include <stdio.h>
#define PRINT_WHAT_THE_NAME_OF_THE_TYPE_IS(TYPE) \
printf("%s\n", "'" #TYPE "' is the name of the type.")
int main()
{
PRINT_WHAT_THE_NAME_OF_THE_TYPE_IS(Mr. John Q. Type);
return 0;
}
will print this:
'Mr. John Q. Type' is the name of the type.
(This will run in either C or C++. The reason I wrote it C-ishly is that in my experience these sorts of preprocessor tricks are more common in C code than in real C++ code; but if you wanted to use std::cout << instead of printf, you absolutely could.)
## is the token pasting operator and it takes two different tokens and pastes them together to make a single token. The entire string literal is considered a single token, thus the pasting operator doesn't work within it. See http://gcc.gnu.org/onlinedocs/gcc-4.0.4/cpp/Tokenization.html
String literals will be concatenated when they are next to each other.
#define QUOTE(X) #X
#define DOSTUFF(TYPE) \
void print_ ## TYPE () { \
static const char *str = QUOTE(TYPE) " is the name of the type\n"; \
printf(str); \
}
DOSTUFF(Foo);
int main(int argc, char ** argv) {
print_Foo();
return 0;
}
Output of g++ -E -c main.cpp will show you what this gets preprocessed into.
# 16 "main.cpp"
void print_Foo () {
static const char *str = "Foo" " is the name of the type\n";
printf(str);
};
int main(int argc, char ** argv) {
print_Foo();
return 0;
}

String concatenation using preprocessor

is it possible to concatenate strings during preprocessing?
I found this example
#define H "Hello "
#define W "World!"
#define HW H W
printf(HW); // Prints "Hello World!"
However it does not work for me - prints out "Hello" when I use gcc -std=c99
UPD This example looks like working now. However, is it a normal feature of c preprocessor?
Concatenation of adjacent string litterals isn't a feature of the preprocessor, it is a feature of the core languages (both C and C++). You could write:
printf("Hello "
" world\n");
You can indeed concatenate tokens in the preprocessor, but be careful because it's tricky. The key is the ## operator. If you were to throw this at the top of your code:
#define myexample(x,y,z) int example_##x##_##y##_##z## = x##y##z
then basically, what this does, is that during preprocessing, it will take any call to that macro, such as the following:
myexample(1,2,3);
and it will literally turn into
int example_1_2_3 = 123;
This allows you a ton of flexibility while coding if you use it correctly, but it doesn't exactly apply how you are trying to use it. With a little massaging, you could get it to work though.
One possible solution for your example might be:
#define H "Hello "
#define W "World!"
#define concat_and_print(a, b) cout << a << b << endl
and then do something like
concat_and_print(H,W);
From gcc online docs:
The '##' preprocessing operator performs token pasting. When a macro is expanded, the two tokens on either side of each '##' operator are combined into a single token, which then replaces the '##' and the two original tokens in the macro expansion.
Consider a C program that interprets named commands. There probably needs to be a table of commands, perhaps an array of structures declared as follows:
struct command
{
char *name;
void (*function) (void);
};
struct command commands[] =
{
{ "quit", quit_command },
{ "help", help_command },
...
};
It would be cleaner not to have to give each command name twice, once in the string constant and once in the function name. A macro which takes the name of a command as an argument can make this unnecessary. The string constant can be created with stringification, and the function name by concatenating the argument with _command. Here is how it is done:
#define COMMAND(NAME) { #NAME, NAME ## _command }
struct command commands[] =
{
COMMAND (quit),
COMMAND (help),
...
};
I just thought I would add an answer that cites the source as to why this works.
The C99 standard §5.1.1.2 defines translation phases for C code. Subsection 6 states:
Adjacent string literal tokens are concatenated.
Similarly, in the C++ standards (ISO 14882) §2.1 defines the Phases of translation. Here Subsection 6 states:
6 Adjacent ordinary string literal tokens are concatenated. Adjacent wide string literal tokens are concatenated.
This is why you can concatenate strings simply by placing them adjacent to one another:
printf("string"" one\n");
>> ./a.out
>> string one
The preprocessing part of the question is simply the usage of the #define preprocessing directive which does the substitution from identifier (H) to string ("Hello ").