The following preprocessor-based identifier-to-string lookup table:
#include <iostream>
// included generated file
#define KEY_a valueA
#define KEY_b valueB
///////
#define LOOKUP_(_key_) KEY_ ## _key_
#define QUOTE_(_str_) #_str_
#define EXPAND_AND_QUOTE_(_str_) QUOTE_(_str_)
#define LOOKUP(_key_) EXPAND_AND_QUOTE_(LOOKUP_(_key_))
int main() {
std::cout << LOOKUP(a) << std::endl;
std::cout << LOOKUP(b) << std::endl;
std::cout << LOOKUP(c) << std::endl;
}
Output:
valueA
valueB
KEY_c
The first #defines come from an #included header generated by an external script before the compilation.
The LOOKUP macro correctly handles existing key in the table, and substitutes the given value as string literal.
But for non-existing keys, it substitutes the key as string literal.
Is there a way to instead make it substitute a given constant for non-existing keys, without causing a compile-time error, and all within the preprocessing stage?
So for example, the LOOKUP(c) and LOOKUP(whatever) should all be substituted to "undefined", without c or whatever occuring in the included generated file.
The names of the keys should not be outputted to the compiled binary, so ideally they should never be seen by the compiler.
Here's a simple, if hacky, solution. By making the definition of KEY_x a list of two elements (the first of which will be ignored), it permits adding a default value:
#include <iostream>
// included generated file
#define KEY_a _,valueA
#define KEY_b _,valueB
///////
#define LOOKUP_(key) KEY_ ## key
#define QUOTE_(_,str,...) #str
#define EXPAND_AND_QUOTE_(...) QUOTE_(__VA_ARGS__)
#define LOOKUP(key) EXPAND_AND_QUOTE_(LOOKUP_(key),undefined)
int main() {
std::cout << LOOKUP(a) << std::endl;
std::cout << LOOKUP(b) << std::endl;
std::cout << LOOKUP(c) << std::endl;
}
Test on coliru
Related
Suppose I have an header that is meant to be included several times generating code from a template parameterised over a macro DATA. I use it in this way:
#define DATA this
#include <header.hpp>
#undef DATA
#define DATA that
#include <header.hpp>
#undef DATA
#define DATA the_other
#include <header.hpp>
#undef DATA
Is there a way to automate this repeated inclusion given a list of the values of DATA? Something like:
#define DATAS (this, that, the_other)
#include <header.hpp>
#undef DATAS
I tried with some __VA_OPT__ magic, and inside of header.hpp I can isolate the first element of the list and the tail of the list, but the problem is that I cannot redefine DATAS in terms of itself for the next inclusion.
Is this possible at all?
Yes, it is possible.
You can use Boost Preprocessor (which is independent of all other Boost Packages and only has to be downloaded, no library needs to be built or installed) to get the needed ready-to-use macros. You can also try to understand Boost Preprocessor and recreate the needed features.
The example is taken from Ari's answer. It could be expanded to provide several data elements to each iteration, e.g. for initializing the ints and floats with specific values.
// header.hpp - sample header, which uses DATA to create variables
// uses Boost preprocessor only for simple concatenation
// you can use your custom header here
#include <boost/preprocessor/cat.hpp>
int BOOST_PP_CAT(int_, DATA) = 1;
float BOOST_PP_CAT(float_, DATA) = 2.2f;
// main.cpp - wants to define lots of variables
// provides header name, list of symbol suffixes
// repeated.hpp will include header.hpp 3 times with DATA set to this, that and the_other
// (Space after REP_PARAMS is important)
#define REP_PARAMS ("header.hpp")(this, that, the_other)
#include "repeated.hpp"
#undef REP_PARAMS
#include <iostream>
using namespace std;
int main()
{
cout << "int_this: " << int_this << endl;
cout << "int_that: " << int_that << endl;
cout << "int_the_other: " << int_the_other << endl;
cout << "----------------------------------------------------------"
<< endl;
cout << "float_this: " << float_this << endl;
cout << "float_that: " << float_that << endl;
cout << "float_the_other: " << float_the_other << endl;
return 0;
}
// repeated.hpp - helper header
// all the magic
// it mostly extracts the REP_PARAMS sequence
// TODO error-checking, e.g. that REP_PARAMS exists and is a sequence of length two, that second element of REP_PARAMS is a tuple
#if !BOOST_PP_IS_ITERATING
// iteration has not started yet, include used boost headers
// initialize iteration with 3 parameters from 0 to < size of tuple,
// include itself (repeated.hpp)
#include <boost/preprocessor/iteration/iterate.hpp>
#include <boost/preprocessor/tuple/elem.hpp>
#include <boost/preprocessor/tuple/size.hpp>
#include <boost/preprocessor/seq/seq.hpp>
#define BOOST_PP_ITERATION_PARAMS_1 (3, (0, BOOST_PP_TUPLE_SIZE(BOOST_PP_SEQ_TAIL(REP_PARAMS)), "repeated.hpp"))
#include BOOST_PP_ITERATE()
#else
// set DATA to i-th element in tuple, include specified header (header.hpp)
#define DATA BOOST_PP_TUPLE_ELEM(BOOST_PP_ITERATION(), BOOST_PP_SEQ_TAIL(REP_PARAMS))
#include BOOST_PP_SEQ_HEAD(REP_PARAMS)
#undef DATA
#endif
The maximum list size is 256. By default it is limited to 64, but can be increased with the BOOST_PP_LIMIT_TUPLE macro.
I have to admit I wouldn't even consider using any preprocessing tricks for that. This is a classical scripting problem.
Instead you could write a small script that creates that header for you and inserts that at the beginning of the file. You could then add that as a step in your build system to run it. This technique gives you a LOT of power going forward:
You can add the same header to many scripts rather easily
You can see all the custom headers in a clean json format
You could easily get the script to add multiple #define <key> <value>-s before the include
You could change formatting easily and quickly
Here is an example script that does that:
import json
def prepend_headers(fout, headers):
for header in headers:
include = header['include']
define = header['define']
for k, v in define.items():
fout.write(f'#define {k} {v}\n')
fout.write(f'#include {include}\n')
for k, _ in define.items():
fout.write(f'#undef {k}\n')
fout.write('\n')
def main(configfile):
with open(configfile) as fin:
config = json.load(fin)
input_file = config['input']
with open(input_file) as fin:
input_content = fin.read()
output_file = config['output']
with open(output_file, 'w') as fout:
headers = config['headers']
prepend_headers(fout, headers)
fout.write(input_content)
if __name__ == '__main__':
import sys
configfile = sys.argv[1]
sys.exit(main(configfile))
If you use the following configuration:
{
"input": "class.cpp.template",
"output": "class.cpp",
"headers": [
{
"include": "<header.hpp>",
"define": {
"DATA": "this",
"OBJ": "him"
}
},
{
"include": "<header.hpp>",
"define": {
"DATA": "that"
}
},
{
"include": "<header.hpp>",
"define": {
"DATA": "the_other"
}
}
]
}
And the following template file:
#include <iostream>
class Obj {
};
int main() {
Obj o;
std::cout << "Hi!" << std::endl;
return 0;
}
The result you get is this:
#define DATA this
#define OBJ him
#include <header.hpp>
#undef DATA
#undef OBJ
#define DATA that
#include <header.hpp>
#undef DATA
#define DATA the_other
#include <header.hpp>
#undef DATA
#include <iostream>
class Obj {
};
int main() {
Obj o;
std::cout << "Hi!" << std::endl;
return 0;
}
Using a template class might be annoying, so you might decide to add some hints in the output file so you could "replace" them with every build you run.
This is not doable using preprocessor only. However, it is probably worth mentioning that there is something called X-Macro that could have been used for something close to what you are asking if you weren't using preprocessor macros for each case.
The reason is that it cannot be used here is that you cannot use #define or #include in the definition of a macro.
For example, this is doable for defining this, that and the_other as variables from a file called data.def that has them as a list:
// data.def
ELEMENT(this)
ELEMENT(that)
ELEMENT(the_other)
Then in main.cc:
//main.cc
#define ELEMENT(d) int int_##d = 1;
#include "data.def"
#undef ELEMENT
#define ELEMENT(d) int float_##d = 2.2;
#include "data.def"
#undef ELEMENT
int main() {
std::cout << "int_this: " << int_this << std::endl;
std::cout << "int_that: " << int_that << std::endl;
std::cout << "int_the_other: " << int_the_other << std::endl;
std::cout << "----------------------------------------------------------"
<< std::endl;
std::cout << "float_this: " << float_this << std::endl;
std::cout << "float_that: " << float_that << std::endl;
std::cout << "float_the_other: " << float_the_other << std::endl;
}
Output:
int_this: 1
int_that: 1
int_the_other: 1
---------------------------------------------------------------
float_this: 2
float_that: 2
float_the_other: 2
But something like this is not going to work because you would be defining a macro in another macro:
#define ELEMENT(d) #define DATA d; \
#include "data.def" \
#undef DATA
#undef ELEMENT
This question already has answers here:
How do I temporarily disable a macro expansion in C/C++?
(6 answers)
Closed 5 years ago.
The goal here is to simply get a, b, c out instead of their actual values. The setup is "simple enough":
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/seq/for_each.hpp>
#include <boost/preprocessor/stringize.hpp>
#include <iostream>
// Define "invalid" sequence first
#define SEQ (a)(b)(c)
// Try to create "final" value with `std::string("elem")`
// Brought in for explicit `std::string`, but no dice
#define MAKE_XSTRING(x) MAKE_STRING(x)
#define MAKE_STRING(x) std::string(#x)
// oh, the humanity! vvvvvvvvvvvv or BOOST_PP_STRINGIZE
#define HUMANIZE(r, data, elem) (MAKE_XSTRING(elem))
#define SEQ_HUMAN BOOST_PP_SEQ_FOR_EACH(HUMANIZE,,SEQ)
So what I'm expecting at this point is what I have: a new sequence with (std::string("a")) etc:
// confirmation: vvvvvvvvvvvvvvvv
// warning: Humans: (std::string("a")) (std::string("b")) (std::string("c"))
#pragma message "Humans: " BOOST_PP_STRINGIZE(SEQ_HUMAN)
Thinking I'm so very clever and have gotten my values sorted out in some explicit strings, now I define the actual values for what the "real" code needs.
// Now that we have the "final" values, actually define the real values
// in real code, it's some lengthy nested namespaces (inconvenient to type)
#define a 123
#define b 456
#define c 789
And at long last, lets print them to make sure they aren't expanded:
// Let there be printing!
#define GOTTA_PRINT_EM_ALL(r,data,i,elem) << ((i)+1) << ". " << elem << std::endl
int main(int argc, const char **argv) {
std::cout << "Humans: " << std::endl
BOOST_PP_SEQ_FOR_EACH_I(GOTTA_PRINT_EM_ALL,,SEQ_HUMAN);
}
But it seems the aliens did indeed take over:
Humans:
1. 123
2. 456
3. 789
Given that they're supposed to be std::string("a")...how the heck are the real values getting back in there?! I thought maybe the ("a") from the std::string constructor was creating issues, but it doesn't seem so (BOOST_PP_STRINGIZE results in same behavior). Any suggestions?
The macro indeed expands into code tokens:
test.cpp|24 col 1| note: #pragma message: Humans: (std::string("123")) (std::string("456")) (std::string("789"))
Now when you insert the code tokens into your GOTTA_PRINT_EM_ALL macro, you get
<< ((0)+1) << ". " << std::string(\"123\") << std::endl << ((1)+1) << ". " << std::string(\"456\") << std::endl << ((2)+1) << ". << std::string(\"789\")" << std::endl
Completely expectedly printing
Humans:
1. 123
2. 456
3. 789
To get the "code tokens" you need to stringize them as well:
// Let there be printing!
#define GOTTA_PRINT_EM_ALL(r,data,i,elem) << ((i)+1) << ". " << BOOST_PP_STRINGIZE(elem) << std::endl
Printing
Humans:
1. std::string("123")
2. std::string("456")
3. std::string("789")
See it Live On Coliru
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/seq/for_each.hpp>
#include <boost/preprocessor/stringize.hpp>
#include <iostream>
#include <string>
#define a 123
#define b 456
#define c 789
#define SEQ (a)(b)(c)
// Try to create "final" value with `std::string("elem")`
// Brought in for explicit `std::string`, but no dice
#define MAKE_STRING(x) std::string(#x)
#define MAKE_XSTRING(x) MAKE_STRING(x)
#define HUMANIZE(r, data, elem) (MAKE_XSTRING(elem))
#define SEQ_HUMAN BOOST_PP_SEQ_FOR_EACH(HUMANIZE,,SEQ)
// Let there be printing!
#define GOTTA_PRINT_EM_ALL(r,data,i,elem) << ((i)+1) << ". " << BOOST_PP_STRINGIZE(elem) << std::endl
int main() {
std::cout << "Humans: " << std::endl
BOOST_PP_SEQ_FOR_EACH_I(GOTTA_PRINT_EM_ALL,,SEQ_HUMAN);
}
I have a requirement to build an automated system to parse a C++ .h file with a lot of #define statements in it and do something with the value that each #define works out to. The .h file has a lot of other junk in it besides the #define statements.
The objective is to create a key-value list, where the keys are all the keywords defined by the #define statements and the values are the evaluations of the macros which correspond to the definitions. The #defines define the keywords with a series of nested macros that ultimately resolve to compile-time integer constants. There are some that do not resolve to compile-time integer constants, and these must be skipped.
The .h file will evolve over time, so the tool cannot be a long hardcoded program which instantiates a variable to be equal to each keyword. I have no control over the contents of the .h file. The only guarantees are that it can be built with a standard C++ compiler, and that more #defines will be added but never removed. The macro formulas may change at any time.
The options I see for this are:
Implement a partial (or hook into an existing) C++ compiler and intercept the value of the macros during the preprocessor step.
Use regexes to dynamically build a source file which will consume all the macros currently defined, then compile and execute the source file to get the evaluated form of all the macros. Somehow (?) skip the macros which do not evaluate to compile-time integer constants. (Also, not sure if regex is expressive enough to capture all possible multi-line macro definitions)
Both of these approaches would add substantial complexity and fragility to the build process for this project which I would like to avoid. Is there a better way to evaluate all the #define macros in a C++ .h file?
Below is an example of what I am looking to parse:
#ifndef Constants_h
#define Constants_h
namespace Foo
{
#define MAKE_CONSTANT(A, B) (A | (B << 4))
#define MAGIC_NUMBER_BASE 40
#define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
#define MORE_MAGIC_1 345
#define MORE_MAGIC_2 65
// Other stuff...
#define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
#define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
// etc...
#define SKIP_CONSTANT "What?"
// More CONSTANT_N mixed with more other stuff and constants which do
// not resolve to compile-time integers and must be skipped
}
#endif Constants_h
What I need to get out of this is the names and evaluations of all the defines which resolve to compile-time integer constants. In this case, for the defines shown it would be
MAGIC_NUMBER_BASE 40
MAGIC_NUMBER 42
MORE_MAGIC_1 345
MORE_MAGIC_2 65
CONSTANT_1 1887
CONSTANT_2 -42
It doesn't really matter what format this output is in as long as I can work with it as a list of key-value pairs further down the pipe.
An approach could be to write a "program generator" that generates a program (the printDefines program) comprising statements like std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;. Obviously, executing such statements will resolve the respective macros and print out their values.
The list of macros in a header file can be obtained by g++ with an -dM -E' option. Feeding this "program generator" with such a list of #defines will generate a "printDefines.cpp" with all the requiredcout`-statements. Compiling and executing the generated printDefines program then yields the final output. It will resolve all the macros, including those that by itself use other macros.
See the following shell script and the following program generator code that together implement this approach:
Script printing the values of #define-statements in "someHeaderfile.h":
# printDefines.sh
g++ -std=c++11 -dM -E someHeaderfile.h > defines.txt
./generateDefinesCpp someHeaderfile.h defines.txt > defines.cpp
g++ -std=c++11 -o defines.o defines.cpp
./defines.o
Code of program generator "generateDefinesCpp":
#include <stdio.h>
#include <string>
#include <iostream>
#include <fstream>
#include <cstring>
using std::cout;
using std::endl;
/*
* Argument 1: name of the headerfile to scan
* Argument 2: name of the cpp-file to generate
* Note: will crash if parameters are not provided.
*/
int main(int argc, char* argv[])
{
cout << "#include<iostream>" << endl;
cout << "#include<stdio.h>" << endl;
cout << "#include \"" << argv[1] << "\"" << endl;
cout << "int main() {" << endl;
std::ifstream headerFile(argv[2], std::ios::in);
std::string buffer;
char macroName[1000];
int macroValuePos;
while (getline(headerFile,buffer)) {
const char *bufferCStr = buffer.c_str();
if (sscanf(bufferCStr, "#define %s %n", macroName, ¯oValuePos) == 1) {
const char* macroValue = bufferCStr+macroValuePos;
if (macroName[0] != '_' && strchr(macroName, '(') == NULL && *macroValue) {
cout << "std::cout << \"" << macroName << "\" << \" \" << (" << macroValue << ") << std::endl;" << std::endl;
}
}
}
cout << "return 0; }" << endl;
return 0;
}
The approach could be optimised such that the intermediate files defines.txt and defines.cpp are not necessary; For demonstration purpose, however, they are helpful. When applied to your header file, the content of defines.txt and defines.cpp will be as follows:
defines.txt:
#define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
#define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
#define Constants_h
#define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
#define MAGIC_NUMBER_BASE 40
#define MAKE_CONSTANT(A,B) (A | (B << 4))
#define MORE_MAGIC_1 345
#define MORE_MAGIC_2 65
#define OBJC_NEW_PROPERTIES 1
#define SKIP_CONSTANT "What?"
#define _LP64 1
#define __APPLE_CC__ 6000
#define __APPLE__ 1
#define __ATOMIC_ACQUIRE 2
#define __ATOMIC_ACQ_REL 4
...
defines.cpp:
#include<iostream>
#include<stdio.h>
#include "someHeaderfile.h"
int main() {
std::cout << "CONSTANT_1" << " " << (MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)) << std::endl;
std::cout << "CONSTANT_2" << " " << (MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)) << std::endl;
std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;
std::cout << "MAGIC_NUMBER_BASE" << " " << (40) << std::endl;
std::cout << "MORE_MAGIC_1" << " " << (345) << std::endl;
std::cout << "MORE_MAGIC_2" << " " << (65) << std::endl;
std::cout << "OBJC_NEW_PROPERTIES" << " " << (1) << std::endl;
std::cout << "SKIP_CONSTANT" << " " << ("What?") << std::endl;
return 0; }
And the output of executing defines.o is then:
CONSTANT_1 1887
CONSTANT_2 -9
MAGIC_NUMBER 42
MAGIC_NUMBER_BASE 40
MORE_MAGIC_1 345
MORE_MAGIC_2 65
OBJC_NEW_PROPERTIES 1
SKIP_CONSTANT What?
Here is a concept, based on assumptions from a clarification comment.
only one header
no includes
no dependency on the including code file
no dependency on previously included headers
no dependency on include order
Main requirements otherwise:
do not risk influence on binary build process
(being the part which makes the actual software product)
do not try to emulate the binary build compiler/parser
How to:
make a copy
include it from a dedicated code file,
which only contains "#include "copy.h";
or directly preprocess the header
(this just feels weirdly against my habits)
delete everything except preprocessor and pragmas,
paying attention to line-continuation
replace all "#define"s by "HaPoDefine",
except one (e.g. the first)
repeat
preprocess the including code file
(most compiler have a switch to do this)
save the output
turn another "HaPoDefine" back into "#define"
until no "HaPoDefine" is left
harvest all macro expansions from the deltas of intermediate saves
discard everything which is not of relevance
since the final actual numerical value is most likely a result of the compiler (not the preprocessor), use a tool like bashs "expr" to calculate values for human-eye readability,
be careful not to risk differences to binary build process
use some regex magic to achieve any desired format
Can you use g++ or gcc with the -E option, and work with that output?
-E Stop after the preprocessing stage; do not run the compiler proper.
The output is in the form of preprocessed source code, which
is sent to the standard output. Input files which don't require
preprocessing are ignored.
With this, I imagine:
Create the list of all #define keys from the source
Run the appropriate command below against the source file(s), and let the GNU preprocessor do its thing
Grab the preprocessed result from stdout, filter to take only those in integer form, and output it to however you want to represent key/value pairs
One of these two commands:
gcc -E myFile.c
g++ -E myFile.cpp
https://gcc.gnu.org/onlinedocs/gcc-2.95.2/gcc_2.html
https://gcc.gnu.org/onlinedocs/cpp/Preprocessor-Output.html
How to get the macro name inside a macro?
Say we have:
#include <iostream>
using std::cout;
using std::endl;
#define MACRO() \
cout << __MACRO_NAME__ << endl
int main () {
MACRO();
return 0;
}
Expected output:
MACRO
Did little bit of research and I don't think that is doable in c++.
But you could use this:
#define MACRO2(x) cout << #x << endl
#define MACRO MACRO2(MACRO)
In this you can use MACRO2 to do the task of MACRO and you can also access name of MACRO as an argument x.
There is one function defined in two different ways, one using #define and other using a function. But for the output I am getting different values.
The output is coming out to be 3 -1.
I want to know why using F(x,y) results in different values.
#include<iostream>
#define F(x,y) y-x
using namespace std;
int F2(int x,int y)
{
return y-x;
}
int main()
{
int x=1,y=2, h=2;
cout << F(x+h,y) << " " << F2(x+h,y) << endl;
return 0;
}
First off, you didn't #define a function but a macro. Macros do straight text replacement so the output line is equivalent to:
cout << y-x+h << " " << F2(x+h,y) << endl;
Can you spot the error now?
Classic problem using #define, and one of the main reasons why macros are discouraged. Keep in mind that a macro is little more than a literal substitution, and consider what it expands to:
cout << y-x+h << " " << F2(x+h,y) << endl;
And y-x+h is something very different from y-(x+h).
Always parenthesize uses of macro arguments:
#define F(x,y) ((y)-(x))
#define is a macro directive, not a function. It is just replacing it's occurrences by the macro body. If you do so, you will see that F(x+h,y) is replaced by y-x+h, which is obviously not what you want. The rule for macros is to take all of the parameters and subexpressions into brackes like this:
#define F(x,y) ((y)-(x))
in order to get the correct results.
This way F(x+h,y) will be replaced by ((y)-(x+h)), which is correct