C++ preprocessor token pasting for namespace qualification - c++

I am having trouble with the preprocessor token pasting operator in gcc 4.7.1 (std=c++11). Namely, consider the following code:
// Create a name for a global map (this works)
#define GLOBAL_MAP(name) g_map_ ## name // This works fine
// Now, namespace qualify this map (this fails to compile when used)
#define NS_QUAL_GLOBAL_MAP(name) SomeNamespace:: ## GLOBAL_MAP(name)
Usage scenarios - first the map definitions:
std::map<std::string,std::string> GLOBAL_MAP(my_map);
namespace SomeNamespace
{
std::map<std::string,std::string> GLOBAL_MAP(my_map);
}
Now the usage:
void foo()
{
bar(GLOBAL_MAP(my_map)); // This compiles fine
baz(NS_QUAL_GLOBAL_MAP(my_map)); // This fails to compile with:
// error: pasting "::" and "NAME_MAP" does not give a
// valid preprocessing token
}
What I believe might be happening is that it is interpreting GLOBAL_MAP after ## as a token for pasting rather than a macro to be further expanded. How do I get around this?

Token pasting generates a single token for the compiler to read. This isn’t what you want here — :: is a valid C++ token on its own, but ::g_map_my_map isn’t a token that the compiler knows.
Hence, remove the token pasting operator:
#define NS_QUAL_GLOBAL_MAP(type) SomeNamespace::GLOBAL_MAP(type)

You don't need the ## operator after ::. The ## operator is used to form a single token, but SomeNamespace::g_map_mymap are multiple tokens anyway. Just do
#define NS_QUAL_GLOBAL_MAP(type) SomeNamespace::GLOBAL_MAP(type)

You just want SomeNamespace:: GLOBAL_MAP(name).
You can't join a name like g_map_my_map to the :: token, because ::g_map_my_map is not a valid token, it's two tokens. So just put them next to each other, don't try to join them.

Related

flex/bison returning a character token from the scanner in C++

I'm using the calc++ example found in the bison documentation as a starting point to a more complex grammar. One thing I haven't been able to figure out is how to return a character (literal) token from flex to bison.
In pure C examples, I've seen flex simply returning the token as:
"+" { count(); return('+'); }
The calc++ example simply uses token symbols:
"+" return yy::parser::make_PLUS (loc);
But this forces me to use PLUS instead of '+' in the grammar file.
How can I get flex to return a literal value as in the C example when generating C++ code?
Do not define it at all. It will return as the literal and you will be able to use it in parser as '+'
If you use "complete symbols" (that is, %define api.token.constructor), you should be able to use the appropriate parser::symbol_type constructor, as shown in the bison manual section on "complete symbols":
":" return yy::parser::symbol_type (':', loc);

Make Bison accept an alternative EOF token

I'm writing an ansi-C parser in C++ with flex and bison; it's pretty complex.
The issue I'm having is a compilation error. The error is below, it's because yy_terminate returns YY_NULL which is defined as (an int) 0 and yylex has the return type of yy::AnsiCParser::symbol_type. yy_terminate(); is the automatic action for the <<EOF>> token in scanners generated by flex. Obviously this causes a type issue.
My scanner doesn't produce any special token for the EOF, because EOF has no purpose in a C grammar. I could create a token-rule for the <<EOF>> but if I ignore it then the scanner hangs in an infinite loop in yylex on the YY_STATE_EOF(INITIAL) case.
The compilation error,
ansi-c.yy.cc: In function ‘yy::AnsiCParser::symbol_type yylex(AnsiCDriver&)’:
ansi-c.yy.cc:145:17: error: could not convert ‘0’ from ‘int’ to ‘yy::AnsiCParser::symbol_type {aka yy::AnsiCParser::basic_symbol<yy::AnsiCParser::by_type>}’
ansi-c.yy.cc:938:30: note: in expansion of macro ‘YY_NULL’
ansi-c.yy.cc:1583:2: note: in expansion of macro ‘yyterminate’
Also, Bison generates this rule for my start-rule (translation_unit) and the EOF ($end).
$accept: translation_unit $end
So yylex has to return something for the EOF or the parser will never stop waiting for input, but my grammar cannot support an EOF token. Is there a way to make Bison recognize something other then 0 for the $end condition without modifying my grammar?
Alternatively, is there simply something I can return from the <<EOF>> token in the scanner to satisfy the Bison $end condition?
Normally, you would not include an explicit EOF rule in a lexical analyzer, not because it serves no purpose, but rather because the default is precisely what you want to do. (The purpose it serves is to indicate that the input is complete; otherwise, the parser would accept the valid prefix of certain invalid programs.)
Unfortunately, the C++ interfaces can defeat the simple convenience of the default EOF action, which is to return 0 (or NULL). I assume from your problem description that you have asked bison to generate a parser using complete symbols. In that case, you cannot simply return a 0 from yylex since the parser is expecting a complete symbol, which is a more complex type than int (Although the token which reports EOF does not normally have a semantic value, it does have a location, if you are using locaitons.) For other token types, bison will have automatically generated a function which makes an token, named something like make_FOO_TOKEN, which you will call in your scanner action for a FOO_TOKEN.
While the C bison parser does automatically define the end of file token (called END), it appears that the C++ interface does not. So you need to manually define it in your %token declaration in your bison input file:
%token END 0 "end of file"
(That defines the token type END with an integer value of 0 and the human readable label "end of file". The value 0 is obligatory.)
Once you've done that, you can add an explicit EOF rule in your flex input file:
<<EOF>> return make_END();
If you are using locations, you'll have to give make_END a location argument as well.
Here's another way to prevent the compiler error could not convert 0 from int to ...symbol_type - place this redefinition of the yyterminate macro just below where you redefine YY_DECL
// change curLocation to the name of the location object used in yylex
// qualify symbol_type with the bison namespace used
#define yyterminate() return symbol_type(YY_NULL, curLocation)
The compiler error shows up when bison locations are enabled, e.g. with %define locations - this makes bison add a location parameter to its symbol_type constructors so the constructor without locations
symbol_type(int tok)
turns into this with locations
symbol_type(int tok, location_type l)
rendering it no longer possible to convert an int to a symbol_type which is what the default definition of yyterminate in flex is able to do when bison locations are not enabled
#define yyterminate() return YY_NULL
With this workaround there's no need to handle EOF in flex if you don't need to - there's no need for a superfluous END token in bison if you don't need it

A previously defined constant, given as macro argument, is considered as string literal

Let's say I have defined a macro which does this
#define MY_MACRO(NAME) \
std::string get##NAME() const \
{ \
return doSomething(#NAME); \
}
Where doSomething method signature will be something like this
std::string doSomething(const std::string& parameter);
This works pretty well when the NAME macro parameter has no dashes in it.
For example :
#define MY_MACRO(thisIsA_test) // Works
But, when I have a dash in my string (this can happen) it won't work because dashes are not allowed in method names
#define MY_MACRO(thisIsA-test) // does NOT WORK
I have tried to work it around this way
#define thisIsAtest "thisIsA-test"
#define MY_MACRO(thisIsAtest)
Everything compiles just fine and I have the getthisIsAtest method generated but unfortunately the macro is not resolved and "thisIsAtest" is kept as string literal.
In other words the doSomething parameter string value will be "thisIsAtest" whereas I was expecting "thisIsA-test".
To expand the macro argument, just use an indirection macro.
#define stringize_literal( x ) # x
#define stringize_expanded( x ) stringize_literal( x )
Your use-case:
return doSomething( stringize_expanded( NAME ) );
Now the method will be named with name of the macro, and the function will be called with the contents of the macro. Somewhat questionable in terms of organization, but there you have it.
Why it works:
By default, macro arguments are expanded before being substituted. So if you pass thisIsAtest to parameter NAME, the macro expansion will replace NAME with "thisIsA-test". The pre-expansion step does not apply when you use a preprocessor operator # or ## though.
In your original code, one use of NAME is subject to ## and the other is subject to # so the macro definition of thisIsAtest never gets used. I just introduced a macro stringize_expanded which introduces an artificial use of NAME (via x) which is not subject to an operator.
This is the idiomatic way to use # and ##, since the expansion is desired more often than the literal macro name. You do happen to want the default behavior for ## in this case, but it could be considered a case of poor encapsulation (as the name of an interface is used to produce output), if you wanted to apply real programming principles to the problem.
There's nothing to work around.
As you have said yourself, dashes are not valid in function names.
So, do not use them.

C++ Preprocessor metaprogramming: obtaining an unique value?

I'm exploiting the behavior of the constructors of C++ global variables to run code at startup in a simple manner. It's a very easy concept but a little difficult to explain so let me just paste the code:
struct _LuaVariableRegistration
{
template<class T>
_LuaVariableRegistration(const char* lua_name, const T& c_name) {
/* ... This code will be ran at startup; it temporarily saves lua_name and c_name in a std::map and when Lua is loaded it will register all temporarily global variables in Lua. */
}
};
However manually instantiating that super ugly class every time one wants to register a Lua global variable is cumbersome; that's why I created the following macro:
#define LUA_GLOBAL(lua_name, c_name) static Snow::_LuaVariableRegistration _____LuaGlobal ## c_name (lua_name, c_name);
So all you have to do is put that in the global scope of a cpp file and everything works perfectly:
LUA_GLOBAL("LuaIsCool", true);
There you go! Now in Lua LuaIsCool will be a variable initialized to true!
But, here is the problem:
LUA_GLOBAL("ACCESS_NONE", Access::None);
Which becomes:
static Snow::_LuaVariableRegistration _____LuaGlobalAccess::None ("ACCESS_NONE", &Access::None);
:((
I need to concatenate c_name in the macro or it will complain about two variables with the same name; I tried replacing it with __LINE__ but it actually becomes _____LuaGlobalAccess__LINE__ (ie it doesn't get replaced).
So, is there a way to somehow obtain an unique string, or any other workaround?
PS: Yes I know names that begin with _ are reserved; I use them anyway for purposes like this being careful to pick names that the standard library is extremely unlikely to ever use. Additionally they are in a namespace.
You need to add an extra layer of macros to make the preprocessor do the right thing:
#define TOKENPASTE(x, y) x ## y
#define TOKENPASTE2(x, y) TOKENPASTE(x, y)
#define LUA_GLOBAL(lua_name, c_name) ... TOKENPASTE2(_luaGlobal, __LINE__) ...
Some compilers also support the __COUNTER__ macro, which expands to a new, unique integer every time it is evaluated, so you can use that in place of __LINE__ to generate unique identifiers. I'm not sure if it's valid ISO C, although gcc accepts its use with the -ansi -pedantic options.

Concat Macro argument with namespace

I have a macro, where one of the arguments is an enum value, which is given without specifying the namespace scope. However somewhere inside the macro I need to access it (obviously I must define the namespace there), but I can't seem to concat the namespace name with the template parameter. Given the following samplecode the compiler complains that pasting :: and Val doesnt give a valid preprocessor token (it works fine for concating get and a to getVal though).
namespace TN
{
enum Info
{
Val = 0
};
}
#define TEST(a) TN::Info get ## a(){return TN::##a;}
TEST(Val)
So is there any way to make this work without using another argument and basically specifying the value to be used twice (e.g. #define TEST(a,b) TN::Info get ## a(){return b;})?
## is a token pasting operator, i.e. it should make a single token out of multiple bits of token and as the compiler says, ::Val isn't a single token, it's two tokens.
Why do you need think you need the second ## at all? What's wrong with this.
#define TEST(a) TN::Info get ## a () { return TN::a; }
Only use ## when you want to concatenate two items and have the compiler treat the result as a single token (e.g. an identifier).
In your macro, the first use of ## is correct, as you are trying to construct an identifier by pasting together get and the contents of a, but second use of ## is spurious, as you just want to make an identifier out of the contents of a and the :: operator is a separate entity to that. GCC will complain about this (though MSVC++ copes).
#define TEST(a) TN::Info get ## a(){return TN::a;}
should work.