How do I get an enum element's numerical value using libclang?

How do I get an enum element's numerical value using libclang? - c++

Suppose I have an enum definition, e.g.:
// myenum.h
enum MyEnum {
First = 1,
Second,
Third,
TwoAgain = Second
};
I would like to programmatically generate a map from any given enum definition, where the key is the enum element's name, and the value is the enum element's numerical value (e.g. myMap["TwoAgain"] == 2)
So far, I know how to traverse the source file using clang_visitChildren(), and extract individual tokens using clang_tokenize(). Recursing through the AST, I get cursors/tokens in this order:
"MyEnum" (CXType_Enum)
"First" (CXToken_Identifier)
"=" (CXToken_Punctuation)
"1" (CXToken_Literal)
"unsigned int" (CXType_UInt)
"1" (CXToken_Literal)
"MyEnum" (CXType_Enum)
"Second" (CXToken_Identifier)
"MyEnum" (CXType_Enum)
"Third" (CXToken_Identifier)
"MyEnum" (CXType_Enum)
"TwoAgain" (CXToken_Identifier)
"=" (CXToken_Punctuation)
"Second" (CXToken_Identifier)
"unsigned int" (CXType_UInt)
"Second" (CXToken_Identifier)
I guess I could write an algorithm that uses this information to calculate every value. However, I was wondering if there's a simpler way? Can I get the numerical values directly from the libclang API?

libclang exposes this information through clang_getEnumConstantDeclValue and clang_getEnumConstantDeclUnsignedValue. A map like you describe can be built by visiting the children of a CXCursor_EnumDecl:
static enum CXChildVisitResult VisitCursor(CXCursor cursor, CXCursor parent, CXClientData client_data) {
if (cursor.kind == CXCursor_EnumConstantDecl) {
CXString spelling = clang_getCursorSpelling(cursor);
myMap[clang_getCString(spelling)] = clang_getEnumConstantDeclValue(cursor);
clang_disposeString(spelling);
}
return CXChildVisit_Continue;
}

As id256 said, I don't think you can do this with libclang. However, Clang's libtooling and plugin interface allow you to access the AST and operate on that directly. For enums, you'll want to look at the EnumDecl class, which allows you to iterate over the inner decls. Then it's just a case of building up a map like:
for (auto declIterator = myEnumDecl.decls_begin();
declIterator != myEnumDecl.decls_end();
++declIterator)
{
myMap[declIterator->getNameAsString()] = declIterator->getInitVal;
}

You can use following way in get name and value for your map. i am using clang 8.
bool VisitEnumDecl(EnumDecl *ED)
{
for (auto it = ED->enumerator_begin(); it != ED->enumerator_end(); it++)
{
std::cout <<it->getNameAsString()<<" "<<it->getInitVal().getSExtValue()<<std::endl;
}
return true;
}

Related

I'm trying to create an enum but I get "identifier expected error"

I have this code:
enum EXECUTION_COMMANDS {
"buy" = OP_BUY,
"sell" = OP_SELL,
"buyLimit" = OP_BUYLIMIT,
"sellLimit" = OP_SELLLIMIT,
"buyStop" = OP_BUYSTOP,
"sellStop" = OP_SELLSTOP
};
So basically what I want to create is when program sees the string "buy", it has to read it as OP_BUY. Since OP_BUY is an internal command on MQL4, how do I do this? Is there another way to do this?

enum EXECUTION_COMMANDS
{
buy = OP_BUY,
...
};
Now the identifier buy is basically a named integer constant with the same value as OP_BUY. You can use buy and OP_BUY as aliases for each other.
If you really want to use strings then you need to create a map, that maps the strings into their integer values:
std::unordered_map<std::string, int> command_map = {
{ "buy", OP_BUY },
...
};
Then to use it use command_map["buy"] which will return the int value of OP_BUY.

Just take out the quotes.
The names of the enumerators should be identifiers, not strings.
enum EXECUTION_COMMANDS
{
buy = OP_BUY,
sell = OP_SELL,
buyLimit = OP_BUYLIMIT,
sellLimit = OP_SELLLIMIT,
buyStop = OP_BUYSTOP,
sellStop = OP_SELLSTOP
};
But if you wanted this to replace actual string literals in your source code, you're going to be disappointed. Either do not use string literals or, if you can't change the input being strings, introduce some mapping using a std::map<std::string, int>.

Iterate over all structs in a module

I'm writing a ModulePass and I need to analyze every struct defined in the given module.
I understand that identified structs with a name are inserted in the ValueSymbolTable, but how can I iterate over all the other structs (identified with no name and literal structs)?

The correct way of doing this is:
#include "llvm/IR/TypeFinder.h"
llvm::TypeFinder StructTypes;
StructTypes.run(M, true);
for (auto *STy : StructTypes)
STy->dump();
You should not use any private/opaque types (like LLVMContextImpl) whose headers are not published.

The LLVMContextImpl instance associated with your current context should have two data structures, one containing all the identified structs in the current context (whether or not they have an explicit name), and the other containing all the literal structs.
To get the LLVMContextImpl instance:
Module& M = ...
LLVMContextImpl* C = M.getContext().pImpl;
For the identified structs:
C->NamedStructTypes
For the literal structs:
C->AnonStructTypes
Both return iterable types (StringMap for the first, DenseMap for the second), allowing you to iterate over them and get all the types out.

bool runOnModule(Module &M) override
{
for(auto *S : M.getIdentifiedStructTypes())
{
S->dump();
}
return false;
}

Complementing Oak's answer, here's a more complete code example:
Module& M = ...
LLVMContextImpl* C = M.getContext().pImpl;
for (StringMap<StructType *>::iterator i = C->NamedStructTypes.begin(); i != C->NamedStructTypes.end(); ++i)
{
StructType *t = i->getValue();
t->dump(); fprintf(stderr, "\n");
}
LLVMContextImpl.h, being the header for a private implementation, isn't one of LLVM's public headers. You can get it from the LLVM source code — either copy it from there into your header search path or, for quick & dirty testing, do:
#include "/path/to/llvm/src/lib/VMCore/LLVMContextImpl.h"

luabind: cannot retrieve values from table indexed by non-built-in classes‏

I'm using luabind 0.9.1 from Ryan Pavlik's master distribution with Lua 5.1, cygwin on Win XP SP3 + latest patches x86, boost 1.48, gcc 4.3.4. Lua and boost are cygwin pre-compiled versions.
I've successfully built luabind in both static and shared versions.
Both versions pass all the tests EXCEPT for the test_object_identity.cpp test which fails in both versions.
I've tracked down the problem to the following issue:
If an entry in a table is created for NON built-in class (i.e., not int, string, etc), the value CANNOT be retrieved.
Here's a code piece that demonstrates this:
#include "test.hpp"
#include <luabind/luabind.hpp>
#include <luabind/detail/debug.hpp>
using namespace luabind;
struct test_param
{
int obj;
};
void test_main(lua_State* L)
{
using namespace luabind;
module(L)
[
class_<test_param>("test_param")
.def_readwrite("obj", &test_param::obj)
];
test_param temp_object;
object tabc = newtable(L);
tabc[1] = 10;
tabc[temp_object] = 30;
TEST_CHECK( tabc[1] == 10 ); // passes
TEST_CHECK( tabc[temp_object] == 30 ); // FAILS!!!
}
tabc[1] is indeed 10 while tabc[temp_object] is NOT 30! (actually, it seems to be nil)
However, if I use iterate to go over tabc entries, there're the two entries with the CORRECT key/value pairs.
Any ideas?
BTW, overloading the == operator like this:
#include <luabind/operator.hpp>
struct test_param
{
int obj;
bool operator==(test_param const& rhs) const
{
return obj == rhs.obj;
}
};
and
module(L)
[
class_<test_param>("test_param")
.def_readwrite("obj", &test_param::obj)
.def(const_self == const_self)
];
Doesn't change the result.
I also tried switching to settable() and gettable() from the [] operator. The result is the same. I can see with the debugger that default conversion of the key is invoked, so I guess the error arises from somewhere therein, but it's beyond me to figure out what exactly the problem is.
As the following simple test case show, there're definitely a bug in Luabind's conversion for complex types:
struct test_param : wrap_base
{
int obj;
bool operator==(test_param const& rhs) const
{ return obj == rhs.obj ; }
};
void test_main(lua_State* L)
{
using namespace luabind;
module(L)
[
class_<test_param>("test_param")
.def(constructor<>())
.def_readwrite("obj", &test_param::obj)
.def(const_self == const_self)
];
object tabc, zzk, zzv;
test_param tp, tp1;
tp.obj = 123456;
// create new table
tabc = newtable(L);
// set tabc[tp] = 5;
// o k v
settable( tabc, tp, 5);
// get access to entry through iterator() API
iterator zzi(tabc);
// get the key object
zzk = zzi.key();
// read back the value through gettable() API
// o k
zzv = gettable(tabc, zzk);
// check the entry has the same value
// irrespective of access method
TEST_CHECK ( *zzi == 5 &&
object_cast<int>(zzv) == 5 );
// convert key to its REAL type (test_param)
tp1 = object_cast<test_param>(zzk);
// check two keys are the same
TEST_CHECK( tp == tp1 );
// read the value back from table using REAL key type
zzv = gettable(tabc, tp1);
// check the value
TEST_CHECK( object_cast<int>(zzv) == 5 );
// the previous call FAILS with
// Terminated with exception: "unable to make cast"
// this is because gettable() doesn't return
// a TRUE value, but nil instead
}
Hopefully, someone smarter than me can figure this out,
Thx
I've traced the problem to the fact that Luabind creates a NEW DISTINCT object EVERY time you use a complex value as key (but it does NOT if you use a primitive one or an object).
Here's a small test case that demonstrates this:
struct test_param : wrap_base
{
int obj;
bool operator==(test_param const& rhs) const
{ return obj == rhs.obj ; }
};
void test_main(lua_State* L)
{
using namespace luabind;
module(L)
[
class_<test_param>("test_param")
.def(constructor<>())
.def_readwrite("obj", &test_param::obj)
.def(const_self == const_self)
];
object tabc, zzk, zzv;
test_param tp;
tp.obj = 123456;
tabc = newtable(L);
// o k v
settable( tabc, tp, 5);
iterator zzi(tabc), end;
std::cerr << "value = " << *zzi << "\n";
zzk = zzi.key();
// o k v
settable( tabc, tp, 6);
settable( tabc, zzk, 7);
for (zzi = iterator(tabc); zzi != end; ++zzi)
{
std::cerr << "value = " << *zzi << "\n";
}
}
Notice how tabc[tp] first has the value 5 and then is overwritten with 7 when accessed through the key object. However, when accessed AGAIN through tp, a new entry gets created. This is why gettable() fails subsequently.
Thx,
David

Disclaimer: I'm not an expert on luabind. It's entirely possible I've missed something about luabind's capabilities.
First of all, what is luabind doing when converting test_param to a Lua key? The default policy is copy. To quote the luabind documentation:
This will make a copy of the parameter. This is the default behavior when passing parameters by-value. Note that this can only be used when passing from C++ to Lua. This policy requires that the parameter type has an accessible copy constructor.
In pratice, what this means is that luabind will create a new object (called "full userdata") which is owned by the Lua garbage collector and will copy your struct into it. This is a very safe thing to do because it no longer matters what you do with the c++ object; the Lua object will stick around without really any overhead. This is a good way to do bindings for by-value sorts of objects.
Why does luabind create a new object each time you pass it to Lua? Well, what else could it do? It doesn't matter if the address of the passed object is the same, because the original c++ object could have changed or been destroyed since it was first passed to Lua. (Remember, it was copied to Lua by value, not by reference.) So, with only ==, luabind would have to maintain a list of every object of that type which had ever been passed to Lua (possibly weakly) and compare your object against each one to see if it matches. luabind doesn't do this (nor do I think should it).
Now, let's look at the Lua side. Even though luabind creates two different objects, they're still equal, right? Well, the first problem is that, besides certain built-in types, Lua can only hold objects by reference. Each of those "full userdata" that I mentioned before is actually a pointer. That means that they are not identical.
But they are equal, if we define an __eq meta operation. Unfortunately, Lua itself simply does not support this case. Userdata when used as table keys are always compared by identity, no matter what. This actually isn't special for userdata; it is also true for tables. (Note that to properly support this case, Lua would need to override the hashcode operation on the object in addition to __eq. Lua also does not support overriding the hashcode operation.) I can't speak for the authors of Lua why they did not allow this (and it has been suggested before), but there it is.
So, what are the options?
The simplest thing would be to convert test_param to an object once (explicitly), and then use that object to index the table both times. However, I suspect that while this fixes your toy example, it isn't very helpful in practice.
Another option is simply not to use such types as keys. Actually, I think this is a very good suggestion, since this kind of light-weight binding is quite useful, and the only other option is to discard it.
It looks like you can define a custom conversion on your type. In your example, it might be reasonable to convert your type to a Lua number which will behave well as a table index.
Use a different kind of binding. There will be some overhead, but if you want identity, you'll have to live with it. It sounds like luabind has some support for wrappers, which you may need to use to preserve identity:
When a pointer or reference to a registered class with a wrapper is passed to Lua, luabind will query for it's dynamic type. If the dynamic type inherits from wrap_base, object identity is preserved.

c++ map type of attribute

I have a map, where string representing the name of attribute and the second value representing the type, which this attribute should have.
map.insert(make_pair("X", iDatatype::iUint));
map.insert(make_pair("Y", iDatatype::iUint));
map.insert(make_pair("RADIANT", iDatatype::iFloat));
where iDatatype is just enumeration of all possible types.
typedef enum
{
iUnknown = 0,
iByte = 1,
iChar = 2,
iUChar = 3,
iShort = 4.....
} iDatatype;
If the program gets the command to create, for example, "RADIANT" than it look at the map, find the iDatatype value (iter->second) and go to switch case.
switch (iter->second) {
case iDatatype::iUint:
uint value = ......
// You gotta do what you gonna do
break;
} .......
In Switch case, the function, which depends on type of attribute, will be called.
This code works. But I am not sure, if it the best solution to map string with the types.
And the problem that I don't know what should I look for? Could you recommend what methods or techniques are commonly used for such purpose? Thank you a lot.

Unless you need the map for some other reference, another approach will be:
if(command == "X" || command == "Y") // make sure command is std::string
// or use strcmp
{
// create uint
}
else if(command == "RADIANT")
{
// create float
}
However I'm not sure how much faster this will be than using a map, because a map uses binary search while this uses iterative search.
If you want to gain the boost of binary search while no need for another enum you can use a map of functions:
std::map<std::string, std::function<void()> map;
map.insert(make_pair("X", &create_uint));
map.insert(make_pair("Y", &create_uint));
map.insert(make_pair("RADIANT", &create_float));
and later call it like:
std::string key = "X";
map[key]();
you can also pass parameters to it like:
void create_uint(std::string param) { /* ... */ }
std::map<std::string, std::function<void(std::string)> map;
map.insert(make_pair("X", &create_uint));
std::string key = "X";
std::string param = "XYZ";
map[key](param);

C/C++: any way to get reflective enums?

I've encountered this situation so many times...
enum Fruit {
Apple,
Banana,
Pear,
Tomato
};
Now I have Fruit f; // banana and I want to go from f to the string "Banana"; or I have string s = "Banana" and from that I want to go to Banana // enum value or int.
So far I've been doing this.. Assuming the enum is in Fruit.h:
// Fruit.cpp
const char *Fruits[] = {
"Apple",
"Banana",
"Pear",
"Tomato",
NULL
};
Obviously that's a messy solution. If a developer adds a new fruit to the header and doesn't add a new entry in Fruits[] (can't blame him, they have to be in two different files!) the application goes boom.
Is there a simple way to do what I want, where everything is in one file? Preprocessor hacks, alien magic, anything..
PS: This, contrary to reflection "for everything", would be really trivial to implement in compilers. Seeing how common a problem it is (at least for me) I really can't believe there is no reflective enum Fruit.. Not even in C++0x.
PS2: I'm using C++ but I tagged this question as C as well because C has the same problem. If your solution includes C++ only things, that's ok for me.

This one requires the fruits to be defined in an external file.
This would be the content of fruit.cpp:
#define FRUIT(name) name
enum Fruit {
#include "fruit-defs.h"
NUM_FRUITS
};
#undef FRUIT
#define FRUIT(name) #name
const char *Fruits [] = {
#include "fruit-defs.h"
NULL
};
#undef FRUIT
And this would be fruit-defs.h:
FRUIT(Banana),
FRUIT(Apple),
FRUIT(Pear),
FRUIT(Tomato),
It works as long as the values start in 0 and are consecutive...
Update: mix this solution with the one from Richard Pennington using C99 if you need non-consecutive values. Ie, something like:
// This would be in fruit-defs.h
FRUIT(Banana, 7)
...
// This one for the enum
#define FRUIT(name, number) name = number
....
// This one for the char *[]
#define FRUIT(name, number) [number] = #name

A c99 way that I've found helps reduce mistakes:
enum Fruit {
APPLE,
BANANA
};
const char* Fruits[] = {
[APPLE] = "APPLE",
[BANANA] = "BANANA"
};
You can add enums, even in the middle, and not break old definitions. You can still get NULL strings for values you forget, of course.

One trick I've done in the past is to add an extra enum and then do a compile time assert (such as Boost's) to make sure the two are kept in sync:
enum Fruit {
APPLE,
BANANA,
// MUST BE LAST ENUM
LAST_FRUIT
};
const char *FruitNames[] =
{
"Apple",
"Banana",
};
BOOST_STATIC_ASSERT((sizeof(FruitNames) / sizeof(*FruitNames)) == LAST_FRUIT);
This will at least prevent someone from forgetting to add to both the enum and the name array and will let them know as soon as they try to compile.

One comment on the macro solution - you don't need a separate file for the enumerators. Just use another macro:
#define FRUITs \
FRUIT(Banana), \
FRUIT(Apple), \
FRUIT(Pear), \
FRUIT(Tomato)
(I would probably leave the commas out, though, and incorporate them into the FRUIT macro as needed.)

There is also Better Enums, which is a head-only library (file) that requires C++11 and is licensed under the BSD software license. Official description:
Reflective compile-time enums for C+: Better Enums is a single, lightweight header file that makes your compiler generate reflective enum types.
Here is the code example from the official website:
#include <enum.h>
BETTER_ENUM(Channel, int, Red = 1, Green, Blue)
Channel c = Channel::_from_string("Red");
const char *s = c._to_string();
size_t n = Channel::_size();
for (Channel c : Channel::_values()) {
run_some_function(c);
}
switch (c) {
case Channel::Red: // ...
case Channel::Green: // ...
case Channel::Blue: // ...
}
Channel c = Channel::_from_integral(3);
constexpr Channel c =
Channel::_from_string("Blue");
It looks very promising, though I haven't tested it yet. In addition there are plenty of (custom) reflection libraries for C++. I hope something similar to Better Enums will be part of the Standard Template Library (STL) (or at least Boost), sooner or later.

What if you did something like this?
enum Fruit {
Apple,
Banana,
NumFruits
};
const char *Fruits[NumFruits] = {
"Apple",
"Banana",
};
Then if you add a new entry to the Fruit enum, your compiler should complain that there are insufficient entries in the initializer of the array, so you would be forced to add an entry to the array.
So it protects you from having the array be the wrong size, but it doesn't help you ensure that the strings are correct.

Take a look at Metaresc library https://github.com/alexanderchuranov/Metaresc
It provides interface for types declaration that will also generate meta-data for the type. Based on meta-data you can easily serialize/deserialize objects of any complexity. Out of the box you can serialize/deserialize XML, JSON, XDR, Lisp-like notation, C-init notation.
Here is a simple example:
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include "metaresc.h"
TYPEDEF_ENUM (fruit_t,
Apple,
Banana,
Pear,
Tomato,
);
int main (int argc, char * argv[])
{
mr_td_t * tdp = mr_get_td_by_name ("fruit_t");
if (tdp)
{
int i;
for (i = 0; i < tdp->fields_size / sizeof (tdp->fields[0]); ++i)
printf ("[%" SCNd64 "] = %s\n", tdp->fields[i].fdp->param.enum_value, tdp->fields[i].fdp->name.str);
}
return (EXIT_SUCCESS);
}
This program will output
$ ./enum
[0] = Apple
[1] = Banana
[2] = Pear
[3] = Tomato
Library works fine for latest gcc and clang.

Could make a class structure for it:
class Fruit {
int value; char const * name ;
protected:
Fruit( int v, char const * n ) : value(v), name(n) {}
public:
int asInt() const { return value ; }
char const * cstr() { return name ; }
} ;
#define MAKE_FRUIT_ELEMENT( x, v ) class x : public Fruit { x() : Fruit( v, #x ) {} }
// Then somewhere:
MAKE_FRUIT_ELEMENT(Apple, 1);
MAKE_FRUIT_ELEMENT(Banana, 2);
MAKE_FRUIT_ELEMENT(Pear, 3);
Then you can have a function that takes a Fruit, and it will even be more type safe.
void foo( Fruit f ) {
std::cout << f.cstr() << std::endl;
switch (f.asInt()) { /* do whatever * } ;
}
The sizeof this is 2x bigger than just an enum. But more than likely that doesn't matter.

As the other people answering the question have shown, there isn't really a clean ("D.R.Y.") way to do this using the C preprocessor alone. The problem is that you need to define an array of size of your enum containing strings corresponding to each enum value, and the C preprocessor isn't smart enough to be able to do that. What I do is to create a text file something like this:
%status ok
%meaning
The routine completed its work successfully.
%
%status eof_reading_content
%meaning
The routine encountered the end of the input before it expected
to.
%
Here %'s mark delimiters.
Then a Perl script, the working part of which looks like this,
sub get_statuses
{
my ($base_name, $prefix) = #_;
my #statuses;
my $status_txt_file = "$base_name.txt";
my $status_text = file_slurp ($status_txt_file);
while ($status_text =~
m/
\%status\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\n
\%meaning\s*(.*?)\s*\n\%\s*\n
/gxs) {
my ($code, $meaning) = ($1, $2);
$code = $prefix."_$code";
$meaning =~ s/\s+/ /g;
push #statuses, [$code, $meaning];
}
return #statuses;
}
reads this file and writes a header file:
typedef enum kinopiko_status {
kinopiko_status_ok,
kinopiko_status_eof_reading_content,
and a C file:
/* Generated by ./kinopiko-status.pl at 2009-11-09 23:45. */
#include "kinopiko-status.h"
const char * kinopiko_status_strings[26] = {
"The routine completed its work successfully.",
"The routine encountered the end of the input before it expected to. ",
using the input file at the top. It also calculates the number 26 here by counting the input lines. (There are twenty-six possible statuses in fact.)
Then the construction of the status string file is automated using make.

I don't like macro solutions, in general, though I admit it's kind of difficult there to avoid them.
Personally I opted for a custom class to wrap my enums in. The goal was to offer a bit more that traditional enums (like iteration).
Under the cover, I use a std::map to map the enum to its std::string counterpart. Then I can use this to both iterate over the enum and "pretty print" my enum or initialize it from a string read in a file.
The problem, of course, is the definition, since I have to first declare the enum and then map it... but that's the price you pay for using them.
Also, I then use not a real enum, but a const_iterator pointing to the map (under the covers) to represent the enum value (with end representing an invalid value).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js