libclang get primitive value - c++

How can I get the value of a primitive literal using libclang?
For example, if I have a CXCursor of cursor kind CXCursor_IntegerLiteral, how can I extract the literal value.
UPDATE:
I've run into so many problems using libclang. I highly recommend avoiding it entirely and instead use the C++ interface clang provides. The C++ interface is highly useable and very well documented: http://clang.llvm.org/doxygen/annotated.html
The only purpose I see of libclang now is to generate the ASTUnit object for you as with the following code (it's not exactly easy otherwise):
ASTUnit * astUnit;
{
index = clang_createIndex(0, 0);
tu = clang_parseTranslationUnit(
index, 0,
clangArgs, nClangArgs,
0, 0, CXTranslationUnit_None
);
astUnit = static_cast<ASTUnit *>(tu->TUData);
}
Now you might say that libclang is stable and the C++ interface isn't. That hardly matters, as the time you spend figuring out the AST with libclang and creating kludges with it wastes so much of your time anyway. I'd just as soon spend a few hours fixing up code that does not compile after a version upgrade (if even needed).

Instead of reparsing the original, you already have all the information you need inside the translation unit :
if (kind == CXCursor_IntegerLiteral)
{
CXSourceRange range = clang_getCursorExtent(cursor);
CXToken *tokens = 0;
unsigned int nTokens = 0;
clang_tokenize(tu, range, &tokens, &nTokens);
for (unsigned int i = 0; i < nTokens; i++)
{
CXString spelling = clang_getTokenSpelling(tu, tokens[i]);
printf("token = %s\n", clang_getCString(spelling));
clang_disposeString(spelling);
}
clang_disposeTokens(tu, tokens, nTokens);
}
You will see that the first token is the integer itself, the next one is not relevant (eg. it's ; for int i = 42;.

If you have access to a CXCursor, you can make use of the clang_Cursor_Evaluate function, for example:
CXChildVisitResult var_decl_visitor(
CXCursor cursor, CXCursor parent, CXClientData data) {
auto kind = clang_getCursorKind(cursor);
switch (kind) {
case CXCursor_IntegerLiteral: {
auto res = clang_Cursor_Evaluate(cursor);
auto value = clang_EvalResult_getAsInt(res);
clang_EvalResult_dispose(res);
std::cout << "IntegerLiteral " << value << std::endl;
break;
}
default:
break;
}
return CXChildVisit_Recurse;
}
Outputs:
IntegerLiteral 42

I found a way to do this by referring to the original files:
std::string getCursorText (CXCursor cur) {
CXSourceRange range = clang_getCursorExtent(cur);
CXSourceLocation begin = clang_getRangeStart(range);
CXSourceLocation end = clang_getRangeEnd(range);
CXFile cxFile;
unsigned int beginOff;
unsigned int endOff;
clang_getExpansionLocation(begin, &cxFile, 0, 0, &beginOff);
clang_getExpansionLocation(end, 0, 0, 0, &endOff);
ClangString filename = clang_getFileName(cxFile);
unsigned int textSize = endOff - beginOff;
FILE * file = fopen(filename.c_str(), "r");
if (file == 0) {
exit(ExitCode::CANT_OPEN_FILE);
}
fseek(file, beginOff, SEEK_SET);
char buff[4096];
char * pBuff = buff;
if (textSize + 1 > sizeof(buff)) {
pBuff = new char[textSize + 1];
}
pBuff[textSize] = '\0';
fread(pBuff, 1, textSize, file);
std::string res(pBuff);
if (pBuff != buff) {
delete [] pBuff;
}
fclose(file);
return res;
}

You can actually use a combination of libclang and the C++ interface.
The libclang CXCursor type contains a data field which contains references to the underlying AST nodes.
I was able to successfully access the IntegerLiteral value by casting data[1] to the IntegerLiteral type.
I'm implementing this in Nim so I will provide Nim code, but you can likely do the same in C++.
let literal = cast[clang.IntegerLiteral](cursor.data[1])
echo literal.getValue().getLimitedValue()
The IntegerLiteral type is wrapped like so:
type
APIntObj* {.importcpp: "llvm::APInt", header: "llvm/ADT/APInt.h".} = object
# https://github.com/llvm-mirror/llvm/blob/master/include/llvm/ADT/APInt.h
APInt* = ptr APIntObj
IntegerLiteralObj* {.importcpp: "clang::IntegerLiteral", header: "clang/AST/Expr.h".} = object
IntegerLiteral* = ptr IntegerLiteralObj
proc getValue*(i: IntegerLiteral): APIntObj {.importcpp: "#.getValue()".}
# This is implemented by the superclass: https://clang.llvm.org/doxygen/classclang_1_1APIntStorage.html
proc getLimitedValue*(a: APInt | APIntObj): culonglong {.importcpp: "#.getLimitedValue()".}
Hope this helps someone :)

Related

a value of type "void *" cannot be assigned to an entity of type "RANDOMSTRUCT *"

So I was working on malloc in void. And I have a code:
int iInitRandomPhaseArrays(WS_ELEMENT *Aufbau, RANDOMSTRUCT **random)
{
WS_ELEMENT *Act;
int iCounter = 0, i;
RANDOMSTRUCT *dummy;
Act = Aufbau;
if (*random != NULL)
return -1;
while (Act != NULL)
{
if (Act->operation == Linsenarray)
iCounter++;
Act = Act->pNext;
}
if (iCounter)
{
dummy = malloc(iCounter * sizeof(random));
ran1_3ARG(&ran1_idum, &ran1_iy, ran1_iv);
dummy[0].idum = ran1_idum;
dummy[0].iy = ran1_iy;
memcpy(dummy[0].iv, ran1_iv, sizeof(ran1_iv));
for (i = 0; i < iCounter; i++)
ran1_3ARG(&dummy[i].idum, &dummy[i].iy, dummy[i].iv);
dummy[0].Anzahl = iCounter;
*random = dummy;
}
return iCounter;
}
here error:
a value of type "void *" cannot be assigned to an entity of type "RANDOMSTRUCT *"
Can anyone help me solve it?
Change the line:
dummy = malloc(iCounter * sizeof(random));
to say:
dummy = (RANDOMSTRUCT *)malloc(iCounter * sizeof(RANDOMSTRUCT));
dummy = malloc(iCounter * sizeof(random));
this allocates the wrong amount of memory (a multiple of a pointer size, not the pointed-to) and returns a void*. In c++ void* doesn't implicitly convert to other pointer types. In c it does.
Assuming you actually mean to use C-isms in C++ code, write this:
template<class T>
T* typed_malloc( std::size_t count = 1 ) {
return static_cast<T*>(malloc( sizeof(T)*count ));
}
this function is a type-safe version of malloc that handles 9999/10000 uses, and prevents an annoying class of bugs.
Then change the line of code to:
dummy = typed_malloc<RANDOMSTRUCT>(iCounter);
Sometimes using malloc in c++ isn't easy to remove, because your code interacts with c code. This kind of change can eliminate bugs before they happen as you modify c code to c++ relatively transparently.

C++ from linux to windows: 'does not evaluate to a constant'

I am trying to port this function from Linux to windows:
template<class TDescriptor, class F>
bool TemplatedVocabulary<TDescriptor,F>::loadFromBinaryFile(const std::string &filename) {
fstream f;
f.open(filename.c_str(), ios_base::in|ios::binary);
unsigned int nb_nodes, size_node;
f.read((char*)&nb_nodes, sizeof(nb_nodes));
f.read((char*)&size_node, sizeof(size_node));
f.read((char*)&m_k, sizeof(m_k));
f.read((char*)&m_L, sizeof(m_L));
f.read((char*)&m_scoring, sizeof(m_scoring));
f.read((char*)&m_weighting, sizeof(m_weighting));
createScoringObject();
m_words.clear();
m_words.reserve(pow((double)m_k, (double)m_L + 1));
m_nodes.clear();
m_nodes.resize(nb_nodes+1);
m_nodes[0].id = 0;
char buf[size_node];// fails
int nid = 1;
while (!f.eof()) {
f.read(buf, size_node);
m_nodes[nid].id = nid;
// FIXME
const int* ptr=(int*)buf;
m_nodes[nid].parent = *ptr;
//m_nodes[nid].parent = *(const int*)buf;
m_nodes[m_nodes[nid].parent].children.push_back(nid);
m_nodes[nid].descriptor = cv::Mat(1, F::L, CV_8U);
memcpy(m_nodes[nid].descriptor.data, buf+4, F::L);
m_nodes[nid].weight = *(float*)(buf+4+F::L);
if (buf[8+F::L]) { // is leaf
int wid = m_words.size();
m_words.resize(wid+1);
m_nodes[nid].word_id = wid;
m_words[wid] = &m_nodes[nid];
}
else
m_nodes[nid].children.reserve(m_k);
nid+=1;
}
f.close();
return true;
}
This line:
char buf[size_node];
will not compile, giving the error:
expression did not evaluate to a constant.
I have tried using:
std::vector<char> buf(size_node)
and:
char buf[size_node] = new char[];
but I see the same error. It seems like this is related to a run time constant vs compile time constant, as stated in the answer here:
Tuple std::get() Not Working for Variable-Defined Constant
But I am not sure how to get around it in this case. Thank you.
It should be
char *buf = new char[size_node];
Remember to delete the memory after use.
Or, just use std::vector. It's much safer.
std::vector<char> buf(size_node);
Then you'd have to change how buf is used. For example:
f.read(buf, size_node);
should become
f.read(buf.data(), size_node); //Only C++11

Create a function with unique function pointer in runtime

When calling WinAPI functions that take callbacks as arguments, there's usually a special parameter to pass some arbitrary data to the callback. In case there's no such thing (e.g. SetWinEventHook) the only way we can understand which of the API calls resulted in the call of the given callback is to have distinct callbacks. When we know all the cases in which the given API is called at compile-time, we can always create a class template with static method and instantiate it with different template arguments in different call sides. That's a hell of a work, and I don't like doing so.
How do I create callback functions at runtime so that they have different function pointers?
I saw a solution (sorry, in Russian) with runtime assembly generation, but it wasn't portable across x86/x64 archtectures.
You can use the closure API of libffi. It allows you to create trampolines each with a different address. I implemented a wrapping class here, though that's not finished yet (only supports int arguments and return type, you can specialize detail::type to support more than just int). A more heavyweight alternative is LLVM, though if you're dealing only with C types, libffi will do the job fine.
I've come up with this solution which should be portable (but I haven't tested it):
#define ID_PATTERN 0x11223344
#define SIZE_OF_BLUEPRINT 128 // needs to be adopted if uniqueCallbackBlueprint is complex...
typedef int (__cdecl * UNIQUE_CALLBACK)(int arg);
/* blueprint for unique callback function */
int uniqueCallbackBlueprint(int arg)
{
int id = ID_PATTERN;
printf("%x: Hello unique callback (arg=%d)...\n", id, arg);
return (id);
}
/* create a new unique callback */
UNIQUE_CALLBACK createUniqueCallback(int id)
{
UNIQUE_CALLBACK result = NULL;
char *pUniqueCallback;
char *pFunction;
int pattern = ID_PATTERN;
char *pPattern;
char *startOfId;
int i;
int patterns = 0;
pUniqueCallback = malloc(SIZE_OF_BLUEPRINT);
if (pUniqueCallback != NULL)
{
pFunction = (char *)uniqueCallbackBlueprint;
#if defined(_DEBUG)
pFunction += 0x256; // variable offset depending on debug information????
#endif /* _DEBUG */
memcpy(pUniqueCallback, pFunction, SIZE_OF_BLUEPRINT);
result = (UNIQUE_CALLBACK)pUniqueCallback;
/* replace ID_PATTERN with requested id */
pPattern = (char *)&pattern;
startOfId = NULL;
for (i = 0; i < SIZE_OF_BLUEPRINT; i++)
{
if (pUniqueCallback[i] == *pPattern)
{
if (pPattern == (char *)&pattern)
startOfId = &(pUniqueCallback[i]);
if (pPattern == ((char *)&pattern) + sizeof(int) - 1)
{
pPattern = (char *)&id;
for (i = 0; i < sizeof(int); i++)
{
*startOfId++ = *pPattern++;
}
patterns++;
break;
}
pPattern++;
}
else
{
pPattern = (char *)&pattern;
startOfId = NULL;
}
}
printf("%d pattern(s) replaced\n", patterns);
if (patterns == 0)
{
free(pUniqueCallback);
result = NULL;
}
}
return (result);
}
Usage is as follows:
int main(void)
{
UNIQUE_CALLBACK callback;
int id;
int i;
id = uniqueCallbackBlueprint(5);
printf(" -> id = %x\n", id);
callback = createUniqueCallback(0x4711);
if (callback != NULL)
{
id = callback(25);
printf(" -> id = %x\n", id);
}
id = uniqueCallbackBlueprint(15);
printf(" -> id = %x\n", id);
getch();
return (0);
}
I've noted an interresting behavior if compiling with debug information (Visual Studio). The address obtained by pFunction = (char *)uniqueCallbackBlueprint; is off by a variable number of bytes. The difference can be obtained using the debugger which displays the correct address. This offset changes from build to build and I assume it has something to do with the debug information? This is no problem for the release build. So maybe this should be put into a library which is build as "release".
Another thing to consider whould be byte alignment of pUniqueCallback which may be an issue. But an alignment of the beginning of the function to 64bit boundaries is not hard to add to this code.
Within pUniqueCallback you can implement anything you want (note to update SIZE_OF_BLUEPRINT so you don't miss the tail of your function). The function is compiled and the generated code is re-used during runtime. The initial value of id is replaced when creating the unique function so the blueprint function can process it.

How to know written var type with Clang using C API instead of actual?

I'm trying to use Clang via C API, indexing to be detailed. The problem is that some types are returned not as they are written, but as they are for compiler. For example "Stream &" becomes "int &" and "byte" becomes "int.
Some test lib:
// TODO make it a subclass of a generic Serial/Stream base class
class FirmataClass
{
public:
FirmataClass(Stream &s);
void setFirmwareNameAndVersion(const char *name, byte major, byte minor);
I'm using the code to get method information:
void showMethodInfo(const CXIdxDeclInfo *info) {
int numArgs = clang_Cursor_getNumArguments(info->cursor);
fprintf(stderr, " %i args:\n", numArgs);
for (int i=0; i<numArgs; i++) {
CXCursor argCursor = clang_Cursor_getArgument(info->cursor, i);
CXString name = clang_getCursorDisplayName(argCursor);
CXString spelling = clang_getCursorSpelling(argCursor);
CXType type = clang_getCursorType(argCursor);
CXString typeSpelling = clang_getTypeSpelling(type);
CXCursorKind kind = clang_getCursorKind(argCursor);
fprintf(stderr, " kind=[%s (%i)], type=[%s], spelling=[%s]\n",
cursor_kinds[kind], kind, clang_getCString(typeSpelling),
clang_getCString(spelling));
clang_disposeString(name);
clang_disposeString(spelling);
clang_disposeString(typeSpelling);
}
// return type
CXType returnType = clang_getCursorResultType(info->cursor);
CXString returnTypeSpelling = clang_getTypeSpelling(returnType);
fprintf(stderr, " returns %s\n", clang_getCString(returnTypeSpelling));
clang_disposeString(returnTypeSpelling);
}
Output:
[105:10 4689] access=[CX_CXXPublic]
kind=[CXIdxEntity_CXXInstanceMethod] (21)
name=[setFirmwareNameAndVersion] is_container=[0] 3 args:
kind=[CXCursor_ParmDecl (10)], type=[const char *], spelling=[name]
kind=[CXCursor_ParmDecl (10)], type=[int], spelling=[major]
kind=[CXCursor_ParmDecl (10)], type=[int], spelling=[minor]
returns void
So you can see that byte function arguments are described as int.
How can i get actual spelling?
Is byte declared via a typedef, or a #define?
When I declare these types:
typedef int MyType_t;
#define MyType2_t int
class Foo
{
public:
bool bar( MyType_t a, MyType2_t b );
};
And then print the type names I get from clang_GetTypeSpelling this is what I get:
bool Foo_bar( MyType_t a, int b )
Libclang presumably can't print the #defined name because the preprocessor has already replaced it with int by the time the parse tree is built.
I've solved this a few days ago.
"Stream &" becomes "int &" and "byte" becomes "int.
libclang doesn't know what Stream or byte are until you insert the standard headers manualy using the flag -isystem <pathToStdHeaderDirectory>
I wrote a C# function that retrieves all the visual studio VC headers include directory:
private static string[] GetStdIncludes()
{
using (RegistryKey key = Registry.LocalMachine.OpenSubKey(#"SOFTWARE\Wow6432Node\Microsoft\VisualStudio"))
{
if (key != null)
{
var lastVcVersions = key.GetSubKeyNames()
.Select(s =>
{
float result = 0;
if (float.TryParse(s, System.Globalization.NumberStyles.Float, System.Globalization.CultureInfo.InvariantCulture, out result))
return result;
else return 0F;
}).Where(w => w > 0F)
.OrderByDescending(or => or)
.Select(s => s.ToString("n1", System.Globalization.CultureInfo.InvariantCulture))
.ToArray();
foreach (var v in lastVcVersions)
{
using (var vk = key.OpenSubKey(v))
{
var val = (string)vk.GetValue("Source Directories");
if (!string.IsNullOrEmpty(val))
return val.Split(";");
}
}
}
}
throw new Exception("Couldn't find VC runtime include directories");
}
hope that helps
I was having the same issue with my own classes.
You need to pass on the same flags you would use for compiling with clang to either clang_parseTranslationUnit or clang_createTranslationUnit, in particular the -I flags which are used to look up the header files where your class or type definitions are.
it seems that if libclang can't find a type declaration, it just defaults to all of then to int.
calling clang_createIndex ( 1, 1 ) should provide you with hints on what you are missing via stderr.
Here is some sample code that works for me now:
int main ( int argc, char* argv[] )
{
char *clang_args[] =
{
"-I.",
"-I./include",
"-I../include",
"-x",
"c++",
"-Xclang",
"-ast-dump",
"-fsyntax-only",
"-std=c++1y"
};
CXIndex Idx = clang_createIndex ( 1, 1 );
CXTranslationUnit TU = clang_parseTranslationUnit ( Idx, argv[1], clang_args, 9, NULL, 0, CXTranslationUnit_Incomplete | CXTranslationUnit_SkipFunctionBodies );
clang_visitChildren ( clang_getTranslationUnitCursor ( TU ),
TranslationUnitVisitor, NULL );
clang_disposeTranslationUnit ( TU );
return 0;
}
I am trying to get the AST for a header file, hence the CXTranslationUnit_Incomplete | CXTranslationUnit_SkipFunctionBodies flags and -ast-dump -fsyntax-only command line options, you may want to omit them if you dont need them and of course add and change the -I parameters according to your needs.

Function has corrupt return value

I have a situation in Visual C++ 2008 that I have not seen before. I have a class with 4 STL objects (list and vector to be precise) and integers.
It has a method:
inline int id() { return m_id; }
The return value from this method is corrupt, and I have no idea why.
debugger screenshot http://img687.imageshack.us/img687/6728/returnvalue.png
I'd like to believe its a stack smash, but as far as I know, I have no buffer over-runs or allocation issues.
Some more observations
Here's something that puts me off. The debugger prints right values in the place mentioned // wrong ID.
m_header = new DnsHeader();
assert(_CrtCheckMemory());
if (m_header->init(bytes, size))
{
eprintf("0The header ID is %d\n", m_header->id()); // wrong ID!!!
inside m_header->init()
m_qdcount = ntohs(h->qdcount);
m_ancount = ntohs(h->ancount);
m_nscount = ntohs(h->nscount);
m_arcount = ntohs(h->arcount);
eprintf("The details are %d,%d,%d,%d\n", m_qdcount, m_ancount, m_nscount, m_arcount);
// copy the flags
// this doesn't work with a bitfield struct :(
// memcpy(&m_flags, bytes + 2, sizeof(m_flags));
//unpack_flags(bytes + 2); //TODO
m_init = true;
}
eprintf("Assigning an id of %d\n", m_id); // Correct ID.
return
m_header->id() is an inline function in the header file
inline int id() { return m_id; }
I don't really know how best to post the code snippets I have , but here's my best shot at it. Please do let me know if they are insufficient:
Class DnsHeader has an object m_header inside DnsPacket.
Main body:
DnsPacket *p ;
p = new DnsPacket(r);
assert (_CrtCheckMemory());
p->add_bytes(buf, r); // add bytes to a vector m_bytes inside DnsPacket
if (p->parse())
{
read_packet(sin, *p);
}
p->parse:
size_t size = m_bytes.size(); // m_bytes is a vector
unsigned char *bytes = new u_char[m_bytes.size()];
copy(m_bytes.begin(), m_bytes.end(), bytes);
m_header = new DnsHeader();
eprintf("m_header allocated at %x\n", m_header);
assert(_CrtCheckMemory());
if (m_header->init(bytes, size)) // just set the ID and a bunch of other ints here.
{
size_t pos = DnsHeader::SIZE; // const int
if (pos != size)
; // XXX perhaps generate a warning about extraneous data?
if (ok)
m_parsed = true;
}
else
{
m_parsed = false;
}
if (!ok) {
m_parsed = false;
}
return m_parsed;
}
read_packet:
DnsHeader& h = p.header();
eprintf("The header ID is %d\n", h.id()); // ID is wrong here
...
DnsHeader constructor:
m_id = -1;
m_qdcount = m_ancount = m_nscount = m_arcount = 0;
memset(&m_flags, 0, sizeof(m_flags)); // m_flags is a struct
m_flags.rd = 1;
p.header():
return *m_header;
m_header->init: (u_char* bytes, int size)
header_fmt *h = (header_fmt *)bytes;
m_id = ntohs(h->id);
eprintf("Assigning an id of %d/%d\n", ntohs(h->id), m_id); // ID is correct here
m_qdcount = ntohs(h->qdcount);
m_ancount = ntohs(h->ancount);
m_nscount = ntohs(h->nscount);
m_arcount = ntohs(h->arcount);
You seem to be using a pointer to an invalid class somehow. The return value shown is the value that VS usually uses to initialize memory with:
2^32 - 842150451 = 0xCDCDCDCD
You probably have not initialized the class that this function is a member of.
Without seeing more of the code in context.. it might be that the m_id is out of the scope you expect it to be in.
Reinstalled VC++. That fixed everything.
Thank you for your time and support everybody! :) Appreciate it!