gdb python: turn expression into address - gdb

I want to get the address from a value in the same way that the x examine command works. Internally, this seems to use value_as_address, which turns any gdb expression into a core address. I have not managed to find a binding for this anywhere in the Python API surface. Further, the workarounds I've seen so far don't seem to accomplish this:
It is not reasonable to cast the type to a void pointer because this is language dependent (will not work in Rust executables, for instance):
some_Value.cast(gdb.lookup_type('void').pointer())
will fail if there is no "void" type, as in the case of Rust.
I want to accept integers, which should come out unchanged.
In particular, I want a function that takes strings such as "0x5555555740a0", "main", etc and turns them into an integer containing their address.

Related

Which extension contains glProgramParameteriARB?

According to the documentation glProgramParameteriARB is a part of ARB_geometry_shader4. I have a graphics card which doesn't support ARB_geometry_shader4:
glxinfo | grep ARB_geometry_shader4
When I call glXGetProcAddress((const GLubyte*)glProgramParameteriARB) I get a function address and everything works fine. Does it mean that the documentation has a bug ? How can I find an extension which contains glProgramParameteriARB ?
glXGetProcAddress can be called without having a current OpenGL context (unlike wglGetProcAddress). As such, the function pointers it returns are independent of the current context. Because of that, it will return valid function pointers for any OpenGL function. It uses delayed binding for this kind of stuff.
If you want to know whether you can use a function pointer, check the extension strings, not whether you get a valid pointer.

Make BFD library find the location of a class member function

I am using the function bfd_find_nearest_line to find the source location of a function (from an executable with debugging symbols --compiled with -g). Naturally one of the arguments is a pointer to the function I want to locate:
boolean
_bfd_elf_find_nearest_line (abfd,
section,
symbols,
offset,
filename_ptr,
functionname_ptr, // <- HERE!
line_ptr)
https://sourceware.org/ml/binutils/2000-08/msg00248.html
After quite a bit of (pure C) boiler plate, I managed this to work with normal functions (where the normal function pointer is casted to *void).
For example, this works:
int my_function(){return 5;}
int main(){
_bfd_elf_find_nearest_line (...,
(void*)(&my_function),
...);
}
The question is if bfd_find_nearest_line can be used to locate the source code of a class member function.
struct A{
int my_member_function(){return 5.;}
};
_bfd_elf_find_nearest_line (...,
what_should_I_put_here??,
...)
Class member function (in this case if type int (A::*)()) are not functions, an in particular cannot be cast to any function pointer, not even to void*. See here: https://isocpp.org/wiki/faq/pointers-to-members#cant-cvt-memfnptr-to-voidptr
I completely understand the logic behind this, how ever the member-function pointer is the only handle from which I have information of a member function in order to make BFD identify the function. I don't want this pointer to call a function.
I know more or less how C++ works, the compiler will generate silently an equivalent free-C function,
__A_my_member_function(A* this){...}
But I don't know how to access the address of this free function or if that is even possible,and whether the bfd library will be able to locate the source location of the original my_member_function via this pointer.
(For the moment at least I am not interested in virtual functions.)
In other words,
1) I need to know if bfd will be able to locate a member function,
2) and if it can how can I map the member function pointer of type int (A::*)() to an argument that bfd can take (void*).
I know by other means (stack trace) that the pointer exists, for example I can get that the free function is called in this case _ZN1A18my_member_functionEv, but the problem is how I can get this from &(A::my_member_function).
Okay, there's good news and bad news.
The good news: It is possible.
The bad news: It's not straight forward.
You'll need the c++filt utility.
And, some way to read the symbol table of your executable, such as readelf. If you can enumerate the [mangled] symbols with a bfd_* call, you may be able to save yourself a step.
Also, here is a biggie: You'll need the c++ name of your symbol in a text string. So, for &(A::my_member_function), you'll need it in a form: "A::my_member_function()" This shouldn't be too difficult since I presume you have a limited number of them that you care about.
You'll need to get a list of symbols and their addresses from readelf -s <executable>. Be prepared to parse this output. You'll need to decode the hex address from the string to get its binary value.
These will be the mangled names. For each symbol, do c++filt -n mangled_name and capture the output (i.e. a pipe) into something (e.g. nice_name). It will give you back the demangled name (i.e. the nice c++ name you'd like).
Now, if nice_name matches "A:my_member_function()", you now have a match, you already have the mangled name, but, more importantly, the hex address of the symbol. Feed this hex value [suitably cast] to bfd where you were stuffing functionname_ptr
Note: The above works but can be slow with repeated invocations of c++filt
A faster way is to do this is to capture the piped output of:
readelf -s <executable> | c++filt
It's also [probably] easier to do it this way since you only have to parse the filtered output and look for the matching nice name.
Also, if you had multiple symbols that you cared about, you could get all the addresses in a single invocation.
Ok, I found a way. First, I discovered that bfd is pretty happy detecting member functions debug information from member pointers, as long as the pointer can be converted to void*.
I was using clang which wouldn't allow me to cast the member function pointer to any kind of pointer or integer.
GCC allows to do this but emits a warning.
There is even a flag to allow pointer to member cast called -Wno-pmf-conversions.
With that information in mind I did my best to convert a member function pointer into void* and I ended up doing this using unions.
struct A{
int my_member_function(){return 5.;}
};
union void_caster_t{
int (A::*p)(void) value;
void* casted_value;
};
void_caster_t void_caster = {&A::my_member_function};
_bfd_elf_find_nearest_line (...,
void_caster.casted_value,
...)
Finally bfd is able to give me debug information of a member function.
What I didn't figure out yet, is how to get the pointer to the constructor and the destructor member functions.
For example
void_caster_t void_caster = {&A::~A};
Gives compiler error: "you can't take the address of the destructor".
For the constructor I wasn't even able to find the correct syntax, since this fails as a syntax error.
void_caster_t void_caster = {&A::A};
Again all the logic behind not being able involves non-sensical callbacks, but this is different because I want the pointer (or address) to get debug information, not callbacks.

How to modify C++ code from user-input

I am currently writing a program that sits on top of a C++ interpreter. The user inputs C++ commands at runtime, which are then passed into the interpreter. For certain patterns, I want to replace the command given with a modified form, so that I can provide additional functionality.
I want to replace anything of the form
A->Draw(B1, B2)
with
MyFunc(A, B1, B2).
My first thought was regular expressions, but that would be rather error-prone, as any of A, B1, or B2 could be arbitrary C++ expressions. As these expressions could themselves contain quoted strings or parentheses, it would be quite difficult to match all cases with a regular expression. In addition, there may be multiple, nested forms of this expression
My next thought was to call clang as a subprocess, use "-dump-ast" to get the abstract syntax tree, modify that, then rebuild it into a command to be passed to the C++ interpreter. However, this would require keeping track of any environment changes, such as include files and forward declarations, in order to give clang enough information to parse the expression. As the interpreter does not expose this information, this seems infeasible as well.
The third thought was to use the C++ interpreter's own internal parsing to convert to an abstract syntax tree, then build from there. However, this interpreter does not expose the ast in any way that I was able to find.
Are there any suggestions as to how to proceed, either along one of the stated routes, or along a different route entirely?
What you want is a Program Transformation System.
These are tools that generally let you express changes to source code, written in source level patterns that essentially say:
if you see *this*, replace it by *that*
but operating on Abstract Syntax Trees so the matching and replacement process is
far more trustworthy than what you get with string hacking.
Such tools have to have parsers for the source language of interest.
The source language being C++ makes this fairly difficult.
Clang sort of qualifies; after all it can parse C++. OP objects
it cannot do so without all the environment context. To the extent
that OP is typing (well-formed) program fragments (statements, etc,.)
into the interpreter, Clang may [I don't have much experience with it
myself] have trouble getting focused on what the fragment is (statement? expression? declaration? ...). Finally, Clang isn't really a PTS; its tree modification procedures are not source-to-source transforms. That matters for convenience but might not stop OP from using it; surface syntax rewrite rule are convenient but you can always substitute procedural tree hacking with more effort. When there are more than a few rules, this starts to matter a lot.
GCC with Melt sort of qualifies in the same way that Clang does.
I'm under the impression that Melt makes GCC at best a bit less
intolerable for this kind of work. YMMV.
Our DMS Software Reengineering Toolkit with its full C++14 [EDIT July 2018: C++17] front end absolutely qualifies. DMS has been used to carry out massive transformations
on large scale C++ code bases.
DMS can parse arbitrary (well-formed) fragments of C++ without being told in advance what the syntax category is, and return an AST of the proper grammar nonterminal type, using its pattern-parsing machinery. [You may end up with multiple parses, e.g. ambiguities, that you'll have decide how to resolve, see Why can't C++ be parsed with a LR(1) parser? for more discussion] It can do this without resorting to "the environment" if you are willing to live without macro expansion while parsing, and insist the preprocessor directives (they get parsed too) are nicely structured with respect to the code fragment (#if foo{#endif not allowed) but that's unlikely a real problem for interactively entered code fragments.
DMS then offers a complete procedural AST library for manipulating the parsed trees (search, inspect, modify, build, replace) and can then regenerate surface source code from the modified tree, giving OP text
to feed to the interpreter.
Where it shines in this case is OP can likely write most of his modifications directly as source-to-source syntax rules. For his
example, he can provide DMS with a rewrite rule (untested but pretty close to right):
rule replace_Draw(A:primary,B1:expression,B2:expression):
primary->primary
"\A->Draw(\B1, \B2)" -- pattern
rewrites to
"MyFunc(\A, \B1, \B2)"; -- replacement
and DMS will take any parsed AST containing the left hand side "...Draw..." pattern and replace that subtree with the right hand side, after substituting the matches for A, B1 and B2. The quote marks are metaquotes and are used to distinguish C++ text from rule-syntax text; the backslash is a metaescape used inside metaquotes to name metavariables. For more details of what you can say in the rule syntax, see DMS Rewrite Rules.
If OP provides a set of such rules, DMS can be asked to apply the entire set.
So I think this would work just fine for OP. It is a rather heavyweight mechanism to "add" to the package he wants to provide to a 3rd party; DMS and its C++ front end are hardly "small" programs. But then modern machines have lots of resources so I think its a question of how badly does OP need to do this.
Try modify the headers to supress the method, then compiling you'll find the errors and will be able to replace all core.
As far as you have a C++ interpreter (as CERN's Root) I guess you must use the compiler to intercept all the Draw, an easy and clean way to do that is declare in the headers the Draw method as private, using some defines
class ItemWithDrawMehtod
{
....
public:
#ifdef CATCHTHEMETHOD
private:
#endif
void Draw(A,B);
#ifdef CATCHTHEMETHOD
public:
#endif
....
};
Then compile as:
gcc -DCATCHTHEMETHOD=1 yourfilein.cpp
In case, user want to input complex algorithms to the application, what I suggest is to integrate a scripting language to the app. So that the user can write code [function/algorithm in defined way] so the app can execute it in the interpreter and get the final results. Ex: Python, Perl, JS, etc.
Since you need C++ in the interpreter http://chaiscript.com/ would be a suggestion.
What happens when someone gets ahold of the Draw member function (auto draw = &A::Draw;) and then starts using draw? Presumably you'd want the same improved Draw-functionality to be called in this case too. Thus I think we can conclude that what you really want is to replace the Draw member function with a function of your own.
Since it seems you are not in a position to modify the class containing Draw directly, a solution could be to derive your own class from A and override Draw in there. Then your problem reduces to having your users use your new improved class.
You may again consider the problem of automatically translating uses of class A to your new derived class, but this still seems pretty difficult without the help of a full C++ implementation. Perhaps there is a way to hide the old definition of A and present your replacement under that name instead, via clever use of header files, but I cannot determine whether that's the case from what you've told us.
Another possibility might be to use some dynamic linker hackery using LD_PRELOAD to replace the function Draw that gets called at runtime.
There may be a way to accomplish this mostly with regular expressions.
Since anything that appears after Draw( is already formatted correctly as parameters, you don't need to fully parse them for the purpose you have outlined.
Fundamentally, the part that matters is the "SYMBOL->Draw("
SYMBOL could be any expression that resolves to an object that overloads -> or to a pointer of a type that implements Draw(...). If you reduce this to two cases, you can short-cut the parsing.
For the first case, a simple regular expression that searches for any valid C++ symbol, something similar to "[A-Za-z_][A-Za-z0-9_\.]", along with the literal expression "->Draw(". This will give you the portion that must be rewritten, since the code following this part is already formatted as valid C++ parameters.
The second case is for complex expressions that return an overloaded object or pointer. This requires a bit more effort, but a short parsing routine to walk backward through just a complex expression can be written surprisingly easily, since you don't have to support blocks (blocks in C++ cannot return objects, since lambda definitions do not call the lambda themselves, and actual nested code blocks {...} can't return anything directly inline that would apply here). Note that if the expression doesn't end in ) then it has to be a valid symbol in this context, so if you find a ) just match nested ) with ( and extract the symbol preceding the nested SYMBOL(...(...)...)->Draw() pattern. This may be possible with regular expressions, but should be fairly easy in normal code as well.
As soon as you have the symbol or expression, the replacement is trivial, going from
SYMBOL->Draw(...
to
YourFunction(SYMBOL, ...
without having to deal with the additional parameters to Draw().
As an added benefit, chained function calls are parsed for free with this model, since you can recursively iterate over the code such as
A->Draw(B...)->Draw(C...)
The first iteration identifies the first A->Draw( and rewrites the whole statement as
YourFunction(A, B...)->Draw(C...)
which then identifies the second ->Draw with an expression "YourFunction(A, ...)->" preceding it, and rewrites it as
YourFunction(YourFunction(A, B...), C...)
where B... and C... are well-formed C++ parameters, including nested calls.
Without knowing the C++ version that your interpreter supports, or the kind of code you will be rewriting, I really can't provide any sample code that is likely to be worthwhile.
One way is to load user code as a DLL, (something like plugins,)
this way, you don't need to compile your actual application, just the user code will be compiled, and you application will load it dynamically.

Name variable Lua

I have the following code in Lua:
ABC:
test (X)
The test function is implemented in C + +. My problem is this: I need to know what the variable name passed as parameter (in this case X). In C + + only have access to the value of this variable, but I must know her name.
Help please
Functions are not passed variables; they are passed values. Variables are just locations that store values.
When you say X somewhere in your Lua code, that means to get the value from the variable X (note: it's actually more complicated than that, but I won't get into that here).
So when you say test(X), you're saying, "Get the value from the variable X and pass that value as the first parameter to the function test."
What it seems like you want to do is change the contents of X, right? You want to have the test function modify X in some way. Well, you can't really do that directly in Lua. Nor should you.
See, in Lua, you can return values from functions. And you can return multiple values. Even from C++ code, you can return multiple values. So whatever it is you wanted to store in X can just be returned:
X = test(X)
This way, the caller of the function decides what to do with the value, not the function itself. If the caller wants to modify the variable, that's fine. If the caller wants to stick it somewhere else, that's also fine. Your function should not care one way or the other.
Also, this allows the user to do things like test(5). Here, there is no variable; you just pass a value directly. That's one reason why functions cannot modify the "variable" that is passed; because it doesn't have to be a variable. Only values are passed, so the user could simply pass a literal value rather than one stored in a variable.
In short: you can't do it, and you shouldn't want to.
The correct answer is that Lua doesn't really support this, but there is the debug interface. See this question for the solution you're looking for. If you can't get a call to debug to work directly from C++, then wrap your function call with a Lua function that first extracts the debug results and then calls your C++ function.
If what you're after is a string representation of the argument, then you're kind of stuck in lua.
I'm thinking something like in C:
assert( x==y );
Which generates a nice message on failure. In C this is done through macros.
Something like this (untested and probably broken).
#define assert(X) if(!(X)) { printf("ASSERION FAILED: %s\n", #X ); abort(); }
Here #X means the string form of the arguments. In the example above that is "x==y". Note that this is subtly different from a variable name - its just the string used in the parser when expanding the macro.
Unfortunately there's no such corresponding functionality in lua. For my lua testing libraries I end up passing the stringified version as part of the expression, so in lua my code looks something like this:
assert( x==y, "x==y")
There may be ways to make this work as assert("x==y") using some kind of string evaluation and closure mechanism, but it seemed to tricky to be worth doing to me.
EDIT:
While this doesn't appear to be possible in pure lua, there's a patched version that does seem to support macros: http://lua-users.org/wiki/LuaMacros . They even have an example of a nicer assert.

Function pointers and unknown number of arguments in C++

I came across the following weird chunk of code.Imagine you have the following typedef:
typedef int (*MyFunctionPointer)(int param_1, int param_2);
And then , in a function , we are trying to run a function from a DLL in the following way:
LPCWSTR DllFileName; //Path to the dll stored here
LPCSTR _FunctionName; // (mangled) name of the function I want to test
MyFunctionPointer functionPointer;
HINSTANCE hInstLibrary = LoadLibrary( DllFileName );
FARPROC functionAddress = GetProcAddress( hInstLibrary, _FunctionName );
functionPointer = (MyFunctionPointer) functionAddress;
//The values are arbitrary
int a = 5;
int b = 10;
int result = 0;
result = functionPointer( a, b ); //Possible error?
The problem is, that there isn't any way of knowing if the functon whose address we got with LoadLibrary takes two integer arguments.The dll name is provided by the user at runtime, then the names of the exported functions are listed and the user selects the one to test ( again, at runtime :S:S ).
So, by doing the function call in the last line, aren't we opening the door to possible stack corruption? I know that this compiles, but what sort of run-time error is going to occur in the case that we are passing wrong arguments to the function we are pointing to?
There are three errors I can think of if the expected and used number or type of parameters and calling convention differ:
if the calling convention is different, wrong parameter values will be read
if the function actually expects more parameters than given, random values will be used as parameters (I'll let you imagine the consequences if pointers are involved)
in any case, the return address will be complete garbage, so random code with random data will be run as soon as the function returns.
In two words: Undefined behavior
I'm afraid there is no way to know - the programmer is required to know the prototype beforehand when getting the function pointer and using it.
If you don't know the prototype beforehand then I guess you need to implement some sort of protocol with the DLL where you can enumerate any function names and their parameters by calling known functions in the DLL. Of course, the DLL needs to be written to comply with this protocol.
If it's a __stdcall function and they've left the name mangling intact (both big ifs, but certainly possible nonetheless) the name will have #nn at the end, where nn is a number. That number is the number of bytes the function expects as arguments, and will clear off the stack before it returns.
So, if it's a major concern, you can look at the raw name of the function and check that the amount of data you're putting onto the stack matches the amount of data it's going to clear off the stack.
Note that this is still only a protection against Murphy, not Machiavelli. When you're creating a DLL, you can use an export file to change the names of functions. This is frequently used to strip off the name mangling -- but I'm pretty sure it would also let you rename a function from xxx#12 to xxx#16 (or whatever) to mislead the reader about the parameters it expects.
Edit: (primarily in reply to msalters's comment): it's true that you can't apply __stdcall to something like a member function, but you can certainly use it on things like global functions, whether they're written in C or C++.
For things like member functions, the exported name of the function will be mangled. In that case, you can use UndecorateSymbolName to get its full signature. Using that is somewhat nontrivial, but not outrageously complex either.
I do not think so, it is a good question, the only provision is that you MUST know what the parameters are for the function pointer to work, if you don't and blindly stuff the parameters and call it, it will crash or jump off into the woods never to be seen again... It is up to the programmer to convey the message on what the function expects and the type of parameters, luckily you could disassemble it and find out from looking at the stack pointer and expected address by way of the 'stack pointer' (sp) to find out the type of parameters.
Using PE Explorer for instance, you can find out what functions are used and examine the disassembly dump...
Hope this helps,
Best regards,
Tom.
It will either crash in the DLL code (since it got passed corrupt data), or: I think Visual C++ adds code in debug builds to detect this type of problem. It will say something like: "The value of ESP was not saved across a function call", and will point to code near the call. It helps but isn't totally robust - I don't think it'll stop you passing in the wrong but same-sized argument (eg. int instead of a char* parameter on x86). As other answers say, you just have to know, really.
There is no general answer. The Standard mandates that certain exceptions be thrown in certain circumstances, but aside from that describes how a conforming program will be executed, and sometimes says that certain violations must result in a diagnostic. (There may be something more specific here or there, but I certainly don't remember one.)
What the code is doing there isn't according to the Standard, and since there is a cast the compiler is entitled to go ahead and do whatever stupid thing the programmer wants without complaint. This would therefore be an implementation issue.
You could check your implementation documentation, but it's probably not there either. You could experiment, or study how function calls are done on your implementation.
Unfortunately, the answer is very likely to be that it'll screw something up without being immediately obvious.
Generally if you are calling LoadLibrary and GetProcByAddrees you have documentation that tells you the prototype. Even more commonly like with all of the windows.dll you are provided a header file. While this will cause an error if wrong its usually very easy to observe and not the kind of error that will sneak into production.
Most C/C++ compilers have the caller set up the stack before the call, and readjust the stack pointer afterwards. If the called function does not use pointer or reference arguments, there will be no memory corruption, although the results will be worthless. And as rerun says, pointer/reference mistakes almost always show up with a modicum of testing.