using c++filt from inside c++ program [duplicate] - c++

I have previously, here, been shown that C++ functions aren't easily represented in assembly. Now I am interested in reading them one way or another because Callgrind, part of Valgrind, show them demangled while in assembly they are shown mangled.
So I would like to either mangle the Valgrind function output or demangle the assembly names of functions. Anyone ever tried something like that? I was looking at a website and found out the following:
Code to implement demangling is part of the GNU Binutils package;
see libiberty/cplus-dem.c and include/demangle.h.
Has anyone ever tried something like that? I want to demangle/mangle in C.
My compiler is gcc 4.x.

Use the c++filt command line tool to demangle the name.

Here is my C++11 implementation, derived from the following page:
http://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html
#include <cxxabi.h> // needed for abi::__cxa_demangle
std::shared_ptr<char> cppDemangle(const char *abiName)
{
int status;
char *ret = abi::__cxa_demangle(abiName, 0, 0, &status);
/* NOTE: must free() the returned char when done with it! */
std::shared_ptr<char> retval;
retval.reset( (char *)ret, [](char *mem) { if (mem) free((void*)mem); } );
return retval;
}
To make the memory management easy on the returned (char *), I'm using a std::shared_ptr with a custom lambda 'deleter' function that calls free() on the returned memory. Because of this, I don't ever have to worry about deleting the memory on my own, I just use it as needed, and when the shared_ptr goes out of scope, the memory will be free'd.
Here's the macro I use to access the demangled type name as a (const char *). Note that you must have RTTI turned on to have access to 'typeid'
#define CLASS_NAME(somePointer) ((const char *) cppDemangle(typeid(*somePointer).name()).get() )
So, from within a C++ class I can say:
printf("I am inside of a %s\n",CLASS_NAME(this));

This is a slight variation on Dave's version above. This is a unique_ptr version with a little bit of checking on the return type, though it looks like you could just ignore that, but somehow that just seems unclean.
auto cppDemangle (const char *abiName)
{
//
// This function allocates and returns storage in ret
//
int status;
char *ret = abi::__cxa_demangle(abiName, 0 /* output buffer */, 0 /* length */, &status);
auto deallocator = ( [](char *mem) { if (mem) free((void*)mem); } );
if (status) {
// 0: The demangling operation succeeded.
// -1: A memory allocation failure occurred.
// -2: mangled_name is not a valid name under the C++ ABI mangling rules.
// -3: One of the arguments is invalid.
std::unique_ptr<char, decltype(deallocator) > retval(nullptr, deallocator);
return retval;
}
//
// Create a unique pointer to take ownership of the returned string so it
// is freed when that pointers goes out of scope
//
std::unique_ptr<char, decltype(deallocator) > retval(ret, deallocator);
return retval;
}

Related

Safely reading string from Lua stack

How is it possible to safely read string value from Lua stack? The functions lua_tostring and lua_tolstring both can raise a Lua error (longjmp / exception of a strange type). Therefore the functions should be called in protected mode using lua_pcall probably. But I am not able to find a nice solution how to do that and get the string value from Lua stack to C++. Is it really needed to call lua_tolstring in protected mode using lua_pcall?
Actually using lua_pcall seems bad, because the string I want to read from Lua stack is an error message stored by lua_pcall.
Use lua_type before lua_tostring: If lua_type returns LUA_TSTRING, then you can safely call lua_tostring to get the string and no memory will be allocated.
lua_tostring only allocates memory when it needs to convert a number to a string.
Ok, When you call lua_pcall failed, it will return an error code. When you call lua_pcall successfully, you will get zero. So, first you should see the returned value by lua_pcall, then use the lua_type to get the type, at last, use the lua_to* functions the get the right value.
int iRet = lua_pcall(L, 0, 0, 0);
if (iRet)
{
const char *pErrorMsg = lua_tostring(L, -1); // error message
cout<<pErrorMsg<<endl;
lua_close(L);
return 0;
}
int iType = lua_type(L, -1);
switch (iType)
{
//...
case LUA_TSTRING:
{
const char *pValue = lua_tostring(L, -1);
// ...
}
}
It's all.
Good luck.
You can use the lua_isstring function to check if the value can be converted to a string without an error.
Here's how it's done in OpenTibia servers:
std::string LuaState::popString()
{
size_t len;
const char* cstr = lua_tolstring(state, -1, &len);
std::string str(cstr, len);
pop();
return str;
}
Source: https://github.com/opentibia/server/blob/master/src/lua_manager.cpp

How to jump the program execution to a specific address in C?

I want the program to jump to a specific address in memory and continue execution from that address. I thought about using goto but I don't have a label rather just an address in memory.
There is no need to worry about return back from the jump address.
edit: using GCC compiler
Inline assembly might be the easiest and most "elegant" solution, although doing this is highly unusual, unless you are writing a debugger or some specialized introspective system.
Another option might be to declare a pointer to a void function (void (*foo)(void)), then set the pointer to contain your address, and then invoke it:
void (*foo)(void) = (void (*)())0x12345678;
foo();
There will be things pushed on the stack since the compiler thinks you are doing a subroutine call, but since you don't care about returning, this might work.
gcc has an extension that allows jumping to an arbitrary address:
void *ptr = (void *)0x1234567; // a random memory address
goto *ptr; // jump there -- probably crash
This is pretty much the same as using a function pointer that you set to a fixed value, but it will actually use a jump instruction rather than a call instruction (so the stack won't be modified)
#include <stdio.h>
#include <stdlib.h>
void go(unsigned int addr) {
(&addr)[-1] = addr;
}
int sub() {
static int i;
if(i++ < 10) printf("Hello %d\n", i);
else exit(0);
go((unsigned int)sub);
}
int main() {
sub();
}
Of course, this invokes undefined behavior, is platform-dependent, assumes that code addresses are the same size as int, etc, etc.
It should look something like this:
unsigned long address=0x80;
void (*func_ptr)(void) = (void (*)(void))address;
func_ptr();
However, it is not a very safe operation, jumping to some unknown address will probably result in a crash!
Since the question has a C++ tag, here's an example of a C++ call to a function with a signature like main()--int main(int argc, char* argv[]):
int main(int argc, char* argv[])
{
auto funcAddr = 0x12345678; //or use &main...
auto result = reinterpret_cast<int (*)(int, char**)>(funcAddr)(argc, argv);
}
Do you have control of the code at the address that you intend to jump to? Is this C or C++?
I hesitantly suggest setjmp() / longjmp() if you're using C and can run setjmp() where you need to jump back to. That being said, you've got to be VERY careful with these.
As for C++, see the following discussion about longjmp() shortcutting exception handling and destructors destructors. This would make me even more hesitant to suggest it's use in C++.
C++: Safe to use longjmp and setjmp?
This is what I am using for my bootstrap loader(MSP430AFE253,Compiler = gcc,CodeCompeserStudio);
#define API_RESET_VECT 0xFBFE
#define JUMP_TO_APP() {((void (*)()) (*(uint16_t*)API_RESET_VECT)) ();}
I Propos this code:
asm(
"LDR R0,=0x0a0000\n\t" /* Or 0x0a0000 for the base Addr. */
"LDR R0, [R0, #4]\n\t" /* Vector+4 for PC */
"BX R0"
);

Problems with LD_PRELOAD and calloc() interposition for certain executables

Relating to a previous question of mine
I've successfully interposed malloc, but calloc seems to be more problematic.
That is with certain hosts, calloc gets stuck in an infinite loop with a possible internal calloc call inside dlsym. However, a basic test host does not exhibit this behaviour, but my system's "ls" command does.
Here's my code:
// build with: g++ -O2 -Wall -fPIC -ldl -o libnano.so -shared Main.cc
#include <stdio.h>
#include <dlfcn.h>
bool gNanoUp = false;// global
// Function types
typedef void* (*MallocFn)(size_t size);
typedef void* (*CallocFn)(size_t elements, size_t size);
struct MemoryFunctions {
MallocFn mMalloc;
CallocFn mCalloc;
};
MemoryFunctions orgMemFuncs;
// Save original methods.
void __attribute__((constructor)) __nano_init(void) {
fprintf(stderr, "NANO: init()\n");
// Get address of original functions
orgMemFuncs.mMalloc = (MallocFn)dlsym(RTLD_NEXT, "malloc");
orgMemFuncs.mCalloc = (CallocFn)dlsym(RTLD_NEXT, "calloc");
fprintf(stderr, "NANO: malloc() found #%p\n", orgMemFuncs.mMalloc);
fprintf(stderr, "NANO: calloc() found #%p\n", orgMemFuncs.mCalloc);
gNanoUp = true;
}
// replacement functions
extern "C" {
void *malloc(size_t size) {
if (!gNanoUp) __nano_init();
return orgMemFuncs.mMalloc(size);
}
void* calloc(size_t elements, size_t size) {
if (!gNanoUp) __nano_init();
return orgMemFuncs.mCalloc(elements, size);
}
}
Now, When I do the following, I get an infinite loop followed by a seg fault, eg:
% setenv LD_PRELOAD "./libnano.so"
% ls
...
NANO: init()
NANO: init()
NANO: init()
Segmentation fault (core dumped)
However if I comment out the calloc interposer, it almost seems to work:
% setenv LD_PRELOAD "./libnano.so"
% ls
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
<directory contents>
...
So somethings up with "ls" that means init() gets called twice.
EDIT
Note that the following host program works correctly - init() is only called once, and calloc is successfully interposed, as you can see from the output.
// build with: g++ test.cc -o test
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
void* p = malloc(123);
printf("HOST p=%p\n", p);
free(p);
char* c = new char;
printf("HOST c=%p\n", c);
delete c;
void* ca = calloc(10,10);
printf("HOST ca=%p\n", ca);
free(ca);
}
% setenv LD_PRELOAD "./libnano.so"
% ./test
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
HOST p=0x601010
HOST c=0x601010
HOST ca=0x601030
I know I am a bit late (6 years). But I wanted to override calloc() today and faced a problem because dlsym() internally uses calloc(). I solved it using a simple technique and thought of sharing it here:
static unsigned char buffer[8192];
void *calloc(size_t nmemb, size_t size)
{
if (calloc_ptr == NULL) // obtained from dlsym
return buffer;
init(); // uses dlsym() to find address of the real calloc()
return calloc_ptr(len);
}
void free(void *in)
{
if (in == buffer)
return;
free_ptr(in);
}
buffer satisfies the need of dlsym() till the real calloc() has been located and my calloc_ptr function pointer initialized.
With regard to __nano_init() being called twice: You've declared the function as a constructor, so it's called when the library is loaded, and it's called a second time explicitly when your malloc() and calloc() implementations are first called. Pick one.
With regard to the calloc() interposer crashing your application: Some of the functions you're using, including dlsym() and fprintf(), may themselves be attempting to allocate memory, calling your interposer functions. Consider the consequences, and act accordingly.
Using dlsym based hooking can result in crashes, as dlsym calls back into the memory allocator. Instead use malloc hooks, as I suggested in your prior question; these can be installed without actually invoking dlsym at all.
You can get away with a preliminary poor calloc that simply returns NULL. This actually works on Linux, YMMV.
static void* poor_calloc(size_t nmemb, size_t size)
{
// So dlsym uses calloc internally, which will lead to infinite recursion, since our calloc calls dlsym.
// Apparently dlsym can cope with slightly wrong calloc, see for further explanation:
// http://blog.bigpixel.ro/2010/09/interposing-calloc-on-linux
return NULL; // This is a poor implementation of calloc!
}
You can also use sbrk to allocate the memory for "poor calloc".

Function to mangle/demangle functions

I have previously, here, been shown that C++ functions aren't easily represented in assembly. Now I am interested in reading them one way or another because Callgrind, part of Valgrind, show them demangled while in assembly they are shown mangled.
So I would like to either mangle the Valgrind function output or demangle the assembly names of functions. Anyone ever tried something like that? I was looking at a website and found out the following:
Code to implement demangling is part of the GNU Binutils package;
see libiberty/cplus-dem.c and include/demangle.h.
Has anyone ever tried something like that? I want to demangle/mangle in C.
My compiler is gcc 4.x.
Use the c++filt command line tool to demangle the name.
Here is my C++11 implementation, derived from the following page:
http://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html
#include <cxxabi.h> // needed for abi::__cxa_demangle
std::shared_ptr<char> cppDemangle(const char *abiName)
{
int status;
char *ret = abi::__cxa_demangle(abiName, 0, 0, &status);
/* NOTE: must free() the returned char when done with it! */
std::shared_ptr<char> retval;
retval.reset( (char *)ret, [](char *mem) { if (mem) free((void*)mem); } );
return retval;
}
To make the memory management easy on the returned (char *), I'm using a std::shared_ptr with a custom lambda 'deleter' function that calls free() on the returned memory. Because of this, I don't ever have to worry about deleting the memory on my own, I just use it as needed, and when the shared_ptr goes out of scope, the memory will be free'd.
Here's the macro I use to access the demangled type name as a (const char *). Note that you must have RTTI turned on to have access to 'typeid'
#define CLASS_NAME(somePointer) ((const char *) cppDemangle(typeid(*somePointer).name()).get() )
So, from within a C++ class I can say:
printf("I am inside of a %s\n",CLASS_NAME(this));
This is a slight variation on Dave's version above. This is a unique_ptr version with a little bit of checking on the return type, though it looks like you could just ignore that, but somehow that just seems unclean.
auto cppDemangle (const char *abiName)
{
//
// This function allocates and returns storage in ret
//
int status;
char *ret = abi::__cxa_demangle(abiName, 0 /* output buffer */, 0 /* length */, &status);
auto deallocator = ( [](char *mem) { if (mem) free((void*)mem); } );
if (status) {
// 0: The demangling operation succeeded.
// -1: A memory allocation failure occurred.
// -2: mangled_name is not a valid name under the C++ ABI mangling rules.
// -3: One of the arguments is invalid.
std::unique_ptr<char, decltype(deallocator) > retval(nullptr, deallocator);
return retval;
}
//
// Create a unique pointer to take ownership of the returned string so it
// is freed when that pointers goes out of scope
//
std::unique_ptr<char, decltype(deallocator) > retval(ret, deallocator);
return retval;
}

Can I use a static var to "cache" the result? C++

I am using a function that returns a char*, and right now I am getting the compiler warning "returning address of local variable or temporary", so I guess I will have to use a static var for the return, my question is can I make something like if(var already set) return var else do function and return var?
This is my function:
char * GetUID()
{
TCHAR buf[20];
StringCchPrintf(buf, 20*sizeof(char), TEXT("%s"),
someFunction());
return buf;
}
And this is what I want to do:
char * GetUID()
{
static TCHAR buf[20];
if(strlen(buf)!=0) return buf;
StringCchPrintf(buf, 20*sizeof(char), TEXT("%s"),
someFunction());
return buf;
}
Is this a well use of static vars? And should I use ZeroMemory(&buf, 20*sizeof(char))? I removed it because if I use it above the if(strlen...) my TCHAR length is never 0, should I use it below?
The reason you're getting a warning is because the memory allocated within your function for buf is going to be popped off the stack once the function exits. If you return a pointer to that memory address, you have a pointer to undefined memory. It may work, it may not - it's not safe regardless.
Typically the pattern in C/C++ is to allocate a block of memory and pass a pointer to that block into your function. e.g.
void GetUID( char* buf )
{
if(strlen(buf)!=0) return;
StringCchPrintf(buf, 20*sizeof(char), TEXT("%s"), someFunction());
}
If you want the function (GetUID) itself to handle caching the result, then you can use a static, a singleton (OOP), or consider thread local storage.
(e.g. in Visual C++)
__declspec(thread) TCHAR buf[20];
It's OK if your code is single threaded. The buffer will be set to contain all zeros when the function is entered for the very first time, so there is no need to explicitly set its contents to zero. But these days all code eventually tends to become multi-threaded, so I would not do this, if I were you.
What you should do instead is allocate buf dynamically using new, and then return it. But be aware that the caller would be responsible for deallocating the memory.
Better yet, use std::string instead of char *.
You could do that, but it won't be thread-safe. You should also be careful with what you do with the result, since you can not store it between subsequent calls to the function (without copying it, of course).
You should also initialize the static variable to the empty string.
This is how I would do it.
It caches; it does not rely on the buffer being 0.
It does have the implicit assumption that 'buf' will be identical from thread to thread, which is not (to my knowledge) correct. I would use a global for that purpose.
//returns a newly allocated buffer, every time. remember to delete [] it.
char * GetUID()
{
static TCHAR buf[20];
static bool run = false;
TCHAR ret_mem = new TCHAR[20];
if(run)
{ return ret_mem; }
//do stuff to the buf
//assuming - dst, src, size.
memcpy(ret_mem, buf, 20);
run = true;
return ret_mem;
}