clang AddressSanitizer instructs code improperly, false-positive result - c++

FOREWORD
The current question is pretty damn huge and related to my master thesis, so I am humbly asking for your patience. I encountered a problem that is going to be explained further about half a year ago and the problem was needed an exterior look because at that point I was really stuck and I had nobody to help me. In the end I waved a hand at the problem, but now I am back in business (the second wind, let us put it that way).
INTRODUCTION
Crucial technologies used in the project:
C++, llvm/clang 13.0.1, ASAN, libFuzzer
The underlying idea behind the project I was writting is:
Write a parser of C-code projects to find functions that are presumed to be vulnerable (in the frames of the current question it does not matter how I decide that they are vulnerable)
When I find the vulnerable function, I start to write fuzzer code with libFuzzer for the function.
At this point I have an IR file with my vulnerable function, an IR file with my fuzzer code so it is time
to perform a separate compilation of two files. During the compilation process I instruct them with ASAN and libFuzzer by the clang compiler.
So the two files are coalesced together and I have an executable called, for example, 'fuzzer'. Theoretically, I can execute this executable and libFuzzer is going to fuzz my vulnerable function.
ACTUAL PROBLEM (PART 1)
ASAN intructs my code somehow bad. It gives me the wrong result.
How do I know that?
I found and took a vulnerable function. This function is from the old version of libcurl and is called sanitize_cookie_path. I reproduced the bug with AFL++ and it gave me what I wanted. If you pass a single quote to the function, it is going to 'blow'. Something similar I wanted to do with libFuzzer and ASAN, but as I mentioned earlier these two did not give me the expected result. Having spent some time on the problem, I can say that there is something with ASAN.
PROBLEM REPRODUCTION
I have the code (see below) in the file sanitize_cookie_path.c:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stddef.h>
static char* sanitize_cookie_path(const char* cookie_path) {
size_t len;
char* new_path = strdup(cookie_path);
if (!new_path) {
return NULL;
}
if (new_path[0] == '\"') {
memmove((void *)new_path, (const void*)(new_path + 1), strlen(new_path));
}
if (new_path[strlen(new_path) - 1] == '\"') {
new_path[strlen(new_path) - 1] = 0x0;
}
if (new_path[0] !='/') {
free(new_path);
new_path = strdup("/");
return new_path;
}
len = strlen(new_path);
if (1 < len && new_path[len - 1] == '/') {
new_path[len - 1] = 0x0;
}
return new_path;
}
int main(int argc, char** argv) {
if (argc != 2) {
exit(1);
}
sanitize_cookie_path('\"');
return 0;
}
My C++ code compiles it with the command:
clang -O0 -emit-llvm path/to/sanitize_cookie_path.c -S -o path/to/sanitize_cookie_path.ll > /dev/null 2>&1
On the IR level of the above code I get rid of the 'main' so only the 'sanitize_cookie_path' function presents.
I generate the simple fuzzer code (see below) for this function:
#include <cstdio>
#include <cstdint>
static char* sanitize_cookie_path(const char* cookie_path) ;
extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
(void) sanitize_cookie_path((char*) data);
return 0;
}
Then I compile it with the command:
clang -O0 -emit-llvm path/to/fuzz_sanitize_cookie_path.cc -S -o path/to/fuzz_sanitize_cookie_path.ll > /dev/null 2>&1
Two IR files are being compiled with the separate compilation. NOTE that before the separate compilation I perform some business to get them fit each other. For instance, I ditch the 'static' keyword and resolve name mangling from C++ to C code.
I compile them both together with the command:
clang++ -O0 -g -fno-omit-frame-pointer -fsanitize=address,fuzzer -fsanitize-coverage=trace-cmp,trace-gep,trace-div path/to/sanitize_cookie_path.ll path/to/fuzz_sanitize_cookie_path.ll -o path-to/fuzzer > /dev/null 2>&1
The final 'fuzzer' executable is ready.
ACTUAL PROBLEM (PART 2)
If you execute the fuzzer program, it is not going to give you the same results as AFL++ gives you. My fuzzer tumbles down on the '__interceptor_strdup' function from some standard library (see error snippet below). The crash report done by libFuzzer is literally empty (0 bytes), but ideally it had to find that the error is with a quote ("). Having done my own research I found out that ASAN did instruct the code bad and it gives me a false-position result. Frankly speaking I can fuzz the 'printf' function from stdio.h and find the same error.
[sanitize_cookie_path]$ ./fuzzer
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1016408680
INFO: Loaded 1 modules (11 inline 8-bit counters): 11 [0x5626d4c64c40, 0x5626d4c64c4b),
INFO: Loaded 1 PC tables (11 PCs): 11 [0x5626d4c64c50,0x5626d4c64d00),
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
=================================================================
==2804==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000011 at pc 0x5626d4ba7671 bp 0x7ffe43152df0 sp 0x7ffe431525a0
READ of size 2 at 0x602000000011 thread T0
#0 0x5626d4ba7670 in __interceptor_strdup (/path/to/fuzzer+0xdd670)
#1 0x5626d4c20127 in sanitize_cookie_path (/path/to/fuzzer+0x156127)
#2 0x5626d4c20490 in LLVMFuzzerTestOneInput (/path/to/fuzzer+0x156490)
#3 0x5626d4b18940 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/path/to/fuzzer+0x4e940)
#4 0x5626d4b1bae6 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/path/to/fuzzer+0x51ae6)
#5 0x5626d4b1c052 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/path/to/fuzzer+0x52052)
#6 0x5626d4b0100b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/path/to/fuzzer+0x3700b)
#7 0x5626d4af0297 in main (/path/to/fuzzer+0x26297)
#8 0x7f8e6442928f (/usr/lib/libc.so.6+0x2928f)
#9 0x7f8e64429349 in __libc_start_main (/usr/lib/libc.so.6+0x29349)
#10 0x5626d4af02e4 in _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:115
I used gdb to enter into the strdup(cookie_path). gdb shows me that the fuzzer tumbles down on the address 0x0000555555631687.
0x0000555555631684 <+452>: mov %rbp,%rsi
0x0000555555631687 <+455>: addr32 call 0x555555674100 <_ZN6__asan18ReportGenericErrorEmmmmbmjb>
0x000055555563168d <+461>: pop %rax
WHAT I TRIED TO DO
I tried to instuct my sanitize_cookie_path.c and fuzz_sanitize_cookie_path.cc with ASAN right at the beginning, not at the IR level but whatever I did nothing worked.
I passed to the 'fuzzer' the so called corpus directory with pre-cooked data to be passed to the fuzzer. I even passed the quote explicitly to the 'fuzzer', but nothing. Example (with the same directory as the fuzzer):
$ mkdir corpus/; echo "\"" > corpus/input; hexdump corpus/input
0000000 0a22
0000002
$ ./fuzzer corpus/
I also googled everything I could about libFuzzer and ASAN, but nothing gave me the results.
Changed compilation command. I got rid of the '-fno-omit-frame-pointer' and '-fsanitize-coverage=trace-cmp,trace-gep,trace-div'.
If there are some uncertainties in the details I have provided, do not hesitate to ask about them and I will iron them out to be more clear for you.
What are some other sites/forums where I can possibly get heard? I would ideally want to contact the developers of ASAN.
I will be more than happy for any help.
UPDATE 04/10/2022
llvm/clang have been upgraded from 13.0.1 to the latest available version in the Arch repository - 14.0.6. The problem still persists.
Opened an issue in the google/sanitizers repository.

Once more I have reread my question and comments, looked again at the code and additionally ran into this thought:
AddressSanitizer is not expected to produce false positives. If you
see one, look again; most likely it is a true positive!
As #Richard Critten and #chi have correctly pointed out in the comments section strdup function needs a NULL terminated string, so I changed my solution
from
(void) sanitize_cookie_path((char*) data);
to
char* string_ = new char[size + 1];
memcpy(string_, data, size);
string_[size] = 0x0;
(void) sanitize_cookie_path(string_);
delete[] string_;
The about solution converts the raw array of bytes data to a NULL terminated string string_ and passes it to the function. This solution works as it is expected.
It was just a stupid mistake that I had overlooked. Thanks again to #Richard Critten and #chi and everyone that tried to help.
Since there is no bug, I am going to retract my false accusations in google/sanitizers.

Related

Stack is totally messed up by trying to produce a buffer overflow

after hours of debugging without any effort, I hope to find some help here on StackOverflow.
I'm currently on a PTP training and due to the fact that I'm only using Linux, i also want to practice the very firsts Labs on my local machine.
What i have to do is to exploit a very simple Program via buffer overflow. Just the Sources are given:
goodpwd.cpp:
#include <iostream>
#include <cstring>
int bf_overflow(char *str){
char buffer[10]; //our buffer
strcpy(buffer,str); //the vulnerable command
return 0;
}
int good_password(){ // a function which is never executed
printf("Valid password supplied\n");
printf("This is good_password function \n");
return 0;
}
int main(int argc, char *argv[])
{
int password=0; // controls whether password is valid or not
printf("You are in goodpwd.exe now\n");
bf_overflow(argv[1]); //call the function and pass user input
if ( password == 1) {
good_password(); //this should never happen
}
else {
printf("Invalid Password!!!\n");
}
printf("Quitting sample1.exe\n");
return 0;
}
I compiled it to get an executable by using
gcc -fno-stack-protector -z execstack -o goodpwd goodpwd.cpp -ggdb -m32 -lstdc++ -no-pie -O0
(I also already tried it without -no-pie and -O0 but I thought maybe the optimization could be the problem..)
I used gdb to debug the executable:
gdb goodpwd -tui -q
After setting a breakpoint to line 6 (the one with the vulnerable strcpy) I executed the following command:
(gdb) run AAAAAAAAAAAAAABCDE
after pressing n to go to the next line, I had a look into the stack:
(gdb) x/20x $esp
this gave me the following result:
0xffffd6f0: 0xffffd748 0x4141a8b0 0x41414141 0x41414141
0xffffd700: 0x41414141 0x45444342 0xffffd700 0x0804923b
0xffffd710: 0xffffd99c 0xf7fe4bd0 0xffffd800 0x08049209
0xffffd720: 0x00000002 0xffffd7f4 0xffffd800 0x00000000
0xffffd730: 0x0804c000 0x00000002 0x08049080 0xffffd760
I cannot explain myself why:
there are two A's at 0xffffd6f4
there are no A's at 0xffffd6f6
I got 16 A's starting at 0xffffd6f8
I got EDCB at 0xffffd704 (because of little endian, thank you #1201ProgramAlarm)
$bsp is 0xffffd708 and $eip is 0x80491a7 but after doing two more steps (leaving the function) $eip is set to 0x804923e because after all I've learned, I'm pretty sure it should be 0x08049209
after those two steps I get those error: main (argc=<error reading variable: Cannot access memory at address 0x4141a8b0>,
argv=<error reading variable: Cannot access memory at address 0x4141a8b4>) at goodpwd.cpp:21
I'd really appreciate if there's someone who's able to help me.
Struggling in module 3 of 43 is not the best feeling I've ever got :D
Edit:
ASLR should be deactivated:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Maybe it was a bit too late yesterday. But today I found out, that #1202ProgramAlarm made a very good point.
Because of using a little-endian system, 0xffffd704 was right.
My confusions about 0xffffd6f4 and 0xffffd6f6 have been irrelevant because they not influenced the result.
The value of the old $EIP was still in 0xffffd70e but I never touched it.
I just had to enhance the string in the argument and afterwards I've been able to exploit the vulnerability.
It was a lot of fun. Thanks for the advises.

c++ (on Clion) for loop stops in the middle with no errors (exit code 0) [duplicate]

When using CLion I have found the output sometimes cuts off.
For example when running the code:
main.cpp
#include <stdio.h>
int main() {
int i;
for (i = 0; i < 1000; i++) {
printf("%d\n", i);
}
fflush(stdout); // Shouldn't be needed as each line ends with "\n"
return 0;
}
Expected Output
The expected output is obviously the numbers 0-999 on each on a new line
Actual Output
After executing the code multiple times within CLion, the output often changes:
Sometimes it executes perfectly and shows all the numbers 0-999
Sometimes it cuts off at different points (e.g. 0-840)
Sometimes it doesn't output anything
The return code is always 0!
Screenshot
Running the code in a terminal (i.e. not in CLion itself)
However, the code outputs the numbers 0-999 perfectly when compiling and running the code using the terminal.
I have spent so much time on this thinking it was a problem with my code and a memory issue until I finally realised that this was just an issue with CLion.
OS: Ubuntu 14.04 LTS
Version: 2016.1
Build: #CL-145.258
Update
A suitable workaround is to run the code in debug mode (thanks to #olaf).
The consensus is that this is an IDE issue. Therefore, I have reported the bug.
A suitable workaround is to execute the code in debug mode (no breakpoint required).
I will update this question, as soon as this bug is fixed.
Update 1
WARNING: You should not change information in registry unless you have been asked specifically by JetBrains. Registry is not in the main menu for a reason! Use the following solution at your own risk!!!
JetBrains have contacted me and provided a suitable solution:
Go to the Find Action Dialog box (CTRL+SHIFT+A)
Search for "Registry..."
Untick run.processes.with.pty
Should then work fine!
Update 2
The bug has been added here:
https://youtrack.jetbrains.com/issue/CPP-6254
Feel free to upvote it!

Removing OPENSSL_cleanse from OpenSSL-1.0.1r

I found out that OPENSSL_cleanse wastes a lot of time in my project. For example, if it runs for 25 seconds, 3 seconds are wasted by OPENSSL_cleanse. I checked the code of this function and decided that it isn't doing anything very useful for me. I know it fills memory with garbage data for security reasons but I don't really care about it. So I decided to place return; just before the start of any operations in this function.
void OPENSSL_cleanse(void *ptr, size_t len)
{
return;
// original OpenSSL code goes here
}
I'm using Mac OS and Xcode. I've compiled the lib and installed it in /Users/ForceBru/Desktop/openssl via the --openssldir option of the Configure script. I've added it to my project in Build Settings->Link Binary With Libraries and added include dirs in Build Settings->Search Paths->Header Search Paths and Build Settings->Search Paths->Library Search Paths.
The project compiled fine, but the time profiler still shows pretty expensive calls to OPENSSL_cleanse.
Edit: the C tag is because OpenSSL is written in C, and the C++ tag is because my code is in C++. Maybe this information will be helpful.
The question is, what am I doing wrong? How do I remove the calls to OPENSSL_cleanse? I think this has to do with linking, because the command line includes -lcrypto, which means this library can actually be taken from anywhere (right?), not necessarily from /Users/ForceBru/Desktop/openssl.
Edit #2: I've edited the linker options to use the .a file in /Users/ForceBru/Desktop/openssl and removed it from Build Settings->Link Binary With Libraries. Still no effect.
It turns out that OpenSSL has lots of assembly code generated by some Perl scripts that are located in the crypto directory (*cpuid.pl). These scripts generate assembly code for the following architectures: alpha, armv4, ia64, ppc, s390x, sparc, x86 and x86_64.
When make runs, the appropriate script fires generating a *cpuid.S (where * is one of the architectures mentioned earlier). These files are compiled into the library and seem to override the OPENSSL_cleanse implemented in crypto/mem_clr.c.
What I had to do is to simply change the body of OPENSSL_cleanse to ret in x86_64cpuid.pl:
.globl OPENSSL_cleanse
.type OPENSSL_cleanse,\#abi-omnipotent
.align 16
OPENSSL_cleanse:
ret
# loads of OPENSSL assembly
.size OPENSSL_cleanse,.-OPENSSL_cleanse
This isn't quite the answer that you were looking for, but it may help you along...
Removing OPENSSL_cleanse from OpenSSL-1.0.1r...
I checked the code of this function and decided that it isn't doing anything very useful for me...
That's probably a bad idea, but we would need to know more about your threat model. Zeroization allows you to deterministically remove sensitive material from memory.
Its also a Certification and Accreditation (C&A) item. For example, FIPS 140-2 requires zeroization even at Level 1.
Also, you can't remove OPENSSL_cleanse per se because OPENSSL_clear_realloc, OPENSSL_clear_free and friends call it. Also see the OPENSSL_cleanse man page.
For example, if it runs for 25 seconds, 3 seconds are wasted by OPENSSL_cleanse
OK, so this is a different problem. OPENSSL_cleanse is kind of convoluted, and it does waste some cycles in an effort to survive the optimization pass.
If you check Commit 380f18ed5f140e0a, then you will see it has been changed in OpenSSL 1.1.0 to the following. Maybe you could use it instead?
diff --git a/crypto/mem_clr.c b/crypto/mem_clr.c
index e6450a1..3389919 100644 (file)
--- a/crypto/mem_clr.c
+++ b/crypto/mem_clr.c
## -59,23 +59,16 ##
#include <string.h>
#include <openssl/crypto.h>
-extern unsigned char cleanse_ctr;
-unsigned char cleanse_ctr = 0;
+/*
+ * Pointer to memset is volatile so that compiler must de-reference
+ * the pointer and can't assume that it points to any function in
+ * particular (such as memset, which it then might further "optimize")
+ */
+typedef void *(*memset_t)(void *,int,size_t);
+
+static volatile memset_t memset_func = memset;
void OPENSSL_cleanse(void *ptr, size_t len)
{
- unsigned char *p = ptr;
- size_t loop = len, ctr = cleanse_ctr;
-
- if (ptr == NULL)
- return;
-
- while (loop--) {
- *(p++) = (unsigned char)ctr;
- ctr += (17 + ((size_t)p & 0xF));
- }
- p = memchr(ptr, (unsigned char)ctr, len);
- if (p)
- ctr += (63 + (size_t)p);
- cleanse_ctr = (unsigned char)ctr;
+ memset_func(ptr, 0, len);
}
Also see Issue 455: Reimplement non-asm OPENSSL_cleanse() on OpenSSL's GitHub.
How do I remove the calls to OPENSSL_cleanse?
OK, so this is a different problem. You have to locate all callers and do something with each. It looks like there's about 185 places you will need to modify things:
$ cd openssl
$ grep -IR _cleanse * | wc -l
185
Instead of this:
void OPENSSL_cleanse(void *ptr, size_t len)
{
return;
// original OpenSSL code goes here
}
Maybe you can delete the function, and then:
#define OPENSSL_cleanse(x, y)
Then the function calls becomes a macro that simply disappears during optimization. Be sure to perform a make clean after changing from a function to a macro.
But I would not advise doing so.
The project compiled fine, but the time profiler still shows pretty expensive calls to OPENSSL_cleanse.
My guess here is either (1) you did not perform a make clean after the changes to the OpenSSL library, or (2) you compiled and linked to the wrong version of the OpenSSL library. But I could be wrong on both.
You can see what your executable's runtime dependencies are with otool -L. Make sure its the expected one. Also keep in mind OpenSSL does not use -install_name.
Before you run your executable, you can set DYLD_LIBRARY_PATH to ensure the dylib you are modifying is loaded. Also see the dyld(1) man pages.

Analyze backtrace of a crash occurring due to a faulty library

In my application I have setup signal handler to catch Segfaults, and print bactraces.
My application loads some plugins libraries, when process starts.
If my application crashes with a segfault, due to an error in the main executable binary, I can analyze the backtrace with:
addr2line -Cif -e ./myapplication 0x4...
It accurately displays the function and the source_file:line_no
However how do analyze if the crash occurs due to an error in the plugin as in the backtrace below?
/opt/myapplication(_Z7sigsegvv+0x15)[0x504245]
/lib64/libpthread.so.0[0x3f1c40f500]
/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]
/opt/myapplication/modules/myplugin.so(_Z11myplugin_reqmodP12CONNECTION_TP7Filebuf+0x68)[0x7f5588fe51e8]
/opt/myapplication(_ZN10Processors7ExecuteEiP12CONNECTION_TP7Filebuf+0x5b)[0x4e584b]
/opt/myapplication(_Z15process_requestP12CONNECTION_TP7Filebuf+0x462)[0x4efa92]
/opt/myapplication(_Z14handle_requestP12CONNECTION_T+0x1c6d)[0x4d4ded]
/opt/myapplication(_Z13process_entryP12CONNECTION_T+0x240)[0x4d79c0]
/lib64/libpthread.so.0[0x3f1c407851]
/lib64/libc.so.6(clone+0x6d)[0x3f1bce890d]
Both my application and plugin libraries have been compiled with gcc and are unstripped.
My application when executed, loads the plugin.so with dlopen
Unfortunately, the crash is occurring at a site where I cannot run the application under gdb.
Googled around frantically for an answer but all sites discussing backtrace and addr2line exclude scenarios where analysis of faulty plugins may be required.
I hope some kind-hearted hack knows solution to this dilemma, and can share some insights. It would be so invaluable for fellow programmers.
Tons of thanks in advance.
Here are some hints that may help you debug this:
The address in your backtrace is an address in the address space of the process at the time it crashed. That means that, if you want to translate it into a 'physical' address relative to the start of the .text section of your library, you have to subtract the start address of the relevant section of pmap from the address in your backtrace.
Unfortunately, this means that you need a pmap of the process before it crashed. I admittedly have no idea whether loading addresses for libraries on a single system are constant if you close and rerun it (imaginably there are security features which randomize this), but it certainly isn't portable across systems, as you have noticed.
In your position, I would try:
demangling the symbol names with c++filt -n or manually. I don't have a shell right now, so here is my manual attempt: _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi is ICAPSection::process(CONNECTION_T *, Filebuf *, int). This may already be helpful. If not:
use objdump or nm (I'm pretty sure they can do that) to find the address corresponding to the mangled name, then add the offset (+0x6af as per your stacktrace) to this, then look up the resulting address with addr2line.
us2012's answer was quite the trick required to solve the problem. I am just trying to restate it here just to help any other newbie struggling with the same problem, or if somebody wishes to offer improvements.
In the backtrace it is clearly visible that the flaw exists in the code for myplugin.so. And the backtrace indicates that it exists at:
/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]
The problem of locating the line corresponding to this fault cannot be determined as simplistically as:
addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x7f5588fe4bbf
The correct procedure here would be to use nm or objdump to determine the address pointing to the mangled name. (Demangling as done by us2012 is not really necessary at this point). So using:
nm -Dlan /opt/myapplication/modules/myplugin.so | grep "_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi"
I get:
0000000000008510 T _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi /usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:518
Interesting to note here is that myplugin.cpp:518 actually points to the line where the opening "{" of the function ICAPSection::process(CONNECTION_T *, Filebuf *, int)
Next we add 0x6af to the address (revealed by the nm output above) 0000000000008510 using linux shell command
printf '0x%x\n' $(( 0x0000000000008510 + 0x6af ))
And that results in 0x8bbf
And this is the actual source_file:line_no of the faulty code, and can be precisely determined with addr2line as:
addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x8bbf
Which displays:
std::char_traits<char>::length(char const*)
/usr/include/c++/4.4/bits/char_traits.h:263
std::string::assign(char const*)
/usr/include/c++/4.4/bits/basic_string.h:970
std::string::operator=(char const*)
/usr/include/c++/4.4/bits/basic_string.h:514
??
/usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:622
I am not too sure why the function name was not displayed here, but myplugin.cpp:622 was quite precisely where the fault was.

How to execute a bitcode file with LLVM 3.3?

I'm starting to program with LLVM, and trying to execute a bitcode.
I came up to this code, adapted from old examples (my doubt is in the creation of the MemoryBuffer, getFile(string) does not exist anymore):
string *errorString = new string;
LLVMContext context;
OwningPtr<MemoryBuffer> *mb = new OwningPtr<MemoryBuffer>;
MemoryBuffer::getFileOrSTDIN(argv[1], *mb);
Module *m = ParseBitcodeFile(mb->take(), context, errorString);
ExecutionEngine *ee = EngineBuilder(m).create();
Function *main = m->getFunction("main");
From this line on nothing works (segmentation fault)
1 - "standard" approach?
void * f = ee->getPointerToFunction(main);
void (*FP)() = (void (*)()) f;
2 - lli's approach, not sure about the '0' for envp
vector<string> *argList = new vector<string>;
ee->runFunctionAsMain(main, *argList, 0);
3 - a generalization of 2.
vector<struct GenericValue> *argList = new vector<struct GenericValue>;
ee->runFunction(main, *argList);
The lli tool is your reference here. As an official LLVM tool and part of the repository and releases, it is always up to date with the latest LLVM APIs. The file tools/lli/lli.cpp is just ~500 lines of code, much of it header files, option definitions and comments. The main function contains the exact flow of execution and is cleanly structured and commented.
You can pick one of two approaches:
Start with lli.cpp as is, gradually stripping things you don't need.
Take the relevant parts from lli.cpp into your own main file.
If the problem is rather with your main, you can always find examples of bitcode files that actually run with lli within the LLVM tests - test/ExecutionEngine - most tests there are bitcode files on which lli is invoked and runs successfully.
After running into the same problem as you, I searched through lli.cpp for all non-optional invocations to modules, enginebuilders etc...
I believe what you are missing is a call to "ee->runStaticConstructorDestructors(false)"
Atleast, this fixed the issue for me
Note: This is under llvm3.4, but I have verified that the same instruction also exists in llvm3.1, indicating it propably exists in 3.3 aswell.