Understanding Clang CFG for try/catch statements - llvm

I'm trying to understand Clang's CFG by looking at its dumped output and it's unclear to me how try/catch statements are represented in the CFG.
Consider this little snippet:
int func(int x);
int func2(int x) {
try {
return func(x);
} catch(...) {
return 0;
}
}
The dumped CFG is the following:
$ clang++ -Xclang -analyze -Xclang -analyzer-checker=debug.DumpCFG -fsyntax-only test.cpp
int func2(int x)
[B4 (ENTRY)]
Succs (1): B3
[B1]
T: try ...
Succs (1): B2
[B2]
catch (...):
1: catch (...) {
[B2.3]}
2: 0
3: return [B2.2];
Preds (1): B1
Succs (1): B0
[B3]
1: func
2: [B3.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(int))
3: x
4: [B3.3] (ImplicitCastExpr, LValueToRValue, int)
5: [B3.2]([B3.4])
6: return [B3.5];
Preds (1): B4
Succs (1): B0
[B0 (EXIT)]
Preds (2): B2 B3
I do not understand how the B1 basic block is linked to the others. The entry block seemingly jumps directly to B3 which contains the body of the try{} statement. Then, B3 has the exit block as its only successor. So B1 and B2 seems to be unlinked from the main flow of the function.
How do I have to interpret the CFG in this case?

clang's CFG support for try statements isn't complete.
12 years ago, clang used to add an edge from every function call to these exception blocks. But https://github.com/llvm/llvm-project/commit/04c6851cd6053c638e68bf1d7b99dda14ea267fb undid that in the name of build performance (exceptions aren't only difficult to reason about for humans), and that's how things still mostly look today: https://github.com/llvm/llvm-project/blob/d677a7cb056b17145a50ec8ca2ab6d5f4c494749/clang/lib/Analysis/CFG.cpp#L2636
There's a bool to turn these edges on, but it's not hooked up to any clang commandline flag.
So these CFG blocks for try statements are mostly not used. They have the catch blocks as successors, and reachable code analysis just treats them as additional roots (essentially assuming that it's always possible for something in the try block to to throw). (See mentions of "CXXTryStmt" in https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/AnalysisBasedWarnings.cpp)
The one thing that does add an edge to a CFG block for a try statement is an explicit throw in try try body:
% cat foo.cc
int func() {
try {
throw 0;
} catch(...) {
return 0;
}
}
% clang++ -Xclang -analyze -Xclang -analyzer-checker=debug.DumpCFG -fsyntax-only foo.cc
int func()
[B4 (ENTRY)]
Succs (1): B3
[B1]
T: try ...
Preds (1): B3
Succs (1): B2
[B2]
catch (...):
1: catch (...) {
[B2.3]}
2: 0
3: return [B2.2];
Preds (1): B1
Succs (1): B0
[B3]
1: 0
2: throw [B3.1]
Preds (1): B4
Succs (1): B1
[B0 (EXIT)]
Preds (1): B2

Related

gdb breakpoint at the end of void function [duplicate]

I have a C++ function which has many return statements at various places. How to set a breakpoint at the return statement where the function actually returns ?
And what does "break" command without argument means?
Contrary to answers so far, most compilers will create a single return assembly instruction, regardless of how many return statements are in the function (it is convenient for the compiler to do that, so there is only a single place to perform all the stack frame cleanup).
If you wanted to stop on that instruction, all you have to do is disas and look for retq (or whatever the return instruction for your processor is), and set a breakpoint on it. For example:
int foo(int x)
{
switch(x) {
case 1: return 2;
case 2: return 3;
default: return 42;
}
}
int main()
{
return foo(0);
}
(gdb) disas foo
Dump of assembler code for function foo:
0x0000000000400448 <+0>: push %rbp
0x0000000000400449 <+1>: mov %rsp,%rbp
0x000000000040044c <+4>: mov %edi,-0x4(%rbp)
0x000000000040044f <+7>: mov -0x4(%rbp),%eax
0x0000000000400452 <+10>: mov %eax,-0xc(%rbp)
0x0000000000400455 <+13>: cmpl $0x1,-0xc(%rbp)
0x0000000000400459 <+17>: je 0x400463 <foo+27>
0x000000000040045b <+19>: cmpl $0x2,-0xc(%rbp)
0x000000000040045f <+23>: je 0x40046c <foo+36>
0x0000000000400461 <+25>: jmp 0x400475 <foo+45>
0x0000000000400463 <+27>: movl $0x2,-0x8(%rbp)
0x000000000040046a <+34>: jmp 0x40047c <foo+52>
0x000000000040046c <+36>: movl $0x3,-0x8(%rbp)
0x0000000000400473 <+43>: jmp 0x40047c <foo+52>
0x0000000000400475 <+45>: movl $0x2a,-0x8(%rbp)
0x000000000040047c <+52>: mov -0x8(%rbp),%eax
0x000000000040047f <+55>: leaveq
0x0000000000400480 <+56>: retq
End of assembler dump.
(gdb) b *0x0000000000400480
Breakpoint 1 at 0x400480
(gdb) r
Breakpoint 1, 0x0000000000400480 in foo ()
(gdb) p $rax
$1 = 42
You can use reverse debugging to find out where function actually returns. Finish executing current frame, do reverse-step and then you should stop at just returned statement.
(gdb) record
(gdb) fin
(gdb) reverse-step
Break on all retq of current function
This Python command puts a breakpoint on every retq instruction of the current function:
class BreakReturn(gdb.Command):
def __init__(self):
super().__init__(
'break-return',
gdb.COMMAND_RUNNING,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
frame = gdb.selected_frame()
# TODO make this work if there is no debugging information, where .block() fails.
block = frame.block()
# Find the function block in case we are in an inner block.
while block:
if block.function:
break
block = block.superblock
start = block.start
end = block.end
arch = frame.architecture()
pc = gdb.selected_frame().pc()
instructions = arch.disassemble(start, end - 1)
for instruction in instructions:
if instruction['asm'].startswith('retq '):
gdb.Breakpoint('*{}'.format(instruction['addr']))
BreakReturn()
Source it with:
source gdb.py
and use the command as:
break-return
continue
You should now be at retq.
Step until retq
Just for fun, another implementation that stops when a retq is found (less efficient of because no hardware support):
class ContinueReturn(gdb.Command):
def __init__(self):
super().__init__(
'continue-return',
gdb.COMMAND_RUNNING,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
thread = gdb.inferiors()[0].threads()[0]
while thread.is_valid():
gdb.execute('ni', to_string=True)
frame = gdb.selected_frame()
arch = frame.architecture()
pc = gdb.selected_frame().pc()
instruction = arch.disassemble(pc)[0]['asm']
if instruction.startswith('retq '):
break
ContinueReturn()
This will ignore your other breakpoints. TODO: can be avoided?
Not sure if it is faster or slower than reverse-step.
A version that stops at a given opcode can be found at: https://stackoverflow.com/a/31249378/895245
break without arguments stops execution at the next instruction in the currently selected stack frame. You select strack frames via the frame or up and down commands. If you want to debug the point where you are actually leaving the current function, select the next outer frame and break there.
rr reverse debugging
Similar to GDB record mentioned at https://stackoverflow.com/a/3649698/895245 , but much more functional as of GDB 7.11 vs rr 4.1.0 in Ubuntu 16.04.
Notably, it deals with AVX correctly:
gdb reverse debugging fails with "Process record does not support instruction 0xf0d at address"
"target record-full" in gdb makes "n" command fail on printf with "Process record does not support instruction 0xc5 at address 0x7ffff7dee6e7"?
which prevents it from working with the default standard library calls.
Install Ubuntu 16.04:
sudo apt-get install rr linux-tools-common linux-tools-generic linux-cloud-tools-generic
sudo cpupower frequency-set -g performance
But also consider compiling from source to get the latest updates, it was not hard.
Test program:
int where_return(int i) {
if (i)
return 1;
else
return 0;
}
int main(void) {
where_return(0);
where_return(1);
}
compile and run:
gcc -O0 -ggdb3 -o reverse.out -std=c89 -Wextra reverse.c
rr record ./reverse.out
rr replay
Now you are left inside a GDB session, and you can properly reverse debug:
(rr) break main
Breakpoint 1 at 0x56057c458619: file a.c, line 9.
(rr) continue
Continuing.
Breakpoint 1, main () at a.c:9
9 where_return(0);
(rr) step
where_return (i=0) at a.c:2
2 if (i)
(rr) finish
Run till exit from #0 where_return (i=0) at a.c:2
main () at a.c:10
10 where_return(1);
Value returned is $1 = 0
(rr) reverse-step
where_return (i=0) at a.c:6
6 }
(rr) reverse-step
5 return 0;
We are now on the correct return line.
If you can change the source code, you might use some dirty trick with the preprocessor:
void on_return() {
}
#define return return on_return(), /* If the function has a return value != void */
#define return return on_return() /* If the function has a return value == void */
/* <<<-- Insert your function here -->>> */
#undef return
Then set a breakpoint to on_return and go one frame up.
Attention: This will not work, if a function does not return via a return statement. So ensure, that it's last line is a return.
Example (shamelessly copied from C code, but will work also in C++):
#include <stdio.h>
/* Dummy function to place the breakpoint */
void on_return(void) {
}
#define return return on_return()
void myfun1(int a) {
if (a > 10) return;
printf("<10\n");
return;
}
#undef return
#define return return on_return(),
int myfun2(int a) {
if (a < 0) return -1;
if (a > 0) return 1;
return 0;
}
#undef return
int main(void)
{
myfun1(1);
myfun2(2);
}
The first macro will change
return;
to
return on_return();
Which is valid, since on_return also returns void.
The second macro will change
return -1;
to
return on_return(), -1;
Which will call on_return() and then return -1 (thanks to the ,-operator).
This is a very dirty trick, but despite using backwards-stepping, it will work in multi-threaded environments and inlined functions, too.
Break without argument sets a breakpoint at the current line.
There is no way for a single breakpoint to catch all return paths. Either set a breakpoint at the caller immediately after it returns, or break at all return statements.
Since this is C++, I suppose you could create a local sentry object, and break on its destructor, though.

Create an Arduino ESP8266 library

I would like to create a Arduino library for an ESP8266or ESP32 microcontroller. I wrote a test library which running on an Arduino Nano board with no problem. Here the library cpp file:
#include "Test.h"
Test::Test(){
}
uint32_t Test::libTest(strcttest* t){
uint32_t w;
w = t->a;
return w;
}
Here's the the header file :
#include <Arduino.h>
typedef struct {
uint32_t a;
uint32_t b;
}strcttest;
class Test
{
public:
Test();
uint32_t libTest(strcttest* t);
private:
};
And last but not least the Arduino ino file:
#include <Test.h>
//Instante Test
Test t;
void setup() {
// put your setup code here, to run once:
Serial.begin(9600);
Serial.println("Start");
}
void loop() {
// put your main code here, to run repeatedly:
//Create structure
strcttest *tt;
tt->a=1;
tt->b=2;
//Output result
Serial.println (t.libTest(tt));
delay(1000);
}
Every compile fine with an Arduino Nano board as well as with ESP8266/ESP32 boards. When I run it on the Nano Board i get the expected result:
Start
1
1
1
1
1
1
1
1
1
...
When I run it on the ESP8266 board I get the following crash result:
l*⸮⸮⸮⸮CI>⸮⸮⸮HB⸮⸮Start
Exception (28):
epc1=0x402024f8 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000
ctx: cont
sp: 3ffef7d0 end: 3ffef9a0 offset: 01a0
>>>stack>>>
3ffef970: feefeffe 00000000 3ffee950 40201eb4
3ffef980: feefeffe feefeffe 3ffee96c 40202340
3ffef990: feefeffe feefeffe 3ffee980 40100108
<<<stack<<<
7!a!*6⸮⸮⸮Start
Exception (28):
epc1=0x402024f8 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000
ctx: cont
sp: 3ffef7d0 end: 3ffef9a0 offset: 01a0
>>>stack>>>
3ffef970: feefeffe 00000000 3ffee950 40201eb4
3ffef980: feefeffe feefeffe 3ffee96c 40202340
3ffef990: feefeffe feefeffe 3ffee980 40100108
<<<stack<<<
ĜBs⸮`⸮"⸮⸮Start
...
And last but not least on the ESP Development board I receive:
i%M/⸮`⸮i%M7
⸮⸮%Q=qU=\Md⸮aGd<$⸮Start
Guru Meditation Error of type LoadProhibited occurred on core 1. Exception was unhandled.
Register dump:
PC : 0x400dde93 PS : 0x00060030 A0 : 0x800d0570 A1 : 0x3ffc7390
A2 : 0x3ffc1c30 A3 : 0x00000000 A4 : 0x0800001c A5 : 0xffffffff
A6 : 0xffffffff A7 : 0x00060d23 A8 : 0x800832e9 A9 : 0x3ffc7380
A10 : 0x00000003 A11 : 0x00060023 A12 : 0x00060020 A13 : 0x00000003
A14 : 0x00000001 A15 : 0x00000000 SAR : 0x0000001f EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000 LBEG : 0x400014fd LEND : 0x4000150d LCOUNT : 0xffffffff
Backtrace: 0x400dde93:0x3ffc7390 0x400d0570:0x3ffc73b0 0x400d79b0:0x3ffc73d0
CPU halted.
So my question is: What I am doing wrong. What do i miss. What I haven't understood with Arduino IDE, cpp, pointers, etc.
Sorry I forgot: I use Arduino IDE 1.8.2
strcttest *tt; is your problem. You're not allocating memory for and creating an object of type strcttest - you're merely allocating memory for a pointer to an object of that type. Basically, the code should crash everywhere when your code gets to the line tt->a=1; The fact that it doesn't when run on the Nano is basically dumb luck..
Think of the case where you have a char* variable and then try to copy a string to it - it will crash too, since you dont have any storage space for the string itself - you only have a few bytes allocated that store the address of the string.
The following is a more reasonable implementation of your void loop() function:
void loop() {
// put your main code here, to run repeatedly:
//Create structure
strcttest tt;
tt.a=1;
tt.b=2;
//Output result
Serial.println (t.libTest(&tt));
delay(1000);
}
Another (slower, due to use of new and delete) implementation may look like this:
void loop() {
// put your main code here, to run repeatedly:
//Create structure
strcttest *tt = new strcttest;
tt->a=1;
tt->b=2;
//Output result
Serial.println (t.libTest(tt));
delete tt;
delay(1000);
}
For ESP32 and ESP8266, have a excellent tool to help in crashes situations,
like that You report.
This integrates to Arduino IDE
See it in: https://github.com/me-no-dev/EspExceptionDecoder

Interfacing C++ with Rust - returning CString panics

I am trying to call some functions written in Rust from C++. So far I've been quite successful but I still have one little problem with a CString-related panic during runtime.
The function hello is supposed to take an input string, concatenate it with some other string and return the product.
Here's my fun.rs:
use std::ffi::CString;
#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
a + b
}
#[no_mangle]
pub extern "C" fn hello(cs: CString) -> CString {
let slice = cs.to_str().unwrap();
let mut s = "Hello, ".to_string();
s = s + slice;
CString::new(&s[..]).unwrap() // runtime error
// CString::new(cs).unwrap() // empty string if no borrow
// cs // works if no borrow, but this is not what I meant
}
Here's main.cpp:
#include <iostream>
using namespace std;
extern "C" {
int add(int a, int b);
const char* hello(const char*x);
}
int main()
{
int a, b;
cin >> a >> b;
cout << add(a,b) << ";" << hello("Pawel") << std::endl;
return 0;
}
And makefile:
rust:
rustc --crate-type=staticlib -C panic=abort fun.rs
cpp:
g++ -c main.cpp
link:
g++ main.o -L . libfun.a -o main -lpthread -ldl -lgcc_s -lc -lm -lrt -lutil
Commands to run executable:
$ make rust
$ make cpp
$ make link
$ ./main
1 2
Executable output:
1 2
thread '<unnamed>' panicked at 'index 18446744073709551615 out of range for slice of length 0', ../src/libcore/slice.rs:549
note: Run with `RUST_BACKTRACE=1` for a backtrace..
Backtrace:
stack backtrace:
1: 0x435d4f - std::sys::backtrace::tracing::imp::write::h46e546df6e4e4fe6
2: 0x44405b - std::panicking::default_hook::_$u7b$$u7b$closure$u7d$$u7d$::h077deeda8b799591
3: 0x443c8f - std::panicking::default_hook::heb8b6fd640571a4f
4: 0x4099fe - std::panicking::rust_panic_with_hook::hd7b83626099d3416
5: 0x4442a1 - std::panicking::begin_panic::h941ea76fc945d925
6: 0x40b74a - std::panicking::begin_panic_fmt::h30280d4dd3f149f5
7: 0x44423e - rust_begin_unwind
8: 0x451d8f - core::panicking::panic_fmt::h2d3cc8234dde51b4
9: 0x452073 - core::slice::slice_index_len_fail::ha4faf37254d75f20
10: 0x40e903 - std::ffi::c_str::CStr::to_str::ha9642252376bab15
11: 0x4048e0 - hello
12: 0x40476f - main
13: 0x7f78ff688f44 - __libc_start_main
14: 0x404678 - <unknown>
15: 0x0 - <unknown>
Any ideas why Rust is panicking?
Rust's CString is not compatible with C's const char *. Here's the definition of CString from the standard library:
pub struct CString {
inner: Box<[u8]>,
}
This Box<[u8]> type is a fat pointer, i.e. a tuple that contains a pointer to the slice's items and the length of the slice as a usize.
What you should do instead is make your Rust function take a *const c_char argument and then call CStr::from_ptr with that pointer as the argument to obtain a CStr value.
As for the return value, there's a bit of a problem: your function allocates a new string and then returns a pointer to it. Again, you should return a *const c_char, which you can do by calling CString::into_raw on your concatenated CString value. But to avoid memory leaks, you must also provide a Rust function that will take back a pointer returned by hello and call CString::from_raw on it, which will recreate the CString. The CString's destructor will then run, freeing the memory.

GDB: find line where function exits [duplicate]

I have a C++ function which has many return statements at various places. How to set a breakpoint at the return statement where the function actually returns ?
And what does "break" command without argument means?
Contrary to answers so far, most compilers will create a single return assembly instruction, regardless of how many return statements are in the function (it is convenient for the compiler to do that, so there is only a single place to perform all the stack frame cleanup).
If you wanted to stop on that instruction, all you have to do is disas and look for retq (or whatever the return instruction for your processor is), and set a breakpoint on it. For example:
int foo(int x)
{
switch(x) {
case 1: return 2;
case 2: return 3;
default: return 42;
}
}
int main()
{
return foo(0);
}
(gdb) disas foo
Dump of assembler code for function foo:
0x0000000000400448 <+0>: push %rbp
0x0000000000400449 <+1>: mov %rsp,%rbp
0x000000000040044c <+4>: mov %edi,-0x4(%rbp)
0x000000000040044f <+7>: mov -0x4(%rbp),%eax
0x0000000000400452 <+10>: mov %eax,-0xc(%rbp)
0x0000000000400455 <+13>: cmpl $0x1,-0xc(%rbp)
0x0000000000400459 <+17>: je 0x400463 <foo+27>
0x000000000040045b <+19>: cmpl $0x2,-0xc(%rbp)
0x000000000040045f <+23>: je 0x40046c <foo+36>
0x0000000000400461 <+25>: jmp 0x400475 <foo+45>
0x0000000000400463 <+27>: movl $0x2,-0x8(%rbp)
0x000000000040046a <+34>: jmp 0x40047c <foo+52>
0x000000000040046c <+36>: movl $0x3,-0x8(%rbp)
0x0000000000400473 <+43>: jmp 0x40047c <foo+52>
0x0000000000400475 <+45>: movl $0x2a,-0x8(%rbp)
0x000000000040047c <+52>: mov -0x8(%rbp),%eax
0x000000000040047f <+55>: leaveq
0x0000000000400480 <+56>: retq
End of assembler dump.
(gdb) b *0x0000000000400480
Breakpoint 1 at 0x400480
(gdb) r
Breakpoint 1, 0x0000000000400480 in foo ()
(gdb) p $rax
$1 = 42
You can use reverse debugging to find out where function actually returns. Finish executing current frame, do reverse-step and then you should stop at just returned statement.
(gdb) record
(gdb) fin
(gdb) reverse-step
Break on all retq of current function
This Python command puts a breakpoint on every retq instruction of the current function:
class BreakReturn(gdb.Command):
def __init__(self):
super().__init__(
'break-return',
gdb.COMMAND_RUNNING,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
frame = gdb.selected_frame()
# TODO make this work if there is no debugging information, where .block() fails.
block = frame.block()
# Find the function block in case we are in an inner block.
while block:
if block.function:
break
block = block.superblock
start = block.start
end = block.end
arch = frame.architecture()
pc = gdb.selected_frame().pc()
instructions = arch.disassemble(start, end - 1)
for instruction in instructions:
if instruction['asm'].startswith('retq '):
gdb.Breakpoint('*{}'.format(instruction['addr']))
BreakReturn()
Source it with:
source gdb.py
and use the command as:
break-return
continue
You should now be at retq.
Step until retq
Just for fun, another implementation that stops when a retq is found (less efficient of because no hardware support):
class ContinueReturn(gdb.Command):
def __init__(self):
super().__init__(
'continue-return',
gdb.COMMAND_RUNNING,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
thread = gdb.inferiors()[0].threads()[0]
while thread.is_valid():
gdb.execute('ni', to_string=True)
frame = gdb.selected_frame()
arch = frame.architecture()
pc = gdb.selected_frame().pc()
instruction = arch.disassemble(pc)[0]['asm']
if instruction.startswith('retq '):
break
ContinueReturn()
This will ignore your other breakpoints. TODO: can be avoided?
Not sure if it is faster or slower than reverse-step.
A version that stops at a given opcode can be found at: https://stackoverflow.com/a/31249378/895245
break without arguments stops execution at the next instruction in the currently selected stack frame. You select strack frames via the frame or up and down commands. If you want to debug the point where you are actually leaving the current function, select the next outer frame and break there.
rr reverse debugging
Similar to GDB record mentioned at https://stackoverflow.com/a/3649698/895245 , but much more functional as of GDB 7.11 vs rr 4.1.0 in Ubuntu 16.04.
Notably, it deals with AVX correctly:
gdb reverse debugging fails with "Process record does not support instruction 0xf0d at address"
"target record-full" in gdb makes "n" command fail on printf with "Process record does not support instruction 0xc5 at address 0x7ffff7dee6e7"?
which prevents it from working with the default standard library calls.
Install Ubuntu 16.04:
sudo apt-get install rr linux-tools-common linux-tools-generic linux-cloud-tools-generic
sudo cpupower frequency-set -g performance
But also consider compiling from source to get the latest updates, it was not hard.
Test program:
int where_return(int i) {
if (i)
return 1;
else
return 0;
}
int main(void) {
where_return(0);
where_return(1);
}
compile and run:
gcc -O0 -ggdb3 -o reverse.out -std=c89 -Wextra reverse.c
rr record ./reverse.out
rr replay
Now you are left inside a GDB session, and you can properly reverse debug:
(rr) break main
Breakpoint 1 at 0x56057c458619: file a.c, line 9.
(rr) continue
Continuing.
Breakpoint 1, main () at a.c:9
9 where_return(0);
(rr) step
where_return (i=0) at a.c:2
2 if (i)
(rr) finish
Run till exit from #0 where_return (i=0) at a.c:2
main () at a.c:10
10 where_return(1);
Value returned is $1 = 0
(rr) reverse-step
where_return (i=0) at a.c:6
6 }
(rr) reverse-step
5 return 0;
We are now on the correct return line.
If you can change the source code, you might use some dirty trick with the preprocessor:
void on_return() {
}
#define return return on_return(), /* If the function has a return value != void */
#define return return on_return() /* If the function has a return value == void */
/* <<<-- Insert your function here -->>> */
#undef return
Then set a breakpoint to on_return and go one frame up.
Attention: This will not work, if a function does not return via a return statement. So ensure, that it's last line is a return.
Example (shamelessly copied from C code, but will work also in C++):
#include <stdio.h>
/* Dummy function to place the breakpoint */
void on_return(void) {
}
#define return return on_return()
void myfun1(int a) {
if (a > 10) return;
printf("<10\n");
return;
}
#undef return
#define return return on_return(),
int myfun2(int a) {
if (a < 0) return -1;
if (a > 0) return 1;
return 0;
}
#undef return
int main(void)
{
myfun1(1);
myfun2(2);
}
The first macro will change
return;
to
return on_return();
Which is valid, since on_return also returns void.
The second macro will change
return -1;
to
return on_return(), -1;
Which will call on_return() and then return -1 (thanks to the ,-operator).
This is a very dirty trick, but despite using backwards-stepping, it will work in multi-threaded environments and inlined functions, too.
Break without argument sets a breakpoint at the current line.
There is no way for a single breakpoint to catch all return paths. Either set a breakpoint at the caller immediately after it returns, or break at all return statements.
Since this is C++, I suppose you could create a local sentry object, and break on its destructor, though.

Exposing goto labels to symbol table

I want to know whether it is possible to expose goto label within a function to symbol table from C/C++
For instance, I want to make ret label of the following snippet appeared from the symbol table and can be referred using standard APIs such as dlsym().
Thanks for your help in advance!
#include <stdio.h>
int main () {
void *ret_p = &&ret;
printf("ret: %p\n", ret_p);
goto *ret_p;
return 1;
ret:
return 0;
}
Thanks to Marc Glisse's comment which is about using inline asm that specifies label, I could come up with a workaround for the question. The following example code snippet shows how I solved the problem.
#include <stdio.h>
int main () {
void *ret_p = &&ret;
printf("ret: %p\n", ret_p);
goto *ret_p;
return 1;
ret:
asm("RET:")
return 0;
}
This will add a symbol table entry as follows.
jikk#sos15-32:~$ gcc -Wl,--export-dynamic t.c -ldl
jikk#sos15-32:~$ readelf -s a.out
39: 08048620 0 FUNC LOCAL DEFAULT 13 __do_global_ctors_aux
40: 00000000 0 FILE LOCAL DEFAULT ABS t.c
41: 0804858a 0 NOTYPE LOCAL DEFAULT 13 RET
42: 08048612 0 FUNC LOCAL DEFAULT 13 __i686.get_pc_thunk.bx
43: 08049f20 0 OBJECT LOCAL DEFAULT 19 __DTOR_END__
jikk#sos15-32:~$ ./a.out
ret: 0x804858a
I'll further test this workaround the verify whether this produces any unexpected side effects.
Thanks