I prepared a C++ interface to a legacy Fortran library.
Some subroutines in the legacy library follow an ugly but usable status code convention to report errors, and I use such status codes to throw a readable exception from my C++ code: it works great.
On the other hand, sometimes the legacy library calls STOP (which terminates the program). And it often does it even though the condition is recoverable.
I would like to capture this STOP from within C++, and so far I have been unsuccessful.
The following code is simple, but exactly represents the problem at hand:
The Fortran legacy library fmodule.f90:
module fmodule
use iso_c_binding
contains
subroutine fsub(x) bind(c, name="fsub")
real(c_double) x
if(x>=5) then
stop 'x >=5 : this kills the program'
else
print*, x
end if
end subroutine fsub
end module fmodule
The C++ Interface main.cpp:
#include<iostream>
// prototype for the external Fortran subroutine
extern "C" {
void fsub(double& x);
}
int main() {
double x;
while(std::cin >> x) {
fsub(x);
}
return 0;
}
The compilation lines (GCC 4.8.1 / OS X 10.7.4; $ denotes command prompt ):
$ gfortran -o libfmodule.so fmodule.f90 -shared -fPIC -Wall
$ g++ main.cpp -L. -lfmodule -std=c++11
The run:
$ ./a.out
1
1.0000000000000000
2
2.0000000000000000
3
3.0000000000000000
4
4.0000000000000000
5
STOP x >=5 : this kills the program
How could I capture the STOP and, say, request another number. Notice that I do not want to touch the Fortran code.
What I have tried:
std::atexit: cannot "come back" from it once I have entered it
std::signal: STOP does not seem to throw a signal which I can capture
You can solve your problem by intercepting the call to the exit function from the Fortran runtime. See below. a.out is created with your code and the compilation lines you give.
Step 1. Figure out which function is called. Fire up gdb
$ gdb ./a.out
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
[...]
(gdb) break fsub
Breakpoint 1 at 0x400888
(gdb) run
Starting program: a.out
5
Breakpoint 1, 0x00007ffff7dfc7e4 in fsub () from ./libfmodule.so
(gdb) step
Single stepping until exit from function fsub,
which has no line number information.
stop_string (string=0x7ffff7dfc8d8 "x >=5 : this kills the programfmodule.f90", len=30) at /usr/local/src/gcc-4.7.2/libgfortran/runtime/stop.c:67
So stop_string is called. We need to know to which symbol this function corresponds.
Step 2. Find the exact name of the stop_string function. It must be in one of the shared libraries.
$ ldd ./a.out
linux-vdso.so.1 => (0x00007fff54095000)
libfmodule.so => ./libfmodule.so (0x00007fa31ab7d000)
libstdc++.so.6 => /usr/local/gcc/4.7.2/lib64/libstdc++.so.6 (0x00007fa31a875000)
libm.so.6 => /lib64/libm.so.6 (0x0000003da4000000)
libgcc_s.so.1 => /usr/local/gcc/4.7.2/lib64/libgcc_s.so.1 (0x00007fa31a643000)
libc.so.6 => /lib64/libc.so.6 (0x0000003da3c00000)
libgfortran.so.3 => /usr/local/gcc/4.7.2/lib64/libgfortran.so.3 (0x00007fa31a32f000)
libquadmath.so.0 => /usr/local/gcc/4.7.2/lib64/libquadmath.so.0 (0x00007fa31a0fa000)
/lib64/ld-linux-x86-64.so.2 (0x0000003da3800000)
I found it in (no surprise) the fortran runtime.
$ readelf -s /usr/local/gcc/4.7.2/lib64/libgfortran.so.3|grep stop_string
1121: 000000000001b320 63 FUNC GLOBAL DEFAULT 11 _gfortran_stop_string##GFORTRAN_1.0
2417: 000000000001b320 63 FUNC GLOBAL DEFAULT 11 _gfortran_stop_string
Step 3. Write a function that will replace that function
I look for the precise signature of the function in the source code (/usr/local/src/gcc-4.7.2/libgfortran/runtime/stop.c see gdb session)
$ cat my_exit.c
#define _GNU_SOURCE
#include <stdio.h>
void _gfortran_stop_string (const char *string, int len)
{
printf("Let's keep on");
}
Step 4. Compile a shared object exporting that symbol.
gcc -Wall -fPIC -c -o my_exit.o my_exit.c
gcc -shared -fPIC -Wl,-soname -Wl,libmy_exit.so -o libmy_exit.so my_exit.o
Step 5. Run the program with LD_PRELOAD so that our new function has precedence over the one form the runtime
$ LD_PRELOAD=./libmy_exit.so ./a.out
1
1.0000000000000000
2
2.0000000000000000
3
3.0000000000000000
4
4.0000000000000000
5
Let's keep on 5.0000000000000000
6
Let's keep on 6.0000000000000000
7
Let's keep on 7.0000000000000000
There you go.
Since what you want would result in non-portable code anyway, why not just subvert the exit mechanism using the obscure long jump mechanism:
#include<iostream>
#include<csetjmp>
#include<cstdlib>
// prototype for the external Fortran subroutine
extern "C" {
void fsub(double* x);
}
volatile bool please_dont_exit = false;
std::jmp_buf jenv;
static void my_exit_handler() {
if (please_dont_exit) {
std::cout << "But not yet!\n";
// Re-register ourself
std::atexit(my_exit_handler);
longjmp(jenv, 1);
}
}
void wrapped_fsub(double& x) {
please_dont_stop = true;
if (!setjmp(jenv)) {
fsub(&x);
}
please_dont_stop = false;
}
int main() {
std::atexit(my_exit_handler);
double x;
while(std::cin >> x) {
wrapped_fsub(x);
}
return 0;
}
Calling longjmp jumps right in the middle of the line with the setjmp call and setjmp returns the value passed as the second argument of longjmp. Otherwise setjmp returns 0. Sample output (OS X 10.7.4, GCC 4.7.1):
$ ./a.out
2
2.0000000000000000
6
STOP x >=5 : this kills the program
But not yet!
7
STOP x >=5 : this kills the program
But not yet!
4
4.0000000000000000
^D
$
No library preloading required (which anyway is a bit more involved on OS X than on Linux). A word of warning though - exit handlers are called in reverse order of their registration. One should be careful that no other exit handlers are registered after my_exit_handler.
Combining the two answers that use a custom _gfortran_stop_string function and longjmp, I thought that raising an exception inside the custom function would be similar, then catch in in the main code. So this came out:
main.cpp:
#include<iostream>
// prototype for the external Fortran subroutine
extern "C" {
void fsub(double& x);
}
int main() {
double x;
while(std::cin >> x) {
try { fsub(x); }
catch (int rc) { std::cout << "Fortran stopped with rc = " << rc <<std::endl; }
}
return 0;
}
catch.cpp:
extern "C" {
void _gfortran_stop_string (const char*, int);
}
void _gfortran_stop_string (const char *string, int len)
{
throw 666;
}
Then, compiling:
gfortran -c fmodule.f90
g++ -c catch.cpp
g++ main.cpp fmodule.o catch.o -lgfortran
Running:
./a.out
2
2.0000000000000000
3
3.0000000000000000
5
Fortran stopped with rc = 666
6
Fortran stopped with rc = 666
2
2.0000000000000000
3
3.0000000000000000
^D
So, seems to work :)
I suggest you fork your process before calling the fortran code and exit 0 (edit: if STOP exits with zero, you will need a sentinel exit code, clanky but does the job) after the fortran execution. That way every fortran call will finish in the same way: the same as if it had stopped. Or, if "STOP" ensure an error, throw the exception when the fortran code stops and send some other message when the fortran execution "completes" normaly.
Below is an example inspire from you code assuming a fortran "STOP" is an error.
int main() {
double x;
pid_t pid;
int exit_code_normal = //some value that is different from all STOP exit code values
while(std::cin >> x) {
pid = fork();
if(pid < 0) {
// error with the fork handle appropriately
} else if(pid == 0) {
fsub(x);
exit(exit_code_normal);
} else {
wait(&status);
if(status != exit_code_normal)
// throw your error message.
}
}
return 0;
}
The exit code could be a constant instead of a variable. I don't think it matters much.
Following a comment, it occurs that the result from the execution would be lost if it sits in the memory of the process (rather than, say, write to a file). If it is the case, I can think of 3 possibilities:
The fortran code messes a whole lot of memory during the call and letting the execution continue beyond the STOP is probably not a good idea in the first place.
The fortran code simply return some value (through it's argument if my fortran is not too rusty) and this could be relayed back to the parent easily through a shared memory space.
The execution of the fortran subroutine acts on an external system (ex: writes to a file) and no return values are expected.
In the 3rd case, my solution above works as is. I prefer it over some other suggested solution mainly because: 1) you don't have to ensure the build process is properly maintained 2) fortran "STOP" still behave as expected and 3) it requires very few lines of code and all the "fortran STOP workaround" logic sits in one single place. So in terms of long term maintenance, I much prefer that.
In the 2nd case, my code above needs small modification but still holds the advantages enumerated above at the price of minimal added complexity.
In the 1st case, you will have to mess with the fortran code no matter what.
Related
I'm trying to debug a C++ program. I'm on macOS, using CLion IDE, clang compiler, LLDB. I stop the program at a breakpoint (marked with >>):
UnicodeString unicodeFromFile(const std::string &file) {
std::ifstream input(file, std::ios::binary);
cout << "open? " << input.is_open() << endl;
>> std::vector<char> bytes(...);
I want to run po input.is_open(). LLDB reports:
(lldb) po input.is_open()
error: expression failed to parse:
error: <user expression 0>:1:7: call to member function 'is_open' is ambiguous
input.is_open()
~~~~~~^~~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.3.sdk/usr/include/c++/v1/fstream:1169:10: candidate function
bool is_open() const;
^
But it's obviously not ambiguous, since the previous line of code calls it without problem. And it's error message is only listing one function.
Previous to this, I had a different problem where it couldn't find the symbol at all. But I used the advice here, which got me to this point. That answer says to add this to to ~/.lldbinit:
settings set target.import-std-module true
My previous errors looked like:
(lldb) po input.is_open()
error: expression failed to parse:
error: Couldn't lookup symbols:
__ZNKSt3__114basic_ifstreamIcNS_11char_traitsIcEEE7is_openEv
When I try this trivial case with the most recent Xcode (or with TOT lldb) calling the is_open works:
> cat tryifstream.cpp
#include <fstream>
int
main() {
std::ifstream my_str("/tmp/tryifstream.cpp");
bool is_it = my_str.is_open();
return is_it ? 0 : 20;
}
> clang++ -g -O0 /tmp/tryifstream.cpp -o /tmp/a.out
> lldb /tmp/a.out
(lldb) b s -p return
Breakpoint 1: where = a.out`main + 88 at tryifstream.cpp:7:10, address = 0x000000010000379c
(lldb) run
Process 38065 launched: '/tmp/a.out' (arm64)
Process 38065 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x000000010000379c a.out`main at tryifstream.cpp:7
4 main() {
5 std::ifstream my_str("/tmp/tryifstream.cpp");
6 bool is_it = my_str.is_open();
-> 7 return is_it ? 0 : 20;
^
8 }
Target 0: (a.out) stopped.
(lldb) v is_it
(bool) is_it = true
(lldb) expr my_str.is_open()
(bool) $0 = true
If this simple case fails for you, you might try upgrading your Xcode. Getting the lldb expression parser to handle all of std templates is an area of ongoing work and newer lldb's are generally better at it.
If that simple case works for you, but your "real" case still doesn't, and you can make your project or some reduced reproducer available to us, please file a bug with the llvm.org bug reporter, including a copy of your reproducer. It will be hard to figure out what's going wrong in this case w/o being able to trace through the lookups.
I would like to learn if existing GDB for RISC-V supports Program Context aware breakpoints?
By program context aware breakpoints : I mean, when there is JAL or JALR instruction PC changes when there is a function call. in other cases in Function call ==> PC = PC + (Current Program Counter + 4)
in Function Return : PC = PC - (Return address (ra register value) ).
I have installed fedora(risc-V) on my ubuntu(virtual machine). Since it is virtual machine I can't print PC register value, that is why I couldn't check if it supports Program Context aware breakpoint or not?
My second question is : How can I print PC register value on my qemu risc-v virtual machine?
#include<stdio.h>
int check_prime(int a)
{
int c;
for (c=2;c<a;c++)
{
if (a%c == 0 ) return 0;
if (c == a-1 ) return 1;
}
}
void oddn(int a)
{
printf("oddn --> %d is an odd number \n",a);
if (check_prime(a)) printf("oddn --> %d is a prime number\n",a);
}
int main()
{
int a;
a=7;
if (check_prime(a)) printf("%d is a prime number \n",a);
if (a%2==1) oddn(a);
}
This is the program I am trying to breakpoint using GDB.
As you see on the picture it breaks twice(which should break once only).
It also gives error :
Error in testing breakpoint condition:
Invalid data type for function to be called
What you're looking for is documented here:
https://sourceware.org/gdb/current/onlinedocs/gdb/Convenience-Funs.html#index-_0024_005fstreq_002c-convenience-function
You should look at $_caller_is, $_caller_matches, $_any_caller_is, and $_any_caller_matches.
As an example, to check if the immediate caller is a particular function we could do this:
break functionD if ($_caller_is ("functionC"))
Then main -> functionD will not trigger the breakpoint, while main -> functionC -> functionD will trigger the breakpoint.
The convenience functions I listed all take a frame offset that can be used to specify which frame GDB will check (for $_caller_is and $_caller_matches) or to limit the range of frames checked (for $_any_caller_is and $_any_caller_matches).
I have a small function in written Haskell with the following type:
foreign export ccall sget :: Ptr CInt -> CSize -> Ptr CSize -> IO (Ptr CInt)
I am calling this from multiple C++ threads running concurrently (via
TBB). During this part of the execution of my program I can barely get a
load average above 1.4 even though I'm running on a six-core CPU (12
logical cores). I therefore suspect that either the calls into Haskell all
get funnelled through a single thread, or there is some significant
synchronization going on.
I am not doing any such thing explicitly, all the function does is operate
on the incoming data (after storing it into a Data.Vector.Storable), and
return the result back as a newly allocated array (from Data.Marshal.Array).
Is there anything I need to do to fully enable concurrent calls like this?
I am using GHC 8.6.5 on Debian Linux (bullseye/testing), and I am compiling with -threaded -O2.
Looking forward to reading some advice,
Sebastian
Using the simple example at the end of this answer, if I compile with:
$ ghc -O2 Worker.hs
$ ghc -O2 -threaded Worker.o caller.c -lpthread -no-hs-main -o test
then running it with ./test occupies only one core at 100%. I need to run it with ./test +RTS -N, and then on my 4-core desktop, it runs at 400% with a load average of around 4.0.
So, the RTS -N flag affects the number of parallel threads that can simultaneously run an exported Haskell function and there is no special action required (other than compiling with -threaded and running with +RTS -n) to fully utilize all available cores.
So, there must be something about your example that's causing the problem. It could be contention between threads over some shared data structure. Or, maybe parallel garbage collection is causing problems; I've observed parallel GC causing worse performance with increasing -N in a simple test case (details forgotten, sadly), so you could try turning off parallel GC with -qg or limiting the number of cores involved with -qn2 or something. To enable these options, you need to call hs_init_with_rtsopts() in place of the usual hs_init() as in my example.
If that doesn't work, I think you'll have to try to narrow down the problem and post a minimal example that illustrates the performance issue to get more help.
My example:
caller.c
#include "HsFFI.h"
#include "Rts.h"
#include "Worker_stub.h"
#include <pthread.h>
#define NUM_THREAD 4
void*
work(void* arg)
{
for (;;) {
fibIO(30);
}
}
int
main(int argc, char **argv)
{
hs_init_with_rtsopts(&argc, &argv);
pthread_t threads[NUM_THREAD];
for (int i = 0; i < NUM_THREAD; ++i) {
int rc = pthread_create(&threads[i], NULL, work, NULL);
}
for (int i = 0; i < NUM_THREAD; ++i) {
pthread_join(threads[i], NULL);
}
hs_exit();
return 0;
}
Worker.hs
module Worker where
import Foreign
fibIO :: Int -> IO Int
fibIO = return . fib
fib :: Int -> Int
fib n | n > 1 = fib (n-1) + fib (n-2)
| otherwise = 1
foreign export ccall fibIO :: Int -> IO Int
Lets say we have a structure with some variables.
Is it possible to values of those variables at a particular point of execution..?
One way might be to print each of them individually.
But my point is, is there a way to check values of all the variables in that structure at a particular point of time, without having to use printf or cout to print each variable value..?
Just wondering if this is possible atleast in gdb..!!
it is possible in gdb, no problem:
For example:
x.C
#include <iostream>
struct A {
int x;
int y;
};
int main(int argc,char **argv) {
A a;
a.x=10;
a.y=11;
std::cout << "Hello world" << std::endl;
}
compiling:
g++ -g -o x x.C
running on gdb
gdb x
(gdb) break main
Breakpoint 1 at 0x40096c: file x.C, line 10.
(gdb) run
Starting program: /home/jsantand/x
Breakpoint 1, main (argc=1, argv=0x7fffffffde98) at x.C:10
10 a.x=10;
(gdb) next
11 a.y=11;
(gdb) next
12 std::cout << "Hello world" << std::endl;
(gdb) print a
$1 = {x = 10, y = 11}
(gdb) quit
Doing that on your code, traces, etc... it will be be more difficult as C++ lacks reflection.
You could do it by hand or if you're adventurous, create something to generate automatically operator<< for your classes an structs/classes so that they provide a string representation. You need some basic C++ parser at least.
Last week I was a debugging a code and a weird situation came up: gdb passes through two different return clauses. I made a simple example that illustrates the situation:
#include <iostream>
using namespace std;
int test() {
string a = "asd";
string b = "asd";
while (true) {
if (a == b) {
return 0;
}
}
return -1;
}
int main() {
int result = test();
cout << "result: " << result << endl;
}
When debugging the code I got:
(gdb) b main
Breakpoint 1 at 0x1d4c: file example.cpp, line 19.
(gdb) r
Starting program: /Users/yuppienet/temp/a.out
Reading symbols for shared libraries +++. done
Breakpoint 1, main () at example.cpp:19
19 int result = test();
(gdb) s
test () at example.cpp:7
7 string a = "asd";
(gdb) n
8 string b = "asd";
(gdb) n
11 if (a == b) {
(gdb) n
12 return 0;
(gdb) n
15 return -1;
(gdb) n
16 }
(gdb) n
main () at example.cpp:20
20 cout << "result: " << result << endl;
(gdb) n
result: 0
21 }
(gdb) n
0x00001ab2 in start ()
I noted that even if gdb shows line 15, the return value is 0 (the finish command confirms this as well).
So the question is: why does gdb show line 15: return -1, even if the function is not really returning this value?
Thanks!
Edit:
I forgot to mention that I compiled with the following line:
g++ -Wall -pedantic -g -pg example.cpp
I suspect you're seeing the function epilogue. Your two strings have destructors, which are being implicitly called on return. Check out what the disassembly says to be sure, but I suspect that both return statements are mapping to something along the lines of:
stash return_value;
goto epilogue;
and correspondingly:
epilogue:
destroy a; // on the stack, so destructor will be called
destroy b;
really_return(stashed value);
The epilogue appears to come from line 15 as a side-effect of how g++ does line numbering - a fairly simple format, really just a list of tags of the form "address X comes from line number Y" - and so it's reporting 15 as the closest match. Confusing in this case, but correct a lot of the time.
Probably because the program counter register passes through the instructions that best map to the final return, i.e. the function's exit sequence. The actual return value is probably kept in a register, so the first return just loads the proper value and jumps to the end of the function, and then that address is "back-mapped" to the source code line of the final return.
You don't say, but if you compiled with optimization that is exactly the kind of behavior you would see in gdb. You see the first line setting up the return value, and then it jumps to the real return instruction but in C++ you see the entire thing including the return value.