GNU Fortran and C interoperability - c++

I have a large, mixed C/Fortran, code base, currently compiled using the Intel tools on Windows. I've been asked to port it to the GNU tools, on Linux. More or less at random, I've selected version 4.8.
Where a C function is called from Fortran, the interoperability often looks like this:
// C code:
void PRINTSTR(char *str, size_t len) {
for(int ii = 0; ii < len; ii++) {
putchar(str[ii]);
}
putchar('\n');
}
!Fortran code:
program test
implicit none
call printstr("Hello, world.")
end
The Intel Fortran compiler always generates upper-case symbols, so this works fine. But the GNU Fortran compiler always generates lower-case symbols and so there is a linker error.
The GNU Fortran compiler used to have an option called -fcase-upper which made it generate upper-case symbols, but it seems this was too configurable for everyone's good and it has been removed (I'm not exactly sure when).
It's possible to use the ISO_C_BINDING facility to force the compiler to generate a case-sensitive name:
program test
interface
subroutine printstr(str) bind(C, name='PRINTSTR')
character :: str(*)
end subroutine
end interface
call printstr("Hello, world.")
end
This resolves the linker error but it changes how string parameters are handled; the length parameter is no longer provided. So to use this method, I'd not only have to add interface definitions for every function that currently works this way, but I'd also have to change how strings are handled in every call to such a function, making sure that all strings are null-terminated.
I could go through and make all such functions lower-case, but of course the Intel compiler still generates upper-case symbols, so that would break the existing build.
Since there are ~2,000 such functions, that seems an infeasible amount of work. So, my question is this: How can I resolve the link errors without changing the function call semantics and without breaking the existing build using the Intel compilers?

To solve the linker error you can do it other way around. Use Intel compiler option names to convert external names to lowercase to match the default GNU Fortran option. And convert name in c to lowercase too:
void printstr(char *str, size_t len) {...}
Personally I would recommend using -funderscoring and Intel's /assume:underscore to distinguish functions that are intended for interoperability.
// C code:
void printstr_(char *str, size_t len) {...}
!Fortran code:
program test
implicit none
call printstr("Hello, world.")
end

Related

C++ function instrumentation via clang++'s -finstrument-functions : how to ignore internal std library calls?

Let's say I have a function like:
template<typename It, typename Cmp>
void mysort( It begin, It end, Cmp cmp )
{
std::sort( begin, end, cmp );
}
When I compile this using -finstrument-functions-after-inlining with clang++ --version:
clang version 11.0.0 (...)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: ...
The instrument code explodes the execution time, because my entry and exit functions are called for every call of
void std::__introsort_loop<...>(...)
void std::__move_median_to_first<...>(...)
I'm sorting a really big array, so my program doesn't finish: without instrumentation it takes around 10 seconds, with instrumentation I've cancelled it at 10 minutes.
I've tried adding __attribute__((no_instrument_function)) to mysort (and the function that calls mysort), but this doesn't seem to have an effect as far as these standard library calls are concerned.
Does anyone know if it is possible to ignore function instrumentation for the internals of a standard library function like std::sort? Ideally, I would only have mysort instrumented, so a single entry and a single exit!
I see that clang++ sadly does not yet support anything like finstrument-functions-exclude-function-list or finstrument-functions-exclude-file-list, but g++ does not yet support -finstrument-functions-after-inlining which I would ideally have, so I'm stuck!
EDIT: After playing more, it would appear the effect on execution-time is actually less than that described, so this isn't the end of the world. The problem still remains however, because most people who are doing function instrumentation in clang will only care about the application code, and not those functions linked from (for example) the standard library.
EDIT2: To further highlight the problem now that I've got it running in a reasonable time frame: the resulting trace that I produce from the instrumented code with those two standard library functions is 15GB. When I hard code my tracing to ignore the two function addresses, the resulting trace is 3.7MB!
I've run into the same problem. It looks like support for these flags was once proposed, but never merged into the main branch.
https://reviews.llvm.org/D37622
This is not a direct answer, since the tool doesn't support what you want to do, but I think I have a decent work-around. What I wound up doing was creating a "skip list" of sorts. In the instrumented functions (__cyg_profile_func_enter and __cyg_profile_func_exit), I would guess the part that is contributing most to your execution time is the printing. If you can come up with a way of short-circuiting the profile functions, that should help, even if it's not the most ideal. At the very least it will limit the size of the output file.
Something like
#include <stdint.h>
uintptr_t skipAddrs[] = {
// assuming 64-bit addresses
0x123456789abcdef, 0x2468ace2468ace24
};
size_t arrSize = 0;
int main(void)
{
...
arrSize = sizeof(skipAddrs)/sizeof(skipAddrs[0]);
// https://stackoverflow.com/a/37539/12940429
...
}
void __cyg_profile_func_enter (void *this_fn, void *call_site) {
for (size_t idx = 0; idx < arrSize; idx++) {
if ((uintptr_t) this_fn == skipAddrs[idx]) {
return;
}
}
}
I use something like objdump -t binaryFile to examine the symbol table and find what the addresses are for each function.
If you specifically want to ignore library calls, something that might work is examining the symbol table of your object file(s) before linking against libraries, then ignoring all the ones that appear new in the final binary.
All this should be possible with things like grep, awk, or python.
You have to add attribute __attribute__((no_instrument_function)) to the functions that should not be instrumented. Unfortunately it is not easy to make it work with C/C++ standard library functions because this feature requires editing all the C++ library functions.
There are some hacks you can do like #define existing macros from include/__config to add this attribute as well. e.g.,
-D_LIBCPP_INLINE_VISIBILITY=__attribute__((no_instrument_function,internal_linkage))
Make sure to append existing macro definition with no_instrument_function to avoid unexpected errors.

C equivalent of IOMemoryDescriptor class

I'm writing some C code using IOKit, and need to use IOMemoryDescriptor methods. Unfortunately, I can only compile pure C sources, and that is a C++ class. So, I'm asking if there is some C interface that lets me perform the same operations.
Specifically, I want a function that does pretty much this, but that can be compiled as C:
#include <IOKit/IOMemoryDescriptor.h>
extern "C" void CopyOut(mach_vm_address_t src, void *dst, size_t size)
{
IOMemoryDescriptor *memDesc;
memDesc = IOMemoryDescriptor::withAddressRange(src, size, kIODirectionOut, current_task());
// Error checking removed for brevity
memDesc->prepare();
memDesc->readBytes(0, dst, size);
memDesc->complete();
memDesc->release();
}
Being based on BSD, xnu has inherited some of BSD's kernel APIs, including the copyin and copyout functions. They are declared in libkern.h, and they do pretty much what you're using an IOMemoryDescriptor for, but nothing else.
You do mention you're using IOKit - if you need anything beyond this out of IOKit's functionality, you'll pretty much have to go with a C++ compiler, or use C to call mangled names directly.
If you're new to using a weird compiler for building kexts, I'll just warn you that kernel code for x86_64 must not use the red zone of the stack, as that can't exist due to interrupt handling. If your compiler assumes a red zone is present, you'll get bizarre crashes. Clang and gcc have corresponding flags for disabling the red zone. (-mno-red-zone, if I remember correctly, automatically activated via the kernel mode flag) Even if you're using a non-official compiler, linking against an object file built with clang's C++ compiler at the last stage should work fine for wrapping any other C++ APIs.

How to call indirectly a C function

Let's suppose I have the following function:
int func(int a, char* b, float c)
{
return 42;
}
I am curios if there is a possibility to call this function without:
explicitly calling it (func(1, "abc", 2.4))
creating a function pointer to it, and then calling it via the function pointer.
The function is written in C (or C++) and might be located either in a library (DLL on Windows) or somewhere compiled in the current application. For now let's assume there are no name mangling issues.
However, I know the following:
the name of the function.
the number and type of parameters as text based input (such as "int", "char*", "float").
its return type
I'm open to any suggestions, but I'm somewhat afraid of some lower level assembly hacks, since I'd like this to be as portable as possible.
I'd prefer a C solution, and I'd like to avoid boost::bind...
Edit - some clarifications ...
The one "calling" the "function" is a scripting language's compiled library (DLL). It loads the scripting language (source file) which has "bindings" to exteral "functions" (The ones I am trying to call) and when in the scripting language it encounters "call this external function" it tries to call that external function which might be in a DLL ... or the application which actually loaded the scripting language's DLL...
In order to be able to call functions with parameter types that are not clear at compiler time, I fear you won't come around said "lower level assembly hacks".
In cases where portability to architectures other than x86 or AMD64 isn't relevant, take a look at this wonderful library. It allows OS-unspecific ways of generating native bytecode at runtime and should be the easiest way to fulfil your wishes.
It's still beta, however I'm using it for a while now without encountering any problems.

How to Bypass a Standard C++ Function While Maintaining Its Functionality

I am looking for a way to be able to redefine a set of POSIX functions but then end the redefinition with a call to the original function. The idea is that I am trying to create a layer that can restrict what OS API's can be called depending on which "profile" is active. This "profile" determines what set of functions are allowed and any not specified should not be used.
For example, if in one profile I am not allowed to use strcpy, I would like to be able to either cause a compile time error (via static_assert) or print something to the screen saying "strcpy is not allowed in this profile" such as below:
MY_string.h
#include <string.h>
char *strcpy(char *restrict s1, const char *restrict s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assesrt(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return strcpy(s1, s2);
#endif
}
So that way within main.cpp I can use MY_string.h
#define PROFILE_PASS_THROUGH
#include "MY_string.h"
int main()
{
char temp1[10];
char temp2[10];
sprintf(temp2, "Testing");
if (0 = strcpy(temp1, temp2))
{
printf("temp1 is %s\n", temp1);
}
return 0;
}
Now I realize that the code I have written above will not compile properly due to the redefinition of strcpy, but is there a way to allow this sort of functionality without playing around with macros or creating my own standard c and c++ libraries?
You can write a preprocessor that changes calls to the standard routine to calls to your own routine. Such a preprocessor might be complicated, depending whether you need to recognize the full C++ grammar to distinguish calls using name spaces and so on or you can get away with more casual recognition of the calls.
You can link with your own library, producing a relocatable object module with resolved names stripped. Your library would contain routines with the standard names, such as strcpy, that execute whatever code you desire and call other names, such as Mystrcpy. The object module produced by this is then linked with a second library and with the standard library. The second library contains routines with those names, such as Mystrcpy, that call the original library names strcpy. The details for doing this are of course dependent on your linker. The goal is to have a chain like this: Original code calls strcpy. This is resolved to the version of strcpy in the first library. That version calls Mystrcpy. Mystrcpy calls the standard library strcpy.
You can compile to assembly and edit the names in the assembly so that your routines are called instead of the standard library routines.
On some systems, you can use dlsym and other functions defined in <dlfcn.h> to load the dynamic library that contains the standard implementations and to call them via pointers returned by dlsym instead of by the usual names in source code.
The GCC linker has a --wrap switch that resolves calls to foo to your routine __wrap_foo and resolves calls to __real_foo (which you would use in your implementation) to the real foo.
See also Intercepting Arbitrary Functions on Windows, UNIX, and Macintosh OS X Platforms.
No, cannot be done in C++. What you want is more akin to a LISP (or derivative) language, where you can grab the slot for an existing function and 'override it in place', potentially punting back to the original implementation.
Typical way of doing is on Unix is via LD_PRELOAD, example (Unix) below proxies a function call, malloc in particular (full example):
/**
* malloc() direct call
*/
inline void * libc_malloc(size_t size)
{
typedef void* (*malloc_func_t)(size_t);
static malloc_func_t malloc_func = (malloc_func_t) dlsym(RTLD_NEXT, "malloc");
return malloc_func(size);
}
In your MY_String.h:
... blah blah
using mynamespace::strcpy;
#endif // header guard or maybe not there if using pragma
then all strcpys that are not prefixed with std:: will use yours. If you REALLY want to ban them, grep and take a shotgun with you when you find the person who used it.
If using some recent GCC (e.g. version 4.7 or newer) you could also write a GCC plugin or a GCC extension in MELT to replace every call to strcpy to your own mystrcpy. This probably will take you some work (perhaps days, not hours) but has the enormous advantage to work inside the compiler, on the GCC compiler's internal representations (Gimple). So it will be done even after inlining, etc. And since you extend the compiler, you can tailor its behavior to what you want.
MELT is a domain specific language to extend GCC. It is designed for such tasks.
You cannot avoid these functions to be called.
A C++ program can do anything it wants, it could have some code that loads the strcpy symbol from libc and runs it. If a malicious developer want to call that function, you have no way to avoid it. To do that you'd need to run the C++ code in some special environment (in a sandbox, or virtual machine), but I'm afraid such technology is not available.
If you trust the developers, and you're just looking for a way to remind them not to call certain functions, then there could be some solution.
One solution could be avoiding to #include libc headers (like cstring), and only include your own header files where you only declared the desired functions.
Another solution could be that of looking to the compiled executable in order to find out what functions are called, or to LD_PRELOAD a library that redefines (and thus overrides) standard functions to make them print a warning at runtime.
Here is how you would you change MY_string.h
#include <cstring>
namespace my_functions{
char *strcpy(char *s1, const char *s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return std::strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assert(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return std::strcpy(s1, s2);
#endif
}
}
using namespace my_functions;
For this to work you cannot include or have using namespace std;

Quadruple Precision in C++ (GCC)

Just recently, the GCC 4.6.0 came out along with libquadmath. Unfortunately, GNU has supported Fortran, but not C or C++ (all that is included is a .so). I have not found a way to use these new features in C++, however, GNU C does support the __float128 type for guaranteed quadruple-precision floats. GNU C does not seem to support the math functions in libquadmath, such fabsq (absolute value, q being the suffix for quad).
Is there any way to get these functions working in C++, or is there some alternative library that I could use for math functions with __float128? What is the best method for getting quadruple-precision floats working in the GCC? Right now, I can add, subtract, and multiply them, but this is useless to me, considering how I have no way to convert them to strings or use functions such as truncq and fabsq to create my own string function.
Apparently, this seems to have been an installation error on my part.
While the core C/C++ portion of the GCC includes libquadmath.so, the Fortran version supplies libquadmath.a and quadmath.h, which can be included to access the functions.
#include <quadmath.h>
#include <iostream>
int main()
{
char* y = new char[1000];
quadmath_snprintf(y, 1000, "%Qf", 1.0q);
std::cout << y << std::endl;
return 0;
}
nm the .so file, and see what function names really are. IIRC, fortran routines have an _ at end of name. In C++, you'll need to extern "C" {} prototypes. If this is a fortran interface, then all args are passed by reference, so proto might be something like
extern "C" { long double fabsq_(long double* x); }