Why does LLVM pass nonsense values to an FFI function? - c++

I have a C++ header declared like:
struct Inner {
// ...
};
struct MyStruct {
Inner* _inner;
size_t _x;
size_t _y;
};
extern "C" {
MyStruct my_fn(MyStruct* s, bool flag_1, bool flag_2);
};
...which I'm compiling with clang into a shared library, whose functions I'm trying to invoke from LLVM code:
%"Inner" = type {i64, i64}
%"MyStruct" = type {%"Inner"*, i64, i64}
declare %"MyStruct" #"my_fn"(%"MyStruct"* %".1", i1 %".2", i1 %".3")
define void #"main"()
{
entry:
%".stack_buf_ptr" = alloca %"MyStruct"
; ... assign elements of %.stack_buf_ptr ...
%".15" = call %"MyStruct" #"my_fn"(%"MyStruct"* %".stack_buf_ptr", i1 true, i1 true)
}
The program crashes inside my_fn. However, what's peculiar is that the passed arguments have nonsense values, which I can see when I print them from my_fn, or if I inspect them in a debugger. Specifically, the first argument has the nonsense pointer value "0x7".
From reading the LLVM IR, I don't know how this is possible— the pointer is provided directly by LLVM's own alloca instruction. I get similar behavior if I try to pass the address of a global variable. More bizarrely, some of my other called functions with similar (but not identical) signatures work just fine.
The only explanation I can think of is a disagreement in calling convention, but it doesn't appear to be wrong in either case— the function is declared inside extern "C" { ... }, and when I use nm on the compiled shared library I can see that my_fn obeys the cdecl naming convention (it's listed as _my_fn). I also read from the documentation that LLVM function declarations default to cdecl.
I don't really know what to look at next. Could there be some reason why my types/signatures aren't matching in binary layout? Are there compiler settings for the shared library I should check that could fix/break things? What could be going on?
Why would the above LLVM code pass nonsense values to my_fn()?

Related

LTO optimizations negative effects and find best solution

I have MCU with flash memory breaked in sections(as usual).
Linker places .struct_init, .struct_init_const, .struct_not_init sections to addresses belongs to flash memory section20. It is hardcoded in linker script.
Consider following test code:
test.h
typedef struct
{
int val1;
int val2;
} mystruct_t;
test.cpp
#include "test.h"
// each variable is placed in dedicated section
// sections are placed in flash section20
// linker exports symbols with address of eaach section
__attribute__((section(".struct_init")))
mystruct_t struct_init = {
.val1 = 1,.val2 = 2};
__attribute__((section(".struct_init_const")))
extern const mystruct_t struct_init_const = {
.val1 = 1, .val2 = 2};
__attribute__((section(".struct_not_init")))
mystruct_t struct_not_init;
main.cpp
#include <stdint.h>
// This symbols exported by linker
// contains addresses of corresponding sections
extern uintptr_t LNK_STRUCT_INIT_ADDR;
extern uintptr_t LNK_STRUCT_INIT_CONST_ADDR;
extern uintptr_t LNK_STRUCT_NOT_INIT_ADDR;
// Pointers for indirect access to data
mystruct_t* struct_init_ptr = (mystruct_t*)LNK_STRUCT_INIT_ADDR;
const mystruct_t* struct_init_const_ptr = (const mystruct_t*)LNK_STRUCT_INIT_CONST_ADDR;
mystruct_t* struct_not_init_ptr = (mystruct_t*)LNK_STRUCT_NOT_INIT_ADDR;
// Extern variables declarations for DIRECT access data
extern mystruct_t struct_init;
extern const mystruct_t struct_init_const;
extern mystruct_t struct_not_init;
// This is some variables representing config values
// They can be more complex objects(classes) with internal state and logic..
int param1_direct;
int param1_init_const_direct;
int param1_not_init_direct;
int param1_indirect;
int param2_init_const_indirect;
int param1_not_init_indirect;
int main(void)
{
// local variables init with direct access
int param1_direct_local = struct_init.val1;
int param1_init_const_direct_local = struct_init_const.val1;
int param1_not_init_direct_local = struct_not_init.val1;
// local variables init with indirect access
int param1_indirect_local = struct_init_ptr->val1;
int param2_init_const_indirect_local = struct_init_const_ptr->val1;
int param1_not_init_indirect_local = struct_not_init_ptr->val1;
//global variables init direct
param1_direct = struct_init.val1;
param1_init_const_direct = struct_init_const.val1;
param1_not_init_direct = struct_not_init.val1;
//global variables init indirect
param1_indirect = struct_init_ptr->val1;
param2_init_const_indirect = struct_init_const_ptr->val1;
param1_not_init_indirect = struct_not_init_ptr->val1;
while(1){
// use all variables we init above
// usage of variables may also occure in some functions or methods
// directly or indirectly called from this loop
}
}
I wanna be sure that initialization of param1_ variables will lead to fetch data from flash. Because data in flash section20 can be changed using bootloader(at the moment when main firmware is not running).
The question is: Can LTO(and other optimizations) throw away fetches from flash and just substitute known values because they are known at link time because of initialization.
What approach is better?
If LTO can substitute values - then initialization should be avoided?
I know volatile can help, but is it really needed in this situation?
Code exampe shows different approaches of accessing and initializing data.
not_init version seems to be the best, because compiler can't substitute anything. But it will be a good idea to have some default parameters, so i'd prefer init version if it can be used.
What approach should be chosen?
Currently i am using GCC 4.9.3 but this is general question about any C/C++ compiler.
C and C++ both feature extern variables, which lets you define constants without immediately giving away their values:
// .h
extern int const param1;
extern char const* const param2;
// ...
In general you would define them in a (single) source file, which would hide them away from anything not in this source file. This is not LTO resilient, of course, but if you can disable LTO it is an easy enough strategy.
If disabling LTO is not an option, another solution is to not define them, let LTO produce a binary, and then use a script to splice the definitions in the produced binary in the right section (the one that can be flashed).
With the value not available at LTO time, you are guaranteed that it will not be substituted.
As for the solutions you presented, while volatile is indeed a standard compliant solution, it implies that the value is not constant, which prevents caching it during run-time. Whether this is acceptable or not is for you to know, just be aware it might have a performance impact, which as you are using LTO I surmised you would like to avoid.

How to check if a function exists in C/C++?

Certain situations in my code, I end up invoking the function only if that function is defined, or else I should not. How can I achieve this?
like:
if (function 'sum' exists ) then invoke sum ()
Maybe the other way around to ask this question is how to determine if function is defined at runtime and if so, then invoke?
When you declare 'sum' you could declare it like:
#define SUM_EXISTS
int sum(std::vector<int>& addMeUp) {
...
}
Then when you come to use it you could go:
#ifdef SUM_EXISTS
int result = sum(x);
...
#endif
I'm guessing you're coming from a scripting language where things are all done at runtime. The main thing to remember with C++ is the two phases:
Compile time
Preprocessor runs
template code is turned into real source code
source code is turned in machine code
runtime
the machine code is run
So all the #define and things like that happen at compile time.
....
If you really wanted to do it all at runtime .. you might be interested in using some of the component architecture products out there.
Or maybe a plugin kind of architecture is what you're after.
Using GCC you can:
void func(int argc, char *argv[]) __attribute__((weak)); // weak declaration must always be present
// optional definition:
/*void func(int argc, char *argv[]) {
printf("FOUND THE FUNCTION\n");
for(int aa = 0; aa < argc; aa++){
printf("arg %d = %s \n", aa, argv[aa]);
}
}*/
int main(int argc, char *argv[]) {
if (func){
func(argc, argv);
} else {
printf("did not find the function\n");
}
}
If you uncomment func it will run it otherwise it will print "did not find the function\n".
While other replies are helpful advices (dlsym, function pointers, ...), you cannot compile C++ code referring to a function which does not exist. At minimum, the function has to be declared; if it is not, your code won't compile. If nothing (a compilation unit, some object file, some library) defines the function, the linker would complain (unless it is weak, see below).
But you should really explain why you are asking that. I can't guess, and there is some way to achieve your unstated goal.
Notice that dlsym often requires functions without name mangling, i.e. declared as extern "C".
If coding on Linux with GCC, you might also use the weak function attribute in declarations. The linker would then set undefined weak symbols to null.
addenda
If you are getting the function name from some input, you should be aware that only a subset of functions should be callable that way (if you call an arbitrary function without care, it will crash!) and you'll better explicitly construct that subset. You could then use a std::map, or dlsym (with each function in the subset declared extern "C"). Notice that dlopen with a NULL path gives a handle to the main program, which you should link with -rdynamic to have it work correctly.
You really want to call by their name only a suitably defined subset of functions. For instance, you probably don't want to call this way abort, exit, or fork.
NB. If you know dynamically the signature of the called function, you might want to use libffi to call it.
I suspect that the poster was actually looking for something more along the lines of SFINAE checking/dispatch. With C++ templates, can define to template functions, one which calls the desired function (if it exists) and one that does nothing (if the function does not exist). You can then make the first template depend on the desired function, such that the template is ill-formed when the function does not exist. This is valid because in C++ template substitution failure is not an error (SFINAE), so the compiler will just fall back to the second case (which for instance could do nothing).
See here for an excellent example: Is it possible to write a template to check for a function's existence?
use pointers to functions.
//initialize
typedef void (*PF)();
std::map<std::string, PF> defined_functions;
defined_functions["foo"]=&foo;
defined_functions["bar"]=&bar;
//if defined, invoke it
if(defined_functions.find("foo") != defined_functions.end())
{
defined_functions["foo"]();
}
If you know what library the function you'd like to call is in, then you can use dlsym() and dlerror() to find out whether or not it's there, and what the pointer to the function is.
Edit: I probably wouldn't actually use this approach - instead I would recommend Matiu's solution, as I think it's much better practice. However, dlsym() isn't very well known, so I thought I'd point it out.
You can use #pragma weak for the compilers that support it (see the weak symbol wikipedia entry).
This example and comment is from The Inside Story on Shared Libraries and Dynamic Loading:
#pragma weak debug
extern void debug(void);
void (*debugfunc)(void) = debug;
int main() {
printf(“Hello World\n”);
if (debugfunc) (*debugfunc)();
}
you can use the weak pragma to force the linker to ignore unresolved
symbols [..] the program compiles and links whether or not debug()
is actually defined in any object file. When the symbol remains
undefined, the linker usually replaces its value with 0. So, this
technique can be a useful way for a program to invoke optional code
that does not require recompiling the entire application.
So another way, if you're using c++11 would be to use functors:
You'll need to put this at the start of your file:
#include <functional>
The type of a functor is declared in this format:
std::function< return_type (param1_type, param2_type) >
You could add a variable that holds a functor for sum like this:
std::function<int(const std::vector<int>&)> sum;
To make things easy, let shorten the param type:
using Numbers = const std::vectorn<int>&;
Then you could fill in the functor var with any one of:
A lambda:
sum = [](Numbers x) { return std::accumulate(x.cbegin(), x.cend(), 0); } // std::accumulate comes from #include <numeric>
A function pointer:
int myFunc(Numbers nums) {
int result = 0;
for (int i : nums)
result += i;
return result;
}
sum = &myFunc;
Something that 'bind' has created:
struct Adder {
int startNumber = 6;
int doAdding(Numbers nums) {
int result = 0;
for (int i : nums)
result += i;
return result;
}
};
...
Adder myAdder{2}; // Make an adder that starts at two
sum = std::bind(&Adder::doAdding, myAdder);
Then finally to use it, it's a simple if statement:
if (sum)
return sum(x);
In summary, functors are the new pointer to a function, however they're more versatile. May actually be inlined if the compiler is sure enough, but generally are the same as a function pointer.
When combined with std::bind and lambda's they're quite superior to old style C function pointers.
But remember they work in c++11 and above environments. (Not in C or C++03).
In C++, a modified version of the trick for checking if a member exists should give you what you're looking for, at compile time instead of runtime:
#include <iostream>
#include <type_traits>
namespace
{
template <class T, template <class...> class Test>
struct exists
{
template<class U>
static std::true_type check(Test<U>*);
template<class U>
static std::false_type check(...);
static constexpr bool value = decltype(check<T>(0))::value;
};
template<class U, class = decltype(sum(std::declval<U>(), std::declval<U>()))>
struct sum_test{};
template <class T>
void validate_sum()
{
if constexpr (exists<T, sum_test>::value)
{
std::cout << "sum exists for type " << typeid(T).name() << '\n';
}
else
{
std::cout << "sum does not exist for type " << typeid(T).name() << '\n';
}
}
class A {};
class B {};
void sum(const A& l, const A& r); // we only need to declare the function, not define it
}
int main(int, const char**)
{
validate_sum<A>();
validate_sum<B>();
}
Here's the output using clang:
sum exists for type N12_GLOBAL__N_11AE
sum does not exist for type N12_GLOBAL__N_11BE
I should point out that weird things happened when I used an int instead of A (sum() has to be declared before sum_test for the exists to work, so maybe exists isn't the right name for this). Some kind of template expansion that didn't seem to cause problems when I used A. Gonna guess it's ADL-related.
This answer is for global functions, as a complement to the other answers on testing methods. This answer only applies to global functions.
First, provide a fallback dummy function in a separate namespace. Then determine the return type of the function-call, inside a template parameter. According to the return-type, determine if this is the fallback function or the wanted function.
If you are forbidden to add anything in the namespace of the function, such as the case for std::, then you should use ADL to find the right function in the test.
For example, std::reduce() is part of c++17, but early gcc compilers, which should support c++17, don't define std::reduce(). The following code can detect at compile-time whether or not std::reduce is declared. See it work correctly in both cases, in compile explorer.
#include <numeric>
namespace fallback
{
// fallback
std::false_type reduce(...) { return {}; }
// Depending on
// std::recuce(Iter from, Iter to) -> decltype(*from)
// we know that a call to std::reduce(T*, T*) returns T
template <typename T, typename Ret = decltype(reduce(std::declval<T*>(), std::declval<T*>()))>
using return_of_reduce = Ret;
// Note that due to ADL, std::reduce is called although we don't explicitly call std::reduce().
// This is critical, since we are not allowed to define any of the above inside std::
}
using has_reduce = fallback::return_of_reduce<std::true_type>;
// using has_sum = std::conditional_t<std::is_same_v<fallback::return_of_sum<std::true_type>,
// std::false_type>,
// std::false_type,
// std::true_type>;
#include <iterator>
int main()
{
if constexpr (has_reduce::value)
{
// must have those, so that the compile will find the fallback
// function if the correct one is undefined (even if it never
// generates this code).
using namespace std;
using namespace fallback;
int values[] = {1,2,3};
return reduce(std::begin(values), std::end(values));
}
return -1;
}
In cases, unlike the above example, when you can't control the return-type, you can use other methods, such as std::is_same and std::contitional.
For example, assume you want to test if function int sum(int, int) is declared in the current compilation unit. Create, in a similar fashion, test_sum_ns::return_of_sum. If the function exists, it will be int and std::false_type otherwise (or any other special type you like).
using has_sum = std::conditional_t<std::is_same_v<test_sum_ns::return_of_sum,
std::false_type>,
std::false_type,
std::true_type>;
Then you can use that type:
if constexpr (has_sum::value)
{
int result;
{
using namespace fallback; // limit this only to the call, if possible.
result = sum(1,2);
}
std::cout << "sum(1,2) = " << result << '\n';
}
NOTE: You must have to have using namespace, otherwise the compiler will not find the fallback function inside the if constexpr and will complain. In general, you should avoid using namespace since future changes in the symbols inside the namespace may break your code. In this case there is no other way around it, so at least limit it to the smallest scope possible, as in the above example

Populate global function pointers in shared library (Solaris, Sun Studio)

I am creating a small C++ wrapper shared library around a Fortran 95 library. Since the Fortran symbols contain . in the symbol name, I have to use dlsym to load the Fortran function into a C++ function pointer.
Currently, I have a bunch of global function pointers in header files:
// test.h
extern void (*f)(int* arg);
and I populate them in the corresponding C++ file:
// test.cc
void (*f))(int* = reinterpret_cast<void(*)(int*>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_");
Questions:
If I do it this way, when are these pointers populated?
Can I assume them to be loaded in my executable that loads this library?
In particular, can I use these functions in statically created objects in my executable or other libraries? Or does this suffer from the static initalization order fiasco?
If the above way is not correct, what is the most elegant way of populating these pointers such that they can be used in static objects in executables and other libraries?
I am using the Sun Studio compiler on Solaris, if that makes a difference, but I would also be interested in a solution for GCC on Linux.
Where does the line
f = reinterpret_cast<void(*)(int*)>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_"));
occur in test.cc? The pointer will be initialized when the line is
executed (which of course depends on when the function which contains it
is called). Or did you mean to write
void (*f)(int* ) = reinterpret_cast<void(*)(int*>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_");
? In this case, the pointer will be initialized during static
initialization. Which means that you still have order of initialization
issues if you try to use the pointers in the constructor of a static
object.
The classical solution for this would be to use some sort of singleton:
struct LibraryPointers
{
void (*f)(int* );
// ...
static LibraryPointers const& instance()
private:
LibraryPointers();
};
LibraryPointers const&
LibraryPointers::instance()
{
static LibraryPointers theOneAndOnly;
return theOneAndOnly;
}
LibraryPointers::LibraryPointers()
: f( reinterpret_cast<void(*)(int*)>(dlsym(RTLD_DEFAULT, "real_f.symbol_name_")) )
, // initialization of other pointers...
{
}
Then wrap the library in a C++ class which uses this structure to get
the addresses of the pointers.
And one last remark: the reinterpret_cast you are trying to do isn't
legal, at least not formally. (I think that both Sun CC and g++ will
accept it, however.) According to Posix, the correct way to get a
pointer to function from dlsym would be:
void (*f)(int* );
*reinterpret_cast<void**>(&f) = dlsym(...);
This doesn't lend itself to initializations, however.

Accessing C++ class member in inline assembly

Question: How can I access a member variable in assembly from within a non-POD class?
Elaboration:
I have written some inline assembly code for a class member function but what eludes me is how to access class member variables. I've tried the offsetof macro but this is a non-POD class.
The current solution I'm using is to assign a pointer from global scope to the member variable but it's a messy solution and I was hoping there was something better that I dont know about.
note: I'm using the G++ compiler. A solution with Intel syntax Asm would be nice but I'll take anything.
example of what I want to do (intel syntax):
class SomeClass
{
int* var_j;
void set4(void)
{
asm("mov var_j, 4"); // sets pointer SomeClass::var_j to address "4"
}
};
current hackish solution:
int* global_j;
class SomeClass
{
int* var_j;
void set4(void)
{
asm("mov global_j, 4"); // sets pointer global_j to address "4"
var_j = global_j; // copy it back to member variable :(
}
};
Those are crude examples but I think they get the point across.
This is all you need:
__asm__ __volatile__ ("movl $4,%[v]" : [v] "+m" (var_j)) ;
Edited to add: The assembler does accept Intel syntax, but the compiler doesn't know it, so this trick won't work using Intel syntax (not with g++ 4.4.0, anyway).
class SomeClass
{
int* var_j;
void set4(void)
{
__asm__ __volatile__("movl $4, (%0,%1)"
:
: "r"(this), "r"((char*)&var_j-(char*)this)
:
);
}
};
This might work too, saving you one register:
__asm__ __volatile__("movl $4, %1(%0)"
:
: "r"(this), "i"((char*)&var_j-(char*)this)
:
);
In fact, since the offset of var_j wrt. this should be known at compile time, the second option is the way to go, even if it requires some tweaking to get it working. (I don't have access to a g++ system right now, so I'll leave this up to you to investigate.)
And don't ever underestimate the importance of __volatile__. Took more of my time that I'd liked to track down bugs that appeared because I missed the volatile keyword and the compiler took it upon itself to do strange things with my assembly.

Initialize global array of function pointers at either compile-time, or run-time before main()

I'm trying to initialize a global array of function pointers at compile-time, in either C or C++. Something like this:
module.h
typedef int16_t (*myfunc_t)(void);
extern myfunc_array[];
module.cpp
#include "module.h"
int16_t myfunc_1();
int16_t myfunc_2();
...
int16_t myfunc_N();
// the ordering of functions is not that important
myfunc_array[] = { myfunc_1, myfunc_2, ... , myfunc_N };
func1.cpp, func2.cpp, ... funcN.cpp (symbolic links to a single func.cpp file, so that different object files are created: func1.o, func2.o, func3.o, ... , funcN.o. NUMBER is defined using g++ -DNUMBER=N)
#include "module.h"
#define CONCAT2(x, y) x ## y
#define CONCAT(x, y) CONCAT2(x, y)
int16_t CONCAT(myfunc_, NUMBER)() { ... }
When compiled using g++ -DNUMBER=N, after preprocessing becomes:
func1.cpp
...
int16_t myfunc_1() { ... }
func2.cpp
...
int16_t myfunc_2() { ... }
and so on.
The declarations of myfunc_N() and the initialization of myfunc_array[] are not cool, since N changes often and could be between 10 to 200. I prefer not to use a script or Makefile to generate them either. The ordering of functions is not that important, i can work around that. Is there a neater/smarter way to do this?
How To Make a Low-Level Function Registry
First you create a macro to place pointers to your functions in a special section:
/* original typedef from question: */
typedef int16_t (*myfunc)(void);
#define myfunc_register(N) \
static myfunc registered_##myfunc_##N \
__attribute__((__section__(".myfunc_registry"))) = myfunc_##N
The static variable name is arbitrary (it will never be used) but it's nice to choose an expressive name. You use it by placing the registration just below your function:
myfunc_register(NUMBER);
Now when you compile your file (each time) it will have a pointer to your function in the section .myfunc_registry. This will all compile as-is but it won't do you any good without a linker script. Thanks to caf for pointing out the relatively new INSERT AFTER feature:
SECTIONS
{
.rel.rodata.myfunc_registry : {
PROVIDE(myfunc_registry_start = .);
*(.myfunc_registry)
PROVIDE(myfunc_registry_end = .);
}
}
INSERT AFTER .text;
The hardest part of this scheme is creating the entire linker script: You need to embed that snippet in the actual linker script for your host which is probably only available by building binutils by hand and examining the compile tree or via strings ld. It's a shame because I quite like linker script tricks.
Link with gcc -Wl,-Tlinkerscript.ld ... The -T option will enhance (rather than replace) the existing linker script.
Now the linker will gather all of your pointers with the section attribute together and helpfully provide a symbol pointing before and after your list:
extern myfunc myfunc_registry_start[], myfunc_registry_end[];
Now you can access your array:
/* this cannot be static because it is not know at compile time */
size_t myfunc_registry_size = (myfunc_registry_end - myfunc_registry_start);
int i;
for (i = 0; i < myfunc_registry_size); ++i)
(*myfunc_registry_start[i])();
They will not be in any particular order. You could number them by putting them in __section__(".myfunc_registry." #N) and then in the linker gathering *(.myfunc_registry.*), but the sorting would be lexographic instead of numeric.
I have tested this out with gcc 4.3.0 (although the gcc parts have been available for a long time) and ld 2.18.50 (you need a fairly recent ld for the INSERT AFTER magic).
This is very similar to the way the compiler and linker conspire to execute your global ctors, so it would be a whole lot easier to use a static C++ class constructor to register your functions and vastly more portable.
You can find examples of this in the Linux kernel, for example __initcall is very similar to this.
I was going to suggest this question is more about C, but on second thoughts, what you want is a global container of function pointers, and to register available functions into it. I believe this is called a Singleton (shudder).
You could make myfunc_array a vector, or wrap up a C equivalent, and provide a function to push myfuncs into it. Now finally, you can create a class (again you can do this in C), that takes a myfunc and pushes it into the global array. This will all occur immediately prior to main being called. Here are some code snippets to get you thinking:
// a header
extern vector<myfunc> myfunc_array;
struct _register_myfunc {
_register_myfunc(myfunc lolz0rs) {
myfunc_array.push_back(lolz0rs);
}
}
#define register_myfunc(lolz0rs) static _register_myfunc _unique_name(lolz0rs);
// a source
vector<myfunc> myfunc_array;
// another source
int16_t myfunc_1() { ... }
register_myfunc(myfunc_1);
// another source
int16_t myfunc_2() { ... }
register_myfunc(myfunc_2);
Keep in mind the following:
You can control the order the functions are registered by manipulating your link step.
The initialization of your translation unit-scoped variables occurs before main is called, i.e. the registering will be completed.
You can generate unique names using some macro magic and __COUNTER__. There may be other sneaky ways that I don't know about. See these useful questions:
Unnamed parameters in C
Unexpected predefined macro behaviour when pasting tokens
How to generate random variable names in C++ using macros?
Your solution sounds much too complicated and error prone to me.
You go over your project with a script (or probably make) to place the -D options to the compiler, anyhow. So I suppose you are keeping a list of all your functions (resp. the files defining them).
I'd use proper names for all the functions, nothing of your numbering scheme and then I would produce the file "module.cpp" with that script and initialize the table with the names.
For this you just have to keep a list of all your functions (and perhaps filenames) in one place. This could be easier be kept consistent than your actual scheme, I think.
Edit: Thinking of it even this might also be overengineering. If you have to maintain a list of your functions somewhere in any case, why not just inside the file "module.cpp"? Just include all the header files of all your functions, there, and list them in the initializer of the table.
Since you allow C++, the answer is obviously yes, with templates:
template<int N> int16_t myfunc() { /* N is a const int here */ }
myfunc_array[] = { myfunc<0>, myfunc<1>, myfunc<2> }
Now, you might wonder if you can create that variable-length initializer list with some macro. The answer is yes, but the macro's needed are ugly. So I'n not going to write them here, but point you to Boost::Preprocessor
However, do you really need such an array? Do you really need the name myfunc_array[0] for myfunc<0> ? Even if you need a runtime argument (myfunc_array[i]) there are other tricks:
inline template <int Nmax> int16_t myfunc_wrapper(int i) {
assert (i<Nmax);
return (i==Nmax) ? myfunc<Nmax> : myfunc_wrapper(i-1);
}
inline int16_t myfunc_wrapper(int i) {
return myfunc_wrapper<NUMBER>(i); // NUMBER is defined on with g++ -DNUMBER=N
}
Ok I worked out a solution based on Matt Joiner's tip:
module.h
typedef int16_t (*myfunc_t)(void);
extern myfunc_array[];
class FunctionRegistrar {
public:
FunctionRegistrar(myfunc_t fn, int fn_number) {
myfunc_array[fn_number - 1] = fn; // ensures correct ordering of functions (not that important though)
}
}
module.cpp
#include "module.h"
myfunc_array[100]; // The size needs to be #defined by the compiler, probably
func1.cpp, func2.cpp, ... funcN.cpp
#include "module.h"
static int16_t myfunc(void) { ... }
static FunctionRegistrar functionRegistrar(myfunc, NUMBER);
Thanks everyone!