Is there any good way to access structure in user-land? - dtrace

I want to use Dtrace to get the value of a member in a structure in user-land, not kernel.
The C code likes this:
typedef struct
{
int a;
}st_A;
void fun1(st_A *p)
{
......
}
The Dtrace script likes this:
#!/usr/sbin/dtrace -qs
pid$1::fun1:entry
{
printf("%d\n", *(int*)copyin(arg0, 4));
}
Personally, I think this Dtrace script is very clumsy. If the structure contains many members, I need to calculate the offset of every member. If the structure contains pointer array, the situation is awful!
So, is there any easy and graceful way to access membesr in a structure in user-land process? Thanks very much!

The more usual way to do this on Solaris is
typedef struct {
int a;
} st_A;
pid$1::fun:entry
{
self->kp = (st_A *)copyin(arg0, sizeof (st_A));
printf("a = %d\n", self->kp->a);
}
but you're right: if you want to follow pointers within your structure then you will have to repeat the copyin() for each dereference.
Remember that you can #include a header file if you invoke dtrace(1) with the -C option. In any case, use -32 or -64 to indicate the data model of your victim process: by default, dtrace(1) will interpret any types you specify using the data model of the running kernel.
I think that illumos's DTrace performs automatic copying-in but I haven't looked at it. I don't know about other implementations.

Related

C++ constructor on unnamed object idiom?

I have a C++ class that works like the Linux "devmem" utility for embedded systems. I'll simplify and give an outline of it here:
struct DevMem
{
DevMem(off_t start, off_t end)
{
// map addresses start to end into memory with mmap
}
~DevMem()
{
// release mapped memory with munmap
}
uint32_t read(off_t address)
{
// return the value at mapped address
}
void write(off_t address, uint32_t value)
{
// write value to mapped address
}
};
I use it like this:
void WriteConverter(uint32_t value)
{
off_t base = 0xa0000000;
DevMem dm(base, base+0x100); // set up mapping for region
dm.write(base+0x8, value); // output value to converter
dm.write(base+0x0, 1); // strobe hardware
while (dm.read(base+0x4)) // wait until done
;
}
And this works great. RAII ensures the mapped memory is released when I'm done with it. But some hardware is really simple and only needs a single read or write. It was bothering me that in order to access that hardware, I would have to invent some name for the instantiation of the class:
DevMem whatever(0xa0001000, 0xa0001000); // map the thing
whatever.write(0xa0001000, 42); // do the thing
With the named object and the repetition of the address three times, it's a little verbose. So I made a change to the constructor so that I could leave off the end parameter if I'm only mapping a single address:
DevMem(off_t start, off_t end = 0)
{
// map addresses start to end into memory with mmap
}
And then I overloaded the read and write routines so the address wasn't passed:
uint32_t read()
{
// return the value at the constructor's start address
}
void write(uint32_t value)
{
// write value to the constructor's start address
}
And I discovered that I could then do this:
DevMem(0xa0001000).write(42); // do the thing
And this works. I don't need to invent a name for the object, it's less verbose, the value is written (or read), and RAII cleans it up nicely. What I assume is happening is that C++ is constructing an unnamed object, dereferencing it, using it, and then destructing it.
Is this use of an unnamed object valid? I mean, it compiles okay, GCC and clang don't complain with common warnings cranked up, and it does actually work on the target hardware. I just can't find any examples of such a thing on the Interwebs. Is this a named idiom?
Yep, completely valid. You create the object, use it and then the destructor kicks in. Your compiler will probably generate the same assembly in your whatever example if whateverhas a reasonable scope.
I don't know any names for this construct though. As well as I wouldn't call this an idiom.

Calling function within C++ classs not working

I have been working on this simply hobbyist OS, and I have decided to add some C++ support. Here is the simple script I wrote. When I compile it, I get this message:
cp.o: In function `caller':
test.cpp:(.text+0x3a): undefined reference to `__stack_chk_fail'
Here is the script:
class CPP {
public:
int a;
void test(void);
};
void CPP::test(void) {
// Code here
}
int caller() {
CPP caller;
caller.test();
return CPP.a;
}
Try it like this.
class CPP {
public:
int a;
void test(void);
};
void CPP::test(void) {
CPP::a = 4;
}
int caller() {
CPP caller;
caller.test();
return caller.a;
}
int main(){
int called = caller();
std::cout << called << std::endl;
return 0;
}
It seems to me that the linker you are using can't find the library containing a security function crashing the program upon detecting stack smashing. (It may be that the compiler doesn't include the function declaration for some reason? I am not familiar who actually defies this specific function.) Try compiling with -fno-stack-protector or equivalent.
What is the compiler used? A workaround might be defining the function as something like exit(1); or similar. That would produce the intended effect yet fix the problem for now.
I created a test program to show how this actually plays out. Test program:
int main(){
int a[50]; // To have the compiler manage the stack
return 0;
}
With only -O0 as the flag ghidra decompiles this to:
undefined8 main(void){
long in_FS_OFFSET;
if (*(long *)(in_FS_OFFSET + 0x28) != *(long *)(in_FS_OFFSET + 0x28)) {
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 0;
}
With -fno-stack-protector:
undefined8 main(void){
return 0;
}
The array was thrown out by ghidra in decompilation, but we see that the stack protection is missing if you use the flag. There are also some messed up parts of this in ghidra (e.g. int->undefined8), but this is standard in decompilation.
Consequences of using the flag
Compiling without stack protection is not good per se, but it shouldn't affect you in much. If you write some code (that the compiler shouts you about) you can create a buffer overflowable program, which should not be that big of an issue in my optinion.
Alternative
Alternatively have a look at this. They are talking about embedded systems, but the topic seems appropriate.
Why is the code there
Look up stack smashing, but to my knowledge I will try to explain. When the program enters a function (main in this case) it stores the location of the next instruction in the stack.
If you write an OS you probably know what the stack is, but for completeness: The stack is just some memory onto which you can push and off which you can pop data. You always pop the last pushed thing off the stack. C++ and other languages also use the stack as a way to store local variables. The stack is at the end of memory and when you push something, the new thing will be further forward rather than back, it fills up 'backwards'.
You can initialise buffers as a local variable e.g. char[20]. If you filled the buffer without checking the length you might overfill this, and overwrite things in the stack other than the buffer. The return address of the next instruction is in the stack as well. So if we have a program like this:
int test(){
int a;
char buffer[20];
int c;
// someCode;
}
Then the stack will look something like this at someCode:
[ Unused space, c, buffer[0], buffer[1] ..., buffer[19], a, Return Address, variables of calling function ]
Now if I filled the buffer without checking the length I can overwrite a (which is a problem as I can modify how the program runs) or even the return address (which is a major flaw as I might be able to execute malicious shellcode, by injecting it into the buffer). To avoid this compilers insert a 'stack cookie' between a and the return address. If that variable is changed then the program should terminate before calling return, and that is what __stack_chk_fail() is for. It seems that it is defined in some library as well so you might not be able use this, despite technically the compiler being the one that uses this.

Several specific methods or one generic method?

this is my first question after long time checking on this marvelous webpage.
Probably my question is a little silly but I want to know others opinion about this. What is better, to create several specific methods or, on the other hand, only one generic method? Here is an example...
unsigned char *Method1(CommandTypeEnum command, ParamsCommand1Struct *params)
{
if(params == NULL) return NULL;
// Construct a string (command) with those specific params (params->element1, ...)
return buffer; // buffer is a member of the class
}
unsigned char *Method2(CommandTypeEnum command, ParamsCommand2Struct *params)
{
...
}
unsigned char *Method3(CommandTypeEnum command, ParamsCommand3Struct *params)
{
...
}
unsigned char *Method4(CommandTypeEnum command, ParamsCommand4Struct *params)
{
...
}
or
unsigned char *Method(CommandTypeEnum command, void *params)
{
switch(command)
{
case CMD_1:
{
if(params == NULL) return NULL;
ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;
// Construct a string (command) with those specific params (params->element1, ...)
return buffer;
}
break;
// ...
default:
break;
}
}
The main thing I do not really like of the latter option is this,
ParamsCommand1Struct *value = (ParamsCommand1Struct *) params;
because "params" could not be a pointer to "ParamsCommand1Struct" but a pointer to "ParamsCommand2Struct" or someone else.
I really appreciate your opinions!
General Answer
In Writing Solid Code, Steve Macguire's advice is to prefer distinct functions (methods) for specific situations. The reason is that you can assert conditions that are relevant to the specific case, and you can more easily debug because you have more context.
An interesting example is the standard C run-time's functions for dynamic memory allocation. Most of it is redundant, as realloc can actually do (almost) everything you need. If you have realloc, you don't need malloc or free. But when you have such a general function, used for several different types of operations, it's hard to add useful assertions and it's harder to write unit tests, and it's harder to see what's happening when debugging. Macquire takes it a step farther and suggests that, not only should realloc just do _re_allocation, but it should probably be two distinct functions: one for growing a block and one for shrinking a block.
While I generally agree with his logic, sometimes there are practical advantages to having one general purpose method (often when operations is highly data-driven). So I usually decide on a case by case basis, with a bias toward creating very specific methods rather than overly general purpose ones.
Specific Answer
In your case, I think you need to find a way to factor out the common code from the specifics. The switch is often a signal that you should be using a small class hierarchy with virtual functions.
If you like the single method approach, then it probably should be just a dispatcher to the more specific methods. In other words, each of those cases in the switch statement simply call the appropriate Method1, Method2, etc. If you want the user to see only the general purpose method, then you can make the specific implementations private methods.
Generally, it's better to offer separate functions, because they by their prototype names and arguments communicate directly and visibly to the user that which is available; this also leads to more straightforward documentation.
The one time I use a multi-purpose function is for something like a query() function, where a number of minor query functions, rather than leading to a proliferation of functions, are bundled into one, with a generic input and output void pointer.
In general, think about what you're trying to communicate to the API user by the API prototypes themselves; a clear sense of what the API can do. He doesn't need excessive minutae; he does need to know the core functions which are the entire point of having the API in the first place.
First off, you need to decide which language you are using. Tagging the question with both C and C++ here makes no sense. I am assuming C++.
If you can create a generic function then of course that is preferable (why would you prefer multiple, redundant functions?) The question is; can you? However, you seem to be unaware of templates. We need to see what you have omitted here to tell if you if templates are suitable however:
// Construct a string (command) with those specific params (params->element1, ...)
In the general case, assuming templates are appropriate, all of that turns into:
template <typename T>
unsigned char *Method(CommandTypeEnum command, T *params) {
// more here
}
On a side note, how is buffer declared? Are you returning a pointer to dynamically allocated memory? Prefer RAII type objects and avoid dynamically allocating memory like that if so.
If you are using C++ then I would avoid using void* as you don't really need to. There is nothing wrong with having multiple methods. Note that you don't actually have to rename the function in your first set of examples - you can just overload a function using different parameters so that there is a separate function signature for each type. Ultimately, this kind of question is very subjective and there are a number of ways of doing things. Looking at your functions of the first type, you would perhaps be well served by looking into the use of templated functions
You could create a struct. That's what I use to handle console commands.
typedef int (* pFunPrintf)(const char*,...);
typedef void (CommandClass::*pKeyFunc)(char *,pFunPrintf);
struct KeyCommand
{
const char * cmd;
unsigned char cmdLen;
pKeyFunc pfun;
const char * Note;
long ID;
};
#define CMD_FORMAT(a) a,(sizeof(a)-1)
static KeyCommand Commands[]=
{
{CMD_FORMAT("one"), &CommandClass::CommandOne, "String Parameter",0},
{CMD_FORMAT("two"), &CommandClass::CommandTwo, "String Parameter",1},
{CMD_FORMAT("three"), &CommandClass::CommandThree, "String Parameter",2},
{CMD_FORMAT("four"), &CommandClass::CommandFour, "String Parameter",3},
};
#define AllCommands sizeof(Commands)/sizeof(KeyCommand)
And the Parser function
void CommandClass::ParseCmd( char* Argcommand )
{
unsigned int x;
for ( x=0;x<AllCommands;x++)
{
if(!memcmp(Commands[x].cmd,Argcommand,Commands[x].cmdLen ))
{
(this->*Commands[x].pfun)(&Argcommand[Commands[x].cmdLen],&::printf);
break;
}
}
if(x==AllCommands)
{
// Unknown command
}
}
I use a thread safe printf pPrintf, so ignore it.
I don't really know what you want to do, but in C++ you probably should derive multiple classes from a Formatter Base class like this:
class Formatter
{
virtual void Format(unsigned char* buffer, Command command) const = 0;
};
class YourClass
{
public:
void Method(Command command, const Formatter& formatter)
{
formatter.Format(buffer, command);
}
private:
unsigned char* buffer_;
};
int main()
{
//
Params1Formatter formatter(/*...*/);
YourClass yourObject;
yourObject.Method(CommandA, formatter);
// ...
}
This removes the resposibility to handle all that params stuff from your class and makes it closed for changes. If there will be new commands or parameters during further development you don't have to modifiy (and eventually break) existing code but add new classes that implement the new stuff.
While not full answer this should guide you in correct direction: ONE FUNCTION ONE RESPONSIBILITY. Prefer the code where it is responsible for one thing only and does it well. The code whith huge switch statement (which is not bad by itself) where you need cast void * to some other type is a smell.
By the way I hope you do realise that according to standard you can only cast from void * to <type> * only when the original cast was exactly from <type> * to void *.

How can I emulate constructor and destructor behavior (for particular data types) in C

I have a C (nested) structure that I would like to automagically initialize and destroy in my code.
I am compiling with GCC (4.4.3) on Linux. I am vaguely aware of GCC function attributes constructor and destructor, but the construction/destruction they provide seem to relate to the entire program (i.e. before main() is called etc).
I want to be able to have different init/cleanup funcs for different data types - is this C++ like behaviour something that I can emulate using POC?
I have included the C++ tag because this is really C++ behaviour I am trying to emulate in C.
There's no way to do this automatically, at least not in any portable manner. In C you'd typically have functions that work somewhat like constructors and destructors — they (de)allocate memory and (de)initialize fields —, except they have to be called explicitly:
typedef struct{} MyStruct;
MyStruct *MyStruct_New(void);
void MyStruct_Free(MyStruct *obj);
The language was simply not designed for this and you shouldn't try to force it, imo. If you want to have automatic destruction, you shouldn't be using C.
#define your way through the problem...
As pointed out by previous authors there is no automatic way of doing what you are asking, which sadly is kind of obvious since C doesn't have any way of doing true OOP.
But a programmer can always hack him or herself through any kind of obstacle.. At the end of this post I wrote you a sample hack to circumvent the problem.
There are methods of cleaning up the macro provided, though it won't be as portable.
C99 implementation: http://ideone.com/9XcCt
C89 implementation: http://ideone.com/WYrjU
- C99 implementation
#include <stdio.h>
#include <stdlib.h>
...
#define SCOPIFY(TYPE,NAME, ...) { \
ctor_ ## TYPE(& NAME); \
__VA_ARGS__ \
dtor_ ## TYPE(& NAME); \
} (void)0
...
typedef struct {
int * p;
} Obj;
void
ctor_Obj (Obj* this) {
this->p = malloc (sizeof (int));
*this->p = 123;
fprintf (stderr, "Obj::ctor, (this -> %p)\n", (void*)this);
}
void
dtor_Obj (Obj* this) {
free (this->p);
fprintf (stderr, "Obj::dtor, (this -> %p)\n", (void*)this);
}
...
int
main (int argc, char *argv[])
{
Obj o1, o2;
SCOPIFY (Obj, o1,
fprintf (stderr, " o1.p -> %d\n", *o1.p);
SCOPIFY (Obj, o2,
int a, b;
fprintf (stderr, " o2.p -> %d\n", *o2.p);
(*o1.p) += (*o2.p);
);
fprintf (stderr, " o1.p -> %d\n", *o1.p);
);
return 0;
}
output (http://ideone.com/WYrjU)
Obj::ctor, (this -> 0xbf8f05ac)
o1.p -> 123
Obj::ctor, (this -> 0xbf8f05a8)
o2.p -> 123
Obj::dtor, (this -> 0xbf8f05a8)
o1.p -> 246
Obj::dtor, (this -> 0xbf8f05ac)
From what you write, I figure that you know already how to write init and destroy functions that eventually use their counterparts for individual parts recursively.
Yes, there is no standard mechanism in C that would allow for something like automatic construction or destruction.
Construction can be somewhat replace by writing an initializer macro. Designated initializers come handy for that
#define TOTO_INITIALIZER(TUTU_PARAM, TATA_PARAM) \
{ \
.tata_member = TATA_INITIALIZER(TATA_PARAM), \
.tutu_member = TUTU_INITIALIZER(TUTU_PARAM), \
}
since they make that such code robust against reordering of members.
For destructors there is nothing that can be coupled to a variable or data type. The only thing I know of what is possible is scope based resource management that in C you can implement through hidden for-scope local variables.
There's no default way to have a function automatically called when you create a struct. Here's an example of a creation and initialisation function set for a certain type of struct:
// Simple struct that holds an ID number and a file pointer.
typedef struct
{
int id;
FILE *data;
} Datum;
// Function to create a Datum from a given file.
Datum *create_datum(const char *fname)
{
// Create Datum object.
Datum *d = (Datum*)malloc(sizeof(Datum));
// malloc may return NULL if we're out of memory.
if(d)
{
// Initialise ID to something.
d->id = 0;
// Open filename passed.
d->data = fopen(fname, "r");
}
return d;
}
// Function to safely destroy a Datum. This function takes a pointer-pointer so
// that it can set the pointer to NULL after deleting the object. Saves you
// from dangling pointers.
void destroy_datum(Datum **dp)
{
if(!dp)
return;
// Get a plain pointer for convenience
Datum *d = *dp;
if(d)
{
// Close the file.
fclose(d->data);
// Delete the object.
free(d);
// Set the pointer to NULL.
*dp = NULL;
}
}
// Now use these functions:
int main(void)
{
Datum *datum = create_datum("test.txt");
if(datum)
{
// Do some things!
}
destroy_datum(&datum);
// datum is now equal to NULL.
}
Hope that helps! Like Homunculus has said, C isn't a great language if you need to do a lot of this sort of stuff - but sometimes you just want to abstract away the process of creating a struct, as well as cleaning it up. This is especially helpful in modular design, where a module can provide the create_ and destroy_ interface functions, and hide the actual implementation of those.
I did not see the gcc tag, but since the original poster mention explicit use of GCC constructor/destructor attributes:
https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attributes.html#index-g_t_0040code_007bconstructor_007d-function-attribute-2500
I'd like to point out that there is also the cleanup attribute:
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/Common-Variable-Attributes.html#index-g_t_0040code_007bcleanup_007d-variable-attribute-3486
cleanup (cleanup_function)
The cleanup attribute runs a function when
the variable goes out of scope. This attribute can only be applied to
auto function scope variables; it may not be applied to parameters or
variables with static storage duration. The function must take one
parameter, a pointer to a type compatible with the variable. The
return value of the function (if any) is ignored. If -fexceptions is
enabled, then cleanup_function is run during the stack unwinding that
happens during the processing of the exception. Note that the cleanup
attribute does not allow the exception to be caught, only to perform
an action. It is undefined what happens if cleanup_function does not
return normally.

Initialize global array of function pointers at either compile-time, or run-time before main()

I'm trying to initialize a global array of function pointers at compile-time, in either C or C++. Something like this:
module.h
typedef int16_t (*myfunc_t)(void);
extern myfunc_array[];
module.cpp
#include "module.h"
int16_t myfunc_1();
int16_t myfunc_2();
...
int16_t myfunc_N();
// the ordering of functions is not that important
myfunc_array[] = { myfunc_1, myfunc_2, ... , myfunc_N };
func1.cpp, func2.cpp, ... funcN.cpp (symbolic links to a single func.cpp file, so that different object files are created: func1.o, func2.o, func3.o, ... , funcN.o. NUMBER is defined using g++ -DNUMBER=N)
#include "module.h"
#define CONCAT2(x, y) x ## y
#define CONCAT(x, y) CONCAT2(x, y)
int16_t CONCAT(myfunc_, NUMBER)() { ... }
When compiled using g++ -DNUMBER=N, after preprocessing becomes:
func1.cpp
...
int16_t myfunc_1() { ... }
func2.cpp
...
int16_t myfunc_2() { ... }
and so on.
The declarations of myfunc_N() and the initialization of myfunc_array[] are not cool, since N changes often and could be between 10 to 200. I prefer not to use a script or Makefile to generate them either. The ordering of functions is not that important, i can work around that. Is there a neater/smarter way to do this?
How To Make a Low-Level Function Registry
First you create a macro to place pointers to your functions in a special section:
/* original typedef from question: */
typedef int16_t (*myfunc)(void);
#define myfunc_register(N) \
static myfunc registered_##myfunc_##N \
__attribute__((__section__(".myfunc_registry"))) = myfunc_##N
The static variable name is arbitrary (it will never be used) but it's nice to choose an expressive name. You use it by placing the registration just below your function:
myfunc_register(NUMBER);
Now when you compile your file (each time) it will have a pointer to your function in the section .myfunc_registry. This will all compile as-is but it won't do you any good without a linker script. Thanks to caf for pointing out the relatively new INSERT AFTER feature:
SECTIONS
{
.rel.rodata.myfunc_registry : {
PROVIDE(myfunc_registry_start = .);
*(.myfunc_registry)
PROVIDE(myfunc_registry_end = .);
}
}
INSERT AFTER .text;
The hardest part of this scheme is creating the entire linker script: You need to embed that snippet in the actual linker script for your host which is probably only available by building binutils by hand and examining the compile tree or via strings ld. It's a shame because I quite like linker script tricks.
Link with gcc -Wl,-Tlinkerscript.ld ... The -T option will enhance (rather than replace) the existing linker script.
Now the linker will gather all of your pointers with the section attribute together and helpfully provide a symbol pointing before and after your list:
extern myfunc myfunc_registry_start[], myfunc_registry_end[];
Now you can access your array:
/* this cannot be static because it is not know at compile time */
size_t myfunc_registry_size = (myfunc_registry_end - myfunc_registry_start);
int i;
for (i = 0; i < myfunc_registry_size); ++i)
(*myfunc_registry_start[i])();
They will not be in any particular order. You could number them by putting them in __section__(".myfunc_registry." #N) and then in the linker gathering *(.myfunc_registry.*), but the sorting would be lexographic instead of numeric.
I have tested this out with gcc 4.3.0 (although the gcc parts have been available for a long time) and ld 2.18.50 (you need a fairly recent ld for the INSERT AFTER magic).
This is very similar to the way the compiler and linker conspire to execute your global ctors, so it would be a whole lot easier to use a static C++ class constructor to register your functions and vastly more portable.
You can find examples of this in the Linux kernel, for example __initcall is very similar to this.
I was going to suggest this question is more about C, but on second thoughts, what you want is a global container of function pointers, and to register available functions into it. I believe this is called a Singleton (shudder).
You could make myfunc_array a vector, or wrap up a C equivalent, and provide a function to push myfuncs into it. Now finally, you can create a class (again you can do this in C), that takes a myfunc and pushes it into the global array. This will all occur immediately prior to main being called. Here are some code snippets to get you thinking:
// a header
extern vector<myfunc> myfunc_array;
struct _register_myfunc {
_register_myfunc(myfunc lolz0rs) {
myfunc_array.push_back(lolz0rs);
}
}
#define register_myfunc(lolz0rs) static _register_myfunc _unique_name(lolz0rs);
// a source
vector<myfunc> myfunc_array;
// another source
int16_t myfunc_1() { ... }
register_myfunc(myfunc_1);
// another source
int16_t myfunc_2() { ... }
register_myfunc(myfunc_2);
Keep in mind the following:
You can control the order the functions are registered by manipulating your link step.
The initialization of your translation unit-scoped variables occurs before main is called, i.e. the registering will be completed.
You can generate unique names using some macro magic and __COUNTER__. There may be other sneaky ways that I don't know about. See these useful questions:
Unnamed parameters in C
Unexpected predefined macro behaviour when pasting tokens
How to generate random variable names in C++ using macros?
Your solution sounds much too complicated and error prone to me.
You go over your project with a script (or probably make) to place the -D options to the compiler, anyhow. So I suppose you are keeping a list of all your functions (resp. the files defining them).
I'd use proper names for all the functions, nothing of your numbering scheme and then I would produce the file "module.cpp" with that script and initialize the table with the names.
For this you just have to keep a list of all your functions (and perhaps filenames) in one place. This could be easier be kept consistent than your actual scheme, I think.
Edit: Thinking of it even this might also be overengineering. If you have to maintain a list of your functions somewhere in any case, why not just inside the file "module.cpp"? Just include all the header files of all your functions, there, and list them in the initializer of the table.
Since you allow C++, the answer is obviously yes, with templates:
template<int N> int16_t myfunc() { /* N is a const int here */ }
myfunc_array[] = { myfunc<0>, myfunc<1>, myfunc<2> }
Now, you might wonder if you can create that variable-length initializer list with some macro. The answer is yes, but the macro's needed are ugly. So I'n not going to write them here, but point you to Boost::Preprocessor
However, do you really need such an array? Do you really need the name myfunc_array[0] for myfunc<0> ? Even if you need a runtime argument (myfunc_array[i]) there are other tricks:
inline template <int Nmax> int16_t myfunc_wrapper(int i) {
assert (i<Nmax);
return (i==Nmax) ? myfunc<Nmax> : myfunc_wrapper(i-1);
}
inline int16_t myfunc_wrapper(int i) {
return myfunc_wrapper<NUMBER>(i); // NUMBER is defined on with g++ -DNUMBER=N
}
Ok I worked out a solution based on Matt Joiner's tip:
module.h
typedef int16_t (*myfunc_t)(void);
extern myfunc_array[];
class FunctionRegistrar {
public:
FunctionRegistrar(myfunc_t fn, int fn_number) {
myfunc_array[fn_number - 1] = fn; // ensures correct ordering of functions (not that important though)
}
}
module.cpp
#include "module.h"
myfunc_array[100]; // The size needs to be #defined by the compiler, probably
func1.cpp, func2.cpp, ... funcN.cpp
#include "module.h"
static int16_t myfunc(void) { ... }
static FunctionRegistrar functionRegistrar(myfunc, NUMBER);
Thanks everyone!