Can I automatically build an argument list with a macro? - c++

I've been writing a little hooking library, which uses MS Detours and can attach to any MS API in any process and log what it's doing. One of things I've tried to do in my library is to remove all the boiler plate from hooking so that each hook can be written very simply.
I've got it down to this being all that's required to hook the CreateFile function and print its first parameter to the debugger:
Hooker hooker;
Hook(hooker, dllnames::Kernel32, apis::T_CreateFileW, CreateFileW, [](auto lpFilename, auto ...rest) -> auto
{
APILOG(L"%s", lpFilename);
return TruePtr(lpFilename, rest...);
});
The only boiler plate required for this, is the function pointer definition apis::T_CreateFileW. However, it strikes me for many cases, I could go further even than this. I'm wondering if I could us a macro to write the above as:
APILOG(hooker, dllnames::Kernel32, CreateFileW, L"%s", lpFilename);
This would mean that could almost write a printf that would allow me to log what any API was doing provided its parameters could be reasonably captured with a format string.
But is this possible as a macro? I guess some problems are:
I have to expand the ... of the macro into both auto param1, auto param2 as well as param1, param2.
I'd have to specify and print at least the first N parameters of the real API in my format string whether I want them or not, since only at that point can I pass rest...
If I want to print all the parameters, I wouldn't be able to pass rest...
It's possible I've already simplified this as far as I can, I'm just curious to see if anything approaching it is possible.

I don't think this is possible without using multiple macros and duplicating the parameter names at least one time; you can pass a multiple-args "bundle" using parens, but you can't do something with each argument like adding the auto prefix.
I'd instead consider more advanced code generation techniques using something other than the preprocessor. m4 comes to mind. Even a simple shell script could be used.

Related

llvm function wrapper for timing

I would like to add a function wrapper in order to record the entry and exit times of certain functions. It seems that LLVM would be a good tool to accomplish this. However, I've been having trouble finding a tutorial on how to write function wrappers. Any suggestions?
p.s. my target language is C
Assuming you need to call func_start when entering each function and func_return when returning, the easiest way is to do the following:
for each function F
insert a call to func_start(F) before the first instruction in the entry block
for each block B in function F
get the terminator instruction T
if T is a return instruction
insert a call to func_return(F) before T
All in all, including boilerplate code for your FunctionPass, wou'll have to write about 40 lines of code for this.
If you really want to go with the wrapper approach you have to do:
for each function F
clone function F (call it G)
delete all instructions in F
insert a call to func_start(F) in F
insert a call to G in F (forwarding the arguments), put the return value in R
insert a call to func_return(F) in F
insert a return instruction returning R in F
The code complexity in this case will be slightly higher and you'll likely incur in a higher compile- and run-time overhead.
I like doing this and use several approaches, depending on the circumstance.
The easiest if you are on a Linux platform is to use the wonderful ltrace utility. You provide the C program you are timing as an argument to ltrace. The "-T" option will output the elapsed call time. If you want a summary of call times use the "-c" option. You can control the amount of output by using the "-e" and "--library" options. Other platforms have somewhat similar tools (like dtrace) but they are not quite as easy to use.
Another, slightly hackish approach is to use macros to redefine the function names. This has all the potential pitfalls of macros but can work well in a controlled environment for smallish programs. The C preprocessor will not recursively expand macros so you can just call the actual function from inside your wrapper macro at the point of call. This avoids the difficulty of placing the "stop timing" code before each potential return in the function body.
#define foo(a,b,c) ({long t0 = now(); int retval = foo(a,b,c); long elapsed = now() - t0; retval;})
Notice the use of the non-standard code block inside an expression. This avoids collisions of the temporary names used for timing and retval. Also by placing retval as the last expression in the statement list this code will time function calls that are embedded in assignments or other expressional contexts (you need to change the type of "retval" to whatever is appropriate for your function).
You must be very careful NOT to include the #define before prototypes and such.
Use your favorite timer function and its appropriate data type (double, long long, whatever). I like <chrono> in C++11 myself.

Spirit Qi semantic actions and parameters for functions unrelated to the parser

How would I declare a semantic action that calls a free function that doesn't use the attribute the rule/parser returned?
Like, let's say we have a parser that returns a string, but I want to call an unrelated function like Beep, which takes two integer values for frequency and duration and does not care for strings?
Is it actually possible to call it directly, or do I always have to write a proxy function which consumes the string and calls, in this case, Beep in it's body?
Edit:
My apologies.
I should have mentioned that I used boost::phoenix::bind at first with the syntax Hartmut suggested, which gave me this error:
could not deduce template argument for 'RT (__cdecl *)(T0,T1)' from 'BOOL (__stdcall *)(DWORD,DWORD)'
Is it the calling convention that messes things up here?
Edit2:
Seems that's the problem, Hartmuts code compiles with a plain function that takes the same amount and types of arguments as Beep.
example:
bool foo(DWORD a, DWORD b)
{
}
px::bind(&foo,123,456); //compiles
px::bind(&Beep,123,456); // doesn't compile and generates the error message above.
A google search revealed to me that (most) WINAPI functions use the __stdcall convention, which is not the default convention, __cdecl, which C/C++ functions with the compiler option /Gd use, like in this case: foo
So the answers given so far were all correct, the phoenix solution just didn't work out of the box for me.
(Which motivated me to post this question in the first place. I'm sorry for it's undignified and confusing nature, maybe this clears it all up now.)
The only thing unclear to me now is...how I would make phoenix get along with __stdcall, but that should probably be a separate question.
As John said you could use boost::bind. You have to be very careful not to mix and match placeholder variables from different libraries, though. When using boost::bind you need to use its placeholder variables, i.e. ::_1 (yes, boost::bind's placeholders are in global namespace - yuck).
The best solution (and the safest in terms of compatibility) is to utilize boost::phoenix::bind. This is compatible with boost::bind and anything you can do there is possible with phoenix::bind as well (and more). Moreover, Spirit 'understands' Phoenix constructs and exposes all of it's internals using special placeholder variables which are implemented using Phoenix themselves.
In your case the code would look like:
namespace phx = boost::phoenix;
some_parser[phx::bind(&Beep, 123, 456)];
which will call Beep(123, 456) whenever some_parser matches.
I imagine you can use Boost Bind. Instead of writing a wrapper function and giving its address to Qi, you could do this:
boost::bind( Beep, 123, 456 );
This will build a binding which discards its own arguments and calls Beep(123, 456). If you wanted to pass its argument along as well (not that you do here, just to illustrate something common), you can do this:
boost::bind( Beep, 123, 456, _1 );
Then it will call Beep(123, 456, the_string)

Conversion of JSON logging macros to template functions... parameter names needed in code

I was assigned the task of updating a very old project a while back. The first thing I had to do was to expand the existing code to incorporate a new feature. As part of this I modified existing macros to print JSON representations of incoming messages (over CORBA, into C++ structs). I then incorporated boost program_options and a new logger and now I want to modernise the macros.
The problem is that I have no idea how to implement what I did with the macros with templates. The key problem is that I use the name of the parameters to the macros to access the fields of the struct:
//Defines the string that precedes the variable name in a JSON name-value pair (newline,indent,")
#define JSON_PRE_VNAME _T("%s,\n\t\t\t\t\"")
//Defines the string that follows the variable name in a JSON name-value pair (":) preceding the value
#define JSON_SEP _T("\":")
#define printHex(Y,X) _tprintf(_T("%02X"), (unsigned char)##Y->##X );
// ******** MACRO **********
// printParam (StructureFieldName=X, ParamType=Y)
// prints out a json key value pair.
// e.g. printParam(AgentId, %s) will print "AgentId":"3910"
// e.g. printParam(TempAgent, %d) will print "TempAgent":1
#define printParam(X,Y) if(strcmp(#Y,"%s")==0){\
_byteCount += _stprintf(_logBuf,JSON_PRE_VNAME _T(#X) JSON_SEP _T("\"%s\""),_logBuf,myEvent->##X);\
}else{\
_byteCount += _stprintf(_logBuf,JSON_PRE_VNAME _T(#X) JSON_SEP _T(#Y),_logBuf,myEvent->##X);\
}\
printBufToLog();
And it is used like this:
//CORBA EVENT AS STRUCT "event"
else if(event.type == NI_eventSendInformationToHost ){
evSendInformationToHost *myEvent;
event.data >>= myEvent; //demarshall
printParam(EventTime,%d);
printParam(Id,%d);
printParam(NodeId,%d);
}
and this results in JSON like this:
"EventTime":1299239194,
"Id":1234567,
"NodeId":3
etc...
Obviously I have commented these macros fairly well, but I am hoping that for the sake of anyone else looking at the code that there is a nice way to achieve the same result with templates. I have to say the macros do make it very easy to add new events to the message logger.
Basically how do I do "#X" and ##X with templates?
Any pointers would be appreciated.
Thanks!
There are some things that you cannot really do without macros, and for some specific contexts macros are the solution. I would just leave the macros as they are and move on to the next task.
Well, I would actually try to improve a bit the macros. It is usually recommended not to use ; inside macros, and with macros that contain more than a single statement wrap them in do {} while(0) loops:
#define printHex(Y,X) _tprintf(_T("%02X"), (unsigned char)##Y->##X )
// remove ; ^
// add do while here:
#define printParam(X,Y) do { if(strcmp(#Y,"%s")==0){\
_byteCount += _stprintf(_logBuf,JSON_PRE_VNAME _T(#X) JSON_SEP _T("\"%s\""),_logBuf,myEvent->##X);\
}else{\
_byteCount += _stprintf(_logBuf,JSON_PRE_VNAME _T(#X) JSON_SEP _T(#Y),_logBuf,myEvent->##X);\
}\
printBufToLog();\
} while (false)
This might help avoid small mistakes that would otherwise be hard to fix, as, for example uses of the macros with if:
if (condition) printHex(a,b);
else printHex(c,d);
// looks good, but would originally expand to a syntax error:
if (condition) _tprintf(_T("%02X"), (unsigned char)##Y->##X );;
else ...
Similarly
if (condition) printParam(a,b);
else ...
would expand to a whole lot of non-sense for the compiler even if it looks correct enough to the casual eye.
I think that in many cases it's better to use an external code generator... starting from a nice neutral definition it's easy to generate C++, Javascript and whatnot to handle your data.
C++ templates are quite primitive and structure/class introspection is just absent. By playing some tricks you can be able to do ifs and loops (wow! what an accomplishment) but a lot of useful techniques are just out of reach. Also once you get your hard to debug template trickery working, at the first error the programmer makes you get screens and screens of babbling nonsense instead of a clear error message.
On the other side you have the C preprocessor, that is horribly weak at doing any real processing and is just a little more (and also less) than a regexp search/replace.
Why clinging to poor tools instead of just implementing a separate code generation phase (that can easily be integrated in the make process) where you can use a serious language of your choice able to do both processing and text manipulation easily?
How easy would be to write a neutral easy-to-parse file and then using for example a Python program to generate the C++ struct declarations, the serialization code and also the javascript counterpart for that?

c++ va_arg typecast issue

All,
I am writing a small c++ app and have been stumped by this issue. Is there a way to create (and later catch ) the error while accessing element from va_list macro using va_arg if element type is not expected. Eg:-
count=va_arg(argp,int);
if (count <= 0 || count > 30)
{
reportParamError(); return;
}
Now, if I am passing a typedef instead of int, I get garbage value on MS compiler but 95% of time count gets value 0 on gcc (on 64 bit sles10 sys). Is there a way I can enforce some typechecking, so that I get an error that can be caught in a catch block?
Any ideas on this would be very helpful to me. Or is there a better way to do this. The function prototype is:-
void process(App_Context * pActx, ...)
The function is called as
process(pAtctx,3,type1,type2,type3);
It is essential for pActx to be passed as 1st parameter and hence cannot pass count as 1st parameter.
Update-1
Ok, this sounds strange but nargs does not seem to part of va_list on sles10 gcc. I had to put in
#ifdef _WIN32
tempCount=va_arg(argp,int)
#endif
After using this, parameters following nargs do not get garbage values. However, this introduces compiler/platform based #ifdefs....Thanks Chris and Kristopher
If you know a count will always be passed as the second argument, then you could always change the signature to this:
void process(App_Context * pActx, int count, ...)
If that's not an option, then there is really no way to catch it. That's just how the variable-argument-list stuff works: there is no way for the callee to know what arguments are being passed, other than whatever information the caller passes.
If you look into how the va_arg macro and related macros are implemented, you may be able to figure out how to inspect all the stuff on the stack. However, this would not be portable, and it is not recommended except as a debugging aid.
You also might want to look into alternatives to variable-arguments, like function overloading, templates, or passing a vector or list of arguments.
No, there is no way. varargs doesn't provide any way to check the types of parameters passed in. You must only read them with the correct type which means that you need another way of communicating type information.
You are likely to be better off avoiding varargs functionality unless you really need it. It's only really a C++ feature for the sake of legacy functions such as printf and friends.

A Good Way to Store C++ CLI Arguments? (W/O using libraries)

So, I'm writting a CLI application in C++ which will accept a bunch of arguments.
The syntax is pretty typical, -tag arg1 arg2 -tag2 arg1 ...
Right now, I take the char** argv and parse them into an
std::map< std::string, std::list<**std::string** > > >
The key is the tag, and the list holds each token behind that tag but before the next one. I don't want to store my args as just std::strings; but I need to make it more interactive.
By interactive, I mean when a user types './myprog -help' a list of all available commands comes up with descriptions.
Currently, my class to facilitate this is:
class Argument
{
public:
Argument(std::string flag, std::string desc);
std::string getFlag();
std::string getDesc();
std:;list<std::string> > getArgs();
void setArgs(std::list<std::string> > args);
bool validSyntax()=0;
std::string getSyntaxErrorDesc()=0;
};
The std::map structure is in a class ProgramCommands which goes about handling the these Arguments.
Now that the problem description is over, my 4 questions are:
How do I give the rest of the program access to the data in ProgramCommands?
I Don't want to make a singleton, at all; and I'd prefer to not have to pass ProgramCommands as an arg to almost every function in the program.
Do you have better ideas about storing how I'm doing this?
How best can I add arguments to the program, without hardcoding them into the ProgramCommands, or main?
std::string only allows for 1 line descriptions, does anyone have an elegant solution to this besides using a list of strings or boost?
EDIT
I don't really want to use libraries because this is a school project (sniffing & interpreting packets). I could, if I wanted to, but I'd rather not.
Your choices on storing the command line arguments are either: Make them a global or pass them around to the functions that need them. Which way is best depends on the sorts of options you have.
If MANY places in your program need the options (for instance a 'verbose' option), then I'd just make the structure a global and get on with my life. It doesn't need to be a singleton (you'll still only have one of them, but that's OK).
If you only need the options at startup time (i.e. # of threads to start initially or port # to connect on), then you can keep the parsing local to 'main' and just pass the parameters needed to the appropriate functions.
I tend to just parse options with the venerable getopt library (yes, that's a leftover from C - and it works just fine) and I stuff the option info (flags, values) into a global structure or a series of global variables. I give usage instructions by having a function 'print_usage' that just prints out the basic usage info as a single block of text. I find it works, it's quick, it's simple, and it gets the job done.
I dont understand your objection to using a singleton to - this is the sort of thing they are made for. If you want them accessible to every object but not pass them as arguments or use a singlton - there are only a couple of tricks I can think of:
-put the parsed arguments them into shared memory and then read them from every function that needs them
-write the parsed arguments out to a binary file and then read them from every function that needs them
-global variables
None of these solutions are nearly as elegant as a singleton, are MUCH more labor intensive and are well... sort of silly compared to a singleton... why hamstring yourself?