compile time assertions *across modules* / c,c++ - c++

recently i discovered in a relatively large project, that ugly runtime crashes occurred because various headers were included in different order in different cpp files.
These headers included #pragma pack - and these pragmas were sometimes not 'closed' ( i mean, set back to the compiler default #pragma pack() ) - resulting in different object layouts in different object files. No wonder the application crashed when it accessed struct members being created in one module and passed to another module. Or derived classes accessing members from base classes.
Since i like the idea to create a more general debugging and assertion strategy from every bug i find, i would really like to assert that object layouts are always and everywhere the same.
So it would be easy to assert
ASSERT( offsetof(membervar) == 4 )
But this would not catch a different layout in another module - or require manual updates whenever the struct layout changes .. so my favourite idea would be something like
ASSERT( offsetof(membervar) == offsetof(othermodule_membervar) )
Would this be possible with an assertion? Or is this a case for a unit test?
Thanks,
H

ASSERT( offsetof(membervar) == offsetof(othermodule_membervar) )
I can't see way to make this technically possible. Further, even if it was phyiscally possible, it isn't practical. You'd need an assert for every pair of source files:
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(B.c::MyClass.membervar) )
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(C.c::MyClass.membervar) )
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(D.c::MyClass.membervar) )
ASSERT( offsetof(B.c::MyClass.membervar) == offsetof(C.c::MyClass.membervar) )
ASSERT( offsetof(B.c::MyClass.membervar) == offsetof(D.c::MyClass.membervar) )
etc

You might be able to get away with this by asserting the sizeof(class) in different files. If the packing is causing the size of the object to be smaller, than I would expect that sizeof() would show that up.
You could also do this as a static assert using C++0x's static assert, or Boost's (or a handrolled one of course)
On the part of not wanting to do this in every file, I would recommend putting together a header file that includes all the headers you're worried about, and the static_asserts.
Personally though, I'd just recommend searching through the code base over the list of pragmas and fix them.

Wendy,
In Win32, there are single functions that can populate different versions of a given struct. Over the years, the FOOBAR struct might have new features added to it, so they create a FOOBAR2 or FOOBAREX. In some cases there are more than two versions.
Anyway, the way they handle this is to have the caller pass in sizeof(theStruct) in addition to the pointer to the struct:
FOOBAREX foobarex = {0};
long lResult = SomeWin32Api(sizeof(foobarex), &foobarex);
Within the implementation of SomWin32Api(), they check the first parameter and determine which version of the struct they're dealing with.
You could do something similar in a debug build to assure that the caller and callee agree on the size of the struct being referred to, and assert if the value doesn't match the expected size. With macros, you might even be able to automate/hide this so that it only happens in a debug build.
Unfortunately, this is a run-time check and not a compile-time check...

What you want isn't directly possible as such. If you're using VC++, the following may be of interest:
http://blogs.msdn.com/vcblog/archive/2007/05/17/diagnosing-hidden-odr-violations-in-visual-c-and-fixing-lnk2022.aspx
There's probably scope to create some way of semi-automating the process it describes, collating the output and cross-referencing.
To detect this sort of problem somewhat more automatically, the following occurs to me. Create a file that defines a struct that will have a particular size with the designated default packing amount, but a different size with different pack values. Also include some kind of static assert that its size is correct. For example, if the default is 4-byte packing:
struct X {
char c;
int i;
double d;
};
extern const char g_check[sizeof(X)==16?1:-1];
Then #include this file at the start of every header (just write a program to put the extra includes in if there's too many to do by hand), and compile and see what happens. This won't directly detect changes in struct layout, just non-standard packing settings, which is what you're interested in anyway.
(When adding new headers one would put this #include at the top, along with the usual ifdef boilerplate and so on. This is unfortunate but I'm not sure there's any way around it. The best solution is probably to ask people to do it, but assume they'll forget, and run the extra-include-inserting program every now and again...)

Apologies for posting an answer - which this is not - but I don't know how to post code in comments. Sorry.
To wrap Brone's idea in a macro, here is what free we currently use (feel free to edit it):
/** Our own assert macro, which will trace a FATAL error message if the assert
* fails. A FATAL trace will cause a system restart.
* Note: I would love to use CPPUNIT_ASSERT_MESSAGE here, for a nice clean
* test failure if testing with CppUnit, but since this header file is used
* by C code and the relevant CppUnit include file uses C++ specific
* features, I cannot.
*/
#ifdef TESTING
/* ToDo: might want to trace a FATAL if integration testing */
#define ASSERT_MSG(subsystem, message, condition) if (!(condition)) {printf("Assert failed: \"%s\" at line %d in file \"%s\"\n", message, __LINE__, __FILE__); fflush(stdout); abort();}
/* we can also use this, which prints of the failed condition as its message */
#define ASSERT_CONDITION(subsystem, condition) if (!(condition)) {printf("Assert failed: \%s\" at line %d in file \%s\"\n", #condition, __LINE__, __FILE__); fflush(stdout); abort();}
#else
#define ASSERT_MSG(subsystem, message, condition) if (!condition) DebugTrace(FATAL, subsystem, __FILE__, __LINE__, "%s", message);
#define ASSERT_CONDITION(subsystem, condition) if (!(condition)) DebugTrace(FATAL, subsystem, __FILE__, __LINE__, "%s", #condition);
#endif

What you would be looking for is an assertion like ASSERT_CONSISTENT(A_x, offsetof(A,x)), placed in a header file. Let me explain why, and what the problem is.
Because the problem exists across translation units, you can only detect the error at link time. That means you need to force the linker to spit out an error. Unfortunately, most cross-translation unit problems are formally of the "no diagnosis needed" kind. The most familiar one is the ODR rule. We can trivially cause ODR violations with such assertions, but you just can't rely on the linker to warn you about them. If you can, the implementation of the ODR can be as simple as
#define ASSERT_CONSISTENT(label, x) class ASSERT_ ## label { char test[x]; };
But if the linker doesn't notice these ODR violations, this will pass by silently. And here lies the problem: the linker really only needs to complain if it can't find something.
With two macro's the problem is solved:
template <int i> class dummy; // needed to differentiate functions
#define ASSERT_DEFINE(label, x) void ASSERT_label(dummy<x>&) { }
#define ASSERT_CHECK(label, x) void (*check)(dummy<x>&) = &ASSERT_label;
You'd need to put the ASSERT_DEFINE macro in a .cpp, and ASSERT_CHECK in its header. If the x value checked isn't the x value defined for that label, you're taking the address of an undefined function. Now, a linker doesn't need to warn about multiple definitions, but it must warn about missing definitions.
BTW, for this particular problem, see Diagnosing Hidden ODR Violations in Visual C++ (and fixing LNK2022)

Related

Merging global arrays at link time / filling a global array from multiple compilation units

I want to define an array of things, like event handlers. The contents of
this array is completely known at compile time, but is defined among
multiple compilation units, distributed amongst multiple libraries that
are fairly decoupled, at least until the final (static) link. I'd like
to keep it that way too - so adding or deleting a compilation unit will
also automatically manage the event handler without having to modify a
central list of event handlers.
Here's an example of what I'd like to do (but does not work).
central.h:
typedef void (*callback_t)(void);
callback_t callbacks[];
central.c:
#include "central.h"
void do_callbacks(void) {
int i;
for (i = 0; i < sizeof(callbacks) / sizeof(*callbacks); ++i)
callbacks[i]();
}
foo.c:
#include "central.h"
void callback_foo(void) { }
callback_t callbacks[] = {
&callback_foo
};
bar.c:
#include "central.h"
void callback_bar(void) { }
callback_t callbacks[] = {
&callback_bar
};
What I'd like to happen is to get a single callbacks array, which contains
two elements: &callback_foo and &callback_bar. With the code above, there's
obviously two problems:
The callbacks array is defined multiple times.
sizeof(callbacks) isn't known when compiling central.c.
It seems to me that the first point could be solved by having the linker merge
the two callbacks symbols instead of throwing an error (possibly through some
attribute on the variable), but I'm not sure if there is something like that.
Even if there is, the sizeof problem should somehow also be solved.
I realize that a common solution to this problem is to just have a startup
function or constructor that "registers" the callback. However, I can see only
two ways to implement this:
Use dynamic memory (realloc) for the callbacks array.
Use static memory with a fixed (bigger than usually needed) size.
Since I'm running on a microcontroller platform (Arduino) with limited memory,
neither of these approaches appeal to me. And given that the entire contents of
the array is known at compile time, I'm hoping for a way to let the compiler
also see this.
I've found this and this solution, but those require a custom
linker script, which is not feasible in the compilation environment I'm
running (especially not since this would require explicitely naming each
of these special arrays in the linker script, so just having a single
linker script addition doesn't work here).
This solution is the best I found so far. It uses a linked list
that is filled at runtime, but uses memory allocated statically in each
compile unit seperately (e.g. a next pointer is allocated with each
function pointer). Still, the overhead of these next pointers should not
be required - is there any better approach?
Perhaps having a dynamic solution combined with link-time optimization can
somehow result in a static allocation?
Suggestions on alternative approaches are also welcome, though the required
elements are having a static list of things, and memory efficiency.
Furthermore:
Using C++ is fine, I just used some C code above for illustrating the problem, most Arduino code is C++ anyway.
I'm using gcc / avr-gcc and though I'd prefer a portable solution, something that is gcc only is also ok.
I have template support available, but not STL.
In the Arduino environment that I use, I have not Makefile or other way to easily run some custom code at compiletime, so I'm looking for something that can be entirely implemented in the code.
As commented in some previous answer, the best option is to use a custom linker script (with a KEEP(*(SORT(.whatever.*))) input section).
Anyway, it can be done without modifying the linker scripts (working sample code below), at least at some platforms with gcc (tested on xtensa embedded device and cygwin)
Assumptions:
We want to avoid using RAM as much as possible (embedded)
We do not want the calling module to know anything about the modules with callbacks (it is a lib)
No fixed size for the list (unknown size at library compile time)
I am using GCC. The principle may work on other compilers, but I have not tested it
Callback funtions in this sample receive no arguments, but it is quite simple to modify if needed
How to do it:
We need the linker to somehow allocate at link time an array of pointers to functions
As we do not know the size of the array, we also need the linker to somehow mark the end of the array
This is quite specific, as the right way is using a custom linker script, but it happens to be feasible without doing so if we find a section in the standard linker script that is always "kept" and "sorted".
Normally, this is true for the .ctors.* input sections (the standard requires C++ constructors to be executed in order by function name, and it is implemented like this in standard ld scripts), so we can hack a little and give it a try.
Just take into account that it may not work for all platforms (I have tested it in xtensa embedded architecture and CygWIN, but this is a hacking trick, so...).
Also, as we are putting the pointers in the constructors section, we need to use one byte of RAM (for the whole program) to skip the callback code during C runtime init.
test.c:
A library that registers a module called test, and calls its callbacks at some point
#include "callback.h"
CALLBACK_LIST(test);
void do_something_and_call_the_callbacks(void) {
// ... doing something here ...
CALLBACKS(test);
// ... doing something else ...
}
callme1.c:
Client code registering two callbacks for module test. The generated functions have no name (indeed they do have a name, but it is magically generated to be unique inside the compilation unit)
#include <stdio.h>
#include "callback.h"
CALLBACK(test) {
printf("%s: %s\n", __FILE__, __FUNCTION__);
}
CALLBACK(test) {
printf("%s: %s\n", __FILE__, __FUNCTION__);
}
void callme1(void) {} // stub to be called in the test sample to include the compilation unit. Not needed in real code...
callme2.c:
Client code registering another callback for module test...
#include <stdio.h>
#include "callback.h"
CALLBACK(test) {
printf("%s: %s\n", __FILE__, __FUNCTION__);
}
void callme2(void) {} // stub to be called in the test sample to include the compilation unit. Not needed in real code...
callback.h:
And the magic...
#ifndef __CALLBACK_H__
#define __CALLBACK_H__
#ifdef __cplusplus
extern "C" {
#endif
typedef void (* callback)(void);
int __attribute__((weak)) _callback_ctor_stub = 0;
#ifdef __cplusplus
}
#endif
#define _PASTE(a, b) a ## b
#define PASTE(a, b) _PASTE(a, b)
#define CALLBACK(module) \
static inline void PASTE(_ ## module ## _callback_, __LINE__)(void); \
static void PASTE(_ ## module ## _callback_ctor_, __LINE__)(void); \
static __attribute__((section(".ctors.callback." #module "$2"))) __attribute__((used)) const callback PASTE(__ ## module ## _callback_, __LINE__) = PASTE(_ ## module ## _callback_ctor_, __LINE__); \
static void PASTE(_ ## module ## _callback_ctor_, __LINE__)(void) { \
if(_callback_ctor_stub) PASTE(_ ## module ## _callback_, __LINE__)(); \
} \
inline void PASTE(_ ## module ## _callback_, __LINE__)(void)
#define CALLBACK_LIST(module) \
static __attribute__((section(".ctors.callback." #module "$1"))) const callback _ ## module ## _callbacks_start[0] = {}; \
static __attribute__((section(".ctors.callback." #module "$3"))) const callback _ ## module ## _callbacks_end[0] = {}
#define CALLBACKS(module) do { \
const callback *cb; \
_callback_ctor_stub = 1; \
for(cb = _ ## module ## _callbacks_start ; cb < _ ## module ## _callbacks_end ; cb++) (*cb)(); \
} while(0)
#endif
main.c:
If you want to give it a try... this the entry point for a standalone program (tested and working on gcc-cygwin)
void do_something_and_call_the_callbacks(void);
int main() {
do_something_and_call_the_callbacks();
}
output:
This is the (relevant) output in my embedded device. The function names are generated at callback.h and can have duplicates, as the functions are static
app/callme1.c: _test_callback_8
app/callme1.c: _test_callback_4
app/callme2.c: _test_callback_4
And in CygWIN...
$ gcc -c -o callme1.o callme1.c
$ gcc -c -o callme2.o callme2.c
$ gcc -c -o test.o test.c
$ gcc -c -o main.o main.c
$ gcc -o testme test.o callme1.o callme2.o main.o
$ ./testme
callme1.c: _test_callback_4
callme1.c: _test_callback_8
callme2.c: _test_callback_4
linker map:
This is the relevant part of the map file generated by the linker
*(SORT(.ctors.*))
.ctors.callback.test$1 0x4024f040 0x0 .build/testme.a(test.o)
.ctors.callback.test$2 0x4024f040 0x8 .build/testme.a(callme1.o)
.ctors.callback.test$2 0x4024f048 0x4 .build/testme.a(callme2.o)
.ctors.callback.test$3 0x4024f04c 0x0 .build/testme.a(test.o)
Try to solve the actual problem. What you need are multiple callback functions, that are defined in various modules, that aren't in the slightest related to each other.
What you have done though, is to place a global variable in a header file, which is accessible by every module including that header. This introduces a tight coupling between all such files, even though they are not related to each other. Furthermore, it seems only the callback handler .c function needs to actually call the functions, yet they are exposed to the whole program.
So the actual problem here is the program design and nothing else.
And there is actually no apparent reason why you need to allocate this array at compile time. The only sane reason would be to save RAM, but that's of course is a valid reason for an embedded system. In which case the array should be declared as const and initialized at compile time.
You can keep something similar to your design if storing the array as read-write objects. Or if the array must be a read-only one for the purpose of saving RAM, you must do a drastic re-design.
I'll give both versions, consider which one is most suitable for your case:
RAM-based read/write array
(Advantage: flexible, can be changed in runtime. Disadvantages: RAM consumption. Slight over-head code for initialization. RAM is more exposed to bugs than flash.)
Let the callback.h and callback.c from a module which is only concerned with the handling of the callback functions. This module is responsible for how the callbacks are allocated and when they are executed.
In callback.h define a type for the callback functions. This should be a function pointer type just as you have done. But remove the variable declaration from the .h file.
In callback.c, declare the callback array of functions as
static callback_t callbacks [LARGE_ENOUGH_FOR_WORST_CASE];
There is no way you can avoid "LARGE_ENOUGH_FOR_WORST_CASE". You are on an embedded system with limited RAM, so you have to actually consider what the worst-case scenario is and reserve enough memory for that, no more, no less. On a microcontroller embedded system, there are no such things as "usually needed" nor "lets save some RAM for other processes". Your MCU either has enough memory to cover the worst case scenario, or it does not, in which case no amount of clever allocations will save you.
In callback.c, declare a size variable that keeps track of how much of the callback array that has been initialized. static size_t callback_size;.
Write an init function void callback_init(void) which initializes the callback module. The prototype should be in the .h file and the caller is responsible for executing it once, at program startup.
Inside the init function, set callback_size to 0. The reason I propose to do this in runtime is because you have an embedded system where a .bss segment may not be present or even undesired. You might not even have a copy-down code that initializes all static variables to zero. Such behavior is non-conformant with the C standard but very common in embedded systems. Therefore, never write code which relies on static variables getting automatically initialized to zero.
Write a function void callback_add (callback_t* callback);. Every module that includes your callback module will call this function to add their specific callback functions to the list.
Keep your do_callbacks function as it is (though as a minor remark, consider renaming to callback_traverse, callback_run or similar).
Flash-based read-only array
(Advantages: saves RAM, true read-only memory safe from memory corruption bugs. Disadvantages: less flexible, depends on every module used in the project, possibly slightly slower access because it's in flash.)
In this case, you'll have to turn the whole program upside-down. By the nature of compile-time solutions, it will be a whole lot more "hard-coded".
Instead of having multiple unrelated modules including a callback handler module, you'll have to make the callback handler module include everything else. The individual modules still don't know when a callback will get executed or where it is allocated. They just declare one or several functions as callbacks. The callback module is then responsible for adding every such callback function to its array at compile-time.
// callback.c
#include "timer_module.h"
#include "spi_module.h"
...
static const callback_t CALLBACKS [] =
{
&timer_callback1,
&timer_callback2,
&spi_callback,
...
};
The advantage of this is that you'll automatically get the worst case scenario handed to you by your own program. The size of the array is now known at compile time, it is simply sizeof(CALLBACKS)/sizeof(callback_t).
Of course this isn't nearly as elegant as the generic callback module. You get a tight coupling from the callback module to every other module in the project, but not the other way around. Essentially, the callback.c is a "main()".
You can still use a function pointer typedef in callback.h though, but it is no longer actually needed: the individual modules must ensure that they have their callback functions written in the desired format anyhow, with or without such a type present.
I too am faced with a similar problem:
...need are multiple callback functions, that are defined in various
modules, that aren't in the slightest related to each other.
Mine is C, on Atmel XMega processor. You mentioned that you are using GCC. The following doesn't solve your problem, it is a variant on the above #1 solution. It exploits the __attribute__((weak)) directive.
1) For each optional module, have a unique (per module name) but similar (per purpose) callback function. E.g.
fooModule.c:
void foo_eventCallback(void) {
// do the foo response here
}
barModule.c:
void bar_eventCallback(void) {
// do the bar response here
}
yakModule.c:
void yak_eventCallback(void) {
// do the yak response here
}
2) Have a callback start point that looks something like:
__attribute__((weak)) void foo_eventCallback(void) { }
__attribute__((weak)) void bar_eventCallback(void) { }
__attribute__((weak)) void yak_eventCallback(void) { }
void functionThatExcitesCallback(void) {
foo_eventCallback();
foo_eventCallback();
foo_eventCallback();
}
The __attribute__((weak)) qualifier basically creates a default implementation with an empty body, which the linker will replace with a different variant IF it finds a non-weak variant by the same name. It doesn't make it completely decoupled, unfortunately. But you can at least put this big super-set-of-all-callbacks in one and only one place, and not get into header file hell with it. And then your different compilation units basically replace the subsets of the superset that they want to. I would love it if there was a way to do this with using the same named function in all modules and just have those called based on what's linked, but haven't yet found something that does that.

increase c++ code verbosity with macros

I'd like to have the possibility to increase the verbosity for debug purposes of my program. Of course I can do that using a switch/flag during runtime. But that can be very inefficient, due to all the 'if' statements I should add to my code.
So, I'd like to add a flag to be used during compilation in order to include optional, usually slow debug operations in my code, without affecting the performance/size of my program when not needed. here's an example:
/* code */
#ifdef _DEBUG_
/* do debug operations here
#endif
so, compiling with -D_DEBUG_ should do the trick. without it, that part won't be included in my program.
Another option (at least for i/o operations) would be to define at least an i/o function, like
#ifdef _DEBUG_
#define LOG(x) std::clog << x << std::endl;
#else
#define LOG(x)
#endif
However, I strongly suspect this probably isn't the cleanest way to do that. So, what would you do instead?
I prefer to use #ifdef with real functions so that the function has an empty body if _DEBUG_ is not defined:
void log(std::string x)
{
#ifdef _DEBUG_
std::cout << x << std::endl;
#endif
}
There are three big reasons for this preference:
When _DEBUG_ is not defined, the function definition is empty and any modern compiler will completely optimize out any call to that function (the definition should be visible inside that translation unit, of course).
The #ifdef guard only has to be applied to a small localized area of code, rather than every time you call log.
You do not need to use lots of macros, avoiding pollution of your code.
You can use macros to change implementation of the function (Like in sftrabbit's solution). That way, no empty places will be left in your code, and the compiler will optimize the "empty" calls away.
You can also use two distinct files for the debug and release implementation, and let your IDE/build script choose the appropriate one; this involves no #defines at all. Just remember the DRY rule and make the clean code reusable in debug scenario.
I would say that his actually is very dependent on the actual problem you are facing. Some problems will benefit more of the second solution, whilst the simple code might be better with simple defines.
Both snippets that you describe are correct ways of using conditional compilation to enable or disable the debugging through a compile-time switch. However, your assertion that checking the debug flags at runtime "can be very inefficient, due to all the 'if' statements I should add to my code" is mostly incorrect: in most practical cases a runtime check does not influence the speed of your program in a detectable way, so if keeping the runtime flag offers you potential advantages (e.g. turning the debugging on to diagnose a problem in production without recompiling) you should go for a run-time flag instead.
For the additional checks, I would rely on the assert (see the assert.h) which does exactly what you need: check when you compile in debug, no check when compiled for the release.
For the verbosity, a more C++ version of what you propose would use a simple Logger class with a boolean as template parameter. But the macro is fine as well if kept within the Logger class.
For commercial software, having SOME debug output that is available at runtime on customer sites is usually a valuable thing to have. I'm not saying everything has to be compiled into the final binary, but it's not at all unusual that customers do things to your code that you don't expect [or that causes the code to behave in ways that you don't expect]. Being able to tell the customer "Well, if you run myprog -v 2 -l logfile.txt and do you usual thing, then email me logfile.txt" is a very, very useful thing to have.
As long as the "if-statement to decide if we log or not" is not in the deepest, darkest jungle in peru, eh, I mean in the deepest nesting levels of your tight, performance critical loop, then it's rarely a problem to leave it in.
So, I personally tend to go for the "always there, not always enabled" approach. THat's not to say that I don't find myself adding some extra logging in the middle of my tight loops sometimes - only to remove it later on when the bug is fixed.
You can avoid the function-like macro when doing conditional compilation. Just define a regular or template function to do the logging and call it inside the:
#ifdef _DEBUG_
/* ... */
#endif
part of the code.
At least in the *Nix universe, the default define for this kind of thing is NDEBUG (read no-debug). If it is defined, your code should skip the debug code. I.e. you would do something like this:
#ifdef NDEBUG
inline void log(...) {}
#else
inline void log(...) { .... }
#endif
An example piece of code I use in my projects. This way, you can use variable argument list and if DEBUG flag is not set, related code is cleared out:
#ifdef DEBUG
#define PR_DEBUG(fmt, ...) \
PR_DEBUG(fmt, ...) printf("[DBG] %s: " fmt, __func__, ## __VA_ARGS__)
#else
#define PR_DEBUG(fmt, ...)
#endif
Usage:
#define DEBUG
<..>
ret = do_smth();
PR_DEBUG("some kind of code returned %d", ret);
Output:
[DBG] some_func: some kind of code returned 0
of course, printf() may be replaced by any output function you use. Furthermore, it can be easily modified so additional information, as for example time stamp, is automatically appended.
For me it depends from application to application.
I've had applications where I wanted to always log (for example, we had an application where in case of errors, clients would take all the logs of the application and send them to us for diagnostics). In such a case, the logging API should probably be based on functions (i.e. not macros) and always defined.
In cases when logging is not always necessary or you need to be able to completely disable it for performance/other reasons, you can define logging macros.
In that case I prefer a single-line macro like this:
#ifdef NDEBUG
#define LOGSTREAM /##/
#else
#define LOGSTREAM std::clog
// or
// #define LOGSTREAM std::ofstream("output.log", std::ios::out|std::ios::app)
#endif
client code:
LOG << "Initializing chipmunk feeding module ...\n";
//...
LOG << "Shutting down chipmunk feeding module ...\n";
It's just like any other feature.
My assumptions:
No global variables
System designed to interfaces
For whatever you want verbose output, create two implementations, one quiet, one verbose.
At application initialisation, choose the implementation you want.
It could be a logger, or a widget, or a memory manager, for example.
Obviously you don't want to duplicate code, so extract the minimum variation you want. If you know what the strategy pattern is, or the decorator pattern, these are the right direction. Follow the open closed principle.

How to separate logging logic from business logic in a C program? And in a C++ one?

I am currently coding in C and I have lots of printfs so that I can track, at some times, the flow of my application. The problem is that some times I want more detail than others, so I usually spend my time commenting/uncommenting my C code, so I can get the appropriate output.
When using Java or C#, I can generally separate both my implementation code from the logging logic by using Aspects.
Is there any similar technique you use in C to get around this problem?
I know I could put a flag called DEBUG that could be either on or off, so I wouldn't have to go all around and comment/uncomment my whole code every time I want to either show or hide the printfs. The question is I'd like to also get rid of the logging logic in my code.
If instead of C I was coding in C++, would it be any better?
Edit
It seems there is an AspectC++, so for C++ there seems to be a solution. What about C?
Thanks
IME you cannot really separate logging from the algorithms that you want to log about. Placing logging statements strategically takes time and experience. Usually, the code keeps assembling logging statements over its entire lifetime (though it's asymptotic). Usually, the logging evolves with the code. If the algorithm changes often, so will usually the logging code.
What you can do is make logging as unobtrusive as possible. That is, make sure logging statements always are one-liners that do not disrupt reading the algorithm, make it so others can insert additional logging statements into an existing algorithm without having to fully understand your logging lib, etc.
In short, treat logging like you treat string handling: Wrap it in a nice little lib that will be included and used just about everywhere, make that lib fast, and make it easy to use.
Not really.
If you have variadic macros available, you can easily play games like this:
#ifdef NDEBUG
#define log(...) (void)0
#else
#define log(...) do {printf("%s:%d: ", __FILE__, __LINE__); printf(__VA_ARGS__);} while(0)
#endif
You can also have logging that's turn-off-and-onable at a finer granularity:
#define LOG_FLAGS <something>;
#define maybe_log(FLAG, ...) do { if (FLAG&LOG_FLAGS) printf(__VA_ARGS__);} while(0)
int some_function(int x, int y) {
maybe_log(FUNCTION_ENTRY, "x=%d;y=%d\n", x, y);
... do something ...
maybe_log(FUNCTION_EXIT, "result=%d\n", result);
return result;
}
Obviously this can be a bit tedious with only allowing a single return from each function, since you can't directly get at the function return.
Any of those macros and calls to printf could be replaced with something (other macros, or variadic function calls) that allows the actual logging format and target to be separated from the business logic, but the fact of some kind of logging being done can't be, really.
aspectc.org does claim to offer a C and C++ compiler with language extensions supporting AOP. I have no idea what state it's in, and if you use it then of course you're not really writing C (or C++) any more.
Remember that C++ has multiple inheritance, which is sometimes helpful with cross-cutting concerns. With enough templates you can do remarkable things, perhaps even implementing your own method dispatch system that allows some sort of join points, but it's a big thing to take on.
On GCC you could use variadic macros: http://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html . It makes possible to define dprintf() with any number of parameters.
Using additional hidden verbose_level parameter you can filter the messages.
In this case the logging loggic will only contain
dprintf_cond(flags_or_verbose_level, msg, param1, param2);
and there will be no need in separating it from the rest of code.
A flag and proper logic is probably the safer way to do it, but you could do the same at compile type. Ie. Use #define and #ifdef to include/exclude the printfs.
Hmm, this sounds similar to a problem I encountered when working on a C++ project last summer. It was a distributed app which had to be absolutely bulletproof and this resulted in a load of annoying exception handling bloat. A 10 line function would double in size by the time you added an exception or two, because each one involved building a stringstream from a looong exception string plus any relevant parameters, and then actually throwing the exception maybe five lines later.
So I ended up building a mini exception handling framework which meant I could centralise all my exception messages inside one class. I would initialise that class with my (possibly parameterised) messages on startup, and this allowed me to write things like throw CommunicationException(28, param1, param2) (variadic arguments). I think I'll catch a bit of flak for that, but it made the code infinitely more readable. The only danger, for example, was that you could inadvertently throw that exception with message #27 rather than #28.
#ifndef DEBUG_OUT
# define DBG_MGS(level, format, ...)
# define DBG_SET_LEVEL(x) do{}while(0)
#else
extern int dbg_level;
# define DBG_MSG(level, format, ...) \
do { \
if ((level) >= dbg_level) { \
fprintf(stderr, (format), ## __VA_ARGS__); \
} \
} while (0)
# define DBG_SET_LEVEL(X) do { dbg_level = (X); } while (0)
#endif
The ## before __VA_ARGS__ is a GCC specific thing that makes , __VA_ARGS__ actually turn into the right code when there are no actual extra arguments.
The do { ... } while (0) stuff is just to make you put ; after the statements when you use them, like you do when you call regular functions.
If you don't want to get as fancy you can do away with the debug level part. That just makes it so that if you want you can alter the level of debugging/tracing date you want.
You could turn the entire print statement into a separate function (either inline or a regular one) that would be called regardless of the debug level, and would make the decision as to printing or not internally.
#include <stdarg.h>
#include <stdio.h>
int dbg_level = 0;
void DBG_MGS(int level, const char *format, ...) {
va_list ap;
va_start(ap, format);
if (level >= dbg_level) {
vfprintf(stderr, format, ap);
}
va_end(ap);
}
If you are using a *nix system then you should have a look at syslog.
You might also want to search for some tracing libraries. There are a few that do similar things to what I have outlined.

Is there a standardised way to get type sizes in bytes in C++ Compilers?

I was wondering if there is some standardized way of getting type sizes in memory at the pre-processor stage - so in macro form, sizeof() does not cut it.
If their isn't a standardized method are their conventional methods that most IDE's use anyway?
Are there any other methods that anyone can think of to get such data?
I suppose I could do a two stage build kind of thing, get the output of a test program and feed it back into the IDE, but that's not really any easier than #defining them in myself.
Thoughts?
EDIT:
I just want to be able to swap code around with
#ifdef / #endif
Was it naive of me to think that an IDE or underlying compiler might define that information under some macro? Sure the pre-processor doesn't get information on any actual machine code generation functions, but the IDE and the Compiler do, and they call the pre-processor and declare stuff to it in advance.
EDIT FURTHER
What I imagined as a conceivable concept was this:
The C++ Committee has a standard that says for every type (perhaps only those native to C++) the compiler has to give to the IDE a header file, included by default that declares the size in memory that ever native type uses, like so:
#define CHAR_SIZE 8
#define INT_SIZE 32
#define SHORT_INT_SIZE 16
#define FLOAT_SIZE 32
// etc
Is there a flaw in this process somewhere?
EDIT EVEN FURTHER
In order to get across the multi-platform build stage problem, perhaps this standard could mandate that a simple program like the one shown by lacqui would be required to compile and run be run by default, this way, whatever that gets type sizes will be the same machine that compiles the code in the second or 'normal' build stage.
Apologies:
I've been using 'Variable' instead of 'Type'
Depending on your build environment, you may be able to write a utility program that generates a header that is included by other files:
int main(void) {
out = make_header_file(); // defined by you
fprintf(out, "#ifndef VARTYPES_H\n#define VARTYPES_H\n");
size_t intsize = sizeof(int);
if (intsize == 4)
fprintf(out, "#define INTSIZE_32\n");
else if (intsize == 8)
fprintf(out, "#define INTSIZE_64\n");
// .....
else fprintf(out, "$define INTSIZE_UNKNOWN\n");
}
Of course, edit it as appropriate. Then include "vartypes.h" everywhere you need these definitions.
EDIT: Alternatively:
fprintf(out, "#define INTSIZE_%d\n", (sizeof(int) / 8));
fprintf(out, "#define INTSIZE %d\n", (sizeof(int) / 8));
Note the lack of underscore in the second one - the first creates INTSIZE_32 which can be used in #ifdef. The second creates INTSIZE, which can be used, for example char bits[INTSIZE];
WARNING: This will only work with an 8-bit char. Most modern home and server computers will follow this pattern; however, some computers may use different sizes of char
Sorry, this information isn't available at the preprocessor stage. To compute the size of a variable you have to do just about all the work of parsing and abstract evaluation - not quite code generation, but you have to be able to evaluate constant-expressions and substitute template parameters, for instance. And you have to know considerably more about the code generation target than the preprocessor usually does.
The two-stage build thing is what most people do in practice, I think. Some IDEs have an entire compiler built into them as a library, which lets them do things more efficiently.
Why do you need this anyway?
The cstdint include provides typedefs and #defines that describe all of the standard integer types, including typedefs for exact-width int types and #defines for the full value range for them.
No, it's not possible. Just for example, it's entirely possible to run the preprocessor on one machine, and do the compilation entirely separately on a completely different machine with (potentially) different sizes for (at least some) types.
For a concrete example, consider that the normal distribution of SQLite is what they call an "amalgamation" -- a single already-preprocessed source code file that you actually compile on your computer.
You want to generate different code based on the sizes of some type? maybe you can do this with template specializations:
#include <iostream>
template <int Tsize>
struct dosomething{
void doit() { std::cout << "generic version" << std::endl; }
};
template <>
void dosomething<sizeof(int)>::doit()
{ std::cout << "int version" << std::endl; }
template <>
void dosomething<sizeof(char)>::doit()
{ std::cout << "char version" << std::endl; }
int main(int argc, char** argv)
{
typedef int foo;
dosomething<sizeof(foo)> myfoo;
myfoo.doit();
}
How would that work? The size isn't known at the preprocessing stage. At that point, you only have the source code. The only way to find the size of a type is to compile its definition.
You might as well ask for a way to get the result of running a program at the compilation stage. The answer is "you can't, you have to run the program to get its output". Just like you need to compile the program in order to get the output from the compiler.
What are you trying to do?
Regarding your edit, it still seems confused.
Such a header could conceivably exist for built-in types, but never for variables. A macro could perhaps be written to replace known type names with a hardcoded number, but it wouldn't know what to do if you gave it a variable name.
Once again, what are you trying to do? What is the problem you're trying to solve? There may be a sane solution to it if you give us a bit more context.
For common build environments, many frameworks have this set up manually. For instance,
http://www.aoc.nrao.edu/php/tjuerges/ALMA/ACE-5.5.2/html/ace/Basic__Types_8h-source.html
defines things like ACE_SIZEOF_CHAR. Another library described in a book I bought called POSH does this too, in a very includable way: http://www.hookatooka.com/wpc/
The term "standardized" is the problem. There's not standard way of doing it, but it's not very difficult to set some pre-processor symbols using a configuration utility of some sort. A real simple one would be compile and run a small program that checks sizes with sizeof and then outputs an include file with some symbols set.

Using C++ Macros To Check For Variable Existence

I am creating a logging facility for my library, and have made some nice macros such as:
#define DEBUG myDebuggingClass(__FILE__, __FUNCTION__, __LINE__)
#define WARING myWarningClass(__FILE__, __FUNCTION__, __LINE__)
where myDebuggingClass and myWarningClass both have an overloaded << operator, and do some helpful things with log messages.
Now, I have some base class that users will be overloading called "Widget", and I would like to change these definitions to something more like:
#define DEBUG myDebuggingClass(__FILE__, __FUNCTION__, __LINE__, this)
#define WARNING myWarningClass(__FILE__, __FUNCTION__, __LINE__, this)
so that when users call 'DEBUG << "Some Message"; ' I can check to see if the "this" argument dynamic_casts to a Widget, and if so I can do some useful things with that information, and if not then I can just ignore it. The only problem is that I would like users to be able to also issue DEBUG and WARNING messages from non-member functions, such as main(). However, given this simple macro, users will just get a compilation error because "this" will not be defined outside of class member functions.
The easiest solution is to just define separate WIDGET_DEBUG, WIDGET_WARNING, PLAIN_DEBUG, and PLAIN_WARNING macros and to document the differences to the users, but it would be really cool if there were a way to get around this. Has anyone seen any tricks for doing this sort of thing?
Declare a global Widget* const widget_this = NULL; and a protected member variable widget_this in the Widget class, initialized to this, and do
#define DEBUG myDebuggingClass(__FILE__, __FUNCTION__, __LINE__, widget_this)
Macros are basically a straight text substitution done by the preprocessor. There's no way for a macro to know the context from which it's being called to do the sort of detection you're interested in.
The best solution is probably separate macros as you suspect.
I don't think you can do this with a macro. You can probably manage to do it with SFINAE, but code that uses SFINAE (at least directly) is1 hard to write, harder to debug, and virtually impossible for anybody but an expert to read or understand. If you really want to do this, I'd try to see if you can get Boost enable_if (or a relative thereof) to handle at least part of the dirty work.
1 ...at least in every case I've ever seen, and I have a hard time imagining it being otherwise either.
Inspired by solipist, but slightly simpler in the implementation:
class Widget {
protected:
::myDebuggingClass myDebuggingClass(char const* file, char const* function, int line) {
return ::myDebuggingClass(file, function, line, this);
}
// ...
This eliminates the need for a shadowed variable; it relies on simple class name lookup rules.
The only way I can think of to possibly get this to work is to define a global variable:
Widget * this = NULL;
If that even compiles (I have my doubts, but don't have a compiler to test it), member functions will use the nearest scoped variable (the real his pointer), and everything else will get a null. Everyone's happy (so to speak...)
you could use weak reference to detect variable or function whether exist.
eg:
detect int a exist:
int a attribute((weak));
if (a)
exist
else
not exist