I'm using ffmpeg in my C++ application.
When trying to play certain files an assertion inside of ffmpeg fails, which causes it to call abort() which terminates my application. I do not want this behavior, rather I want to get the chance to recover, preferably through an exception.
Anyone got any ideas as to how I can get around the problem with ffmpeg/assert potentially terminating my application?
EDIT:
The only way I can think of right now is to change the ffmpeg assert macro so that it causes an access violation which I can catch through SEH exceptions. Ugly and potentially bad solution?
If the "exception" needs to be compiled as C, you could use a setjmp/longjump pair. Put the setjmp in your error handling code, and the longjmp in place of the abort in the FFMPG code.
If you really want a true exception to catch, a divide by zero might be safer than a random access violation.
This code is from ffmpeg doxygen documentation
/*
* copyright (c) 2010 Michael Niedermayer <michaelni#gmx.at>
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef AVUTIL_AVASSERT_H
#define AVUTIL_AVASSERT_H
#include <stdlib.h>
#include "avutil.h"
#include "log.h"
#define av_assert0(cond) do { \
if (!(cond)) { \
av_log(NULL, AV_LOG_FATAL, "Assertion %s failed at %s:%d\n", \
AV_STRINGIFY(cond), __FILE__, __LINE__); \
abort(); \
} \
} while (0)
#if defined(ASSERT_LEVEL) && ASSERT_LEVEL > 0
#define av_assert1(cond) av_assert0(cond)
#else
#define av_assert1(cond) ((void)0)
#endif
#if defined(ASSERT_LEVEL) && ASSERT_LEVEL > 1
#define av_assert2(cond) av_assert0(cond)
#else
#define av_assert2(cond) ((void)0)
#endif
#endif /* AVUTIL_AVASSERT_H */
You could simply redefine the av_assert macros to throw instead of abort()
If you can't/don't want to re-work the ffmpeg code, then I'd say fork off another process to do the ffmpeg operations and then exit. You can wait for that process to exit one way or another in your main process and determine how it went, without risk of your main process being terminated.
It may not be the best solution in the world, but it gets you the isolation you need, with some hope of knowing what happened, and without having to do too much violence to the ffpmpeg code.
Related
while developing in Xcode it is common to switch between Debug and Release mode and using some parts of code in Debug mode only while not using some in Release mode.
I often throw out NSLog code by some #define rule that lets the Pre-compiler parse out those commands that are not needed in a Release. Doing so because some final testing needs proof everything works as expected and errors are still handled properly without messing some NSLog i possibly forgot. This is in example of importance in audio development where logging in general is contra productive but needed while debugging. Wrapping everything in #ifdef DEBUG is kinda cumbersome and makes code lock wild, so my #defines are working well to keep code simple and readable without worrying about NSLog left in releases while still Logging on purpose if needed. This praxis works really well for me to have a proper test scenario with pure Release code.
But this leads to compiler warnings that some variables are not used at all. Which is not much of a problem but i want to go one step ahead and try to get rid of those warnings also. Now i could turn those warnings off in Xcode but i try to find a way to keep those and just getting rid of them for my NSLog overruled #defines
So instead of logging against dev>null i throw out (nullify) all code that is wrapped by NSLog(... ) and use some extra defined rule called ALLWAYSLog() that keeps NSLog in Releases on purpose and also changes NSLog to fprintf to avoid app origin and time prints.
Here my rules..
#ifdef DEBUG
#define NSLog(FORMAT, ...) fprintf(stderr, "%s \n", [[NSString stringWithFormat:FORMAT, ##__VA_ARGS__] UTF8String])
#else
#define NSLog(FORMAT, ...) {;}
#endif
#define ALLWAYSLog(FORMAT, ...) fprintf(stderr, "%s \n", [[[NSString alloc] initWithFormat:FORMAT, ##__VA_ARGS__] UTF8String])
To get rid of those unused variable warnings we often use
#pragma unused(variablename)
to inform the precompiler we did that on purpose..
Question:
Is it possible to write some #define rule that makes use of #pragma unused(x) ? Or how to integrate this mentioned way of __unused attribute
In the #else case, you can put the function call on the right side of the && operator with 0 on the left side. That will ensure that variables are "used" while also ensuring that the function doesn't actually get called and that the parameters are not evaluated.
#ifdef DEBUG
#define NSLog(FORMAT, ...) fprintf(stderr, "%s \n", [[NSString stringWithFormat:FORMAT, ##__VA_ARGS__] UTF8String])
#else
#define NSLog(FORMAT, ...) (0 && fprintf(stderr, "%s \n", [[NSString stringWithFormat:FORMAT, ##__VA_ARGS__] UTF8String]))
#endif
after testing and still not believing there is no "official way" of doing this i ended up reading my header files again. (usr/include/sys/cdefs.h)
where __unused is declared as __attribute__((__unused__)).
This seems the officially way of telling the Apple Clang (GCC) compiler a specific variable will be not used intentionally by placing a __unused directive at the right place in code. In example in front of a variables declaration or after a function declaration and more.. see stackoverflow discussion starting 2013 ongoing
#dbush 's answer was and is nice because it suppresses the unused variable warning by making use of the passed arguments and introducing nullify logic that will do no harm - but will still be executed to find "Expression result unused". That was pretty close to my goal and is possibly still the most simple solution.
in example:
#define NSLog(...) (0 && fprintf(stderr,"%s",[[NSString alloc] initWithFormat:__VA_ARGS__].UTF8String))
// applied to
int check = 333;
NSLog(#"findMeAgainInPreProcess %d",check);
// preprocesses to
int check = 333;
{0 && fprintf(__stderrp,"%s \n",[[NSString alloc] initWithFormat:#"findMeAgainInPreProcess %d",check].UTF8String);};
While this will not print anything, the compiler knows the expression is unused then.
But this left me with the question how unused marking is done properly. Trying a reciprocal approach distinguish Debug and Release to make use of __unused again in combination with my first approach like so...
#define ALLWAYSLog(...) fprintf(stderr,"%s \n",[[NSString alloc] initWithFormat:__VA_ARGS__].UTF8String)
#ifdef DEBUG
#define IN_RELEASE__unused
#define NSLog(...) ALLWAYSLog(__VA_ARGS__)
#else
#define IN_RELEASE__unused __unused
//#define NSLog(...) (0&&ALLWAYSLog(__VA_ARGS__)) //#dbush solution
//#define NSLog(...) NSCAssert(__VA_ARGS__,"")
//#define NSLog(...) {;}
#define NSLog(...) /*__VA_ARGS__*/
#endif
In Debug it will not silence the unused variable warning by parsing out the directive itself and in Release it will exchange IN_RELEASE__unused to __unused according to the macro and silence it. This is a little extra work but could help to see which parts are unused on purpose.
Means i can type like below..
IN_RELEASE__unused int check = 333;
NSLog(#"findMeAgainInPreProcess %d",check);
// produces for DEBUG
int check = 333; //no warning, var is used below
fprintf(__stderrp,"%s \n",[[NSString alloc] initWithFormat:#"findMeAgainInPreCompile %d", check].UTF8String);
// produces for RELEASE
__attribute__((__unused__)) int check = 333; //no warning intentionally
; // no print, nothing
This keeps NSLog in place (in code), marks the unused variables to silence the warning properly and NSLog gets parsed out completely in Release. And i can still force prints for both modes with the introduced ALLWAYSLog.
Conclusion: dbush's solution is still more straight forward.
I just reinstalled MinGW and the Codelite IDE on my Windows PC, however I'm now unable to compile/build a project.
It is odd because every time I change a setting or make a new project, I am able to run it once, then it stops working.
I've already tried reinstalling MinGW...
This may be a bug of gcc that occurs when applying c++11 or newer standards, ie adding parameter "-std=c++11" or "-std=c++0x".
I fixed it by adding "#include "io.h"" in the file stdio.h.
you can go to your include path: "c:/mingw/include" and edit the "stdio.h".
/*
* stdio.h
*
* Definitions of types and prototypes of functions for operations on
* standard input and standard output streams.
*
* $Id: stdio.h,v 8863016e809f 2018/12/04 19:00:29 keith $
*
* Written by Colin Peters
* Copyright (C) 1997-2005, 2007-2010, 2014-2018, MinGW.org Project.
*
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice, this permission notice, and the following
* disclaimer shall be included in all copies or substantial portions of
* the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OF OR OTHER
* DEALINGS IN THE SOFTWARE.
*
* NOTE: The file manipulation functions provided by Microsoft seem to
* work with either slash (/) or backslash () as the directory separator;
* (this is consistent with Microsoft's own documentation, on MSDN).
*
*/
#include //include at here
#ifndef _STDIO_H
#pragma GCC system_header
/* When including <wchar.h>, some of the definitions and declarations
* which are nominally provided in <stdio.h> must be duplicated. Rather
* than require duplicated maintenance effort, we provide for partial
* inclusion of <stdio.h> by <wchar.h>; only when not included in
* this partial fashion...
*/
If there is any problems or a better solution, i would appreciate your feedback very gladly.
I would have assumed this would be a widely asked question, but still I have yet to find an answer.
I was debugging some C++ code that was creating errors in a subtle way only with certain function handles as inputs. Long story short fixed the problem but I defined in the .cpp file:
#define DEBUG(x) do { std::cerr << x << std::endl; } while (0)
needless to say the code is littered with:
DEBUG("Foo's address")
DEBUG(&Foo)
Now I assumed that in "Release" that the compiler would ignore all these pre-compiler outputs. But it doesn't!
So how does one do this in practice (I want to leave the outputs for future additions, but obviously don't want it in release versions)? I'm trying out clion which uses cmake, is this something IDE/compiler specific?
Thanks
Depending on your compiler it may define something that tells you that compilation is in debug mode (or you can do that yourself on the command line), then:
#ifndef _DEBUG // works in VS
#define DEBUG(x)
#else
#define DEBUG(x) do { std::cerr << x << std::endl; } while (0)
#endif
For more discussion on which macro to use, see this question.
Portable way of doing this would be using NDEBUG
#ifdef NDEBUG
#define DEBUG(x)
#else
#define DEBUG(x) do { std::cerr << x << std::endl; } while (0)
#endif
See: C / C++ : Portable way to detect debug / release?
It is usually a mistake to eliminate logging messages in so-called "release versions". After all, "release versions" are where you will need the logged information most! How else will you even hope to analyse and fix the problem when the end user will tell you nothing but "it doesn't work" or "it crashed"?
So instead of eliminating the valuable information that your software creates to aid you in your bugfixing sessions, think about how to save and persist it such that it can be easily transmitted to you by the end user if problems arise. Like, redirecting the log messages to a log file when the application runs on the user's machine (perhaps with the application itself offering a "Send log file to support" feature, or something like that).
Code like
DEBUG("Foo's address")
DEBUG(&Foo)
should be replaced with something like:
Log("Foo's address");
Log(std::to_string(&Foo));
Then inside of your Log function, which may have a signature like void Log(std::string const& message), you can check your DEBUG macro and act accordingly:
void Log(std::string const& message)
{
#ifdef DEBUG
// write message to std::cerr
#else
// write message to log file
#endif
}
Now, of course, DEBUG is not a standard macro (unlike NDEBUG, which turns assert on and off). It's not implicitly defined. You have to define it yourself when you invoke your compiler. For example, /DDEBUG with MSVC or -DDEBUG with GCC. Chances are that your IDE adds such a flag, or something similar like -D_DEBUG when it runs the compiler, but still, that's not standard and not part of the compiler itself. (Actually, you might consider a different name for the macro anyway if you are going to use it like this, something like LOG_TO_CONSOLE.)
In any case, this is just to give you an inspiration of what do do. You may prefer a std::ostream-based approach instead of a function taking a std::string. There are a lot of questions and answers on Stackoverflow about this.
The important point is: Don't throw away valuable log information under the assumption that you won't need it once your software is released. There will be bugs and vague error descriptions.
recently i discovered in a relatively large project, that ugly runtime crashes occurred because various headers were included in different order in different cpp files.
These headers included #pragma pack - and these pragmas were sometimes not 'closed' ( i mean, set back to the compiler default #pragma pack() ) - resulting in different object layouts in different object files. No wonder the application crashed when it accessed struct members being created in one module and passed to another module. Or derived classes accessing members from base classes.
Since i like the idea to create a more general debugging and assertion strategy from every bug i find, i would really like to assert that object layouts are always and everywhere the same.
So it would be easy to assert
ASSERT( offsetof(membervar) == 4 )
But this would not catch a different layout in another module - or require manual updates whenever the struct layout changes .. so my favourite idea would be something like
ASSERT( offsetof(membervar) == offsetof(othermodule_membervar) )
Would this be possible with an assertion? Or is this a case for a unit test?
Thanks,
H
ASSERT( offsetof(membervar) == offsetof(othermodule_membervar) )
I can't see way to make this technically possible. Further, even if it was phyiscally possible, it isn't practical. You'd need an assert for every pair of source files:
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(B.c::MyClass.membervar) )
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(C.c::MyClass.membervar) )
ASSERT( offsetof(A.c::MyClass.membervar) == offsetof(D.c::MyClass.membervar) )
ASSERT( offsetof(B.c::MyClass.membervar) == offsetof(C.c::MyClass.membervar) )
ASSERT( offsetof(B.c::MyClass.membervar) == offsetof(D.c::MyClass.membervar) )
etc
You might be able to get away with this by asserting the sizeof(class) in different files. If the packing is causing the size of the object to be smaller, than I would expect that sizeof() would show that up.
You could also do this as a static assert using C++0x's static assert, or Boost's (or a handrolled one of course)
On the part of not wanting to do this in every file, I would recommend putting together a header file that includes all the headers you're worried about, and the static_asserts.
Personally though, I'd just recommend searching through the code base over the list of pragmas and fix them.
Wendy,
In Win32, there are single functions that can populate different versions of a given struct. Over the years, the FOOBAR struct might have new features added to it, so they create a FOOBAR2 or FOOBAREX. In some cases there are more than two versions.
Anyway, the way they handle this is to have the caller pass in sizeof(theStruct) in addition to the pointer to the struct:
FOOBAREX foobarex = {0};
long lResult = SomeWin32Api(sizeof(foobarex), &foobarex);
Within the implementation of SomWin32Api(), they check the first parameter and determine which version of the struct they're dealing with.
You could do something similar in a debug build to assure that the caller and callee agree on the size of the struct being referred to, and assert if the value doesn't match the expected size. With macros, you might even be able to automate/hide this so that it only happens in a debug build.
Unfortunately, this is a run-time check and not a compile-time check...
What you want isn't directly possible as such. If you're using VC++, the following may be of interest:
http://blogs.msdn.com/vcblog/archive/2007/05/17/diagnosing-hidden-odr-violations-in-visual-c-and-fixing-lnk2022.aspx
There's probably scope to create some way of semi-automating the process it describes, collating the output and cross-referencing.
To detect this sort of problem somewhat more automatically, the following occurs to me. Create a file that defines a struct that will have a particular size with the designated default packing amount, but a different size with different pack values. Also include some kind of static assert that its size is correct. For example, if the default is 4-byte packing:
struct X {
char c;
int i;
double d;
};
extern const char g_check[sizeof(X)==16?1:-1];
Then #include this file at the start of every header (just write a program to put the extra includes in if there's too many to do by hand), and compile and see what happens. This won't directly detect changes in struct layout, just non-standard packing settings, which is what you're interested in anyway.
(When adding new headers one would put this #include at the top, along with the usual ifdef boilerplate and so on. This is unfortunate but I'm not sure there's any way around it. The best solution is probably to ask people to do it, but assume they'll forget, and run the extra-include-inserting program every now and again...)
Apologies for posting an answer - which this is not - but I don't know how to post code in comments. Sorry.
To wrap Brone's idea in a macro, here is what free we currently use (feel free to edit it):
/** Our own assert macro, which will trace a FATAL error message if the assert
* fails. A FATAL trace will cause a system restart.
* Note: I would love to use CPPUNIT_ASSERT_MESSAGE here, for a nice clean
* test failure if testing with CppUnit, but since this header file is used
* by C code and the relevant CppUnit include file uses C++ specific
* features, I cannot.
*/
#ifdef TESTING
/* ToDo: might want to trace a FATAL if integration testing */
#define ASSERT_MSG(subsystem, message, condition) if (!(condition)) {printf("Assert failed: \"%s\" at line %d in file \"%s\"\n", message, __LINE__, __FILE__); fflush(stdout); abort();}
/* we can also use this, which prints of the failed condition as its message */
#define ASSERT_CONDITION(subsystem, condition) if (!(condition)) {printf("Assert failed: \%s\" at line %d in file \%s\"\n", #condition, __LINE__, __FILE__); fflush(stdout); abort();}
#else
#define ASSERT_MSG(subsystem, message, condition) if (!condition) DebugTrace(FATAL, subsystem, __FILE__, __LINE__, "%s", message);
#define ASSERT_CONDITION(subsystem, condition) if (!(condition)) DebugTrace(FATAL, subsystem, __FILE__, __LINE__, "%s", #condition);
#endif
What you would be looking for is an assertion like ASSERT_CONSISTENT(A_x, offsetof(A,x)), placed in a header file. Let me explain why, and what the problem is.
Because the problem exists across translation units, you can only detect the error at link time. That means you need to force the linker to spit out an error. Unfortunately, most cross-translation unit problems are formally of the "no diagnosis needed" kind. The most familiar one is the ODR rule. We can trivially cause ODR violations with such assertions, but you just can't rely on the linker to warn you about them. If you can, the implementation of the ODR can be as simple as
#define ASSERT_CONSISTENT(label, x) class ASSERT_ ## label { char test[x]; };
But if the linker doesn't notice these ODR violations, this will pass by silently. And here lies the problem: the linker really only needs to complain if it can't find something.
With two macro's the problem is solved:
template <int i> class dummy; // needed to differentiate functions
#define ASSERT_DEFINE(label, x) void ASSERT_label(dummy<x>&) { }
#define ASSERT_CHECK(label, x) void (*check)(dummy<x>&) = &ASSERT_label;
You'd need to put the ASSERT_DEFINE macro in a .cpp, and ASSERT_CHECK in its header. If the x value checked isn't the x value defined for that label, you're taking the address of an undefined function. Now, a linker doesn't need to warn about multiple definitions, but it must warn about missing definitions.
BTW, for this particular problem, see Diagnosing Hidden ODR Violations in Visual C++ (and fixing LNK2022)
Original Question
What I'd like is not a standard C pre-processor, but a variation on it which would accept from somewhere - probably the command line via -DNAME1 and -UNAME2 options - a specification of which macros are defined, and would then eliminate dead code.
It may be easier to understand what I'm after with some examples:
#ifdef NAME1
#define ALBUQUERQUE "ambidextrous"
#else
#define PHANTASMAGORIA "ghostly"
#endif
If the command were run with '-DNAME1', the output would be:
#define ALBUQUERQUE "ambidextrous"
If the command were run with '-UNAME1', the output would be:
#define PHANTASMAGORIA "ghostly"
If the command were run with neither option, the output would be the same as the input.
This is a simple case - I'd be hoping that the code could handle more complex cases too.
To illustrate with a real-world but still simple example:
#ifdef USE_VOID
#ifdef PLATFORM1
#define VOID void
#else
#undef VOID
typedef void VOID;
#endif /* PLATFORM1 */
typedef void * VOIDPTR;
#else
typedef mint VOID;
typedef char * VOIDPTR;
#endif /* USE_VOID */
I'd like to run the command with -DUSE_VOID -UPLATFORM1 and get the output:
#undef VOID
typedef void VOID;
typedef void * VOIDPTR;
Another example:
#ifndef DOUBLEPAD
#if (defined NT) || (defined OLDUNIX)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
Ideally, I'd like to run with -UOLDUNIX and get the output:
#ifndef DOUBLEPAD
#if (defined NT)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
This may be pushing my luck!
Motivation: large, ancient code base with lots of conditional code. Many of the conditions no longer apply - the OLDUNIX platform, for example, is no longer made and no longer supported, so there is no need to have references to it in the code. Other conditions are always true. For example, features are added with conditional compilation so that a single version of the code can be used for both older versions of the software where the feature is not available and newer versions where it is available (more or less). Eventually, the old versions without the feature are no longer supported - everything uses the feature - so the condition on whether the feature is present or not should be removed, and the 'when feature is absent' code should be removed too. I'd like to have a tool to do the job automatically because it will be faster and more reliable than doing it manually (which is rather critical when the code base includes 21,500 source files).
(A really clever version of the tool might read #include'd files to determine whether the control macros - those specified by -D or -U on the command line - are defined in those files. I'm not sure whether that's truly helpful except as a backup diagnostic. Whatever else it does, though, the pseudo-pre-processor must not expand macros or include files verbatim. The output must be source similar to, but usually simpler than, the input code.)
Status Report (one year later)
After a year of use, I am very happy with 'sunifdef' recommended by the selected answer. It hasn't made a mistake yet, and I don't expect it to. The only quibble I have with it is stylistic. Given an input such as:
#if (defined(A) && defined(B)) || defined(C) || (defined(D) && defined(E))
and run with '-UC' (C is never defined), the output is:
#if defined(A) && defined(B) || defined(D) && defined(E)
This is technically correct because '&&' binds tighter than '||', but it is an open invitation to confusion. I would much prefer it to include parentheses around the sets of '&&' conditions, as in the original:
#if (defined(A) && defined(B)) || (defined(D) && defined(E))
However, given the obscurity of some of the code I have to work with, for that to be the biggest nit-pick is a strong compliment; it is valuable tool to me.
The New Kid on the Block
Having checked the URL for inclusion in the information above, I see that (as predicted) there is an new program called Coan that is the successor to 'sunifdef'. It is available on SourceForge and has been since January 2010. I'll be checking it out...further reports later this year, or maybe next year, or sometime, or never.
I know absolutely nothing about C, but it sounds like you are looking for something like unifdef. Note that it hasn't been updated since 2000, but there is a successor called "Son of unifdef" (sunifdef).
Also you can try this tool http://coan2.sourceforge.net/
something like this will remove ifdef blocks:
coan source -UYOUR_FLAG --filter c,h --recurse YourSourceTree
I used unifdef years ago for just the sort of problem you describe, and it worked fine. Even if it hasn't been updated since 2000, the syntax of preprocessor ifdefs hasn't changed materially since then, so I expect it will still do what you want. I suppose there might be some compile problems, although the packages appear recent.
I've never used sunifdef, so I can't comment on it directly.
Around 2004 I wrote a tool that did exactly what you are looking for. I never got around to distributing the tool, but the code can be found here:
http://casey.dnsalias.org/exifdef-0.2.zip (that's a dsl link)
It's about 1.7k lines and implements enough of the C grammar to parse preprocessor statements, comments, and strings using bison and flex.
If you need something similar to a preprocessor, the flexible solution is Wave (from boost). It's a library designed to build C-preprocessor-like tools (including such things as C++03 and C++0x preprocessors). As it's a library, you can hook into its input and output code.