Why is snprintf faster than ostringstream or is it? - c++

I read somewhere that snprintf is faster than ostringstream. Has anyone has any experiences with it? If yes why is it faster.

std::ostringstream is not required to be slower, but it is generally slower when implemented. FastFormat's website has some benchmarks.
The Standard library design for streams supports much more than snprintf does. The design is meant to be extensible, and includes protected virtual methods that are called by the publicly exposed methods. This allows you to derive from one of the stream classes, with the assurance that if you overload the protected method you will get the behavior you want. I believe that a compiler could avoid the overhead of the virtual function call, but I'm not aware of any compilers that do.
Additionally, stream operations often use growable buffers internally; which implies relatively slow memory allocations.

We replaced some stringstreams in inner loops with sprintf (using statically allocated buffers), and this made a big difference, both in msvc and gcc. I imagine that the dynamic memory management of this code:
{
char buf[100];
int i = 100;
sprintf(buf, "%d", i);
// do something with buf
}
is much simpler than
{
std::stringstream ss;
int i = 100;
ss << i;
std::string s = ss.str();
// do something with s
}
but i am very happy with the overall performance of stringstreams.

Some guys would possibly tell you about that the functions can't be faster than each other, but their implementation can. That's right i think i would agree.
You are unlikely to ever notice a difference in other than benchmarks. The reason that c++ streams generally tend to be slower is that they are much more flexible. Flexibility most often comes at the cost of either time or code growth.
In this case, C++ streams are based on stream-buffers. In itself, streams are just the hull that keep formatting and error flags in place, and call the right i/o facets of the c++ standard library (for example, num_put to print numbers), that print the values, well formatted, into the underlying stream-buffer connected to the c++ stream.
All this mechanisms - the facets, and the buffers, are implemented by virtual functions. While there is indeed no mark note, those functions must be implemented to be slower than c stdio pendants that fact will make them somewhat slower than using c stdio functions normally (i benchmark'ed that some time ago with gcc/libstdc++ and in fact noticed a slowdown - but which you hardly notice in day-by-day usage).

Absolutely this is implementation-specific.
But if you really want to know, write two small programs, and compare them. You would need to include typical usage for what you have in mind, the two programs would need to generate the same string, and you would use a profiler to look at the timing information.
Then you would know.

One issue would probably be that the type safety added by ostringstream carries extra overhead. I've not done any measurements, though.

As litb said, standard streams support many things we don't always need.
Some streams implementation get rid of this never used flexibility, see FAStream for instance.

It's quite possible that because sprintf is part of the CRT that is written in assembly. The ostringstream is part of the STL, and probably a little more generically written, and has OOP code/overhead to deal with.

Yes, if you run the function below on a few million numbers with Visual C++ 5.0, the first version takes about twice as long as the second and produces the same output.
Compiling tight loops into a .exe and running the Windows timethis something.exe' or the Linuxtime something' is how I investigate most of my performance curiosities. (`timethis' is available on the web somewhere)
void Hex32Bit(unsigned int n, string &result)
{
#if 0
stringstream ss;
ss
<< hex
<< setfill('0')
<< "0x" << setw(8) << n
;
result = ss.str();
#else
const size_t len = 11;
char temp[len];
_snprintf(temp, len, "0x%08x", n);
temp[len - 1] = '\0';
result = temp;
#endif
}

One reason I know that the printf family of functions are faster than the corresponding C++ functions (cout, cin, and other streams) is that the latter do typechecking. As this usually involves some requests to overloaded operators, it can take some time.
In fact, in programming competitions it is often recommended that you use printf et al rather than cout/cin for precisely this reason.

Related

Safe Use of strcpy

Plain old strcpy is prohibited in its use in our company's coding standard because of its potential for buffer overflows. I was looking the source for some 3rd Party Library that we link against in our code. The library source code has a use of strcpy like this:
for (int i = 0; i < newArgc; i++)
{
newArgv[i] = new char[strlen(argv[i]) + 1];
strcpy(newArgv[i], argv[i]);
}
Since strlen is used while allocating memory for the buffer to be copied to, this looks fine. Is there any possible way someone could exploit this normal strcpy, or is this safe as I think it looks to be?
I have seen naïve uses of strcpy that lead to buffer overflow situations, but this does not seem to have that since it is always allocating the right amount of space for the buffer using strlen and then copying to that buffer using the argv[] as the source which should always be null terminated.
I am honestly curious if someone running this code with a debugger could exploit this or if there are any other tactics someone that was trying to hack our binary (with this library source that we link against in its compiled version) could use to exploit this use of strcpy. Thank you for your input and expertise.
It is possible to use strcpy safely - it's just quite hard work (which is why your coding standards forbid it).
However, the code you have posted is not a vulnerability. There is no way to overwrite bits of memory with it; I would not bother rewriting it. (If you do decide to rewrite it, use std::string instead.
Well, there are multiple problems with that code:
If an allocation throws, you get a memory-leak.
Using strcpy() instead of reusing the length is sub-optimal. Use std::copy_n() or memcpy() instead.
Presumably, there are no data-races, not that we can tell.
Anyway, that slight drop in performance is the only thing "wrong" with using strcpy() there. At least if you insist on manually managing your strings yourself.
Deviating from a coding standard should always be possible, but then document well why you decided to do so.
The main problem with strcpy is that it has no length limitation. When taking care this is no problem, but it means that strcpy always is to be accompanied with some safeguarding code. Many less experienced coders have fallen into this pitfall, hence the coding guideline came into practice.
Possible ways to handle string copy safely are:
Check the string length
Use a safe variant like strlcpy, or on older Microsoft compilers, strncpy_s.
As a general strcpy replacement idiom, assuming you're okay with the slight overhead of the print formatting functions, use snprintf:
snprintf(dest, dest_total_buffer_length, "%s", source);
e.g.
snprintf(newArgv[i], strlen(argv[i]) + 1, "%s", argv[i]);
It's safe, simple, and you don't need to think about the +1/-1 size adjustment.

Why are standard string functions faster than my custom string functions?

I decided to find the speeds of 2 functions :
strcmp - The standard comparison function defined in string.h
xstrcmp- A function that has same parameters and does the same, just that I created it.
Here is my xstrcmp function :
int xstrlen(char *str)
{
int i;
for(i=0;;i++)
{
if(str[i]=='\0')
break;
}
return i;
}
int xstrcmp(char *str1, char *str2)
{
int i, k;
if(xstrlen(str1)!=xstrlen(str2))
return -1;
k=xstrlen(str1)-1;
for(i=0;i<=k;i++)
{
if(str1[i]!=str2[i])
return -1;
}
return 0;
}
I didn't want to depend on strlen, since I want everything user-defined.
So, I found the results. strcmp did 364 comparisons per millisecond and my xstrcmp did just 20 comparisons per millisecond (atleast on my computer!)
Can anyone tell why this is so ? What does the xstrcmp function do to make itself so fast ?
if(xstrlen(str1)!=xstrlen(str2)) //computing length of str1
return -1;
k=xstrlen(str1)-1; //computing length of str1 AGAIN!
You're computing the length of str1 TWICE. That is one reason why your function loses the game.
Also, your implemetation of xstrcmp is very naive compared to the ones defined in (most) Standard libraries. For example, your xstrcmp compares one byte at a time, when in fact it could compare multiple bytes in one go, taking advantage of proper alignment as well, or can do little preprocessing so as to align memory blocks, before actual comparison.
strcmp and other library routines are written in assembly, or specialized C code, by experienced engineers and use a variety of techniques.
For example, the assembly implementation might load four bytes at a time into a register, and compare that register (as a 32-bit integer) to four bytes from the other string. On some machines, the assembly implementation might load eight bytes or even more. If the comparison shows the bytes are equal, the implementation moves on to the next four bytes. If the comparison shows the bytes are unequal, the implementation stops.
Even with this simple optimization, there are a number of issues to be dealt with. If the string addresses are not multiples of four bytes, the processor might not have an instruction that will load four bytes (many processors require four-byte loads to use addresses that are aligned to multiples of four bytes). Depending on the processor, the implementation might have to use slower unaligned loads or to write special code for each alignment case that does aligned loads and shifts bytes in registers to align the bytes to be compared.
When the implementation loads four bytes at once, it must ensure it does not load bytes beyond the terminating null character if those bytes might cause a segment fault (error because you tried to load an address that is not readable).
If the four bytes do contain the terminating null character, the implementation must detect it and not continue comparing further bytes, even if the current four are equal in the two strings.
Many of these issues require detailed assembly instructions, and the required control over the exact instructions used is not available in C. The exact techniques used vary from processor model to processor model and vary greatly from architecture to architecture.
Faster implementation of strlen:
//Return difference in addresses - 1 as we don't count null terminator in strlen.
int xstrlen(char *str)
{
char* ptr = str;
while (*str++);
return str - ptr - 1;
}
//Pretty nifty strcmp from here:
//http://vijayinterviewquestions.blogspot.com/2007/07/implement-strcmpstr1-str2-function.html
int mystrcmp(const char *s1, const char *s2)
{
while (*s1==*s2)
{
if(*s1=='\0')
return(0);
++s1;
++s2;
}
return(*s1-*s2);
}
I'll do the other one later if I have time. You should also note that most of these are done in assembly language or using other optimized means which will be faster than the best stright C implementation you can write.
Aside from the problems in your code (which have been pointed out already), -- at least in the gcc-C-libs, the str- and mem-functions are faster by a margin in most cases because their memory access patterns are higly optimized.
There were some discussions on the topic on SO already.
Try this:
int xstrlen(const char* s){
const char* s0 = s;
while(*s) s++;
return(s - s0);
}
int xstrcmp(const char* a, const char* b){
while(*a && *a==*b){a++; b++;}
return *a - *b;
}
This could probably be sped up with some loop unrolling.
1. Algorithm
Your implementation of strcmp could have a better algorithm. There should be no need to call strlen at all, each call to strlen will iterate over the whole length of the string again. You can find simple but effective implementations online, probably the place to start is something like:
// Adapted from http://vijayinterviewquestions.blogspot.co.uk
int xstrcmp(const char *s1, const char *s2)
{
for (;*s1==*s2;++s1,++s2)
{
if(*s1=='\0') return(0);
}
return(*s1-*s2);
}
That doesn't do everything, but should be simple and work in most cases.
2. Compiler optimisation
It's a stupid question, but make sure you turned on all the optimisation switches when you compile.
3. More sophisticated optimisations
People writing libraries will often use more advanced techniques, such as loading a 4-byte or 8-byte int at once, and comparing it, and only comparing individual bytes if the whole matches. You'd need to be an expert to know what's appropriate for this case, but you can find people discussing the most efficient implementation on stack overflow (link?)
Some standard library functions for some platforms may be hand-written in assembly if the coder can knows there's a more efficient implementation than the compiler can find. That's increasingly rare now, but may be common on some embedded systems.
4. Linker "cheating" with standard library
With some standard library functions, the linker may be able to make your program call them with less overhead than calling functions in your code because it was designed to know more about the specific internals of the functions (link?) I don't know if that applies in this case, it probably doesn't, but it's the sort of thing you have to think about.
5. OK, ok, I get that, but when SHOULD I implement my own strcmp?
Off the top of my head, the only reasons to do this are:
You want to learn how. This is a good reason.
You are writing for a platform which doesn't have a good enough standard library. This is very unlikely.
The string comparison has been measured to be a significant bottleneck in your code, and you know something specific about your strings that mean you can compare them more efficiently than a naive algorithm. (Eg. all strings are allocated 8-byte aligned, or all strings have an N-byte prefix.) This is very, very unlikely.
6. But...
OK, WHY do you want to avoid relying on strlen? Are you worried about code size? About portability of code or of executables?
If there's a good reason, open another question and there may be a more specific answer. So I'm sorry if I'm missing something obvious, but relying on the standard library is usually much better, unless there's something specific you want to improve on.

Create std::stringbuf based on std::vector<char>

I am doing in-memory image conversions between two frameworks (OpenSceneGraph and wxWidgets). Not wanting to care about the underlying classes (osg::Image and wxImage), I use the stream oriented I/O features both APIs provide like so:
1) Create an std::stringstream
2) Write to the stream using OSG's writers
3) Read from the stream using wxWigdets readers
This works fairly well. Until now I've been using direct access to the stream buffer, but my attention has been recently caught by the "non-contiguous underlying buffer" problem of the std::stringstream. I had been using a kludge to get a const char* ptr to the buffer - but it worked (tested on Windows, Linux and OSX, using MSVC 9 and GCC 4.x), so I never fixed it.
Now I understand that this code is a time bomb and I want to get rid of it. This problem has been brought up several times on SO (here for instance), but I could not find an answer that could really help me do the simplest thing that could possibly work.
I think the most reasonable thing to do is to create my own streambuf using a vector behind the scenes - this would guarantee that the buffer is contiguous. I am aware that this would not a generic solution, but given my constraints:
1) the required size is not infinite and actually quite predictable
2) my stream really needs to be an std::iostream (I can't use a raw char array) because of the APIs
anybody knows how I can either a custom stringbuf using a vector of chars ? Please do not answer "use std::stringstream::str()", since I know we can, but I'm precisely looking for something else (even though you'd say that copying 2-3 MB is so fast that I wouldn't even notice the difference, let's consider I am still interested in custom stringbufs just for the beauty of the exercise).
If you can use just an istream or an ostream (rather than
bidirectional), and don't need seeking, it's pretty simple (about 10
lines of code) to create your own streambuf using std::vector<char>.
But unless the strings are very, very large, why bother? The C++11 standard
guarantees that std::string is contiguous; that a char*
obtained by &myString[0] can be used to as a C style array.` And the
reason C++11 added this guarantee was in recognition of existing
practice; there simply weren't any implementations where this wasn't the
case (and now that it's required, there won't be any implementations in
the future where this isn't the case).
boost::iostreams have a few ready made sinks for this. There's array_sink if you have some sort of upper limit and can allocate the chunk upfront, such a sink won't grow dynamically but on the other hand that can be a positive as well. There's also back_inserter_device, which is more generic and works straight up with std::vector for example. An example using back_inserter_device:
#include <string>
#include <iostream>
#include "boost/iostreams/stream_buffer.hpp"
#include "boost/iostreams/device/back_inserter.hpp"
int main()
{
std::string destination;
destination.reserve( 1024 );
boost::iostreams::stream_buffer< boost::iostreams::back_insert_device< std::string > > outBuff( ( destination ) );
std::streambuf* cur = std::cout.rdbuf( &outBuff );
std::cout << "Hello!" << std::endl;
// If we used array_sink we'd need to use tellp here to retrieve how much we've actually written, and don't forgot to flush if you don't end with an endl!
std::cout.rdbuf( cur );
std::cout << destination;
}

Safe counterparts of itoa()?

I am converting some old c program to a more secure version. The following functions are used heavily, could anyone tell me their secure counterparts? Either windows functions or C runtime library functions. Thanks.
itoa()
getchar()
strcat()
memset()
itoa() is safe as long as the destination buffer is big enough to receive the largest possible representation (i.e. of INT_MIN with trailing NUL). So, you can simply check the buffer size. Still, it's not a very good function to use because if you change your data type to a larger integral type, you need to change to atol, atoll, atoq etc.. If you want a dynamic buffer that handles whatever type you throw at it with less maintenance issues, consider an std::ostringstream (from the <sstream> header).
getchar() has no "secure counterpart" - it's not insecure to begin with and has no buffer overrun potential.
Re memset(): it's dangerous in that it accepts the programmers judgement that memory should be overwritten without any confirmation of the content/address/length, but when used properly it leaves no issue, and sometimes it's the best tool for the job even in modern C++ programming. To check security issues with this, you need to inspect the code and ensure it's aimed at a suitable buffer or object to be 0ed, and that the length is computed properly (hint: use sizeof where possible).
strcat() can be dangerous if the strings being concatenated aren't known to fit into the destination buffer. For example: char buf[16]; strcpy(buf, "one,"); strcat(buf, "two"); is all totally safe (but fragile, as further operations or changing either string might require more than 16 chars and the compiler won't warn you), whereas strcat(buf, argv[0]) is not. The best replacement tends to be a std::ostringstream, although that can require significant reworking of the code. You may get away using strncat(), or even - if you have it - asprintf("%s%s", first, second), which will allocate the required amount of memory on the heap (do remember to free() it). You could also consider std::string and use operator+ to concatenate strings.
None of these functions are "insecure" provided you understand the behaviour and limitations. itoa is not standard C and should be replaced with sprintf("%d",...) if that's a concern to you.
The others are all fine to the experienced practitioner. If you have specific cases which you think may be unsafe, you should post them.
I'd change itoa(), because it's not standard, with sprintf or, better, snprintf if your goal is code security. I'd also change strcat() with strncat() but, since you specified C++ language too, a really better idea would be to use std::string class.
As for the other two functions, I can't see how you could make the code more secure without seeing your code.

Which C I/O library should be used in C++ code? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
In new C++ code, I tend to use the C++ iostream library instead of the C stdio library.
I've noticed some programmers seem to stick to stdio, insisting that it's more portable.
Is this really the case? What is better to use?
To answer the original question:
Anything that can be done using stdio can be done using the iostream library.
Disadvantages of iostreams: verbose
Advantages of iostreams: easy to extend for new non POD types.
The step forward the C++ made over C was type safety.
iostreams was designed to be explicitly type safe. Thus assignment to an object explicitly checked the type (at compiler time) of the object being assigned too (generating an compile time error if required). Thus prevent run-time memory over-runs or writing a float value to a char object etc.
scanf()/printf() and family on the other hand rely on the programmer getting the format string correct and there was no type checking (I believe gcc has an extension that helps). As a result it was the source of many bugs (as programmers are less perfect in their analysis than compilers [not going to say compilers are perfect just better than humans]).
Just to clarify comments from Colin Jensen.
The iostream libraries have been stable since the release of the last standard (I forget the actual year but about 10 years ago).
To clarify comments by Mikael Jansson.
The other languages that he mentions that use the format style have explicit safeguards to prevent the dangerous side effects of the C stdio library that can (in C but not the mentioned languages) cause a run-time crash.
N.B. I agree that the iostream library is a bit on the verbose side. But I am willing to put up with the verboseness to ensure runtime safety. But we can mitigate the verbosity by using Boost Format Library.
#include <iostream>
#include <iomanip>
#include <boost/format.hpp>
struct X
{ // this structure reverse engineered from
// example provided by 'Mikael Jansson' in order to make this a running example
char* name;
double mean;
int sample_count;
};
int main()
{
X stats[] = {{"Plop",5.6,2}};
// nonsense output, just to exemplify
// stdio version
fprintf(stderr, "at %p/%s: mean value %.3f of %4d samples\n",
stats, stats->name, stats->mean, stats->sample_count);
// iostream
std::cerr << "at " << (void*)stats << "/" << stats->name
<< ": mean value " << std::fixed << std::setprecision(3) << stats->mean
<< " of " << std::setw(4) << std::setfill(' ') << stats->sample_count
<< " samples\n";
// iostream with boost::format
std::cerr << boost::format("at %p/%s: mean value %.3f of %4d samples\n")
% stats % stats->name % stats->mean % stats->sample_count;
}
It's just too verbose.
Ponder the iostream construct for doing the following (similarly for scanf):
// nonsense output, just to examplify
fprintf(stderr, "at %p/%s: mean value %.3f of %4d samples\n",
stats, stats->name, stats->mean, stats->sample_count);
That would requires something like:
std::cerr << "at " << static_cast<void*>(stats) << "/" << stats->name
<< ": mean value " << std::precision(3) << stats->mean
<< " of " << std::width(4) << std::fill(' ') << stats->sample_count
<< " samples " << std::endl;
String formatting is a case where object-orientedness can, and should be, sidestepped in favour of a formatting DSL embedded in strings. Consider Lisp's format, Python's printf-style formatting, or PHP, Bash, Perl, Ruby and their string intrapolation.
iostream for that use case is misguided, at best.
The Boost Format Library provides a type-safe, object-oriented alternative for printf-style string formatting and is a complement to iostreams that does not suffer from the usual verbosity issues due to the clever use of operator%. I recommend considering it over using plain C printf if you dislike formatting with iostream's operator<<.
Back in the bad old days, the C++ Standards committee kept mucking about with the language and iostreams was a moving target. If you used iostreams, you were then given the opportunity to rewrite parts of your code every year or so. Because of this, I always used stdio which hasn't changed significantly since 1989.
If I were doing stuff today, I would use iostreams.
If, like me, you learned C before learning C++, the stdio libraries seem more natural to use. There are pros and cons for iostream vs. stdio but I do miss printf() when using iostream.
In principle I would use iostreams, in practice I do too much formatted decimals, etc that make iostreams too unreadable, so I use stdio. Boost::format is an improvement, but not quite motivating enough for me. In practice, stdio is nearly typesafe since most modern compilers do argument checking anyway.
It's an area where I'm still not totally happy with any of the solutions.
I'll be comparing the two mainstream libraries from the C++ standard library.
You shouldn't use C-style-format-string-based string-processing-routines in C++.
Several reasons exist to mit their use:
Not typesafe
You can't pass non-POD types to variadic argument lists (i.e., neither to scanf+co., nor to printf+co.),
or you enter the Dark Stronghold of Undefined Behaviour
Easy to get wrong:
You must manage to keep the format string and the "value-argument-list" in sync
You must keep in sync correctly
Subtle bugs introduced at remote places
It is not only the printf in itself that is not good. Software gets old and is refactored and modified, and errors might be introduced from remote places. Suppose you have
.
// foo.h
...
float foo;
...
and somewhere ...
// bar/frob/42/icetea.cpp
...
scanf ("%f", &foo);
...
And three years later you find that foo should be of some custom type ...
// foo.h
...
FixedPoint foo;
...
but somewhere ...
// bar/frob/42/icetea.cpp
...
scanf ("%f", &foo);
...
... then your old printf/scanf will still compile, except that you now get random segfaults and you don't remember why.
Verbosity of iostreams
If you think printf() is less verbose, then there's a certain probability that you don't use their iostream's full force. Example:
printf ("My Matrix: %f %f %f %f\n"
" %f %f %f %f\n"
" %f %f %f %f\n"
" %f %f %f %f\n",
mat(0,0), mat(0,1), mat(0,2), mat(0,3),
mat(1,0), mat(1,1), mat(1,2), mat(1,3),
mat(2,0), mat(2,1), mat(2,2), mat(2,3),
mat(3,0), mat(3,1), mat(3,2), mat(3,3));
Compare that to using iostreams right:
cout << mat << '\n';
You have to define a proper overload for operator<< which has roughly the structure of the printf-thingy, but the significant difference is that you now have something re-usable and typesafe; of course you can also make something re-usable for printf-likes, but then you have printf again (what if you replace the matrix members with the new FixedPoint?), apart from other non-trivialities, e.g. you must pass FILE* handles around.
C-style format strings are not better for I18N than iostreams
Note that format-strings are often thought of being the rescue with internationalization, but they are not at all better than iostream in that respect:
printf ("Guten Morgen, Sie sind %f Meter groß und haben %d Kinder",
someFloat, someInt);
printf ("Good morning, you have %d children and your height is %f meters",
someFloat, someInt); // Note: Position changed.
// ^^ not the best example, but different languages have generally different
// order of "variables"
I.e., old style C format strings lack positional information as much as iostreams do.
You might want to consider boost::format, which offers support for stating the position in the format string explicitly. From their examples section:
cout << format("%1% %2% %3% %2% %1% \n") % "11" % "22" % "333"; // 'simple' style.
Some printf-implementations provide positional arguments, but they are non-standard.
Should I never use C-style format strings?
Apart from performance (as pointed out by Jan Hudec), I don't see a reason. But keep in mind:
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified” - Knuth
and
“Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.” - Pike
Yes, printf-implementations are usually faster than iostreams are usually faster than boost::format (from a small and specific benchmark I wrote, but it should largely depend on the situation in particular: if printf=100%, then iostream=160%, and boost::format=220%)
But do not blindly omit thinking about it: How much time do you really spend on text-processing? How long does your program run before exiting?
Is it relevant at all to fall back to C-style format strings, loose type safety, decrease refactorbility,
increase probability of very subtle bugs that may hide themselves for years and may only reveal themselves right
into your favourites customers face?
Personally, I wouldn't fall back if I can not gain more than 20% speedup. But because my applications
spend virtually all of their time on other tasks than string-processing, I never had to. Some parsers
I wrote spend virtually all their time on string processing, but their total runtime is so small
that it isn't worth the testing and verification effort.
Some riddles
Finally, I'd like to preset some riddles:
Find all errors, because the compiler won't (he can only suggest if he's nice):
shared_ptr<float> f(new float);
fscanf (stdout, "%u %s %f", f)
If nothing else, what's wrong with this one?
const char *output = "in total, the thing is 50%"
"feature complete";
printf (output);
For binary IO, I tend to use stdio's fread and fwrite. For formatted stuff I'll usually use IO Stream although as Mikael said, non-trival (non-default?) formatting can be a PITA.
While there are a lot of benefits to the C++ iostreams API, one significant problem is has is around i18n. The problem is that the order of parameter substitutions can vary based on the culture. The classic example is something like:
// i18n UNSAFE
std::cout << "Dear " << name.given << ' ' << name.family << std::endl;
While that works for English, in Chinese the family name is comes first.
When it comes to translating your code for foreign markets, translating snippets is fraught with peril so new l10ns may require changes to the code and not just different strings.
boost::format seems to combine the best of stdio (a single format string that can use the parameters in a different order then they appear) and iostreams (type-safety, extensibility).
I use iostreams, mainly because that makes it easier to fiddle with the stream later on (if I need it). For example, you could find out that you want to display the output in some trace window -- this is relatively easy to do with cout and cerr. You can, off course, fiddle with pipes and stuff on unix, but that is not as portable.
I do love printf-like formatting, so I usually format a string first, and then send it to the buffer. With Qt, I often use QString::sprintf (although they recommend using QString::arg instead). I've looked at boost.format as well, but couldn't really get used to the syntax (too many %'s). I should really give it a look, though.
What I miss about the iolibraries is the formatted input.
iostreams does not have a nice way to replicate scanf() and even boost does not have the required extension for input.
stdio is better for reading binary files (like freading blocks into a vector<unsigned char> and using .resize() etc.). See the read_rest function in file.hh in http://nuwen.net/libnuwen.html for an example.
C++ streams can choke on lots of bytes when reading binary files causing a false eof.
Since iostreams have become a standard you should use them knowing that your code will work for sure with newer versions of compiler. I guess nowadays most of the compilers know very well about iostreams and there shouldn't be any problem using them.
But if you want to stick with *printf functions there can be no problem in my opinion.