Does printf() convert its arguments to string like cout? - c++

I start to learn C++ from C. Recently, I have just read a tutorial book about C++. In section Introduce to streams, the book has noted:
The << operator is overloaded so that the operand on the right can be
a string or any primitive value. If this operand is not a string, the
<< operator converts it to string form before sending it to the output
stream.
So I wonder whether printf() function in C has the same effect. And if it doesn't, please tell me about the differences between both of them.

Well, of course it has to somehow generate a string representation of each argument, that is needed in order to have something to print. Printing involves sending streams of characters to an output device after all, you can't print unless you have a sequence of characters.
The printf() function uses the formatting string to control how to interpret each argument in order to create the character representation, and also how to format that representation when output.
Note that no "conversion" of the arguments happens that is visible externally, of course. There's no way
printf("%d\n", 47);
can make that 47 into a string in place; C uses call by value so the function only gets a copy of the value, and it then uses the type information implicit in the %d conversion specifier to figure out how to generate the two characters '4' and '7' that make up the printed representation.

So I wonder whether printf() function in C has the same effect.
Both C and C++ uses streams for output. In C it is stdout and in C++ it is cout.
Though it is not evident from the statement printf writes to standard output(stdout) say a terminal.
In case of cout, it is evident from a statement itself, where the output is going.
Some subtle differences
With cout you might need to include an additional header - say iomanip - and use some functions - say setw() - to have fine formatting where as in printf you rely on format string.
Performance - Each has its own advantage depending on what you print and where you print. I borrowed this point from here.
Another similarity
Both C++ and C standards mention nothing about the order of evaluation of function arguments. So you must not try fancy stuff with functions. For example neither should you do
printf(%d%d",++i,i++); // The behaviour is undefined.
nor should you do
cout<<++i<<++i; // The behaviour is undefined.
Note:
Remember that the c streams are available in C++ if you include the necessary headers.

Related

How can I replicate compile time hex string interpretation at run time!? c++

In my code the following line gives me data that performs the task its meant for:
const char *key = "\xf1`\xf8\a\\\x9cT\x82z\x18\x5\xb9\xbc\x80\xca\x15";
The problem is that it gets converted at compile time according to rules that I don't fully understand. How does "\x" work in a String?
What I'd like to do is to get the same result but from a string exactly like that fed in at run time. I have tried a lot of things and looked for answers but none that match closely enough for me to be able to apply.
I understand that \x denotes a hex number. But I don't know in which form that gets 'baked out' by the compiler (gcc).
What does that ` translate into?
Does the "\a" do something similar to "\x"?
This is indeed provided by the compiler, but this part is not member of the standard library. That means that you are left with 3 ways:
dynamically write a C++ source file containing the string, and writing it on its standard output. Compile it and (providing popen is available) execute it from your main program and read its input. Pretty ugly isn't it...
use the source of an existing compiler, or directly its internal libraries. Clang is probably a good starting point because it has been designed to be modular. But it could require a good amount of work to find where that damned specific point is coded and how to use that...
just mimic what the compiler does, and write your own parser by hand. It is not that hard, and will learn you why tests are useful...
If it was not clear until here, I strongly urge you to use the third way ;-)
If you want to translate "escape" codes in strings that you get as input at run-time then you need to do it yourself, explicitly.
One way is to read the input into one string. Then copy the characters from that source string into a new destination string, one by one. If you see a backslash then you discard it, fetch the next character, and if it's an x you can use e.g. std::stoi to convert the next few characters into its corresponding integer value, and append that number to the destination string (either adding it with std::to_string, or using output string streams and the normal "output" operator <<).

What is the purpose of * in Fortran input/output

I am learning Fortran because well, yeah, thought I'd learn it. However, I've found utterly no information on what the purpose of * is in print, read, etc:
program main
print *, "Hello, world!"
end program main
What is the purpose of this *? I've done some research however I don't think I understand it properly, I don't want to carry on learning until I actually know the point of *.
From what I've managed to find I think that it's some sort of format specifier, however, I don't understand what that means and I have no idea if I'm even correct. All the tutorials I've found just tell me what to write to print to the console but not what * actually means.
You are correct in that it is a format specifier.
There's a page on Wikibooks to do with IO and that states:
The second * specifies the format the user wants the number read with
when talking about the read statement. This applies to the write and print statements too.
For fixed point reals: Fw.d; w is the total number of spaces alloted for the number, and d is the number of decimal places.
And the example they give is
REAL :: A
READ(*,'(F5.2)') A
which reads a real number with 2 digits before and after the decimal point. So if you were printing a real number you'd use:
PRINT '(F5.2)', A
to get the rounded output.
In your example you're just printing text so there's no special formatting to do. Also if you leave the format specifier as * it will apply the default formatting to reals etc.
The print statement does formatted output to a special place (which we assume is the screen). There are three available ways to specify the format used in this output. One of these, as in the question, is using *.
The * format specifier means that the formatted output will be so-called list-directed output. Conversely, for a read statement, it will be list-directed input. Now that the term has been introduced you will be able to find many other questions here: list-directed input/output has many potentially surprising consequences.
Generally, a format specification item says how to format an item in the input/output list. If we are writing out a real number we'd use a format item which may, say, state the precision of the output or whether it uses scientific notation.
As in the question, when writing out a character variable we may want to specify the width of the output field (so that there is blank padding, or partial output) or not. We could have
print '(A20)', "Hello, world!"
print '(A5)', "Hello, world!"
print '(A)', "Hello, world!"
which offers a selection of effects. [Using a literal character constant like '(A)' is one of the other ways of giving the format.]
But we couldn't have
print '(F12.5)', "Hello, world!" ! A numeric format for a character variable
as our format doesn't make sense with the output item.
List-directed output is different from the above cases. Instead, (quote here and later from the Fortran 2008 draft standard) this
allows data editing according to the type of the list item instead of by a format specification. It also allows data to be free-field, that is, separated by commas (or semicolons) or blanks.
The main effect, and why list-directed is such a convenience, is that no effort is required on the part of the programmer to craft a suitable format for the things to be sent out.
When an output list is processed under list-directed output and a real value appears it is as though the programmer gave a suitable format for that value. And the same for other types.
But the crucial thing here is "a suitable format". The end user doesn't get to say A5 or F12.5. Instead of that latter the compiler gets to choose "reasonable processor-dependent values" for widths, precisions and so on. [Leading to questions like "why so much white space?".] The constraints on the output are given in, for example, Fortran 2008 10.10.4.
So, in summary * is saying, print out the following things in a reasonable way. I won't complain how you choose to do this, even if you give me output like
5*0
1 2
rather than
0 0 0 0 0 1 2
and I certainly won't complain about having a leading space like
Hello, world!
rather than
Hello, world!
For, yes, with list-directed you will (except in cases beyond this answer) get an initial blank on the output:
Except for new records created by [a special case] or by continuation of delimited character sequences, each output record begins with a blank character.

scanf on an istream object

NOTE: I've seen the post What is the cin analougus of scanf formatted input? before asking the question and the post doesn't solve my problem here. The post seeks for C++-way to do it, but as I mentioned already, it is inconvenient to just use C++-way to do it sometimes and I have clear examples for that.
I am trying to read data from an istream object, and sometimes it is inconvenient to just use C++-style ways such as operator>>, e.g. the data are in special form 123:456 so you have to imbue to make ':' as space (which is very hacky, as opposed to %d:%d in scanf), or 00123 where you want to read as string and convert decimal instead of octal (as opposed to %d in scanf), and possibly many other cases.
The reason I chose istream as interface is because it can be derived and therefore more flexible. For example, we can create in-memory streams, or some customized streams that generated on the fly, etc. C-style FILE*, on the other hand, is very limited, at least in a standard-compliant way, on creating customized streams.
So my questions is, is there a way to do scanf-like data extraction on istream object? I think fscanf internally read character by character from FILE* using fgetc, while istream also provides such interface. So it is possible by just copying and pasting the code of fscanf and replace the FILE* with the istream object, but that's very hacky. Is there a smarter and cleaner way, or is there some existing work on this?
Thanks.
You should never, under any circumstances, use scanf or its relatives for anything, for three reasons:
Many format strings, including for instance all the simple uses of %s, are just as dangerous as gets.
It is almost impossible to recover from malformed input, because scanf does not tell you how far in characters into the input it got when it hit something unexpected.
Numeric overflow triggers undefined behavior: yes, that means scanf is allowed to crash the entire program if a numeric field in the input has too many digits.
Prior to C++11, the C++ specification defined istream formatted input of numbers in terms of scanf, which means that last objection is very likely to apply to them as well! (In C++11 the specification is changed to use strto* instead and to do something predictable if that detects overflow.)
What you should do instead is: read entire lines of input into std::string objects with getline, hand-code logic to split them up into fields (I don't remember off the top of my head what the C++-string equivalent of strsep is, but I'm sure it exists) and then convert numeric strings to machine numbers with the strtol/strtod family of functions.
I cannot emphasize this enough: THE ONLY 100% RELIABLE WAY TO CONVERT STRINGS TO NUMBERS IN C OR C++, unless you are lucky enough to have a C++ runtime that is already C++11-conformant in this regard, IS WITH THE strto* FUNCTIONS, and you must use them correctly:
errno = 0;
result = strtoX(s, &ends, 10); // omit 10 for floats
if (s == ends || *ends || errno)
parse_error();
(The OpenBSD manpages, linked above, explain why you have to do this fairly convoluted thing.)
(If you're clever, you can use ends and some manual logic to skip that colon, instead of strsep.)
I do not recommend you to mix C++ input output and C input output. No that they are really incompatible but they could just plain interoperate wrong.
For example Oracle docs recommend not to mix it http://www.oracle.com/technetwork/articles/servers-storage-dev/mixingcandcpluspluscode-305840.html
But no one stops you from reading data into the buffer and parsing it with standard c functions like sscanf.
...
string curString;
int a, b;
...
std::getline(inputStream, curString);
int sscanfResult == sscanf(curString.cstr(), "%d:%d", &a, &b);
if (2 != sscanfResult)
throw "error";
...
But it won't help in some situations when your stream is just one long contiguous sequence of symbols(like some string turned into memory stream).
Making your own fscanf from scratch or porting(?) the original CRT function actually isn't the worst possible idea. Just make sure you have tested it thoroughly(low level custom char manipulation was always a source of pain in C).
I've never really tried the boost\spirit and such parsing infrastructure could really be an overkill for your project. But boost libraries are usually well tested and designed. You could at least try to use it.
Based on #tmyklebu's comment, I implemented streamScanf which wraps istream as FILE* via fopencookie: https://github.com/likan999/codejam/blob/master/Common/StreamScanf.cpp

How to fix fprintf vulnerability?

In my code I used fprintf. I used flawfinder to check the code for vulnerabilities and I got that:
358: [4] (format) fprintf: If format strings can be influenced by
an attacker, they can be exploited. Use a constant for the format
specification.
Can someone explain to me what Use a constant for the format specification actually means? Is there any safe version of fprintf?
The problem is that fprintf determines how many arguments it should get by examining the format string. If the format string doesn't agree with the actual arguments, you have undefined behavior which can manifest as a security vulnerability.
The problem is particularly bad if the string supplied can be influenced by the user of your program, because he can then specifically design the string to make your program do bad things.
There is no safe version of fprintf in the C standard. C++ streams avoid the problem, at the cost of not having format strings and using a far more verbose syntax for specifying formatting options.
A constant string, as in a string literal.
Like in
fprintf(someFile, "%s", someStringVariable);
and not like
fprintf(someFile, someStringVariable);
It means it wants you to write:
fprintf(out, "foo %s", some_string);
instead of what you have, which I guess is something like:
const char *format = "foo %s";
/* some time later */
fprintf(out, format, some_string);
The reason is that it's worried format might come from user input or something, and a malicious user could supply a format foo %s%s%s in order to provoke undefined behavior that they may be able to exploit.
Obviously if you're choosing between n different format strings, all of which are string literals in your code and all use the same format specifiers, but you choose which one at runtime, then following this advice is a bit awkward and wouldn't make your code any safer. But you could have n functions instead of n strings, and each function calls fprintf with a different string literal.
If you're reading the format string out of a config file (which is one fairly crude way of implementing internationalization from scratch) then you're basically out of luck. The linter doesn't trust your translator to use the right format codes for the arguments supplied to the call. And arguably neither should you :-)

printf vs. std::cout [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Should I use printf in my C++ code?
If I just want to print a string on screen, I can do that using those two ways:
printf("abc");
std::cout << "abc" << std::endl;
The case is, and in the examples shown above, is there an advantage of using printf over std::cout, or, vice versa?
While cout is the proper C++ way, I believe that some people and companies (including Google) continue to use printf in C++ code because it is much easier to do formatted output with printf than with cout.
Here's an interesting example that I found here.
Compare:
printf( "%-20s %-20s %5s\n" , "Name" , "Surname" , "Id" );
and
cout << setw( -20 ) << "Name" << setw( 20 ) << "Surname" << setw( 5 ) << "Id" << endl;
printf and its associated friends are C functions. They work in C++, but do not have the type safety of C++ std::ostreams. Problems can arise in programs that use printf functions to format output based on user input (or even input from a file). For example:
int main()
{
char[] a = {'1', '2', '3', '4'}; // a string that isn't 0-terminated
int i = 50;
printf("%s", a); // will continue printing characters until a 0 is found in memory
printf("%s", i); // will attempt to print a string, but this is actually an integer
}
C++ has much stronger type safety (and a std::string class) to help prevent problems like these.
I struggle with this very question myself. printf is in general easier to use for formatted printing, but the iostreams facility in C++ has the big advantage that you can create custom formatters for objects. I end up using both of them in my code as necessary.
The problem with using both and intermixing them is that the output buffers used by printf and cout are not the same, so unless you run unbuffered or explicitly flush output you can end up with corrupted output.
My main objection to C++ is that there is no fast output formatting facility similar to printf, so there is no way to easily control output for integer, hex, and floating point formatting.
Java had this same problem; the language ended up getting printf.
Wikipedia has a good discussion of this issue at http://en.wikipedia.org/wiki/Printf#C.2B.2B_alternatives_to_sprintf_for_numeric_conversion.
Actually for your particular example, you should have asked which is preferable, puts or cout. printf prints formatted text but you are just outputting plain text to the console.
For general use, streams (iostream, of which cout is a part) are more extensible (you can print your own types with them), and are more generic in that you can generate functions to print to any type of stream, not just the console (or redirected output). You can create generic stream behaviour with printf too using fprintf which take a FILE* as a FILE* is often not a real file, but this is more tricky.
Streams are "typesafe" in that you overload with the type you are printing. printf is not typesafe with its use of ellipses so you could get undefined results if you put the wrong parameter types in that do not match the format string, but the compiler will not complain. You may even get a seg-fault / undefined behaviour (but you could with cout if used incorrectly) if you miss a parameter or pass in a bad one (eg a number for %s and it treats it as a pointer anyway).
printf does have some advantages though: you can template a format string then reuse that format string for different data, even if that data is not in a struct, and using formatting manipulations for one variable does not "stick" that format for further use because you specify the format for each variable. printf is also known to be threadsafe whereas cout actually is not.
boost has combined the advantages of each with their boost::format library.
The printf has been borrowed from C and has some limitations. The most common mentioned limitation of printf is type safety, as it relies on the programmer to correctly match the format string with the arguments. The second limitation that comes again from the varargs environment is that you cannot extend the behavior with user defined types. The printf knows how to print a set of types, and that's all that you will get out of it. Still, it for the few things that it can be used for, it is faster and simpler to format strings with printf than with c++ streams.
While most modern compilers, are able to address the type safety limitation and at least provide warnings (the compiler can parse the format string and check the arguments provided in the call), the second limitation cannot be overcome. Even in the first case, there are things that the compiler cannot really help with, as checking for null termination --but then again, the same problem goes with std::cout if you use it to print the same array.
On the other end, streams (including std::cout) can be extended to handle user defined types by means of overloaded std::ostream& operator<<( std::ostream&, type const & ) for any given user defined type type. They are type safe by themselves --if you pass in a type that has no overloaded operator<< the compiler will complain. They are, on the other hand, more cumbersome to produce formatted output.
So what should you use? In general I prefer using streams, as overloading operator<< for my own types is simple and they can be used uniformly with all types.
Those two examples do different things. The latter will add a newline character and flush output (result of std::endl). std::cout is also slower. Other than that, printf and std::cout achieve the same thing and you can choose whichever you prefer. As a matter of preference, I'd use std::cout in C++ code. It's more readable and safer.
See this article if you need to format output using std::cout.
In general, you should prefer cout because it's much type-safer and more generic. printf isn't type-safe, nor is it generic at all. The only reason you might favour printf is speed- from memory, printf is many times faster than cout.