Rationale
I try to avoid assignments in C++ code completely. That is, I use only initialisations and declare local variables as const whenever possible (i.e. always except for loop variables or accumulators).
Now, I’ve found a case where this doesn’t work. I believe this is a general pattern but in particular it arises in the following situation:
Problem Description
Let’s say I have a program that loads the contents of an input file into a string. You can either call the tool by providing a filename (tool filename) or by using the standard input stream (cat filename | tool). Now, how do I initialise the string?
The following doesn’t work:
bool const use_stdin = argc == 1;
std::string const input = slurp(use_stdin ? static_cast<std::istream&>(std::cin)
: std::ifstream(argv[1]));
Why doesn’t this work? Because the prototype of slurp needs to look as follows:
std::string slurp(std::istream&);
That is, the argument i non-const and as a consequence I cannot bind it to a temporary. There doesn’t seem to be a way around this using a separate variable either.
Ugly Workaround
At the moment, I use the following solution:
std::string input;
if (use_stdin)
input = slurp(std::cin);
else {
std::ifstream in(argv[1]);
input = slurp(in);
}
But this is rubbing me the wrong way. First of all it’s more code (in SLOCs) but it’s also using an if instead of the (here) more logical conditional expression, and it’s using assignment after declaration which I want to avoid.
Is there a good way to avoid this indirect style of initialisation? The problem can likely be generalised to all cases where you need to mutate a temporary object. Aren’t streams in a way ill-designed to cope with such cases (a const stream makes no sense, and yet working on a temporary stream does make sense)?
Why not simply overload slurp?
std::string slurp(char const* filename) {
std::ifstream in(filename);
return slurp(in);
}
int main(int argc, char* argv[]) {
bool const use_stdin = argc == 1;
std::string const input = use_stdin ? slurp(std::cin) : slurp(argv[1]);
}
It is a general solution with the conditional operator.
The solution with the if is more or less the standard solution when
dealing with argv:
if ( argc == 1 ) {
process( std::cin );
} else {
for ( int i = 1; i != argc; ++ i ) {
std::ifstream in( argv[i] );
if ( in.is_open() ) {
process( in );
} else {
std::cerr << "cannot open " << argv[i] << std::endl;
}
}
This doesn't handle your case, however, since your primary concern is to
obtain a string, not to "process" the filename args.
In my own code, I use a MultiFileInputStream that I've written, which
takes a list of filenames in the constructor, and only returns EOF when
the last has been read: if the list is empty, it reads std::cin. This
provides an elegant and simple solution to your problem:
MultiFileInputStream in(
std::vector<std::string>( argv + 1, argv + argc ) );
std::string const input = slurp( in );
This class is worth writing, as it is generally useful if you often
write Unix-like utility programs. It is definitly not trivial, however,
and may be a lot of work if this is a one-time need.
A more general solution is based on the fact that you can call a
non-const member function on a temporary, and the fact that most of the
member functions of std::istream return a std::istream&—a
non const-reference which will then bind to a non const reference. So
you can always write something like:
std::string const input = slurp(
use_stdin
? std::cin.ignore( 0 )
: std::ifstream( argv[1] ).ignore( 0 ) );
I'd consider this a bit of a hack, however, and it has the more general
problem that you can't check whether the open (called by the constructor
of std::ifstream worked.
More generally, although I understand what you're trying to achieve, I
think you'll find that IO will almost always represent an exception.
You can't read an int without having defined it first, and you can't
read a line without having defined the std::string first. I agree
that it's not as elegant as it could be, but then, code which correctly
handles errors is rarely as elegant as one might like. (One solution
here would be to derive from std::ifstream to throw an exception if
the open didn't work; all you'd need is a constructor which checked for
is_open() in the constructor body.)
All SSA-style languages need to have phi nodes to be usable, realistically. You would run into the same problem in any case where you need to construct from two different types depending on the value of the condition. The ternary operator cannot handle such cases. Of course, in C++11 there are other tricks, like moving the stream or suchlike, or using a lambda, and the design of IOstreams is virtually the exact antithesis of what you're trying to do, so in my opinion, you would just have to make an exception.
Another option might be an intermediate variable to hold the stream:
std::istream&& is = argc==1? std::move(cin) : std::ifstream(argv[1]);
std::string const input = slurp(is);
Taking advantage of the fact that named rvalue references are lvalues.
Related
I am a beginner at C++ programming, and I encountered an issue.
I want to be able to convert the contents of a file to a char*, and I used file and string streams. However, it's not working.
This is my function that does the work:
char* fileToChar(std::string const& file){
std::ifstream in(file);
if (!in){
std::cout << "Error: file does not exist\n";
exit(EXIT_FAILURE);
}
std::stringstream buffer;
buffer << in.rdbuf() << std::flush;
in.close();
return const_cast<char *>(buffer.str().c_str());
}
However, when I test the method out by outputting its contents into another file like this:
std::ofstream file("test.txt");
file << fileToChar("fileTest.txt");
I just get tons of strange characters like this:
îþîþîþîþîþîþîþîþîþîþîþîþîþîþîþîþîþ[...etc]
What exactly is going on here? Is there anything I missed?
And if there's a better way to do this, I would be glad to know!
return const_cast<char *>(buffer.str().c_str());
returns a pointer to the internal char buffer of a temporary copy of the internal buffer of the local stringstream. Long story short: As soon as you exit the function, this pointer points to garbage.
Btw, even if that was not a problem, the const_cast would be dangerous nonsense, you are not allowed to write through the pointer std::string::c_str returns. Legitimate uses of const_cast are extremely rare.
And for the better way: The best and easiest way would be returning std::string. Only if this is not allowed, a std::vector<char> (preferred) or new char[somelength] (frowned on) would be viable solutions.
char* fileToChar(std::string const& file){
This line already shows that something is going into the wrong direction. You return a pointer to some string, and it's completely unclear to the user of the function who is responsible for releasing the allocated memory, if it has to be released at all, if nullptr can be returned, and so on.
If you want a string, then by all means use std::string!
std::string fileToChar(std::string const& file){
return const_cast<char *>(buffer.str().c_str());
Another line that should make all alarms go off. const_cast is always a workaround to some underlying problem (or some problem with external code).
There is usually a good reason why something is const. By forcing the compiler to turn off the security check and allowing it to attempt modifications of unmodifiable data, you typically turn compilation errors into hard-to-diagnose run-time errors.
Even if this function worked correctly, any attempt to modify the result would be undefined behaviour:
char* file_contents = fileToChar("foo.txt");
file_contents[0] = 'x'; // undefined behaviour
But it does not work correctly anyway. buffer.str() returns a temporary std::string object. c_str() returns a pointer to that temporary object's internally managed memory. The object's lifetime ends when the full expression return const_cast<char *>(buffer.str().c_str()) has been evaluated. Using the resulting pointer is therefore undefined behaviour, too.
The problems sound complicated, but the fix is easy. Make the function return std::string and turn the last statement into return buffer.str();.
If your question is, how to read the content of a file into an buffer, consider my following suggestion. But take care that buffer is big enough for the file content. A file size check and preallocation of the memory is advised before calling fileToChar().
bool fileToChar(std::string const& file, char* buffer, unsigned int &buffer_size )
{
FILE *f = fopen( file.c_str(), "rb" );
if( f == nullptr )
{
return false;
}
fseek(f , 0, SEEK_END );
const int size = ftell( f );
rewind( f );
fread( buffer, 1, size, f );
fclose( f );
return true;
}
I have the following mock up code of a class which uses an attribute to set a filename:
#include <iostream>
#include <iomanip>
#include <sstream>
class Test {
public:
Test() { id_ = 1; }
/* Code which modifies ID */
void save() {
std::string filename ("file_");
filename += getID();
std::cout << "Saving into: " << filename <<'\n';
}
private:
const std::string getID() {
std::ostringstream oss;
oss << std::setw(4) << std::setfill('0') << id_;
return oss.str();
}
int id_;
};
int main () {
Test t;
t.save();
}
My concern is about the getID method. At first sight it seems pretty inefficient since I am creating the ostringstream and its corresponding string to return. My questions:
1) Since it returns const std::string is the compiler (GCC in my case) able to optimize it?
2) Is there any way to improve the performance of the code? Maybe move semantics or something like that?
Thank you!
Creating an ostringstream, just once, prior to an expensive operation like opening a file, doesn't matter to your program's efficiency at all, so don't worry about it.
However, you should worry about one bad habit exhibited in your code. To your credit, you seem to have identified it already:
1) Since it returns const std::string is the compiler (GCC in my case) able to optimize it?
2) Is there any way to improve the performance of the code? Maybe move semantics or something like that?
Yes. Consider:
class Test {
// ...
const std::string getID();
};
int main() {
std::string x;
Test t;
x = t.getID(); // HERE
}
On the line marked // HERE, which assignment operator is called? We want to call the move assignment operator, but that operator is prototyped as
string& operator=(string&&);
and the argument we're actually passing to our operator= is of type "reference to an rvalue of type const string" — i.e., const string&&. The rules of const-correctness prevent us from silently converting that const string&& to a string&&, so when the compiler is creating the set of assignment-operator functions it's possible to use here (the overload set), it must exclude the move-assignment operator that takes string&&.
Therefore, x = t.getID(); ends up calling the copy-assignment operator (since const string&& can safely be converted to const string&), and you make an extra copy that could have been avoided if only you hadn't gotten into the bad habit of const-qualifying your return types.
Also, of course, the getID() member function should probably be declared as const, because it doesn't need to modify the *this object.
So the proper prototype is:
class Test {
// ...
std::string getID() const;
};
The rule of thumb is: Always return by value, and never return by const value.
1) Since it returns const std::string is the compiler (GCC in my case)
able to optimize it?
Makes no sense to return a const object unless returning by reference
2) Is there any way to improve the performance of the code? Maybe move
semantics or something like that?
Id id_ does not change, just create the value in the constructor, using an static method may help:
static std::string format_id(int id) {
std::ostringstream oss;
oss << std::setw(4) << std::setfill('0') << id;
return oss.str();
}
And then:
Test::Test()
: id_(1)
, id_str_(format_id(id_))
{ }
Update:
This answer is not totally valid for the problem due to the fact that id_ does change, I will not remove it 'cause maybe someone will find it usefull for his case. Anyway, I wanted to clarify some thoughts:
Must be static in order to be used in variable initialization
There was a mistake in the code (now corrected), which used the member variable id_.
It makes no sense to return a const object by value, because returning by value will just copy (ignoring optimizations) the result to a new variable, which is in the scope of the caller (and might be not const).
My advice
An option is to update the id_str_ field anytime id_ changes (you must have a setter for id_), given that you're already changin the member id_ I assume there will be no issues updating another.
This approach allows to implement getID() as a simple getter (should be const, btw) with no performance issues, and the string field is computed only once.
One possibility would be to do something like this:
std::string getID(int id) {
std::string ret(4, '0') = std::to_string(id);
return ret.substring(ret.length()-4);
}
If you're using an implementation that includes the short string optimization (e.g., VC++) chances are pretty good that this will give a substantial speed improvement (a quick test with VC++ shows it at around 4-5 times as fast).
OTOH, if you're using an implementation that does not include short string optimization, chances are pretty good it'll be substantially slower. For example, running the same test with g++, produces code that's about 4-5 times slower.
One more point: if your ID number might be more than 4 digits long, this doesn't give the same behavior--it always returns a string of exactly 4 characters rather than the minimum of 4 created by the stringstream code. If your ID numbers may exceed 9999, then this code simply won't work for you.
You could change getID in this way:
std::string getID() {
thread_local std::ostringstream oss;
oss.str(""); // replaces the input data with the given string
oss.clear(); // resets the error flags
oss << std::setw(4) << std::setfill('0') << id_;
return oss.str();
}
it won't create a new ostringstream every single time.
In your case it isn't worth it (as Chris Dodd says opening a file and writing to it is likely to be 10-100x more expensive)... just to know.
Also consider that in any reasonable library implementation std::to_string will be at least as fast as stringstream.
1) Since it returns const std::string is the compiler (GCC in my case)
able to optimize it?
There is a rationale for this practice, but it's essentially obsolete (e.g. Herb Sutter recommended returning const values for non-primitive types).
With C++11 it is strongly advised to return values as non-const so that you can take full advantage of rvalue references.
About this topic you can take a look at:
Purpose of returning by const value?
Should I return const objects?
I have function definition lke below
void ConvertString(std::string &str)
{
size_t pos = 0;
while ((pos = str.find("&", pos)) != std::string::npos) {
str.replace(pos, 1, "and");
pos += 3;
}
}
Purpose of this function is to find & and replace it with and. function execution in fine. I written this for all generalised string at one instance I am calling this in following way
char mystr[80] = "ThisIsSample&String";
ConvertString((std::string)mystr);
printf(mystr);
In above call I am expecting console should be printed with new modified string with "and".
But some of string modification is not working , any error in function?
This code:
char mystr[80] = "ThisIsSample&String";
ConvertString((std::string)mystr);
printf(mystr);
… creates a temporary string object and passes that as argument.
Since the formal argument type is by reference to non-const, this should not compile, but Visual C++ supports it as a language extension (for class types only, IIRC).
Instead do like
string s = "Blah & blah";
ConvertString( s );
cout << s << endl;
By the way, C style casts are in general an invitation to bugs, because the basic nature of such a cast can change very silently from e.g. const_cast to reinterpret_cast when the code is maintained.
It's safe enough in the hands of an experienced programmer, like a power tool such as a chain saw can be safe in the hands of an experienced woodsman, but it's not a thing that a novice should use just to save a little work.
It's because you create a temporary std::string object (whose initial content is the content of the array mystr), and pass that temporary object by reference to the function. This temporary object is then destructed when the call id done.
Did you read some documentation of std::string and of printf?
You need
std::string mystr = "ThisIsSample&String";
ConvertString(mystr);
printf(mystr.c_str());
You obviously want to pass by reference a string variable (technically an l-value) to your ConvertString
I believe your problem is that you cast char array to string.
ConvertString((std::string)mystr);
this line creates a new variable of type std::string and passes it by reference. What you want is to convert it this way:
std::string convertedStr = (std::string)mystr;
ConvertString(convertedStr);
printf(convertedStr.c_str());
I am not very well aware of C++ pointer and reference syntax, but it's similar to this
what your are doing is not correct! you cannot should not convert a char* to a std::string with a cstyle-cast. what you should do is more like:
std::string mystr( "ThisIsSample&String" );
ConvertString(mystr);
edit:
thx for -reputation... this code isn't even compiling...
http://ideone.com/bCsmgf
When I attempt to run this code it crashes. There are no error messages. When the program compiles and runs, it just displays the windows 7 message, "this program has stopped working.":
void readGameFile(string ** entries, int * num_entries, string ** story, int * num_lines)
{
ifstream madlib("madlibs1.txt");
string line;
getline(madlib, line);
*num_entries=stoi(line);
*entries=new string [*num_entries];
for (int i=0; i<*num_entries; i++)
{
getline(madlib,*entries[i]);
}
I did a few tests, and it seems to assign entries[0] a value, and then crashes when attempting to assign entries[1] a value. I am forced to use this function name, with those function parameters and parameter types specifically. I also may not use malloc, vector or other answers I've seen.
I think the issue is one of precedence: you almost certainly
want:
getline( madlib, (*entries)[i]) );
Otherwise, you're indexing from the string**, then
dereferencing: *(entries[i]).
You also want to check the results of getline, possibly in the
loop:
for ( int i = 0; madlib && i != *num_entries; ++ i )...
as well as before the std::stoi.
And finally: I don't know why you are forced to use this
function signature. It is horrible C++, and you should never
write anything like this. Logically, std::vector<string>
would be a better solution, but even without it: your function
has 4 out parameters. This would be better handled by returning
a struct. And failing that, out parameters in C++ are
usually implemented by non-const reference, not by a pointer.
While there are arguments for using the pointer in some cases,
when it results in a pointer to a pointer, it's evil. If
nothing else:
bool // Because we have to indicate whether it succeed or failed
readGameFile( std::string* &entries, int &num_entries, std::string* &story, int &num_lines )
// ...
(This actually looks more like it should be constructor,
however, of a class with two data elements, entries and
story.)
Assignment:
Read in info from text file (done)
Retrieve only parts of text file using substr method (done)
Store info into instance variables (need help)
Here is the code I am having trouble with:
string* lati;
lati = new string(data.substr(0, data.find_first_of(",")));
double* latDub;
latDub = new double(atof((char *)lati));
this->latitude = *latDub;
I need to store the latitude into the instance variable latitude.
The variable data is the read-in text file.
this->latitude is declared as a double.
I have tested and the variable lati is the correct value, but once I try to convert it into a double the value changes to 0 for some reason. I am specifically supposed to use the atof method when converting!
(char *)lati doesn't do what you think it does. What you're clearly trying to do there is get the char sequence associated with lati, but what you're actually doing is just squeezing a string* into a char* which is all kinds of bad.
There's a member function on std::string that will give you exactly what you want. You should review the documentation for string, and replace (char *)lati with a call to that function.
Why your code compiles, but gives meaningless results has already been explained by adpalumbo. There are two fundamental problems in your code leading to that error, on which I want to expand here.
One is that you use a C-style cast: (T)obj. Basically, that just tells the compiler to shut up, you know what you are doing. That is rarely ever a good idea, because when you do know what you are doing, you can usually do without such casts.
The other one is that you are using objects allocated dynamically on the heap. In C++, objects should be created on the stack, unless you have very good reasons for using dynamic objects. And dynamic objects are usually hidden inside objects on the stack. So your code should read like this:
string lati(data.substr(0, data.find_first_of(",")));
double latDub = /* somehow create double from lati */;
this->latitude = latDub;
Of course, latDub is completely unnecessary, you could just as well write to this->latitude directly.
Now, the common way to convert a string into some other type would be streaming it through a string stream. Removing the unnecessary variables you introduced, your code would then look like this:
std::istringstream iss(data.substr(0, data.find_first_of(",")));
if( !iss >> this->latitude ) throw "Dude, you need error handling here!";
Usually you want to pack that conversion from a string into a utility function which you could reuse throughout your code:
inline double convert3double(const std::string& str)
{
std::istringstream iss(str);
double result;
if( !iss >> result )
throw std::exception("Dang!");
return result;
}
However, since the very same algorithm can be used for all types (for which operator>> is overloaded meaningfully with an input stream as the left operand), just make this a template:
template< typename T >
inline T convert3double(const std::string& str)
{
std::istringstream iss(str);
T result; // presumes default constructor
if( !iss >> result ) // presumes operator>>
throw std::exception("Dang!");
return result;
}