I'm relearning C++ and I found myself often writing pieces of code like this:
vector<string> Text;
string Line;
while (getline(cin, Line))
{
Text.push_back(Line);
}
I was wondering if there is a more compact way to write the loop using only basic features (no user-written functions, for example) - more or less, putting everything in the condition?
You can use a for loop. We can declare Line in the variable declaration part and use the condition and increment parts to read and place the line. That gives you
for(string Line; getline(cin, Line); Text.push_back(Line));
more or less, putting everything in the condition?
You can do this
while (getline(cin, Line) && (Text.push_back(Line),true)) {}
It works because && is short-circuited and because the comma-operator evaluates the first operand, discards the result and returns the result of the second operand.
So it is possible, but why would you want to do that? Making code as dense as possible is rarely good for readability (actually your original code is more readable and uses less characters).
Well, at the expense of some obfuscation,
while (getline(cin, Line) && (Text.push_back(Line), 1));
would do it: note the use of the expression separator operator which, informally speaking, "converts" the void return type of push_back to an int so enabling its use with the short-circuiting &&.
But as a rule of thumb, work with the language, not against it. My answer is doing the latter. The way you present the code in your question is adequate.
At the risk of "but that is exactly OP's code!", I would personally favor this version if this is the entire body of a scope (e.g. function that parses the text):
vector<string> Text;
string Line;
while (getline(cin, Line))
Text.push_back(Line);
Alternatively, if part of a larger scope, I would probably group all four lines together and add empty lines before and after for visual coherence (and maybe add a short comment before it):
// [lots of other code]
// Gather input from cin.
vector<string> Text;
string Line;
while (getline(cin, Line))
Text.push_back(Line);
// [lots of other code]
I am aware that this introduces no clever tricks, but this is the most compact and readable form of the given code, at least to me.
If you wanted compactness above all else, you could choose garbage variables, omit all unnecessary whitespace and even alias the types beforehand (since we are "often writing" this kind of code this is a one-off) to, say, V<S> t;S l;while(getline(cin,l))t.push_back(l); but nobody wants to read that.
So clearly there is more than compactness at play. As for me, I'm looking to keep noise to a minimum while retaining intuitive readability, and I would suggest this is an agreeable goal.
I would never use the "throw everything into the loop condition" suggestions because that very much breaks how I expect code to be structured: The main purpose of your loop goes into the loop body. You may disagree/have different expectations, but in my eyes everything else is just an attempt to show off your minifying skills, it does not produce good code.
The above accomplishes just that: The braces are noise for this simple operation, and the important part stands out as the loop body. "But is getline not also important?" - It is, and I would honestly prefer a version where it is in the loop body, such as a hypothetical
vector<string> Text;
while (cin.hasLine())
Text.push_back(readLine(cin));
This would be an ideal loop to me: The condition only checks for termination and the loop body is only the operation we want to repeat.
Even better would be a standard algorithm, but I unaware of any that would help here (ranges or boost might provide, I don't know).
On a more abstract level, if OP frequently writes this exact code, it should obviously be a separate function. But even if not, the "lots of other code" example would benefit from that abstraction too.
Loop with a single instruction. you can write it in one line but I don't recommend it
while (getline(cin, Line)) Text.push_back(Line);
Related
so I made a function to parse a given string with a comma delimiter last semester during a haze. Its very likely I took much of it from guides online, but it worked for the overall project so I did it. Now i'm going back and reviewing it, and i'm confused. Here is the code
`vector parsedString(string line){
vector splitStrings;
stringstream inputString(line);
while(inputString.good()){
string substr;
getline(inputString,substr,',');
splitStrings.push_back(substr);
substr = "";
}
return splitStrings;
}`
The purpose was to put each part of the line thats seperated into a vector, then take that vector back where its needed with all the parts. However, I do NOT understand the stringstream aspects of this.
To be specific, when I wrote code to check stringstreams contents during the loop, it stayed the same for the entire time. If getline() is supposed to track where the last delimiter was, why does it not show in the contents?
also if possible, an explanation on how .good() works in this case would be phenomenal. I understand stringstream is a stream, and function of that sort are supposed to check if streams are finished or not, but again I don't understand how the program would know that.
Everything works as intended, there is no mistakes being made from what I can see. I just fundamentally can't seem to grasp why its working, and I don't want my lack of knowledge to come back to bite me.
A istringstream is not just a string. If it were, it would be redundant.
For a first approximation, you could think of it as a class whose instances contain a string and a position (i.e., a string index). When you construct the istringstream from a string, the position is initialised to 0. When you read a character from the istringstream, you get the character at the position, and the position is incremented. So each time you read a character, you get the next one. (Actually, a stringstream has two positions, one for reading and one for writing, because it's a combination of an istringstream and an ostreamstring. But only the input part is relevant to your question.)
All other stream input operations are based on reading a single character, although implementations are allowed to be more efficient if the results are the same.
The above was an oversimplification, of course. A stream has other state: status bits, formatting parameters, locale settings, and more stuff I'm forgetting. See this overview for more details. But the basic point stands: the string is only a part of a stringstream's state: the rest of the state is used to make it look like an I/O stream. Which turns out to be useful if you want to pick it apart sequentially into tokens.
Please note, I've never used streams before today, so my understanding of them remains rather vague. Apologies when I say something appallingly stupid.
Here I have a short bit of code that splits up a stringstream into a bunch of strings at each space.
vector<string> words;
stringstream ss("some random words that I wrote just now");
string word;
while(getline(ss, word, ' ')){
words.push_back(word);
}
I'm wondering why we're using a stringstream here, rather than just a string.
This would work like:
Create a string object at memory location x
When referenced, go through each character and check if it is a space. All previous characters should be saved somewhere temporary.
If it is a space, grab all the stuff we've just stored and stick it on the end of the vector, then, clear the temp storage thing. If it's not a space, go back to step 2
What's storing "some random words that I wrote just now" as a stringstream going to do to help us here?
Is it just making a stream of characters so that we can check through them? Is this necessary? Are we always doing this, even in other languages?
I'm wondering why we're using a stringstream here, rather than just a string.
If this is the question, then one big reason why stringstream is used is simply -- because it works with little effort by the programmer. The less code you write, the less chance for bugs to occur.
Your method of using just std::string and searching for spaces requires the C++ programmer to write all of those steps (create a string, manually search for spaces, etc). It may be trivial to write, but even the best programmers can make mistakes in trivial code. The code may have bugs, may not cover all of the corner cases, etc.
As to ease of use:
When a C++ programmer sees stringstream with respect to usage of separating a sting with whitespace, the purpose of the code is immediately known.
If on the other hand, a programmer decides to manually parse the data by using just string and searching for spaces, the code is not immediately realized as to what it does when another programmer reads the code. Sure, it may be a quick realization of the code by the other programmer, but I can bet the other programmer will say "why didn't you use stringstream?".
What's storing "some random words that I wrote just now" as a stringstream going to do to help us here? Is it just making a stream of characters so that we can check through them? Is this necessary?
std::stringstream just allows you to use the usual input/output operations such as >> and std::getline on a string. You can't use std::getline to read from an std::string, so you put the string in a std::streamstream first. You can totally parse a string by looping over the characters yourself as you described.
Are we always doing this, even in other languages?
Not in Python at least. There you would just do words = line.split(' ').
I want to read a file line by line and capture one particular line of input. For maximum performance I could do this in a low level way by reading the entire file in and just iterating over its contents using pointers, but this code is not performance critical so therefore I wish to use a more readable and typesafe std library style implementation.
So what I have is this:
std::string line;
line.reserve(1024);
std::ifstream file(filePath);
while(file)
{
std::getline(file, line);
if(line.substr(0, 8) == "Whatever")
{
// Do something ...
}
}
While this isn't performance critical code I've called line.reserve(1024) before the parsing operation to preclude multiple reallocations of the string as larger lines are read in.
Inside std::getline the string is erased before having the characters from each line added to it. I stepped through this code to satisfy myself that the memory wasn't being reallocated each iteration, what I found fried my brain.
Deep inside string::erase rather than just resetting its size variable to zero what it's actually doing is calling memmove_s with pointer values that would overwrite the used part of the buffer with the unused part of the buffer immediately following it, except that memmove_s is being called with a count argument of zero, i.e. requesting a move of zero bytes.
Questions:
Why would I want the overhead of a library function call in the middle of my lovely loop, especially one that is being called to do nothing at all?
I haven't picked it apart myself yet but under what circumstances would this call not actually do nothing but would in fact start moving chunks of buffer around?
And why is it doing this at all?
Bonus question: What the C++ standard library tag?
This is a known issue I reported a year ago, to take advantage of the fix you'll have to upgrade to a future version of the compiler.
Connect Bug: "std::string::erase is stupidly slow when erasing to the end, which impacts std::string::resize"
The standard doesn't say anything about the complexity of any std::string functions, except swap.
std::string::clear() is defined in terms of std::string::erase(),
and std::string::erase() does have to move all of the characters after
the block which was erased. So why shouldn't it call a standard
function to do so? If you've got some profiler output which proves that
this is a bottleneck, then perhaps you can complain about it, but
otherwise, frankly, I can't see it making a difference. (The logic
necessary to avoid the call could end up costing more than the call.)
Also, you're not checking the results of the call to getline before
using them. Your loop should be something like:
while ( std::getline( file, line ) ) {
// ...
}
And if you're so worried about performance, creating a substring (a new
std::string) just in order to do a comparison is far more expensive
than a call to memmove_s. What's wrong with something like:
static std::string const target( "Whatever" );
if ( line.size() >= target.size()
&& std::equal( target.begin(), target().end(), line.being() ) ) {
// ...
}
I'ld consider this the most idiomatic way of determining whether a
string starts with a specific value.
(I might add that from experience, the reserve here doesn't buy you
much either. After you've read a couple of lines in the file, your
string isn't going to grow much anyway, so there'll be very few
reallocations after the first couple of lines. Another case of
premature optimization?)
In this case, I think the idea you mention of reading the entire file and iterating over the result may actually give about as simple of code. You're simply changing: "read line, check for prefix, process" to "read file, scan for prefix, process":
size_t not_found = std::string::npos;
std::istringstream buffer;
buffer << file.rdbuf();
std::string &data = buffer.str();
char const target[] = "\nWhatever";
size_t len = sizeof(target)-1;
for (size_t pos=0; not_found!=(pos=data.find(target, pos)); pos+=len)
{
// process relevant line starting at contents[pos+1]
}
When would I use std::istringstream, std::ostringstream and std::stringstream and why shouldn't I just use std::stringstream in every scenario (are there any runtime performance issues?).
Lastly, is there anything bad about this (instead of using a stream at all):
std::string stHehe("Hello ");
stHehe += "stackoverflow.com";
stHehe += "!";
Personally, I find it very rare that I want to perform streaming into and out of the same string stream.
Usually I want to either initialize a stream from a string and then parse it; or stream things to a string stream and then extract the result and store it.
If you're streaming to and from the same stream, you have to be very careful with the stream state and stream positions.
Using 'just' istringstream or ostringstream better expresses your intent and gives you some checking against silly mistakes such as accidental use of << vs >>.
There might be some performance improvement but I wouldn't be looking at that first.
There's nothing wrong with what you've written. If you find it doesn't perform well enough, then you could profile other approaches, otherwise stick with what's clearest. Personally, I'd just go for:
std::string stHehe( "Hello stackoverflow.com!" );
A stringstream is somewhat larger, and might have slightly lower performance -- multiple inheritance can require an adjustment to the vtable pointer. The main difference is (at least in theory) better expressing your intent, and preventing you from accidentally using >> where you intended << (or vice versa). OTOH, the difference is sufficiently small that especially for quick bits of demonstration code and such, I'm lazy and just use stringstream. I can't quite remember the last time I accidentally used << when I intended >>, so to me that bit of safety seems mostly theoretical (especially since if you do make such a mistake, it'll almost always be really obvious almost immediately).
Nothing at all wrong with just using a string, as long as it accomplishes what you want. If you're just putting strings together, it's easy and works fine. If you want to format other kinds of data though, a stringstream will support that, and a string mostly won't.
In most cases, you won't find yourself needing both input and output on the same stringstream, so using std::ostringstream and std::istringstream explicitly makes your intention clear. It also prevents you from accidentally typing the wrong operator (<< vs >>).
When you need to do both operations on the same stream you would obviously use the general purpose version.
Performance issues would be the least of your concerns here, clarity is the main advantage.
Finally there's nothing wrong with using string append as you have to construct pure strings. You just can't use that to combine numbers like you can in languages such as perl.
istringstream is for input, ostringstream for output. stringstream is input and output.
You can use stringstream pretty much everywhere.
However, if you give your object to another user, and it uses operator >> whereas you where waiting a write only object, you will not be happy ;-)
PS:
nothing bad about it, just performance issues.
std::ostringstream::str() creates a copy of the stream's content, which doubles memory usage in some situations. You can use std::stringstream and its rdbuf() function instead to avoid this.
More details here: how to write ostringstream directly to cout
To answer your third question: No, that's perfectly reasonable. The advantage of using streams is that you can enter any sort of value that's got an operator<< defined, while you can only add strings (either C++ or C) to a std::string.
Presumably when only insertion or only extraction is appropriate for your operation you could use one of the 'i' or 'o' prefixed versions to exclude the unwanted operation.
If that is not important then you can use the i/o version.
The string concatenation you're showing is perfectly valid. Although concatenation using stringstream is possible that is not the most useful feature of stringstreams, which is to be able to insert and extract POD and abstract data types.
Why open a file for read/write access if you only need to read from it, for example?
What if multiple processes needed to read from the same file?
My brother recently started learning C++. He told me a problem he encountered while trying to validate input in a simple program. He had a text menu where the user entered an integer choice, if they entered an invalid choice, they would be asked to enter it again (do while loop). However, if the user entered a string instead of an int, the code would break.
I read various questions on stackoverflow and told him to rewrite his code along the lines of:
#include<iostream>
using namespace std;
int main()
{
int a;
do
{
cout<<"\nEnter a number:"
cin>>a;
if(cin.fail())
{
//Clear the fail state.
cin.clear();
//Ignore the rest of the wrong user input, till the end of the line.
cin.ignore(std::numeric_limits<std::streamsize>::max(),\
'\n');
}
}while(true);
return 0;
}
While this worked ok, I also tried a few other ideas:
1. Using a try catch block. It didn't work. I think this is because an exception is not raised due to bad input.
2. I tried if(! cin){//Do Something} which didn't work either. I haven't yet figured this one out.
3. Thirdly, I tried inputting a fixed length string and then parsing it. I would use atoi(). Is this standards compliant and portable? Should I write my own parsing function?
4. If write a class that uses cin, but dynamically does this kind of error detection, perhaps by determining the type of the input variable at runtime, would it have too much overhead? Is it even possible?
I would like to know what is the best way to do this kind of checking, what are the best practices?
I would like to add that while I am not new to writing C++ code, I am new to writing good standards compliant code. I am trying to unlearn bad practices and learn the right ones. I would be much obliged if answerers give a detailed explanation.
EDIT: I see that litb has answered one of my previous edits. I'll post that code here for reference.
#include<iostream>
using namespace std;
int main()
{
int a;
bool inputCompletionFlag = true;
do
{
cout<<"\nEnter a number:"
cin>>a;
if(cin.fail())
{
//Clear the fail state.
cin.clear();
//Ignore the rest of the wrong user input, till the end of the line.
cin.ignore(std::numeric_limits<std::streamsize>::max(),\
'\n');
}
else
{
inputCompletionFlag = false;
}
}while(!inputCompletionFlag);
return 0;
}
This code fails on input like "1asdsdf". I didn't know how to fix it but litb has posted a great answer. :)
Here is code you could use to make sure you also reject things like
42crap
Where non-number characters follow the number. If you read the whole line and then parse it and execute actions appropriately it will possibly require you to change the way your program works. If your program read your number from different places until now, you then have to put one central place that parses one line of input, and decides on the action. But maybe that's a good thing too - so you could increase the readability of the code that way by having things separated: Input - Processing - Output
Anyway, here is how you can reject the number-non-number of above. Read a line into a string, then parse it with a stringstream:
std::string getline() {
std::string str;
std::getline(std::cin, str);
return str;
}
int choice;
std::istringstream iss(getline());
iss >> choice >> std::ws;
if(iss.fail() || !iss.eof()) {
// handle failure
}
It eats all trailing whitespace. When it hits the end-of-file of the stringstream while reading the integer or trailing whitespace, then it sets the eof-bit, and we check that. If it failed to read any integer in the first place, then the fail or bad bit will have been set.
Earlier versions of this answer used std::cin directly - but std::ws won't work well together with std::cin connected to a terminal (it will block instead waiting for the user to input something), so we use a stringstream for reading the integer.
Answering some of your questions:
Question: 1. Using a try catch block. It didn't work. I think this is because an exception is not raised due to bad input.
Answer: Well, you can tell the stream to throw exceptions when you read something. You use the istream::exceptions function, which you tell for which kind of error you want to have an exception thrown:
iss.exceptions(ios_base::failbit);
I did never use it. If you do that on std::cin, you will have to remember to restore the flags for other readers that rely on it not throwing. Finding it way easier to just use the functions fail, bad to ask for the state of the stream.
Question: 2. I tried if(!cin){ //Do Something } which didn't work either. I haven't yet figured this one out.
Answer: That could come from the fact that you gave it something like "42crap". For the stream, that is completely valid input when doing an extraction into an integer.
Question: 3. Thirdly, I tried inputting a fixed length string and then parsing it. I would use atoi(). Is this standards compliant and portable? Should I write my own parsing function?
Answer: atoi is Standard Compliant. But it's not good when you want to check for errors. There is no error checking, done by it as opposed to other functions. If you have a string and want to check whether it contains a number, then do it like in the initial code above.
There are C-like functions that can read directly from a C-string. They exist to allow interaction with old, legacy code and writing fast performing code. One should avoid them in programs because they work rather low-level and require using raw naked pointers. By their very nature, they can't be enhanced to work with user defined types either. Specifically, this talks about the function "strtol" (string-to-long) which is basically atoi with error checking and capability to work with other bases (hex for example).
Question: 4. If I write a class that uses cin, but dynamically do this kind of error detection, perhaps by determining the type of the input variable at runtime, will it have too much overhead? Is it even possible?
Answer: Generally, you don't need to care too much about overhead here (if you mean runtime-overhead). But it depends specifically on where you use that class. That question will be very important if you are writing a high performance system that processes input and needs to have high throughout. But if you need to read input from a terminal or a file, you already see what this comes down to: Waiting for the user to input something takes really so long, you don't need to watch runtime costs at this point anymore on this scale.
If you mean code overhead - well it depends on how the code is implemented. You would need to scan your string that you read - whether it contains a number or not, whether some arbitrary string. Depending on what you want to scan (maybe you have a "date" input, or a "time" input format too. Look into boost.date_time for that), your code can become arbitrarily complex. For simple things like classifying between number or not, I think you can get away with small amount of code.
This is what I do with C but it's probably applicable for C++ as well.
Input everything as a string.
Then, and only then, parse the string into what you need. It's sometimes better to code your own than try to bend someone else's to your will.
In order to get the exceptions with iostreams you need to set the proper exception flag for the stream.
And I would use get_line to get the whole line of input and then handle it accordingly - use lexical_cast, regular expressions (for example Boost Regex or Boost Xpressive, parse it with Boost Spirit, or just use some kind of appropriate logic
What I would do is twofold: First, try to validate the input, and extract the data, using a regular expression, if the input is somewhat not trivial. It can be very helpful also even if the input is just a series of numbers.
Then, I like to use boost::lexical_ cast, that can raise a bad_ lexical_ cast exception if the input cannot be converted.
In your example:
std::string in_str;
cin >> in_str;
// optionally, test if it conforms to a regular expression, in case the input is complex
// Convert to int? this will throw bad_lexical_cast if cannot be converted.
int my_int = boost::lexical_cast<int>(in_str);
Forget about using formatted input (the >> operator) directly in real code. You will always need to read raw text with std::getline or similar and then use your own input parsing routines (which may use of the >> operator) to parse the input.
How about a combination of the various approaches:
Snag the input from std::cin using std::getline(std::cin, strObj) where strObj is a std::string object.
Use boost::lexical_cast to perform a lexical translation from strObj to either a signed or unsigned integer of largest width (e.g., unsigned long long or something similar)
Use boost::numeric_cast to cast the integer down to the expected range.
You could just fetch the input with std::getline and then call boost::lexical_cast to the appropriately narrow integer type as well depending on where you want to catch the error. The three step approach has the benefit of accepting any integer data and then catch narrowing errors separately.
I agree with Pax, the simplest way to do this is to read everything as string, then use TryParse to verify the input. If it is in the right format, then proceed, otherwhise just notify the user and use continue on the loop.
One thing that hasn't been mentioned yet is that it is usually important that you test to see if the cin >> operation worked before using the variable that supposedly got something from the stream.
This example is similar to yours, but makes that test.
#include <iostream>
#include <limits>
using namespace std;
int main()
{
while (true)
{
cout << "Enter a number: " << flush;
int n;
if (cin >> n)
{
// do something with n
cout << "Got " << n << endl;
}
else
{
cout << "Error! Ignoring..." << endl;
cin.clear();
cin.ignore(numeric_limits<streamsize>::max(), '\n');
}
}
return 0;
}
This will use the usual operator >> semantics; it will skip whitespace first, then try to read as many digits as it can and then stop. So "42crap" will give you the 42 then skip over the "crap". If that isn't what you want, then I agree with the previous answers, you should read it into a string and then validate it (perhaps using a regular expression - but that may be overkill for a simple numeric sequence).