How do I write a function that reads an std::istream and sets proper flags if the stream contained unexpected content, ended before expected, or was not fully consumed?
For concreteness, suppose I'm expecting the stream to contain a string of alpha characters followed by a separator and then some digits, like foo:55. I'd like to read something like
struct var {
std::string name;
double value;
};
from the stream. I can of course write the operator as
std::istream& operator>>(std::istream& s, var& x) {
std::string str;
s >> str;
size_t sep = str.find(':');
x.name = str.substr(0,sep);
x.value = atof(str.substr(sep+1).c_str());
return s;
}
But can I do without copying the stream content to a string? Also, this doesn't work with spaces, in the sense that str won't contain the whole stream content.
I asked a similar question about a week ago, but there was no response to it, probably because I framed it in context on boost::program_options and such questions don't seem to get much attention here.
You can use std::getline instead of s >> str to read up to ':', and then read the number directly into the double, like this:
std::istream& operator>>(std::istream& s, var& x) {
// Skip over the leading whitespace
while (s.peek() == '\n' || s.peek() == ' ') {
s.get();
}
std::getline(s, x.name, ':');
s >> x.value;
return s;
}
Demo.
Why not let the stream do the work for you. You can use getline(), >> and istream::ignore() to read in the input.
std::istream& operator>>(std::istream& s, var& x) {
// get the string part and through out the :
std::getline(s, x.name, ':');
// get the number part
s >> x.value;
// consume the newline so the next call to getline won't include it in the string part
s.ignore(std::numeric_limits<std::streamsize>::max(), '\n')
return s;
}
Related
The contents of my file look like:
Barr,3145,7
Rab,12,5513,1412,221,232,179,7121231
Bean,1,1231,219,21,337,9239,312,764,640391,4,7966346,22278,5,116364,56350
Earl,132,230,121,32,425,67
Donut,112,5525,23121,2123,65432,8790,3,4567,444
I want each line to be stored in a 2D vector (vector<vector<string>>) without the comma.
I have tried doing:
while(getline(filestream, line)){
stringstream linestream(line);
while(getline(linestream, anotherLine, ','){
oneDvector.push_back(anotherLine);
}
twoDvector.push_back(oneDvector);
oneDvector.clear();
}
But this does not seem to work. What can I do?
IMHO, you don't want to use a 2d vector or array. You want a std::vector of a class:
struct Record
{
std::string text;
std::vector<int> data;
friend std::istream& operator>>(std::istream& input, Record& r);
};
std::istream& operator>>(std::istream& input, Record& r)
{
std::string text_line;
std::getline(input, text_line);
std::istringstream text_stream(text_line);
std::getline(text_stream, r.text, ',');
int value;
char comma;
while (text_stream >> value)
{
r.data.push_back(value);
text_stream >> comma;
}
return input;
}
Note: In the above input function, a record is read by line into a string. This will make the reading of the numbers easier.
You input loop could look like:
Record r;
std::vector<Record> database;
while (file_stream >> r)
{
database.push_back(r);
}
Usually modeling a structure from an input record makes a better program. Easier to develop and debug.
The input may be simplified by replacing the 2nd occurrence and remaining commas with spaces. The first comma is used to end the text field.
I have the following struct:
struct Person{
std::string name;
std::string address;
std::string& personName(){ return name; }
std::string& personAddress(){return address;}
};
The exercise is to write a read function that will read name and address. For example, the function I first wrote was this:
std::istream &read(std::istream &is, Person &person){
is >> person.name >> person.address;
return is;
}
However this function fails to take more than a word for address. For example if input is:
Lee Goswell Road
The output will be person.name = "Lee" and person.address = "Goswell". What I want is the function to read the entire address basically. I did try solving this problem as follows, but I am not sure it is right because address is changed implicitly:
std::istream &read(std::istream &is, Person &person){
is >> person.name;
std::getline(std::cin, person.address);
return is;
}
Another thing to consider before saying I should write separate functions, the task is to write one function to take read both the name and address.
You can use operator>> in tandem with std::getline but you'll probably want to eat the white-space from the stream first.
Also rather than read, you should just create your own operator>>:
std::istream& operator>>(std::istream& is, Person& person){
is >> person.name >> std::ws;
std::getline(is, person.address);
return is;
}
You can then use this as follows:
std::istringstream foo("Lee Goswell Road\nJon Lois Lane");
Person bar;
foo >> bar;
std::cout << bar.name << std::endl << bar.address << std::endl;
Just read a word, skip leading whitespace, then read to a delimiter:
if (is >> person.name >> std::ws
std::getline(is, person.address)) {
// do something with the input
}
else {
// deal with input failure
}
std::ws simply skips leading whitespace and std::getline() reads to delimiter with '\n' being the default.
I am basically reading a .txt file and storing values.
For example:
Student- Mark
Tennis
It will store Mark into memory as the studentName.
Now...If it is just
Student-
Tennis
Then it will work fine and produce an error.
However, if the file looks like this
Student-(space)(nothing here)
Tennis
It will store Tennis into memory as the studentName, when if fact it should store nothing and produce an error. I use '\n' character to determine if there is anything after the - character. This is my code...
istream& operator>> (istream& is, Student& student)
{
is.get(buffer,200,'-');
is.get(ch);
if(is.peek() == '\n')
{
cout << "Nothing there" << endl;
}
is >> ws;
is.getline(student.studentName, 75);
}
I think it is because the is.peek() is recognizing white space, but then if I try removing white space using is >> ws, it removes the '\n' character and still stores Tennis as the studentName.
Would really mean a lot if someone could help me solve this problem.
If you want to ignore whitespace but not '\n' you can't use std::ws as easily: it will skip over all whitespace and aside from ' ' the characters '\n' (newline), '\t' (tab), and '\r' (carriage return) are considered whitespace (I think there are actually even a few more). You could redefine what whitespace means for your stream (by replacing the stream's std::locale with a custom std::ctype<char> facet which has changed idea of what whitespace means) but that's probably a bit more advanced (as far as I can tell, there is about a handful of people who could do that right away; ask about it and I'll answer that question if I notice it, though...). An easier approach is to simply read the tail of the line using std::getline() and see what's in there.
Another alternative is create your own manipulator, let's say, skipspace, and use that prior to checking for newline:
std::istream& skipspace(std::istream& in) {
std::istreambuf_iterator<char> it(in), end;
std::find_if(it, end, [](char c){ return c != ' '; });
return in;
}
// ...
if ((in >> skipspace).peek() != '\n') {
// ....
}
You don't need to peek characters. I would use std::getline() and let it handle line breaks for you, then use std::istringstream for parsing:
std::istream& operator>> (std::istream& is, Student& student)
{
std::string line;
if (!std::getline(is, line))
{
std::cout << "Can't read student name" << std::endl;
return is;
}
std::istringstream iss(line);
std::string ignore;
std::getline(iss, ignore, '-');
iss >> std::ws;
iss.getline(student.studentName, 75);
/*
read and store className if needed ...
if (!std::getline(is, line))
{
std::cout << "Can't read class name" << std::endl;
return is;
}
std::istringstream iss2(line);
iss2.getline(student.className, ...);
*/
return is;
}
Or, if you can change Student::studentName into a std::string instead of a char[]:
std::istream& operator>> (std::istream& is, Student& student)
{
std::string line;
if (!std::getline(is, line))
{
std::cout << "Can't read student name" << std::endl;
return is;
}
std::istringstream iss(line);
std::string ignore;
std::getline(iss, ignore, '-');
iss >> std::ws >> student.studentName;
/*
read and store className if needed ...
if (!std::getline(is, student.className))
{
std::cout << "Can't read class name" << std::endl;
return is;
}
*/
return is;
}
I've got an object (of class myObj) that contains multiple strings (a string pointer). I want to overload the >>operator so that I can read in multiple strings at a time.
This overloaded operator functions accept statements like:
cin >> str;
cout << str;
The only problem is that when I fill in a series of strings, it seems that only the first string gets correctly processed in the stream.
To insert:
istream &operator>>(istream &is, myObj &obj)
{
std::string line;
while (true)
{
std::getline(is, line);
if (not is)
break;
obj.add(line);
is >> line;
}
return is;
}
To extract
ostream &operator<<(ostream &os, myObj const &obj)
{
for(size_t idx = 0; idx != obj.size(); ++idx)
os << obj[idx];
return os;
}
The code compiles fine but when I cout the object only the first string is printed and the other strings are omitted.
So when I provide cin with:
Hi
Stack
Exchange
only Hi will be displayed.
Does anyone know why this happens?
Thanks in advance!
P.S I am new to Stack Exchange and I am working hard to formulate the problems as best as I can :)
Your loop will work like that:
std::getline(is, line);
Extracts and stores "Hi" in line, extracts the newline
obj.add(line);
Adds "Hi" to obj
is >> line;
Extracts and stores "Stack" in line, does not extracts the following newline
std::getline(is, line);
Extracts and stores an empty string in line, because next read char is a newline
obj.add(line);
Adds empty string "" to obj
is >> line;
Extracts and stores "Exchange" in line
std::getline(is, line);
Extracts nothing (end of input stream)
if (not is)
break;
Then the stream is at end, your loop exits.
Conclusion : you stored only Hi and an empty string
is >> line puts a leading newline in the stream which is being read by std::getline() during the following loop. Because std::getline() stops input when it finds a newline, this reads an empty string into line and thus the stream is put into an error state which the loop responds to by breaking out of it.
There doesn't seem to be a need for that last read. You can remove it.
Also, this is a more idiomatic way to loop with input. This way you don't have to check the state of the stream within the loop body:
for (std::string line; std::getline(is, line); )
{
obj.add(line);
}
And since there's only one token per line, you can use a formmatted extractor:
for (std::string word; is >> word; )
{
obj.add(word);
}
I have an input stream containing integers and special meaning characters '#'. It looks as follows:
... 12 18 16 # 22 24 26 15 # 17 # 32 35 33 ...
The tokens are separated by space. There's no pattern for the position of '#'.
I was trying to tokenize the input stream like this:
int value;
std::ifstream input("data");
if (input.good()) {
string line;
while(getline(data, line) != EOF) {
if (!line.empty()) {
sstream ss(line);
while (ss >> value) {
//process value ...
}
}
}
}
The problem with this code is that the processing stops when the first '#' is encountered.
The only solution I can think of is to extract each individual token into a string (not '#') and use atoi() function to convert the string to an integer. However, it's very inefficient as the majority tokens are integer. Calling atoi() on the tokens introduces big overhead.
Is there a way I can parse the individual token by its type? ie, for integers, parse it as integers while for '#', skip it. Thanks!
One possibility would be to explicitly skip whitespace (ss >> std::ws), and then to use ss.peek() to find out if a # follows. If yes, use ss.get() to read it and continue, otherwise use ss >> value to read the value.
If the positions of # don't matter, you could also remove all '#' from the line before initializing the stringstream with it.
Usually not worth testing against good()
if (input.good()) {
Unless your next operation is generating an error message or exception. If it is not good all further operations will fail anyway.
Don't test against EOF.
while(getline(data, line) != EOF) {
The result of std::getline() is not an integer. It is a reference to the input stream. The input stream is convertible to a bool like object that can be used in bool a context (like while if etc..). So what you want to do:
while(getline(data, line)) {
I am not sure I would read a line. You could just read a word (since the input is space separated). Using the >> operator on string
std::string word;
while(data >> word) { // reads one space separated word
Now you can test the word to see if it is your special character:
if (word[0] == "#")
If not convert the word into a number.
This is what I would do:
// define a class that will read either value from a stream
class MyValue
{
public:
bool isSpec() const {return isSpecial;}
int value() const {return intValue;}
friend std::istream& operator>>(std::istream& stream, MyValue& data)
{
std::string item;
stream >> item;
if (item[0] == '#') {
data.isSpecial = true;
} else
{ data.isSpecial = false;
data.intValue = atoi(&item[0]);
}
return stream;
}
private:
bool isSpecial;
int intValue;
};
// Now your loop becomes:
MyValue val;
while(file >> val)
{
if (val.isSpec()) { /* Special processing */ }
else { /* We have an integer */ }
}
Maybe you can read all values as std::string and then check if it's "#" or not (and if not - convert to int)
int value;
std::ifstream input("data");
if (input.good()) {
string line;
std::sstream ss(std::stringstream::in | std::stringstream::out);
std::sstream ss2(std::stringstream::in | std::stringstream::out);
while(getline(data, line, '#') {
ss << line;
while(getline(ss, line, ' ') {
ss2 << line;
ss2 >> value
//process values ...
ss2.str("");
}
ss.str("");
}
}
In here we first split the line by the token '#' in the first while loop then in the second while loop we split the line by ' '.
Personally, if your separator is always going to be space regardless of what follows, I'd recommend you just take the input as string and parse from there. That way, you can take the string, see if it's a number or a # and whatnot.
I think you should re-examine your premise that "Calling atoi() on the tokens introduces big overhead-"
There is no magic to std::cin >> val. Under the hood, it ends up calling (something very similar to) atoi.
If your tokens are huge, there might be some overhead to creating a std::string but as you say, the vast majority are numbers (and the rest are #'s) so they should mostly be short.