How does char extraction differ from string extraction? - c++

When disabling whitespace skipping with chars and strings the behavior is different. It seems the only way to extract an entire string (including whitespace characters) is to use chars and noskipws. But this is not possible with strings because it won't extract after the first space.
std::string test = "a b c";
char c;
std::istringstream iss(test);
iss.unsetf(std::ios_base::skipws);
while (iss >> c)
std::cout << c;
will output a b c but change c to string and it only outputs a.

The >> operator for a string extracts words, and stops at the
first white space it sees. If it doesn't skip initial white
space, then it stops immediately, and returns an empty string.
You don't say how you want the string to be delimited. To read
until the end of line, just use std::getline. To read until
the end of file, you can use something like:
std::istringstream collector;
collector << iss.rdbuf();
std::string results = collector.str();
It's not the most efficient, but if the file is small, it will
do.

Related

Switching from formatted to unformatted input in C++

I have an input text file. The first line has two int numbers a and b, and the second line is a string. I want to use formatted input to do file >> a >> b, and then unformatted input to get the characters of the string one by one. In between the two steps, I need to skip over the '\n' character at the end of the first line. I used
while(file.get()<=' ' && !file.eof()); // skip all unprintable chars
if(!file.eof()) file.unget(); // keep the eof sign once triggered
to make the input format more flexible. The user can now separate the numbers a and b from the string using an arbitrary number of empty lines '\n', tab keys '\t', and/or space keys ' ' -- the same freedom he has to separate the numbers a and b. There's even no problem reading in Linux a text file copied from Windows when every end of line now becomes "\r\n".
Is there an ifstream function that does the same thing (skip all chars <=' ' until the next printable char or EOF is reached)? The ignore function does not seem to do that.
Yes, there is: std::ws manipulator. It skips whitespace characters until a non-whitespace is found or end of stream is reached.. It is similar to use of whitespace character in scanf format string.
In fact, it is used by formatted input before actually starting to parse characters.
You can use it like that:
int x;
std::string str;
std::cin >> x >> std::ws;
std::getline(std::cin, str);
//...
//std::vector<int> vec;
for(auto& e: vec) {
std::cin >> e;
}
std::getline(std::cin >> std::ws, str);

Using getline on a string

How can I use getline() to read words form a line that I have stored in a string?
For example:
char ch[100],w[20];
cin.getline(ch,100);
Now can I use getline to count the number of words in ch? I don't want to use a delimiter to directly count words from ch. I want to read words from ch and store them in w.
I have tried using ch in getline as a parameter.
getline is implemented in the standard as either a stream method, or a string method which takes a stream: http://en.cppreference.com/w/cpp/string/basic_string/getline
There is no standard implementation of getline which does not require a stream.
That said you can use ch to seed a istringstream to count the words in the string, but basic_istream<CharT, Traits>& getline(basic_istream<CharT, Traits>&& input, basic_string<CharT, Traits, Allocator>& str) assumes a newline as the delimiter so that's not what you're going to want to count words. Similarly, a getline that takes a delimiter will only break on a specific character. Instead you could use basic_istream& basic_istream::operator>> which will split words on all whitespace characters:
istringstream foo(ch);
for(auto i = 1; foo >> w; ++i) {
cout << i << ": " << w << endl;
}
Live Example
Just as a note here, defining char w[20] is just begging for an out of bounds write. At a minimum you need to define that such that if ch is filled with non-whitespace characters, w can contain it all. You could do that by defining char w[100].
But if someone were to come and increase the size of ch without changing the size of w and then you'd be in trouble again. In C++17 you could define w like this char w[size(ch)] prior to that you could do char w[sizeof(ch) / sizeof(ch[0])]
But your best option is probably to just make both w and ch strings so they can dynamically resize to accommodate user input.

C++ getline() behaves strangely when reading stringstream containing \0 [duplicate]

This question already has answers here:
How do you construct a std::string with an embedded null?
(11 answers)
Closed 8 years ago.
I'm trying to read a large buffer from a socket which uses \0 to delimit pieces of data and \n to delimit lines.
I thought getline() would be an easy way to get each line but it's behaving strangely.
I'm using \n as the delimiter in getline().
string line;
string test1 = "aaa,123\nbbb\nccc,456\n";
stringstream ss1(test1);
while(std::getline(ss1, line, '\n')) {
cout << line << endl;
}
// outputs:
// aaa,123
// bbb
// ccc,456
string test2 = "aaa\0123\0\nbbb\0\nccc\0456\0\n";
stringstream ss2(test2);
while(std::getline(ss2, line, '\n')) {
cout << line << endl;
}
// outputs:
// aaa
// 3
Why is this happening in test2? Where is the 3 coming from? Must I remove the \0 to make this work? Is there an easier/better way to mark strings in my buffer when I do a socket recv()?
\0 in a special symbol. It shows when the string ends.
For example, if you type in "a string", the compiler automatically adds a \0 on the end, which signifies the end of the string. However, it is legal to have a \0 in the middle of the string, it just means that everything after it is ignored.
So basically, any operation you do on the string, not just the getline, will treat the string as "aaa", ignoring everything after the first \0 that is found. But...
As #Fred Larson points out
Oh, I see where the 3 comes from. The first \0 isn't a null, it's the start of \012, which is a carriage return. Then the 3 follows.
So actually, the string is being treated as "aaa\n3". Which is why you get the output you do.
Edit: And thanks to Galik, I will also add that these rules I mention might only apply to a string literal / c-string. It may be a different case with std::strings, in which the length of the string is known ahead of time.
\0 is the standard string terminator symbol. As such, you may either read character by character or avoid \0 as delemeters

How can I get a word from string

I have f.e. "I am working as a nurse."
How Can I or which function use to get word from letter number 1 to space or to letter number 11?
So should I get " am working "
To read a word from a stream use operator>> on a string
std::stringstream stream("I am working as a nurse.");
std::string word;
stream >> word; // word now holds 'I'
stream >> word; // word now holds 'am'
stream >> word; // word now holds 'working'
// .. etc
It is not totally clear what you want, but from your example it seems like you want the substring that starts at character 1 and ends on the character 11 places later (that's 12 characters total). That means you want string::substr:
std::string str("I am working as a nurse");
std::string sub = str.substr(1,12);
char[] source = "I am working as a nurse."
char[11] dest;
strncpy(dest, &source[1], 10)

Is there a better way to parse a line of text like this?

I have a text file with lines of text that have a string another string followed by up to 4 integers,
ex:
clear "clear water.png" 5 7
wet "wet water.png" 9 5 33 17
soft "soft rain falling.png"
The only way I see it is:
read until space is found
set string to wet
read until double quote
read until second double quote
set second string to wet water.png
while not end of line
read until space
put string through string stream
push resulting integer into vector of int
Is there a better way to do this?
Thanks
This is the sort of task for which scanf and company truly shine.
char first_string[33], second_string[129];
sscanf(input_string,
"%32s%*[^\"]%*c%128[^\"]%*c %d %d %d %d",
first_string,
second_string,
&first_int,
&second_int,
&third_int,
&fourth_int);
You probably want to do that in an if statement so you can test the return value, to tell you how many of those fields converted (e.g., so you know how many integers you read at the end).
Edit: perhaps some explanation would be helpful. Let's dissect that:
%32s reads a string to the first white-space (or 32 characters, whichever comes first).
%*[^\"] ignores input up to the first ".
%*c ignores one more byte of input (the quote itself)
%128[^\"] reads the string in the quote (i.e., up to the next quote character).
%*c Ignores the closing quote
%d Reads an int (which we've done four times).
The space before each %d is really unnecessary -- it'll skip whitespace, but without the space, %d will skip leading whitespace anyway. I've included them purely to make it a little more readable.
Ugly, with no error-checking, but no dependencies on any non-standard libraries:
string s;
while(getline(fin, s))
{
string word, quoted_string;
vector<int> vec;
istringstream line(s);
line >> word;
line.ignore(numeric_limits<streamsize>::max(), '"');
getline(line, quoted_string, '"');
int n;
while(line >> n) vec.push_back(n);
// do something with word, quoted_string and vec...
}
Depending on the restrictions of the input strings you could trying splitting on double-quote then splitting on space.
Yes
Use getline to read one line at a time. Parse the lines using a regular expression library.