Demonstration of noskipws in C++ - c++

I was trying out the noskipws manipulator in C++ and I wrote following code.
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main()
{
string first, middle, last;
istringstream("G B Shaw") >> first >> middle >> last;
cout << "Default behavior: First Name = " << first << ", Middle Name = " << middle << ", Last Name = " << last << '\n';
istringstream("G B Shaw") >> noskipws >> first >> middle >> last;
cout << "noskipws behavior: First Name = " << first << ", Middle Name = " << middle << ", Last Name = " << last << '\n';
}
I expect the following output:
Expected Output
Default behavior: First Name = G, Middle Name = B, Last Name = Shaw
noskipws behavior: First Name = G, Middle Name = , Last Name = B
Output
Default behavior: First Name = G, Middle Name = B, Last Name = Shaw
noskipws behavior: First Name = G, Middle Name = , Last Name = Shaw
I modified this code to make it work for chars like this and it works perfectly fine.
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main()
{
char first, middle, last;
istringstream("G B S") >> first >> middle >> last;
cout << "Default behavior: First Name = " << first << ", Middle Name = " << middle << ", Last Name = " << last << '\n';
istringstream("G B S") >> noskipws >> first >> middle >> last;
cout << "noskipws behavior: First Name = " << first << ", Middle Name = " << middle << ", Last Name = " << last << '\n';
}
I know how cin works and I wasn't able to figure out why it works this way in case of string.

std::istringstream("G B S") >> std::noskipws >> first >> middle >> last;
When an extraction is performed on strings, the string is first cleared and characters are inserted into its buffer.
21.4.8.9 Inserters and extractors
template<class charT, class traits, class Allocator>
basic_istream<charT, traits>&
operator>>(basic_istream<charT, traits>& is,
basic_string<charT, traits, Allocator>& str);
Effects: Behaves as a formatted input function (27.7.2.2.1). After constructing a sentry object, if the sentry converts to true, calls str.erase() and then extracts characters from is and appends them to str as if by calling str.append(1, c). [...]
The first read will extract the string "G" into first. For the second extraction, nothing will be extracted because the std::noskipws format flag is set, disabling the clearing of leading whitespace. Because of this, the string is cleared and then the extraction fails because no characters were put in. Here is the continuation of the above clause:
21.4.8.9 Inserters and extractors (Cont.)
[...] Characters are extracted and appended until any of the following occurs:
n characters are stored;
end-of-file occurs on the input sequence;
isspace(c, is.getloc()) is true for the next available input
character c.
When the stream determines a failed extraction the std::ios_base::failbit is set in the stream state indicating an error.
From this point on any and all attempts at I/O will fail unless the stream state is cleared. The extractor becomes inoperable and it will not run given a stream state not cleared of all its errors. This means that the extraction into last doesn't do anything and it retains the value it had at the previous extraction (the one without std::noskipws) because the stream did not clear the string.
As for the reason why using char works: Characters have no formatting requirements in C or C++. Any and all characters can be extracted into a object of type char, which is the reason why you're seeing the correct output despite std::noskipws being set:
27.7.2.2.3/1 [istream::extractors]
template<class charT, class traits>
basic_istream<charT, traits>& operator>>(basic_istream<charT, traits>& in,
charT& c);
Effects: Behaves like a formatted input member (as described in 27.7.2.2.1) of in. After a sentry object is constructed a character is extracted from in, if one is available, and stored in c. Otherwise, the function calls in.setstate(failbit).
The semantics for the extractor will store a character into its operand if one is available. It doesn't delimit upon whitespace (or even the EOF character!). It will extract it just like a normal character.

The basic algorithm for >> of a string is:
1) skip whitespace
2) read and extract until next whitespace
If you use noskipws, then the first step is skipped.
After the first read, you are positionned on a whitespace, so the next (and all following) reads will stop immediatly, extracting nothing.
For more information you can see this.
Form cplusplus.com ,
many extraction operations consider the whitespaces themselves as the terminating character, therfore, with the skipws flag disabled, some extraction operations may extract no characters at all from the stream.
So , remove the noskipws , when using with strings .

The reason is that in the second example you are not reading into last variable at all and instead you are printing old value of it.
std::string first, middle, last;
std::istringstream iss("G B S");
^^^
iss >> first >> middle >> last;
std::cout << "Default behavior: First Name = " << first
<< ", Middle Name = " << middle << ", Last Name = " << last << '\n';
std::istringstream iss2("G B T");
^^^
iss2 >> std::noskipws >> first >> middle >> last;
std::cout << "noskipws behavior: First Name = " << first
<< ", Middle Name = " << middle << ", Last Name = " << last << '\n';
Default behavior: First Name = G, Middle Name = B, Last Name = S
noskipws behavior: First Name = G, Middle Name = , Last Name = S
This happen because after second read to variable last stream is positioned on whitespace.

Related

When parsing a space delimitated string, is there any advantage using getline over stringstream::operator>>?

int main()
{
std::string s = "my name is joe";
std::stringstream ss{s};
std::string temp;
while(std::getline(ss, temp, ' '))
{
cout << temp.size() << " " << temp << endl;
}
//----------------------------//
ss = std::stringstream{s};
while(ss >> temp)
{
cout << temp.size() << " " << temp << endl;
}
}
I've always used the former, but I'm wondering if there's any advantage to using the latter? I've typically always used the former because I feel that if someone were to instead change the string to a comma delimitated string, then all I need to do is put in a new delimiter, whereas the operator>> would read in the commas. But for space delimitation, it seems there is no difference.
std::getline() and operator>> are intended for different purposes. It is not a matter of which one is more advantageous than the other. Use the one that is better suited for the task at hand.
operator>> is for formatted input. It reads in and parses many different data types, including strings. If there is no error state on the input stream, it skips leading whitespace (unless the skipws flag on the input stream is disabled, such as with the std::noskipws manipulator), and then it reads and parses characters until it encounters whitespace, a character that does not belong to the data type being parsed, or the end of the stream.
std::getline() is for unformatted input of strings only. If there is no error state on the input stream, it does not skip leading whitespace, and then it reads characters until it encounters the specified delimiter (or '\n' if not specified), or the end of the stream.

About c++ input behavior

#include <iostream>
using namespace std;
const int ArrSize = 400;
int main()
{
char arr1[ArrSize];
char arr2[ArrSize];
char arr3[ArrSize];
cout << "enter the first string ";
cin >> arr1;
cout << "enter the second string ";
cin.get(arr2, ArrSize);
cout << "enter the thrid string ";
cin>>arr3;
cout << endl << endl;
cout << "first string is: " << arr1 << "\n";
cout << "second string is: " << arr2 << "\n";
cout << "thrid string is: " << arr3 << "\n";
return 0;
}
execution result is
input :
"abc\n"
output :
first string is: abc
second string is:
thrid string is:(strange characters)
Can you explain why the second cin didn't get input?
I expected that cin would read leading white spaces form the stream buffer and ignore them and read new string.
Let's start by adjusting the program to check for errors.
#include <iostream>
using namespace std;
const int ArrSize = 400;
int main()
{
char arr1[ArrSize];
char arr2[ArrSize];
char arr3[ArrSize];
cout << "enter the first string ";
if (!(cin >> arr1))
{
cout << "Failed cin >> arr1\n";
}
cout << "enter the second string ";
if (!cin.get(arr2, ArrSize))
{
cout << "Failed cin.get(arr2, ArrSize)\n";
}
cout << "enter the third string ";
if (!(cin>>arr3))
{
cout << "Failed cin >> arr3\n";
}
cout << endl << endl;
cout << "first string is: " << arr1 << "\n";
cout << "second string is: " << arr2 << "\n";
cout << "third string is: " << arr3 << "\n";
return 0;
}
The results should be something like
enter the first string abc
enter the second string Failed cin.get(arr2, ArrSize)
enter the third string Failed cin >> arr3
first string is: abc
second string is:
third string is: <garbage here>
We can see that the second and third reads failed. Why is that? To find out, we need to do a little reading. Here's some high-quality documentation for std::istream::get
The relevant overload is number 3, but 3 just calls number 4 with the delimiter set to '\n' and 4 says two important things,
Characters are extracted and stored until any of the following occurs:
count-1 characters have been stored
end of file condition occurs in the input sequence (setstate(eofbit) is called)
the next available input character c equals delim, as determined by Traits::eq(c, delim). This character is not extracted (unlike basic_istream::getline())
If no characters were extracted, calls setstate(failbit). In any case, if count>0, a null character (CharT() is stored in the next successive location of the array.
So if you only get a newline, delim in this case, the output string arr2 is null terminated and the stream is placed into fail state because no characters were extracted from the stream, making the stream unreadable until the failure is acknowledged by clearing it. This is what we are seeing: an empty string and fail bit.
Why is the string empty? Why didn't it prompt for input? Because cin >> arr1 reads one whitespace-delimited token from the stream. It will ignore all whitespace up to the start of the token, but it leaves the whitespace after the token in the stream.
If you type abc and hit enter, "abc\n" goes into the stream. cin >> arr1 reads "abc" into arr1. The "\n" stays in the stream where cin.get(arr2, ArrSize) finds it. The get exit condition is immediately satisfied by the "\n", so get stops and leaves the "\n" in the stream. No characters were extracted. Fail bit is set and arr2 is null terminated.
cin>>arr3 subsequently fails because you can't read from a failed stream. Nothing is placed in arr3, so when arr3 is printed, it is unterminated and << keeps printing until it finds a terminator. This is the garbage characters, though technically anything can happen.
The question does not specify what is to be done with data left over after cin >> arr1. Common solutions are to remove everything up to and including the newline character from the stream with
cin.ignore(numeric_limits<streamsize>::max(), '\n');
but if you want to use any characters left on the line for arr2, you'll have to be trickier. For example, always read lines, build an istringstream out of the line, and then parse the istringstream as is done in option 2 of this answer.
Side note: Reading into character arrays with >> is always risky because it will keep reading until whitespace is found. If the program reads the size of the array from the stream without finding whitepace, sucks to be you. get knows to stop before its overflowed. >> doesn't. On the other hand, get will read until it finds the end of the line, not just a single whitespace delimited token.
>> into a std::string will do the right thing and resize the string to fit the input. Generally prefer std::string to char arrays. And if you are using std::string prefer std::getline to get or istream's getline.

Capitalizing letters in C++

I have an assignment where the user enters a student name in the format ( last name, first name). Can you help me figure out how to capitalize the first letter for both the first name and the last name?
I was using this to turn the user input into an array, so I could have the first letter capitalized, but when I did this, I had trouble getting it to work outside of the for loop.
for (int x = 0; x < fName.length(); x++)
{
fName[x] = tolower(fName[x]);
}
fName[0] = toupper(fName[0]);
I used your code and just added some parsing around it. You really are very close.
I can't help myself. For user input, I always use getline() followed by a stringstream to parse the words from the line. I find it avoids a lot of edge cases that get me into quicksand.
When getline() gets an input, it returns true unless it has problems. If the user inputs Ctrl-d, it will return false. Ctrl-D is basically an EOF (end of file) code, which works well in this case (as long as you are not trying to input the Ctrl-d from inside your debugger. Mine does not like that.
Note that I am using std::string in place of an array. std::string can be treated like an array for subscripting, but it prints nicely and has other functions that make it better for processing character strings.
#include <iostream>
#include <string> // Allow you to use strings
#include <sstream>
int main(){
std::string input_line;
std::string fName;
std::string lName;
std::cout << "Please enter students as <lastname>, <firstname>\n"
"Press ctrl-D to exit\n";
while(std::getline(std::cin, input_line)){
std::istringstream ss(input_line);
ss >> lName;
// remove trailing comma. We could leave it in and all would work, but
// it just feels better to remove the comma and then add it back in
// on the output.
if(lName[lName.size() - 1] == ',')
lName = lName.substr(0, lName.size() - 1); // Substring without the comma
ss >> fName;
for (int x = 0; x < fName.length(); x++) // could start at x=1, but this works.
{
fName[x] = tolower(fName[x]); // make all chars lower case
}
fName[0] = toupper(fName[0]);
for (int x = 0; x < lName.length(); x++)
{
lName[x] = tolower(lName[x]);
}
lName[0] = toupper(lName[0]);
std::cout << "Student: " << lName << ", " << fName << std::endl;
}
}

Why does std::operator>>(istream&, char&) extract whitespace?

I was compiling the following program and I learned that the extractor for a char& proceeds to extract a character even if it is a whitespace character. I disabled the skipping of leading whitespace characters expecting the proceeding read attempts to fail (because formatted extraction stops at whitespace), but was surprised when it succeeded.
#include <iostream>
#include <sstream>
int main()
{
std::istringstream iss("a b c");
char a, b, c;
iss >> std::noskipws;
if (iss >> a >> b >> c)
{
std::cout << "a = \"" << a
<< "\"\nb = \"" << b
<< "\"\nc = \"" << c << '\n';
}
}
Output:
a = "a"
b = " "
c = "b"
As you can see from the output, b was given the value of the space between "a" and "b"; and c was given the following character "b". I was expecting both b and c to not have a value at all since the extraction should fail because of the leading whitespace. What is the reason for this behavior?
In IOStreams, characters have virtually no formatting requirements. Any and all characters in the character sequence are valid candidates for an extraction. For the extractors that use the numeric facets, extraction is defined to stop at whitespace. However, the extractor for charT& works directly on the buffer, indiscriminately returning the next available character presumably by a call to rdbuf()->sbumpc().
Do not assume that this behavior extends to the extractor for pointers to characters as for them extraction is explicitly defined to stop at whitespace.

c++ Reading text file into array of structs not working

I have been working on this for a while and can't fix it. I am very new to C++. So far I can get 10 things into my array but the output is not legible, it's just a bunch of numbers. I have read other posts with similar code but for some reason mine isn't working.
The input text file is 10 lines of fake data like this:
56790 "Comedy" 2012 "Simpsons" 18.99 1
56791 "Horror" 2003 "The Ring" 11.99 7
My code is here:
(My output is below my code)
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
struct DVD {
int barcode;
string type;
int releaseDate;
string name;
float purchaseprice;
int rentaltime;
void printsize();
};
int main () {
ifstream in("textfile.exe");
DVD c[10];
int i;
for (int i=0; i < 10; i++){
in >> c[i].barcode >> c[i].type >> c[i].releaseDate >>
c[i].name >> c[i].purchaseprice >> c[i].rentaltime;
}
for (int i=0;i< 10;i++) {
cout << c[i].barcode<<" ";
cout << c[i].type<<" ";
cout << c[i].releaseDate<<" ";
cout << c[i].name << " ";
cout << c[i].purchaseprice << " ";
cout << c[i].rentaltime << "\n";
}
return 0;
}
My output looks similar to garbage, but there are 10 lines of it like my array:
-876919876 -2144609536 -2.45e7 2046
A comment on what to study to modify my code would be appreciated.
As suggested by cmbasnett, ifstream in("textfile.exe") reads in an executable file. If you with for the program to read in a text file, changing it to ifstream in("textfile.txt") should work.
You always need to check that your input is actually correct. Since it may fail prior to reading 10 lines, you should probably also keep a count of how many entries you could successfully read:
int i(0);
for (; i < 10
&& in >> c[i].barcode >> c[i].type >> c[i].releaseDate
>> c[i].name >> c[i].purchaseprice >> c[i].rentaltime; ++i) {
// ???
}
You actual problem reading the second line is that your strings are quoted but the approach used for formatted reading of strings doesn't care about quotes. Instead, strings are terminated by a space character: the formatted input for strings will skip leading whitespace and then read as many characters until another whitespace is found. On your second line, it will read "The and then stop. The attempt to read the purchaseprice will fail because Ring isn't a value numeric value.
To deal with that problem you might want to make the name quotedstring and define an input and output operators for it, e.g.:
struct quoted_string { std::string value; };
std::istream& operator>> (std::istream& in, quoted_string& string) {
std::istream::sentry cerberos(in); // skips leading whitespace, etc.
if (in && in.peek() == '"') {
std::getline(in.ignore(), string.value, '"');
}
else {
in.setstate(std::ios_base::failbit);
}
return in;
}
std::ostream& operator<< (std::ostream& out, quoted_string const& string) {
return out << '"' << string.value << '"';
}
(note that the code isn't test but I'm relatively confident that it might work).
Just to briefly explain how the input operator works:
The sentry is used to prepare the input operation:
It flushes the tie()d std::ostream (if any; normally there is none except for std::cin).
It skips leading whitespace (if any).
It checks if the stream is still not in failure mode (i.e., neither std::ios_base::failbit nor `std::ios_base::badbit are set).
To see if the input starts with a quote, in.peek() is used: this function returns an int indicating either that the operation failed (i.e., it returns std::char_traits<char>::eof()) or the next character in the stream. The code just checks if it returns " as it is a failure if the stream returns an error or any other character is present.
If there is a quote, the quote is skipped using file.ignore() which by default just ignores one character (it can ignore more characters and have a character specified when to stop).
After skipping the leading quote, std::getline() is used to read from file into string.value until another quote is found. The last parameter is defaulted to '\n' but for reading quoted string using a '"' is the correct value to use. The terminating character is, conveniently, not stored.