Limiting input size with std::setw in std::cin - c++

Let's say I have sample code:
std::string s;
std::cin >> std::setw(4) >> s;
std::cout << s;
Now for input abcdef the result will be abc and for abc it will be abc too. The question is how can I check whether the string was split in the middle due to the limit or the result string is the actual one? I need to know whether the input fits or some data was skipped.

Although I know that the stream's width is considered when reading into a char* I wasn't aware that it is also considered when reading into a std::string. Assuming it is, reading would stop under three conditions:
The stream is completely read in which case eof() is set.
The next character is a space.
The number of characters which need to be read are read.
That is, you can check in.eof() and std::isspace(in.peek(). Well, is there is a funny std::ctype<char> facet used by the stream you'd really need to use
std::isspace(in.getloc(),
std::char_traits<char>::to_char_type(in.peek()));

Related

How exactly does the extract>> operator works in C++

I am a computer science student, an so do not have much experience with the C++ language (considering it is my first semester using this language,) or coding for that matter.
I was given an assignment to read integers from a text file in the simple form of:
19 3 -2 9 14 4
5 -9 -10 3
.
.
.
This sent me of on a journey to understand I/O operators better, since I am required to do certain things with this stream (duh.)
I was looking everywhere and could not find a simple explanation as to how does the extract>> operator works internally. Let me clarify my question:
I know that the extractor>> operator would extract one continues element until it hits space, tab, or newline. What I try to figure out is, where would the pointer(?) or read-location(?) be AFTER it extracts an element. Will it be on the last char of the element just removed or was it removed and therefore gone? will it be on the space/tab/'\n' character itself? Perhaps the beginning of the next element to extract?
I hope I was clear enough. I lack all the appropriate jargon to describe my problem clearer.
Here is why I need to know this: (in case anyone is wondering...)
One of the requirements is to sum all integers in each line separately.
I have created a loop to extract all integers one-by-one until it reaches the end of the file. However, I soon learned that the extract>> operator ignores space/tab/newline. What I want to try is to extract>> an element, and then use inputFile.get() to get the space/tab/newline. Then, if it's a newline, do what I gotta do.
This will only work if the stream pointer will be in a good position to extract the space/tab/newline after the last extraction>>.
In my previous question, I tried to solve it using getline() and an sstring.
SOLUTION:
For the sake of answering my specific question, of how operator>> works, I had to accept Ben Voigt's answer as the best one.
I have used the other solutions suggested here (using an sstring for each line) and they did work! (you can see it in my previous question's link) However, I implemented another solution using Ben's answer and it also worked:
.
.
.
if(readFile.is_open()) {
while (readFile >> newInput) {
char isNewLine = readFile.get(); //get() the next char after extraction
if(isNewLine == '\n') //This is just a test!
cout << isNewLine; //If it's a newline, feed a newline.
else
cout << "X" << isNewLine; //Else, show X & feed a space or tab
lineSum += newInput;
allSum += newInput;
intCounter++;
minInt = min(minInt, newInput);
maxInt = max(maxInt, newInput);
if(isNewLine == '\n') {
lineCounter++;
statFile << "The sum of line " << lineCounter
<< " is: " << lineSum << endl;
lineSum = 0;
}
}
.
.
.
With no regards to my numerical values, the form is correct! Both spaces and '\n's were catched:
Thank you Ben Voigt :)
Nonetheless, this solution is very format dependent and is very fragile. If any of the lines has anything else before '\n' (like space or tab), the code will miss the newline char. Therefore, the other solution, using getline() and sstrings, is much more reliable.
After extraction, the stream pointer will be placed on the whitespace that caused extraction to terminate (or other illegal character, in which case the failbit will also be set).
This doesn't really matter though, since you aren't responsible for skipping over that whitespace. The next extraction will ignore whitespaces until it finds valid data.
In summary:
leading whitespace is ignored
trailing whitespace is left in the stream
There's also the noskipws modifier which can be used to change the default behavior.
The operator>> leaves the current position in the file one
character beyond the last character extracted (which may be at
end of file). Which doesn't necessarily help with your problem;
there can be spaces or tabs after the last value in a line. You
could skip forward reading each character and checking whether
it is a white space other than '\n', but a far more idiomatic
way of reading line oriented input is to use std::getline to
read the line, then initialize an std::istringstream to
extract the integers from the line:
std::string line;
while ( std::getline( source, line ) ) {
std::istringstream values( line );
// ...
}
This also ensures that in case of a format error in the line,
the error state of the main input is unaffected, and you can
continue with the next line.
According to cppreference.com the standard operator>> delegates the work to std::num_get::get. This takes an input iterator. One of the properties of an input iterator is that you can dereference it multiple times without advancing it. Thus when a non-numeric character is detected, the iterator will be left pointing to that character.
In general, the behavior of an istream is not set in stone. There exist multiple flags to change how any istream behaves, which you can read about here. In general, you should not really care where the internal pointer is; that's why you are using a stream in the first place. Otherwise you'd just dump the whole file into a string or equivalent and manually inspect it.
Anyway, going back to your problem, a possible approach is to use the getline method provided by istream to extract a string. From the string, you can either manually read it, or convert it into a stringstream and extract tokens from there.
Example:
std::ifstream ifs("myFile");
std::string str;
while ( std::getline(ifs, str) ) {
std::stringstream ss( str );
double sum = 0.0, value;
while ( ss >> value ) sum += value;
// Process sum
}

What's the difference between read, readsome, get, and getline?

What is the difference between these functions. When I use them they all do the same thing. For example all three calls return "hello":
#include <iostream>
#include <sstream>
int main()
{
stringstream ss("hello");
char x[10] = {0};
ss.read(x, sizeof(x)); // #1
std::cout << x << std::endl;
ss.clear();
ss.seekg(0, ss.beg);
ss.readsome(x, sizeof(x)); // #2
std::cout << x << std::endl;
ss.clear();
ss.seekg(0, ss.beg);
ss.get(x, sizeof(x)); // #3
std::cout << x;
ss.clear();
ss.seekg(0, ss.beg);
ss.getline(x, sizeof(x)); // #4
std::cout << x << std:endl;
}
get and getline are quite similar, when get is called with parameters ( char_type* s, std::streamsize count ). However, get reads from the stream until a delimiter is found, and then leaves it there. getline by comparison will pull the delimiter off the stream, but then drop it. It won't be added to the buffer it fills.
get looks for \n, and when a specific number of characters is provided in an argument (say, count) it will read up to count - 1 characters before stopping. read will pull in all count of them.
You could envisage read as being an appropriate action on a binary datasource, reading a specific number of bytes. get would be more appropriate on a text stream, when you're reading into a string that you'd like null-terminated, and where things like newlines have useful syntactic meanings splitting up text.
readsome only returns characters that are immediately available in the underlying buffer, something which is a bit nebulous and implementation specific. This probably includes characters returned to the stream using putback, for example. The fact that you can't see the difference between read and readsome just shows that the two might share an implementation on the particular stream type and library you are using.
I've observed the difference between read() and readsome() on a flash filing system.
The underlying stream reads 8k blocks and the read method will go for the next block to satisfy the caller, whereas the readsome method is allowed to return less than the request in order to avoid spending time fetching the next block.
The main difference between get() and getline() is that get() leaves the newline character in the input stream, making it the first character seen by the next input operation, whereas getline() extracts and discards the newline character from the input stream.

C++ fstream: how to know size of string when reading?

...as someone may remember, I'm still stuck on C++ strings. Ok, I can write a string to a file using a fstream as follows
outStream.write((char *) s.c_str(), s.size());
When I want to read that string, I can do
inStream.read((char *) s.c_str(), s.size());
Everything works as expected. The problem is: if I change the length of my string after writing it to a file and before reading it again, printing that string won't bring me back my original string but a shorter/longer one. So: if I have to store many strings on a file, how can I know their size when reading it back?
Thanks a lot!
You shouldn’t be using the unformatted I/O functions (read() and write()) if you just want to write ordinary human-readable string data. Generally you only use those functions when you need to read and write compact binary data, which for a beginner is probably unnecessary. You can write ordinary lines of text instead:
std::string text = "This is some test data.";
{
std::ofstream file("data.txt");
file << text << '\n';
}
Then read them back with getline():
{
std::ifstream file("data.txt");
std::string line;
std::getline(file, line);
// line == text
}
You can also use the regular formatting operator >> to read, but when applied to string, it reads tokens (nonwhitespace characters separated by whitespace), not whole lines:
{
std::ifstream file("data.txt");
std::vector<std::string> words;
std::string word;
while (file >> word) {
words.push_back(word);
}
// words == {"This", "is", "some", "test", "data."}
}
All of the formatted I/O functions automatically handle memory management for you, so there is no need to worry about the length of your strings.
Although your writing solution is more or less acceptable, your reading solution is fundamentally flawed: it uses the internal storage of your old string as a character buffer for your new string, which is very, very bad (to put it mildly).
You should switch to a formatted way of reading and writing the streams, like this:
Writing:
outStream << s;
Reading:
inStream >> s;
This way you would not need to bother determining the lengths of your strings at all.
This code is different in that it stops at whitespace characters; you can use getline if you want to stop only at \n characters.
You can write the strings and write an additional 0 (null terminator) to the file. Then it will be easy to separate strings later. Also, you might want to read and write lines
outfile << string1 << endl;
getline(infile, string2, '\n');
If you want to use unformatted I/O your only real options are to either use a fixed size or to prepend the size somehow so you know how many characters to read. Otherwise, when using formatted I/O it somewhat depends on what your strings contain: if they can contain all viable characters, you would need to implement some sort of quoting mechanism. In simple cases, where strings consist e.g. of space-free sequence, you can just use formatted I/O and be sure to write a space after each string. If your strings don't contain some character useful as a quote, it is relatively easy to process quotes:
std::istream& quote(std::istream& out) {
char c;
if (in >> c && c != '"') {
in.setstate(std::ios_base::failbit;
}
}
out << '"' << string << "'";
std::getline(in >> std::ws >> quote, string, '"');
Obviously, you might want to bundle this functionality a class.

Convert string to int and get the number of characters consumed in C++ with stringstream

I am new to C++ (coming from a C# background) and am trying to learn how to convert a string to an int.
I got it working by using a stringstream and outputting it into a double, like so:
const char* inputIndex = "5+2";
double number = 0;
stringstream ss(inputIndex);
ss >> number;
// number = 5
This works great. The problem I'm having is that the strings I'm parsing start with a number, but may have other, not digit characters after the digits (e.g. "5+2", "9-(3+2)", etc). The stringstream parses the digits at the beginning and stops when it encounters a non-digit, like I need it to.
The problem comes when I want to know how many characters were used to parse into the number. For example, if I parse 25+2, I want to know that two characters were used to parse 25, so that I can advance the string pointer.
So far, I got it working by clearing the stringstream, inputting the parsed number back into it, and reading the length of the resulting string:
ss.str("");
ss << number;
inputIndex += ss.str().length();
While this does work, it seems really hacky to me (though that might just be because I'm coming from something like C#), and I have a feeling that might cause a memory leak because the str() creates a copy of the string.
Is there any other way to do this, or should I stick with what I have?
Thanks.
You can use std::stringstream::tellg() to find out the current get position in the input stream. Store this value in a variable before you extract from the stream. Then get the position again after you extract from the stream. The difference between these two values is the number of characters extracted.
double x = 3435;
std::stringstream ss;
ss << x;
double y;
std::streampos pos = ss.tellg();
ss >> y;
std::cout << (ss.tellg() - pos) << " characters extracted" << std::endl;
The solution above using tellg() will fail on modern compilers (such as gcc-4.6).
The reason for this is that tellg() really shows the position of the cursor, which is now out of scope. See eg "file stream tellg/tellp and gcc-4.6 is this a bug?"
Therefore you need to also test for eof() (meaning the entire input was consumed).

C++: Why does space always terminate a string when read?

Using type std::string to accept a sentence, for practice (I haven't worked with strings in C++ much) I'm checking if a character is a vowel or not. I got this:
for(i = 0; i <= analyse.length(); i++) {
if(analyse[i] == 'a' || analyse[i] == 'e' [..etc..]) {
...vowels++;
} else { ...
...consonants++;
}
This works fine if the string is all one word, but the second I add a space (IE: aeio aatest) it will only count the first block and count the space as a consonant, and quit reading the sentence (exiting the for loop or something).
Does a space count as no character == null? Or some oddity with std::string?, It would be helpful to know why that is happening!
EDIT:
I'm simply accepting the string through std::cin, such as:
std::string analyse = "";
std::cin >> analyse;
I'd guess you're reading your string with something like your_stream >> your_string;. Operator >> for strings is defined to work (about) the same as scanf's %s conversion, which reads up until it encounters whitespace -- therefore, operator>> does the same.
You can read an entire line of input instead with std::getline. You might also want to look at an answer I posted to a previous question (provides some alternatives to std::getline).
I can't tell from the code that you have pasted, but I'm going to go out on a limb and guess that you're reading into the string using the stream extraction operator (stream >> string).
The stream extraction operator stops when it encounters whitespace.
If this isn't what's going on, can you show us how you're populating your string, and what its contents are?
If I'm right, then you're going to want a different method of reading content into the string. std::getline() is probably the easiest method of reading from a file. It stops at newlines instead of at whitespace.
Edit based on edited question:
use this (doublecheck the syntax. I'm not in front of my compiler.):
std::getline(std::cin, analyze);
This ought to stop reading when you press "enter".
If you want to read in an entire line (including the blanks) then you should read using getline. Schematically it looks like this:
#include <string>
istream& std::getline( istream& is, string& s );
To read the whole line you do something like this:
string s;
getline( cin, s );
cout << "You entered " << s << endl;
PS: the word is "consonant", not "consenent".
The >> operator on an istream separates strings on whitespace. If you want to get a whole line, you can use readline(cin,destination_string).