How exactly does the extract>> operator works in C++ - c++

I am a computer science student, an so do not have much experience with the C++ language (considering it is my first semester using this language,) or coding for that matter.
I was given an assignment to read integers from a text file in the simple form of:
19 3 -2 9 14 4
5 -9 -10 3
.
.
.
This sent me of on a journey to understand I/O operators better, since I am required to do certain things with this stream (duh.)
I was looking everywhere and could not find a simple explanation as to how does the extract>> operator works internally. Let me clarify my question:
I know that the extractor>> operator would extract one continues element until it hits space, tab, or newline. What I try to figure out is, where would the pointer(?) or read-location(?) be AFTER it extracts an element. Will it be on the last char of the element just removed or was it removed and therefore gone? will it be on the space/tab/'\n' character itself? Perhaps the beginning of the next element to extract?
I hope I was clear enough. I lack all the appropriate jargon to describe my problem clearer.
Here is why I need to know this: (in case anyone is wondering...)
One of the requirements is to sum all integers in each line separately.
I have created a loop to extract all integers one-by-one until it reaches the end of the file. However, I soon learned that the extract>> operator ignores space/tab/newline. What I want to try is to extract>> an element, and then use inputFile.get() to get the space/tab/newline. Then, if it's a newline, do what I gotta do.
This will only work if the stream pointer will be in a good position to extract the space/tab/newline after the last extraction>>.
In my previous question, I tried to solve it using getline() and an sstring.
SOLUTION:
For the sake of answering my specific question, of how operator>> works, I had to accept Ben Voigt's answer as the best one.
I have used the other solutions suggested here (using an sstring for each line) and they did work! (you can see it in my previous question's link) However, I implemented another solution using Ben's answer and it also worked:
.
.
.
if(readFile.is_open()) {
while (readFile >> newInput) {
char isNewLine = readFile.get(); //get() the next char after extraction
if(isNewLine == '\n') //This is just a test!
cout << isNewLine; //If it's a newline, feed a newline.
else
cout << "X" << isNewLine; //Else, show X & feed a space or tab
lineSum += newInput;
allSum += newInput;
intCounter++;
minInt = min(minInt, newInput);
maxInt = max(maxInt, newInput);
if(isNewLine == '\n') {
lineCounter++;
statFile << "The sum of line " << lineCounter
<< " is: " << lineSum << endl;
lineSum = 0;
}
}
.
.
.
With no regards to my numerical values, the form is correct! Both spaces and '\n's were catched:
Thank you Ben Voigt :)
Nonetheless, this solution is very format dependent and is very fragile. If any of the lines has anything else before '\n' (like space or tab), the code will miss the newline char. Therefore, the other solution, using getline() and sstrings, is much more reliable.

After extraction, the stream pointer will be placed on the whitespace that caused extraction to terminate (or other illegal character, in which case the failbit will also be set).
This doesn't really matter though, since you aren't responsible for skipping over that whitespace. The next extraction will ignore whitespaces until it finds valid data.
In summary:
leading whitespace is ignored
trailing whitespace is left in the stream
There's also the noskipws modifier which can be used to change the default behavior.

The operator>> leaves the current position in the file one
character beyond the last character extracted (which may be at
end of file). Which doesn't necessarily help with your problem;
there can be spaces or tabs after the last value in a line. You
could skip forward reading each character and checking whether
it is a white space other than '\n', but a far more idiomatic
way of reading line oriented input is to use std::getline to
read the line, then initialize an std::istringstream to
extract the integers from the line:
std::string line;
while ( std::getline( source, line ) ) {
std::istringstream values( line );
// ...
}
This also ensures that in case of a format error in the line,
the error state of the main input is unaffected, and you can
continue with the next line.

According to cppreference.com the standard operator>> delegates the work to std::num_get::get. This takes an input iterator. One of the properties of an input iterator is that you can dereference it multiple times without advancing it. Thus when a non-numeric character is detected, the iterator will be left pointing to that character.

In general, the behavior of an istream is not set in stone. There exist multiple flags to change how any istream behaves, which you can read about here. In general, you should not really care where the internal pointer is; that's why you are using a stream in the first place. Otherwise you'd just dump the whole file into a string or equivalent and manually inspect it.
Anyway, going back to your problem, a possible approach is to use the getline method provided by istream to extract a string. From the string, you can either manually read it, or convert it into a stringstream and extract tokens from there.
Example:
std::ifstream ifs("myFile");
std::string str;
while ( std::getline(ifs, str) ) {
std::stringstream ss( str );
double sum = 0.0, value;
while ( ss >> value ) sum += value;
// Process sum
}

Related

Searching for char from end of file using seekg() c++

I have a file and I want to only output the last line to the console.
Here's my thoughts on how I would do it. Use file.seekg(0, ios::end) to put myself at the end of file.
Then, I could create a decrement variable int decrement = -1; and use a while loop
while (joke.peek() != '\n')
{
decrement--;
}
and get the starting position for my final line (going backwards from the end).
Knowing this, joke.seekg(decrement, ios::end); should set me to the beginning of the final line, and assuming I previously declared string input;
I would think that
getline(joke, input);
cout << input << endl;
would output it to the console.
Full code
void displayLastLine(ifstream &joke)
{
string input;
int decrement = -1;
joke.clear();
joke.seekg(0, ios::end);
while (joke.peek() != '\n')
{
decrement--;
}
joke.clear();
joke.seekg(decrement, ios::end);
getline(joke, input);
cout << input << endl;
}
The problem is, when I go to call this method for the file, nothing happens. When I step through it, the decrement just keeps on subtracting one, far beyond where a '\n' would be. To give an example of a text file, it would look something like this:
junk
garbage
skip this line
This is the line that we're looking for!
joke.seekg(0, ios::end);
This positions the file at the end.
while (joke.peek() != '\n')
Well, here's problem #1. When you're at the end of the file, peek() always returns EOF.
decrement--;
You write:
When I step through it, the decrement just keeps on subtracting one,
Well, what did you expect to happen, since that's the only thing that the loop does? The only thing your for loop does is subtract 1 from decrement. So that's what happens.
This a common problem: a computer does only what you tell it to do, instead of what you want it to do.
Although this is not optimal, your missing step is that before you peek(), you need to seek() back by one character. Then, peek() shows you the character at the current cursor position. Then, seek() back by one more character, and check peek() again, and so on.
But that still will not be sufficient for you. Most text files end with a newline character. That is, a newline is the last character in the file. So, even if you add back the missing seek(), in nearly all cases, what your code will end up doing is finding the last character in the file, the final newline character.
My recommendation for you is to stop writing code for a moment, and, instead, come up with a logical process for doing what you want to do, and describe this process in plain words. Then, discuss your proposed course of action with your rubber duck. Only after your rubber duck agrees that what you propose will work, then translate your plain language description into code.
peek does not move the file pointer, it reads the character without extracting it. So, you are constantly peeking the same value (character) and ending up in an infinite loop.
What would you need to do is something like:
while (joke.peek() != '\n')
{
joke.seek(-1, ios::cur);
}
That would put the input position at the \n, using the 2nd overload of seekg.
Please note that this is not a perfect solution. You need to check for errors and boundary conditions, but it explains your observed behaviour and gives you something to start fixing your code.
Your loop is actually only decrementing "decrement" and not using it to make the next search.
while (joke.peek() != '\n')
{
joke.seekg(decrement, std::ios::end);
decrement--;
}

Can't get ios::beg to go back to the beginning of the file

It always seems to be the things that should be no problem that cause problems for me. I don't get it. :/
So I'm trying to make sure that I understand how to manipulate text files. I've got two files, "infile.txt" and "outfile.txt". "infile.txt" has six numbers in it and nothing else. Here is the code I used to manipulate the files.
#include<fstream>
using std::ifstream;
using std::ofstream;
using std::fstream;
using std::endl;
using std::ios;
int main()
{
ifstream inStream;
ofstream outStream;//create streams
inStream.open("infile.txt", ios::in | ios::out);
outStream.open("outfile.txt");//attach files
int first, second, third;
inStream >> first >> second >> third;
outStream << "The sum of the first 3 nums is " << (first+second+third) << endl;
//make two operations on the 6 numbers
inStream >> first >> second >> third;
outStream << "The sum of the second 3 nums is " << (first+second+third) << endl;
inStream.seekg(0); //4 different ways to force the program to go back to the beginning of the file
//2. inStream.seekg(0, ios::beg);
//3. inStream.seekg(0, inStream.beg);
//4. inStream.close(); inStream.open("infile.txt");
//I have tried all four of these lines and only #4 works.
//There has got to be a more natural option than just
//closing and reopening the file. Right?
inStream >> first >> second >> third;
outStream << "And again, the sum of the first 3 nums is " << (first+second+third) << endl;
inStream.close();
outStream.close();
return 0;
}
Maybe I don't understand quite how the stream works, but I've seen a few sources that said that seekg(0) should move the index back to the start of the file. Instead, this is what I get out of it.
The sum of the first 3 nums is 8
The sum of the second 3 nums is 14
And again, the sum of the first 3 nums is 14
It went back, but not nearly in the way I would have hoped. Any idea why this happened? Why did my first three attempts fail?
As Bo Persson states, it may be because your input has
encountered end of file; it shouldn't, because in C++, a text
file is defined as being terminated by a '\n', but practically
speaking, if you're working under Windows, a lot of ways of
generating a file will omit this final '\n'—although it
is formally required, practical considerations will mean that
you'll make sure that it works even if the final '\n' is
missing. And I can't think of any other reason off hand why the
seekg's wouldn't work. inStream.seekg( 0 ) is, of course,
undefined behavior, but in practice, it will work pretty much
everywhere. inStream.seekg( 0, ios::beg ) is guaranteed to
work if inStream.good(), and is, IMHO, preferable to the
first form. (The single argument form of seekg is normally
only used with the results of a tellg as an argument.) And of
course, it only works if the actual input source supports
seeking: it won't work if you're reading from a keyboard or
a pipe (but presumably, "infile.txt" is neither).
In general, you should check the status of inStream after each
read, before using the results. But if the only problem is that
the file doesn't end with '\n', it's probable that the status
will be OK (!fail()) after the final read, even if you've
encountered end of file. In which case, you'll need clear()
anyway.
Note that the above comments are valid for C++-03 and precedent.
C++11 has changed the specification of the single argument form
of seekg, and requires it to reset eofbit before anything
else. (Why is this change only for the single argument form of
seekg, and not the two argument form? Oversight?)
The second input reaches end-of-file for the stream. That state sticks until you call inStream.clear() to clear its state (in addition to the seek).
With a C++11 compliant compiler, option 4 should also work as close and reopen will now clear the previous state. Older compilers might not do that.
Try:
inStream.seekg(0, ios_base::beg);

Line Breaks when reading an input file by character in C++

Ok, just to be up front, this IS homework, but it isn't due for another week, and I'm not entirely sure the final details of the assignment. Long story short, without knowing what concepts he'll introduce in class, I decided to take a crack at the assignment, but I've run into a problem. Part of what I need to do for the homework is read individual characters from an input file, and then, given the character's position within its containing word, repeat the character across the screen. The problem I'm having is, the words in the text file are single words, each on a different line in the file. Since I'm not sure we'll get to use <string> for this assignment, I was wondering if there is any way to identify the end of the line without using <string>.
Right now, I'm using a simple ifstream fin; to pull the chars out. I just can't figure out how to get it to recognize the end of one word and the beginning of another. For the sake of including code, the following is all that I've got so far. I was hoping it would display some sort of endl character, but it just prints all the words out run together style.
ifstream fin;
char charIn;
fin.open("Animals.dat");
fin >> charIn;
while(!fin.eof()){
cout << charIn;
fin >> charIn;
}
A few things I forgot to include originally:
I must process each character as it is input (my loop to print it out needs to run before I read in the next char and increase my counter). Also, the length of the words in 'Animals.dat' vary which keeps me from being able to just set a number of iterations. We also haven't covered fin.get() or .getline() so those are off limits as well.
Honestly, I can't imagine this is impossible, but given the restraints, if it is, I'm not too upset. I mostly thought it was a fun problem to sit on for a while.
Why not use an array of chars? You can try it as follow:
#define MAX_WORD_NUM 20
#define MAX_STR_LEN 40 //I think 40 is big enough to hold one word.
char words[MAX_WROD_NUM][MAX_STR_LEN];
Then you can input a word to the words.
cin >> words[i];
The >> operator ignores whitespace, so you'll never get the newline character. You can use c-strings (arrays of characters) even if the <string> class is not allowed:
ifstream fin;
char animal[64];
fin.open("Animals.dat");
while(fin >> animal) {
cout << animal << endl;
}
When reading characters from a c-string (which is what animal is above), the last character is always 0, sometimes represented '\0' or NULL. This is what you check for when iterating over characters in a word. For example:
c = animal[0];
for(int i = 1; c != 0 && i < 64; i++)
{
// do something with c
c = animal[i];
}

How can you ignore from cin until it goes to the next line?

If you have a text file that you're reading character by character with cin:
char text;
cin >> text;
cout << char << endl;
Suppose you want to ignore any lines that start with ">" until the new line, how can you do that?
You can compare the char read for '>' using::
int strncmp ( const char * str1, const char * str2, size_t num );
If found, skip till char read equals '\n' i.e., skip till strncmp returns 0 for ( char, '\n', 1 )
Typically you want to use std::istream::ignore, something like:
static const int max_line = 65536;
if (text == ">")
std::cin.ignore(max_line, '\n');
Note that I've specified a maximum distance to skip of 64K bytes. Many people recommend something like std::numeric_limits<std::streamsize>::max(), which basically means to skip any amount of text until you find the delimiter (new-line, in this case).
IMO, specifying such a huge number is usually a poor idea -- if you go for too long without seeing a new-line, it's safe to stop and assume that you've gotten bad data. As soon as you've read enough to be reasonably certain there's a problem, it's better to stop and warn the user, rather than spending minutes with the program apparently locked up, reading gigabytes of useless data (and then probably giving the user an error message anyway).
Another possibility (especially if you're fairly sure you'll get good input) is to start by reading a full line (e.g., with std::getline), then if it starts with a >, just skip processing the line, and go back to read the next one.

C++: Why does space always terminate a string when read?

Using type std::string to accept a sentence, for practice (I haven't worked with strings in C++ much) I'm checking if a character is a vowel or not. I got this:
for(i = 0; i <= analyse.length(); i++) {
if(analyse[i] == 'a' || analyse[i] == 'e' [..etc..]) {
...vowels++;
} else { ...
...consonants++;
}
This works fine if the string is all one word, but the second I add a space (IE: aeio aatest) it will only count the first block and count the space as a consonant, and quit reading the sentence (exiting the for loop or something).
Does a space count as no character == null? Or some oddity with std::string?, It would be helpful to know why that is happening!
EDIT:
I'm simply accepting the string through std::cin, such as:
std::string analyse = "";
std::cin >> analyse;
I'd guess you're reading your string with something like your_stream >> your_string;. Operator >> for strings is defined to work (about) the same as scanf's %s conversion, which reads up until it encounters whitespace -- therefore, operator>> does the same.
You can read an entire line of input instead with std::getline. You might also want to look at an answer I posted to a previous question (provides some alternatives to std::getline).
I can't tell from the code that you have pasted, but I'm going to go out on a limb and guess that you're reading into the string using the stream extraction operator (stream >> string).
The stream extraction operator stops when it encounters whitespace.
If this isn't what's going on, can you show us how you're populating your string, and what its contents are?
If I'm right, then you're going to want a different method of reading content into the string. std::getline() is probably the easiest method of reading from a file. It stops at newlines instead of at whitespace.
Edit based on edited question:
use this (doublecheck the syntax. I'm not in front of my compiler.):
std::getline(std::cin, analyze);
This ought to stop reading when you press "enter".
If you want to read in an entire line (including the blanks) then you should read using getline. Schematically it looks like this:
#include <string>
istream& std::getline( istream& is, string& s );
To read the whole line you do something like this:
string s;
getline( cin, s );
cout << "You entered " << s << endl;
PS: the word is "consonant", not "consenent".
The >> operator on an istream separates strings on whitespace. If you want to get a whole line, you can use readline(cin,destination_string).