Storing data line by line into arrays - c++

I need to store data into two arrays that are stored in the order Name then Number
Ex:
Kara
000131012
Tucker
002102000
I understand how to use the single-line method:
while (infile >> a >> b)
{
// process pair (a,b)
}
But this doesn't work for the way this data is stored.

I am not sure how sensible it is to add an answer so late (and to this easy question), but I want to be clear since there were some missunderstandings in the comments.
How does >> work?
The operator>> first discards als leading space characters (spaces, tabs, newlines, maybe more, this depends on the locale). Then it will try to read as many non-whitespace characters as possible (so int i; cin >> i; with input 123123jj sets i to 123123). Then it will potentially set the failbit, eofbit, or badbit, which influence the boolean value of the stream.
What does that mean for your code?
If your names consistenly do not include a space character, your code will run perfectly completely independent of the number of words per line. If you have that guarantee of spaceless names, I would recommend this way since it is easy and you don't get a problem if your input is a bit faulty and has for example a double newline at some point.
If you have perhaps spaces in the names, your code above will fail. Then you have to use std::getline. Its usage is well documented on the linked page.

Related

How to ignore line break after reading two ints with cin?

I am new to C++, I have practically started it today. Any way, I am facing a problem where the first line contains two integers, and the next lines contain operations to be done. It's a fairly weird problem, actually, so I won't go into the details. Well, I am having a problem with reading the first line, and then the follow up operations.
My code looks like this so far:
int m, j;
string comando;
string OPR
cin << m << l;
while (getline(cin, comando)) {
OPR = comando.substr(0, 3);
}
The problem is: Apparently, whenever I write both m and l in the same line, the \n stays in the buffer, and it get's read by the newline, causing a problem when I try to take the substring. I tried adding a char variable that would be read after m and l, which would, supposedly, get the \n. However, it is getting the first letter of the newline instead, which, then, messes up my code. I tried to see if I had any syntax errors or anything, but that isn't it. I also looked for ways to ignore the \n char, but everything I found was related to strings, or reading from files.
I know I could read the line, and then cast the two ints from string to int, but that seems like a bad way to do it (at least it would be a bad way to do it in C).
Anyways, if any one can help me, that would be awesome, thanks!
P.S.: I don't do a check before the substr operation because, by the definition of the problem, the line will have a 3-char operation, a space and then an integer.
A good place to look for tips for common problems like this is your favorite reference:
When used immediately after whitespace-delimited input, e.g. after int n; std::cin >> n;, getline consumes the endline character left on the input stream by operator>>, and returns immediately. A common solution is to ignore all leftover characters on the line of input with cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n'); before switching to line-oriented input.
From here.

Why does cin command leaves a '\n' in the buffer?

This is related to: cin and getline skipping input But they don't answer the why it happens just the how to fix it.
Why does cin leaves a '\n' in the buffer but just cin.getline takes it?
for example:
cin >> foo;
cin >> bar;//No problem
cin >> baz;//No problem.
But with cin.getline
cin >> foo;
cin.getline(bar,100);//will take the '\n'
So why it doesn't happen with cin but it does with cin.getline?
Because, when you say getline you say you want to get a line... A line is string that ends with \n, the ending is integral part of it.
When you say cin >> something, you want to get precisely something, nothing more. End-line marker is not part of it, so it's not consumed. Unless you would have a special type for line, but there is no such thing in standard library.
While without citations from standard this might be taken as opinion, but this is logic behind it. They have different semantics. There is also another difference getline works as unformatted input, and operator>> works as formatted input. I strongly suggest reading of those links to get the differences. This also says that they are semantically different.
Another answer, better or not is debatable, would be to quote standard, that I am sure says, how getline behaves, and how operator>> behaves for different types, and say, it works like this, because standard says so. This would be good, because the standard absolutely defines how things work and it can do so arbitrarily... And it rarely does explain motivation and logic behind the design.
You are not comparing cin to cin.getline, but rather cin.operator>> and cin.getline, and that is exactly what these functions are defined to do. The answer to the question "why" is "by definition". If you want rationale, I can't give it to you.
cin.getline reads and consumes until a newline \n is entered. cin.operator>> does not consume this newline. The latter performs formatted input, skipping leading whitespace, until the end of the object it was reading (in your case whatever foo is) "stops" (in case foo is an int, when the character isn't a number). A newline is what remains when the number is consumed from the input line. cin.getline reads a line, and consumes the newline by definition.
Make sure to always check for error on each stream operation:
if(cin >> foo)
or
if(std::getline(cin, some_string))
Note I used std::getline instead of the stream's member because this way there's no need for any magic numbers (the 100 in your code).

C++ tokenization

I am writing a lexer in C++ and I am reading from a file character by character, however, how do you do tokenization in this case? I can't use strtok since I have character not a string. Somehow I need to keep reading until I reach a delimeter?
The answer is Yes. You need to keep reading until you hit a delimiter.
There are multiple solutions.
The simplest thing to do is exactly that: keep a buffer (std::string) of the characters you already read until you reach a delimiter. At that point, you build a token from the accumulated characters in the buffer, clear the buffer, and push the delimiter (if necessary) in the buffer.
Another solution would be to read ahead of the time: ie, pick up the entire line with std::getline (for example), and then check what's on this line. In general the end-of-line is a natural token delimiter.
This works well... when delimiters are easy.
Unfortunately some languages, like C++, have awkward grammars. For example, in C++ >> can be either:
the operator >> (for right-shift and stream extraction)
the end of two nested templates (ie could be rewritten as > >)
In those cases... well, just don't bother with the difference in the tokenizer, and let your AST building pass disambiguate, it's got more information.
On the basis of information provided you.
If you want to read upto a delimiter from a File, use getline(char *,int,char) function.
getline() is use to read upto n characters or upto a delimiter.
Example:
#include<fstream.h>
using namespace std;
main()
{
fstream f;
f.open("test.cpp",ios::in);
char *c;
f.getline(c,2,' ');
cout<<c; // upto 1 char or till a space
}

Why would I even use istream::ignore when checking for valid input?

The C++ FAQ over at parashift uses something similar to the following:
while (cout << "Enter an integer: " && !(cin >> foo))
{
cin.clear();
//feel free to replace this with just (80, '\n') for my point
cin.ignore (numeric_limits<streamsize>::max(), '\n');
}
The cin.ignore (...), however, seems unnecessary. Why can't I just use cin.sync()? It's shorter and does not require a length. It's also more versatile as it will work the same way whether or not there are any characters in the input buffer in the first place. I've tested this once in the same loop as I used with ignore and it worked the same way. Yet it seems every example involving this type of input validation uses ignore instead of sync.
What (if any) was the reasoning behind using ignore when there's a much simpler alternative?
If it matters:
Windows
GCC
MinGW
On an ifstream, the effect of sync() is implementation defined (per C++11, ยง27.9.1.5/19) -- there's no guarantee that it'll do what you want (and no real guarantee of what it'll do at all). In a typical case, it will be about equivalent to the ignore if and only if the stream is line buffered -- but if the stream is unbuffered, it probably won't do anything, and if the stream is fully buffered, it'll probably do bad things.
Both do different things. sync discards characters already read ahead, no matter how many there are, or what they are. On the other hand, ignore discards characters until a certain character is encountered, no matter whether those characters have already been read, or whether there are more characters already read ahead. For example, imagine that cin has a 40 byte buffer, but your line had 80 bytes. Then most likely the first 40 bytes had been read to cin's buffer. After you've interpreted the beginning of those, by calling sync you discard the rest of those 40 characters you already have read, but not the other 40 characters in the line. On the other hand, your input might come from a pipe where no line buffering is typically done. In that case, you may discard not only the current line, but also parts of the next line which have been read ahead. OTOH with ignore you always know for sure that you'll always read up to the next \n (assuming the maximal number of characters to ignore is high enough to encounter it).

How do I set EOF on an istream without reading formatted input?

I'm doing a read in on a file character by character using istream::get(). How do I end this function with something to check if there's nothing left to read in formatted in the file (eg. only whitespace) and set the corresponding flags (EOF, bad, etc)?
Construct an istream::sentry on the stream. This will have a few side effects, the one we care about being:
If its skipws format flag is set, and the constructor is not passed true as second argument (noskipws), all leading whitespace characters (locale-specific) are extracted and discarded. If this operation exhausts the source of characters, the function sets both the failbit and eofbit internal state flags
You can strip any amount of leading (or trailing, as it were) whitespace from a stream at any time by reading to std::ws. For instance, if we were reading a file from STDIN, we would do:
std::cin >> std::ws
Credit to this comment on another version of this question, asked four years later.
How do I end this function with something to check if there's nothing left to read in formatted in the file (eg. only whitespace)?
Whitespace characters are characters in the stream. You cannot assume that the stream will do intelligent processing for you. Until and unless, you write your own filtering stream.
By default, all of the formatted extraction operations (overloads of operator>>()) skip over whitespace before extracting an item -- are you sure you want to part ways with this approach?
If yes, then you could probably achieve what you want by deriving a new class, my_istream, from istream, and overriding each operator>>() to call the following method at the end:
void skip_whitespace() {
char ch;
ios_base old_flags = flags(ios_base::skipws);
*this >> ch; // Skips over whitespace to read a character
flags(old_flags);
if (*this) { // I.e. not at end of file and no errors occurred
unget();
}
}
It's quite a bit of work. I'm leaving out a few details here (such as the fact that a more general solution would be to override the class template basic_istream<CharT, Traits>).
istream is not going to help a lot - it functions as designed. However, it delegates the actual reading to streambufs. If your streambuf wrapper trims trailing whitespace, an istream reading from that streambuf won't notice it.