#include <string>
#include <iostream>
int main() {
std::string str;
char magic[9];
std::cin.read((char *)magic, sizeof(magic));
std::cin.seekg(0, std::ios::beg);
while (std::cin >> str) {
std::cout << str << std::endl;
}
}
my code contains implementation of seekg(0) fucntion on std::cin
it is not behaving as expected on some of the files
when run as
./a.out < filename
those files that it is not behaving as expected have property that they have number of characters(including endline characters and other white spaces) less than 9(9 is the number of characters we read from cin before seekg)
if the file contains more than 9 characters it is behaving as expected
for example:
123456789
will give output as
123456789
while file containing less than 9 characters will not give output
for example:
1234
will give no output
With a file of less than nine characters, you have already attempted to read past the end with your initial read. That means the eof (end of file) and fail flags have been set for the stream and, while seekg may reset eof, it does not reset fail (a).
You can check that by inserting:
cout << "eof/fail=" << cin.eof() << '/' << cin.fail() << '\n';
immediately before and after the seekg. For file sizes of 8, 9, and 10 respectively, you get:
eof/fail=1/1
eof/fail=0/1
eof/fail=0/0
eof/fail=0/0
12345678
eof/fail=0/0
eof/fail=0/0
123456789
You can see the first failure results in no output because the fail bit is still set. The second and third have output because it was never set (the output is the characters shown plus one newline).
To repair this, you can clear the fail bit simply by inserting the following before your seekg:
std::cin.clear();
Then running that code on the eight-character file gives:
eof/fail=1/1
eof/fail=0/0
1234567
showing that the clear has indeed cleared the fail bit.
You might also want to keep in mind that it's not a requirement for a stream to be seekable, especially if it's just coming in via standard input. You may find for certain sized files that you cannot seek back an arbitrary amount if you've read through a large chunk of the stream.
(a) For the language lawyers amongst us, Unformatted input functions (C++11 27.7.2.3/41, C++14 27.7.2.3/41 and C++17 30.7.4.3/41) all have essentially the same text on how seekg works (my emphasis):
After constructing a sentry object, if fail() != true, executes ...
Related
I was given a question where the input will be like:
10 8
4 9
6 12
5 4
3
1
Here I don't know the number of lines that contains 2 integers. Those sets of 2 integers will be taken into an array. But when the program encounters "3", it will start taking input in another array.
I have tried this with
while(cin>>a>>b){ //some porcess with a and b }
but it doesn't work because it recognizes 3 and 1 as another set of two integers. Please help me to solve this problem.
cin >> a >> b skips not only spaces, but any delimeter characters too ('\n', '\t', ' ').
Here you actually may want to read input line-by-line and then check if there are two integers or one. Consider use of std::getline for retrieving each line of text. Then you can use read string as std::istream (like in example in the link above) and read from it with counting, how many numbers you read totally.
So think about your problem. Essentially it is, read one line at a time, and if it contains two numbers do one thing, but if it contains one number do something else.
But the code you have written reads numbers not lines. That is where the problem is.
Instead write your code to read only line at a time, analyse that line to see if it contains one or two numbers (or something else) and then proceed from there.
What you need is the ability to read a line of text into a string, and then read from that string into your numbers. To do that you use an istringstream. Something like this
#include <iostream>
#include <sstream>
#include <string>
int a, b;
string s;
getline(cin, s); // read one line from standard input
istringstream line(s); // put that string to a stream we can read from
if (line >> a) // try and read the first number from the stream
{
// got the first number
if (line >> b) // try and read the second number from the stream
{
// got the second number
...
}
else
{
// only one number
...
}
}
else
{
// didn't get any numbers, some sort of error
...
}
My understanding of iostream has always been that when an input conversion fails, the data remains in the stream to be read after clearing the error. But I have an example that does not always work that way, and I would like to know if the behavior is correct. If so, could someone point me at some good documentation of the actual rules?
This small program represents the idea of using an i/o failure to read an optional repeat count and a string.
#include <iostream>
#include <string>
int main()
{
int cnt;
std::cout << "in: ";
std::cin >> cnt;
if(!std::cin) {
cnt = 1;
std::cin.clear();
}
std::string s;
std::cin >> s;
std::cout << "out: " << cnt << " [" << s << "]" << std::endl;
}
So, here's how it runs:
[me#localhost tmp]$ ./bother
in: 16 boxes
out: 16 [boxes]
[me#localhost tmp]$ ./bother
in: hatrack
out: 1 [hatrack]
[me#localhost tmp]$ ./bother
in: some things
out: 1 [some]
[me#localhost tmp]$ ./bother
in: 23miles
out: 23 [miles]
[me#localhost tmp]$ ./bother
in: #(#&$(##&$ computer
out: 1 [#(#&$(##&$]
So it mostly works. When there's a number first, it is read, then the string is. When I give a non-numeric first, the read fails, the count is set to 1 and the non-numeric input is read. But this breaks:
[me#localhost tmp]$ ./bother
in: + smith
out: 1 [smith]
The + fails the integer read because it's not enough to make a number, but the + does not remain on the stream to be picked up by the string read. Likewise with -. If it reads the + or - as a zero, that would be reasonable, but then the read should succeed and cnt should show that zero.
Perhaps this is correct behavior, and I've just always been wrong about what it is supposed to do. If so, what are the rules?
Your advice appreciated.
Plus and minus are valid parts of an integer and so are read, when the next character is read the stream fails and that character is left in the stream but the leading sign character is not put back into the stream.
See https://en.cppreference.com/w/cpp/locale/num_get/get for the full rules
I'am new to c++ and a little bit in Linux. I have simple project that need to parse CPU stat from /proc/stat file and compute CPU usage. I have tried doing it on full bash script. but what i need is c++. I just need a little help. /proc/stat gives a lot of numbers and i know different column represent on something. like User,Nice,System,Idle etc. For example i just want to get the Idle value, and store it as Integer using c++, how would i do it? Please Help. What I tried right now is just getting the whole line i need using ifstream and getline()
std::ifstream filestat("/proc/stat");
std::string line;
std::getline(filestat,line);
and what i get is this.
cpu 349585 0 30513 875546 0 935 0 0 0 0
To clarify my question, for example i want to get the 875546 value and store it to an integer using c++. how would i do it? thank you
The format of stat is described in detail under the proc(5) manual page.
You can see it either by running the command man 5 proc from a Linux terminal or online.
The methods described above for parsing the stat file are fine for academic purposes, but a production grade parser should take extra precaution when using these methods.
If you need a production grade parser in C++ for files in /proc, you can check out pfs - A library for parsing the procfs. (Disclaimer: I'm the author of the library)
The biggest issue is usually the comm field (The second field in the file).
According to the man pages, this field is a string that should be "scanned" using some scanf flavor and the formatter %s. But that is wrong!
The comm field is controlled by the application (Can be set using prctl(PR_SET_NAME, ...)) and can easily include spaces or brackets, easily causing 99% of the parsers out there to fail.
And a simple change like that won't just return a bad comm value, it will screw up with all the values that come after it.
The right way to parse the file are one of the following:
Option #1:
Read the entire content of the file
Find the first occurrence of '('
Find the last occurrence of ')'
Assign to comm the string between those indices
Parse the rest of the file after the last occurrence of ')'
Option #2:
Read the PID (the first value in the file)
Read 18 bytes (16 is the largest comm value + 2 for the wrapping brackets)
Extract the comm value from that buffer just like we did for option #1
Find out the actual length of the value, fix your stream and continue reading from there
You really need to study up on how file input works. This should be simple enough. You just need to ignore the first 3 characters "cpu" and then read through 4 integer values:
unsigned n;
if(std::ifstream("/proc/stat").ignore(3) >> n >> n >> n >> n)
{
// use n here...
std::cout << n << '\n';
}
Alternatively if you already have the line (maybe you are reading the file one line at a time) you can use std::istringstream to turn the line into a new input stream:
std::ifstream filestat("/proc/stat");
std::string line;
std::getline(filestat, line);
unsigned n;
if(std::istringstream(line).ignore(3) >> n >> n >> n >> n)
{
// use n here...
std::cout << n << '\n';
}
There are several ways to the problem. You can use regular expression library to get the part of the string or if you know this is always going to the 5th element then you can use this:
std::string text = "cpu 349585 0 30513 875546 0 935 0 0 0 0";
std::istringstream iss(text);
std::vector<std::string> results(std::istream_iterator<std::string>{iss}, std::istream_iterator<std::string>());
int data = std::stoi( results[4] ); //check size before accessing
std::cout << data << std::endl;
I hope it helps.
I have a question about the difference between these two pieces of code:
char buffer5[5];
cin.get(buffer5, 5);
cout << buffer5;
cin.get(buffer5, 5);
cout << buffer5;
and
char buffer4;
while (cin.get(buffer4))
{
cout << buffer4;
}
In the first piece of code, the code gets 5 characters and puts it in buffer5. However, because you press enter, a newline character isn't put into the stream when calling get(), so the program will terminate and will not ask you for another round of 5 characters.
In the second piece of code, cin.get() waits for input to the input stream, so the loop doesn't just terminate (I think). Lets say I input "Apple" into the input stream. This will put 5 characters into the input stream, and the loop will print all characters to the output. However, unlike the first piece of code, it does not stop, even after two inputs as I can continuously keep inputting.
Why is it that I can continuously input character sequences into the terminal in the second piece of code and not the first?
First off, "pressing enter" has no special meaning to the IOStreams beyond entering a newline character (\n) into the input sequence (note, when using text streams the platform specific end of line sequences are transformed into a single newline character). When entering data on a console, the data is normally line buffered by the console and only forwarded to the program when pressing enter (typically this can be turned off but the details of this are platform specific and irrelevant to this question anyway).
With this out of the way lets turn our attention to the behavior of s.get(buffer, n) for an std::istream s and a pointer to an array of at least n characters buffer. The description of what this does is quite trivial: it calls s.get(buffer, n, s.widen('\n')). Since we are talking about std::istream and you probably haven't changed the std::locale we can assume that s.widen('\n') just returns '\n', i.e., the call is equivalent to s.get(buffer, n, '\n') where '\n' is called a delimiter and the question becomes what this function does.
Well, this function extracts up to m = 0 < n? n - 1: 0 characters, stopping when either m is reached or when the next character is identical to the delimiter which is left in the stream (you'd used std::istream::getline() if you'd wanted the delimiter to be extracted). Any extracted character is stored in the corresponding location of buffer and if 0 < n a null character is stored into location buffer[n - 1]. In case, if no character is extracted std::ios_base::failbit is set.
OK, with this we should have all ingredients to the riddle in place: When you entered at least one character but less than 5 characters the first call to get() succeeded and left the newline character as next character in the buffer. The next attempt to get() more characters immediately found the delimiter, stored no character, and indicated failure by setting std::ios_base::failbit. It is easy to verify this theory:
#include <iostream>
int main()
{
char buffer[5];
for (int count(0); std::cin; ++count) {
if (std::cin.get(buffer, 5)) {
std::cout << "get[" << count << "]='" << buffer << "'\n";
}
else {
std::cout << "get[" << count << "] failed\n";
}
}
}
If you enter no character, the first call to std::cin.get() fails. If you enter 1 to 4 characters, the first call succeeds but the second one fails. If you enter more than 4 characters, the second call also succeeds, etc. There are several ways to deal with the potentially stuck newline character:
Just use std::istream::getline() which behaves the same as std::istream::get() but also extracts the delimiter if this is why it stopped reading. This may chop one line into multiple reads, however, which may or may not be desired.
To avoid the limitation of a fixed line length, you could use std::getline() together with an std::string (i.e., std::getline(std::cin, string)).
After a successful get() you could check if the next character is a newline using std::istream::peek() and std::istream::ignore() it when necessary.
Which of these approaches meets your needs depends on what you are trying to achieve.
To simplify, I'm trying to read the content of a CSV-file using the ifstream class and its getline() member function. Here is this CSV-file:
1,2,3
4,5,6
And the code:
#include <iostream>
#include <typeinfo>
#include <fstream>
using namespace std;
int main() {
char csvLoc[] = "/the_CSV_file_localization/";
ifstream csvFile;
csvFile.open(csvLoc, ifstream::in);
char pStock[5]; //we use a 5-char array just to get rid of unexpected
//size problems, even though each number is of size 1
int i =1; //this will be helpful for the diagnostic
while(csvFile.eof() == 0) {
csvFile.getline(pStock,5,',');
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}
return 0;
}
I'm expecting all the numbers to be read, since getline is suppose to take what is written since the last reading, and to stop when encountering ',' or '\n'.
But it appears that it reads everything well, EXCEPT '4', i.e. the first number of the second line (cf. console):
Iteration number 1
1
Iteration number 2
2
Iteration number 3
3
Iteration number 4
5
Iteration number 5
6
Thus my question: what makes this '4' after (I guess) the '\n' so specific that getline doesn't even try to take it into account ?
(Thank you !)
You are reading comma separated values so in sequence you read: 1, 2, 3\n4, 5, 6.
You then print the first character of the array each time: i.e. 1, 2, 3, 5, 6.
What were you expecting?
Incidentally, your check for eof is in the wrong place. You should check whether the getline call succeeds. In your particular case it doesn't currently make a difference because getline reads something and triggers EOF all in one action but in general it might fail without reading anything and your current loop would still process pStock as if it had been repopulated successfully.
More generally something like this would be better:
while (csvFile.getline(pStock,5,',')) {
cout << "Iteration number " << i << endl;
cout << *pStock<<endl;
i++;
}
AFAIK if you use the terminator parameter, getline() reads until it finds the delimiter. Which means that in your case, it has read
3\n4
into the array pSock, but you only print the first character, so you get 3 only.
the problem with your code is that getline, when a delimiter is specified, ',' in your case, uses it and ignores the default delimiter '\n'. If you want to scan that file, you can use a tokenization function.