Buffered input versus standard input - c++

I was trying to read a long list of numbers (Around 10^7) from input file. Through some searching I found that reading the contents using buffer gives more performance when compared to reading the number one by one.
My second program is performing better than the first program. I am using a cin stream object in the first program and stringstream object in the second program. What is the difference between these two in terms of I/O performance?
#include <iostream>
using namespace std;
int main()
{
int n,k;
cin >> n >> k;
int count = 0;
while ( n-- > 0 )
{
int num;
cin >> num;
if( num % k == 0 )
count++;
}
cout << count << endl;
return 0;
}
This program is taking a longer time when compared to the following code using buffered input.
#include <iostream>
#include <sstream>
using namespace std;
int main()
{
cin.seekg(0, cin.end);
int length = cin.tellg();
cin.seekg(0, cin.beg);
char *buffer = new char[length];
cin.read(buffer,length);
stringstream ss(buffer);
int n,k;
ss >> n >> k;
int result = 0;
while( n-- )
{
int num;
ss >> num;
if( num % k == 0 )
result++;
}
cout << result << endl;
return 0;
}

The second one will require ~twice the file's size in memory, otherwise, since it reads the entire file in one call, it will likely read data into memory as fast as the underlying storage can feed it, and then process it as fast as the CPU can do so.
It'd be good to avoid the memory cost, and in that respect, your first program is better. On my system, using an input called test.txt that looks like:
10000000 2
13
13
< 10000000-2 more "13"s. >
and your first program called a, and your second called b. I get:
% time ./a <test.txt
0
./a < test.txt 1.70s user 0.01s system 99% cpu 1.709 total
% time ./b <test.txt
0
./b < test.txt 0.76s user 0.04s system 100% cpu 0.806 total
cin is not buffered by default, to keep "synchronized" with stdio. See this excellent answer for a good explanation. To make it buffered, I added cin.sync_with_stdio(false) to the top of your first program, and called the result c, which runs perhaps slightly faster:
% time ./c <test.txt
0
./c < test.txt 0.72s user 0.01s system 100% cpu 0.722 total
(Note: the times waffle around a bit, and I only ran a few tests, but c seems to be at least as fast as b.)
Your second program runs quickly because while not buffered, we can just issue one read call. The first program must issue a read call for each cin >>, whereas the third program can buffer (issue a read call every now and then).
Note that adding this line means you can't read from stdin using the C FILE * by that name, or call any library methods that would do so. In practice, this is likely to not be an issue.

Related

Strange Behavior in C++ Input/Output When Parsing Large Integer Input

I have the following piece of code:
#include <iostream>
using namespace std;
int main() {
// Number of inputs
int N;
scanf("%d", &N);
// Takes in input and simply outputs the step it's on
for (int i = 0; i < N; i++) {
int Temp;
scanf("%d", &Temp);
printf("%d ", i);
}
}
When taking in a large amount of integer input, C++ stops at a certain point in printing output, seemingly waiting for more input to come.
Given an input of 2049 1's, the program stops after printing 2048 integers (0 up to 2047), and does not print the final 2048 (the 2049th integer). 2048 looks suspicious, being a power of 2.
It seems to be the case that the larger the input values, the quicker the program decides to stop, and in this case after what looks like a random number of steps. For example, I gave it 991 integers (up to the ten thousands), and the program stopped outputting after iteration 724.
Note that I copied and pasted the numbers as a whole block, rather than typing and entering them one by one, but I doubt this plays a role. I also tried cin and cout, but they did not help.
Could someone please explain the reasons behind this phenomenon?
I have found the answer to my question. The reason behind the failure is indeed due to copying and pasting large chunks of input, as many have suggested, and I thank everyone for their help. There were no incorrect characters, though, and cause of this problem is instead the 4096 character limit posed by canonical mode.
In canonical mode, the terminal lets the user navigate the input, using arrow keys, backspace, etc. It sends the text to the processor only when there is a newline or the buffer is full. The size of this buffer being 4096 characters, it becomes clear why the code fails to parse more input than that, i.e. 2049 "1 "s is 4098 characters. One can switch to noncanonical mode, which allows larger input at the expense of not being able to navigate it, using stty -icanon. Entering stty icanon takes it back to canonical mode.
Practically speaking, entering the input with newlines separating the numbers seems like the easiest fix.
This source was quite helpful to me: http://blog.chaitanya.im/4096-limit.
This post on unix stack exchange is similar to my problem: https://unix.stackexchange.com/questions/131105/how-to-read-over-4k-input-without-new-lines-on-a-terminal.
My first thought was that you're reaching some sort of barrier on the input side...
The limit for the length of a command line is not [typically] imposed by the shell, but by the operating system.
Bash Command Line and Input Limit - SO
However, this is probably not the case.
First, focus on your data, make sure your data is what you think it is (no unexpected characters) and then try to debug your reads, make sure the values are making it into memory like you intend.
Try separating out your read and write into two loops, this might help your debugging a little easier depending on your skill level, but again, making sure something funky isn't going on with your reads. Suspicion is high with the reads on this one...
Here's a couple of cracks at it below... haven't tested. Hope this helps!
#include <iostream>
int main() {
int N;
std::cin >> N;
// Read N integers, storing to array
int* numbers = new int[N];
for (int i = 0; i < N; i++) {
std::cin >> numbers[i];
}
// Print
for (int i = 0; i < N; i++) {
std::cout << numbers[i] << " ";
}
std::cout << std::endl;
// Free the dynamically allocated memory
delete[] numbers;
return 0;
}
Okay... maybe a little more optimized...
#include <iostream>
int main() {
int N;
std::cin >> N;
// fixed-size on the stack
int numbers[N];
// cin.tie(nullptr) and ios::sync_with_stdio(false) might improve perf.
std::cin.tie(nullptr);
std::ios::sync_with_stdio(false);
// Read N integers, storing to array
for (int i = 0; i < N; i++) {
std::cin >> numbers[i];
}
// Print
for (int i = 0; i < N; i++) {
std::cout << numbers[i] << " ";
}
std::cout << std::endl;
return 0;
}

Google KickStart Round B Bus Routes Problem wrong answer?

I was practicing with google kickstart's round B bus route problem. I actually looked at their analysis and implemented their alternative answer.
I'll also paste the problem prompt below my code.
https://codingcompetitions.withgoogle.com/kickstart/round/000000000019ffc8/00000000002d83bf
And my solution passes the first test set but gets a wrong answer on the second test set. I have no idea what the second test set is, only that it's very big. I'm pretty confused as my solution follows the analysis of the problem, and is actually an implementation of the alternate provided solution to a T. I also have no idea how to figure out which test case could be giving a wrong answer, there seems to be so many possibilities!
I have no idea how to even debug such a vague answer. Maybe there are some edge cases I'm not considering?
#include <iostream> // includes cin to read from stdin and cout to write to stdout
#include <bits/stdc++.h>
using namespace std;
int main() {
int t, n, d;
cin >> t; // read t. cin knows that t is an int, so it reads it as such.
for (int i = 1; i <= t; ++i) {
cin >> n >> d; // read n and then m.
stack <int> bus;
for(int j=0; j<n; j++){
int x;
cin >> x;
bus.push(x);
}
while(!bus.empty()){
int b = bus.top();
bus.pop();
d = d - d%b;
}
cout << "Case #" << i << ": " << d << endl;
}
return 0;
}
****Here's a shortened version of the Problem Prompt ****
Problem
Bucket is planning to make a very long journey across the countryside by bus. Her journey consists of N bus routes, numbered from 1 to N in the order she must take them. The buses themselves are very fast, but do not run often. The i-th bus route only runs every Xi days.
More specifically, she can only take the i-th bus on day Xi, 2Xi, 3Xi and so on. Since the buses are very fast, she can take multiple buses on the same day.
Bucket must finish her journey by day D, but she would like to start the journey as late as possible. What is the latest day she could take the first bus, and still finish her journey by day D?
It is guaranteed that it is possible for Bucket to finish her journey by day D.
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case begins with a line containing the two integers N and D. Then, another line follows containing N integers, the i-th one is Xi.
Output
For each test case, output one line containing Case #x: y, where x is the test case number (starting from 1) and y is the latest day she could take the first bus, and still finish her journey by day D.
Limits
Time limit: 10 seconds per test set.
Memory limit: 1GB.
my guess is using int is not sufficient as D can be up to 10^12 in test set 2
Edited: I verified my guess. You will be able to solve the problem by fixing this bug. And I believe the use of long long is common for coding contests, i.e. take note of the input/output constraints every time.
#include <iostream>
#include <bits/stdc++.h>
using namespace std;
int main ()
{
int t;
long long int x,i,j,n,d;
cin >> t;
for ( i = 0; i < t; i++)
{
cin >> n >> d;
stack < long long int >route;
for (j = 0; j < n; j++)
{
cin >> x;
route.push (x);
}
while (!route.empty ())
{
long long int c = route.top ();
route.pop ();
d = d - d % c;
}
cout << "Case #" << i+1 << ": " << d << endl;
}
return 0;
}

Time Limit Exceeded in C++ Online Judge

I just sent the solution to a problem COJ written in C ++. The problem is this link: http://coj.uci.cu/24h/problem.xhtml?pid=1839
This is my solution:
#include<iostream>
using namespace std;
unsigned int t, n;
int main(){
cin >> t;
while(t > 0 && cin >> n){
cout<< ( n * 8 ) + 42 << endl;
t--;
}
return 0;
}
To this the judge online of COJ says: "Time Limit Exceeded". Can someone explain why?
Your could try to get rid of std::endl because it is slow. Instead, you use '\n'. The code is below:
#include<iostream>
using namespace std;
unsigned int t, n;
int main(){
cin >> t;
while(t > 0 && cin >> n){
cout<< ( n * 8 ) + 42 << '\n';
t--;
}
cout.flush();
return 0;
}
Or you could try printf() to fasten your I/O.
This algorithm grows linearly, it should not be a limit factor. I believe I/O is the limiting factor. If you could find a algorithm that is log N, please tell me.
Someone vote it down because they believe doing I/O with standard out(stdout) is fast. However, std::endl flush after every iteration, which makes it slow.
Another idea is to use recursion, and you hope the judge system has good optimization on tail recursion(it is possible.). It may get a little faster, too.
On your own computer, you could even try to profile it to find the limiting factor, and if it is I/O, try the way above.
If it is still not fast enough, get rid of object-oriented design, use C and write direct to stdout instead use printf.
Good luck!

Read a sequence of ints from cin and store them in a vector

This is what I have done to read integers with std::cin and store them in a vector:
int number;
vector<int>ivec;
while (cin>>number)
{
ivec.push_back(number);
}
for (auto v: ivec){
cout << v;
}
Then, I am stuck with the problem that how to stop entering integers and move to the next process of printing the vector out. Any pointer will be appreciated.
It depends on the terminal in use and the precise mechanism varies quite a lot but, conventionally, typing Ctrl+D (Linux) or Ctrl+Z (Windows) will result in an end-of-file "signal" being transmitted along the pipe, causing the EOF bit to be set on cin, and thus the next cin >> number attempt to fail.
That will break the loop.
Conveniently, the same will happen if you ran your executable with redirected input from a file. Which is kind of the point.
The code you posted reads numbers from cin as long as it succeeds. There are two ways to have it stop succeeding: If you enter something that is not a number, reading data still succeeds, but converting it to number fails. This puts cin into the bad state. It can be recovered from using the clear methode. The other way is making the reading of characters from cin fail (for example the end of a file that gets used as input). This puts cin into the failed state. Usually, recovering from a failed state is impossible.
To produce the you can no longer read state at end of file when entering data at the keyboard, operating system specific methods have to be used (likely Control-D or Control-Z). This is final for the invocation of your program.
If you need a way for the user to signal: "Please go on, but let me enter other stuff later", the most clean way is likely reading cin line-by-line and parse the input using strtol or a stringstream, and comparing for a magic stop-token (e.g. empty line, "end") to exit the loop.
You can do it the following way
#include <vector>
#include <iterator>
//...
vector<int> ivec( std::istream_iterator<int>( std::cin ),
std::istream_iterator<int>() );
for ( auto v: ivec ) std::cout << v << ' ';
std::cout << std::endl;
In Windows you have to press key combination Ctrl + z and in UNIX system Ctrl + d
Or you can introduce a sentinel. In this case the loop can look like
int number;
const int sentinel = SOME_VALUE;
for ( std::cin >> number && number != sentinel ) vec.push_back( number );
Please find a simple solution to your problem, let me know if you see any issue with this solution.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> va;
int x;
while ( cin >> x ) {
va.push_back(x);
if ( cin.get() == '\n' ) break;
}
//Vector output
for ( int i = 0; i < va.size(); i++ ) {
cout << va[i] <<" ";
}
return 0;
}

I'm Having Trouble Skipping Certain Characters from an Input Text File (C++)

(I apologize that this is so low level compared to most of the questions I have seen on this website, but I have run out of ideas and I do not know who else to ask.)
I am working on a school project that requires me to read basketball statistics from a file named in06.txt. The file in06.txt looks exactly as follows:
5
P 17 24 9 31 28
R 4 5 1 10 7
A 9 2 3 6 8
S 3 4 0 5 4
I am required to read and store the first number, 5, into a variable called "games." From there, I must read the numbers from the second line and determine the high, the low, and the average. I must do the same thing for lines 3, 4, and 5. (FYI, the letters P, R, A, and S are there to indicate "Points," "Rebounds," "Assists," and "Steals.")
Since I only have been learning about programming for a few weeks, I do not want to overwhelm myself by jumping right into dealing with every aspect of the project. So, I am first working on determining the average from each line. My plan is to keep a running total of each line and then divide the running total by the number of games, which is 5 in this case.
This is my code:
#include <iostream>
#include <fstream>
#include <cstdlib>
using namespace std;
int main()
{
int games;
int points_high, points_low, points_total;
int rebounds_high, rebounds_low, rebounds_total;
int assists_high, assists_low, assists_total;
int steals_high, steals_low, steals_total;
double points_average, rebounds_average, assists_average, steals_average;
ifstream fin;
ofstream fout;
fin.open("in06.txt");
if( fin.fail() ) {
cout << "\nInput file opening failed.\n";
exit(1);
}
else
cout << "\nInput file was read successfully.\n";
int tempint1, tempint2, tempint3, tempint4;
char tempchar;
fin >> games;
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character P from the text file.
while( fin >> tempint1 ) {
points_total += tempint1;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character R from the text file.
while( fin >> tempint2 ) {
rebounds_total += tempint2;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character A from the text file.
while( fin >> tempint3 ) {
assists_total += tempint3;
}
fin.get(tempchar); // Takes the endl; from the text file.
fin.get(tempchar); // Takes the character S from the text file.
while( fin >> tempint4 ) {
steals_total += tempint4;
}
cout << "The total number of games is " << games << endl;
cout << "The value of total points is " << points_total << endl;
cout << "The value of total rebounds is " << rebounds_total << endl;
cout << "The value of total assists is " << assists_total << endl;
cout << "The value of total steals is " << steals_total << endl;
return 0;
}
And this is the (incorrect) output:
Input file was read successfully.
The total number of games is 5
The value of total points is 111
The value of total rebounds is 134522076
The value of total assists is 134515888
The value of total steals is 673677934
I have been reading about file input in my textbook for hours, hoping that I will find something that will indicate why my program is outputting the incorrect values. However, I have found nothing. I have also researched similar problems on this forum as well as other forums, but the solutions use methods that I have not yet learned about and thus, my teacher would not allow them in my project code. Some of the methods I saw were arrays and the getline function. We have not yet learned about either.
Note: My teacher does not want us to store every integer from the input file. He wants us to open the file a single time and store the number of games, and then use loops and if statements for determining the high, average, and low numbers from each line.
If anyone could help me out, I would GREATLY appreciate it!
Thanks!
You have all these variables declared:
int games;
int points_high, points_low, points_total;
int rebounds_high, rebounds_low, rebounds_total;
int assists_high, assists_low, assists_total;
int steals_high, steals_low, steals_total;
double points_average, rebounds_average, assists_average, steals_average;
And then you increment them:
points_total += tempint1;
Those variables were never initialzed to a known value (0), so they have garbage in them. You need to initialize them.
Besides what OldProgrammer said, you've approached the reading of integers incorrectly. A loop like this
while( fin >> tempint2 ) {
rebounds_total += tempint2;
}
will stop when an error occurs. That is, either it reaches EOF or the extraction encounters data that cannot be formatted as an integer - or in other words, good() returns false. It does not, as you seem to think, stop reading at the end of a line. Once an error flag is set, all further extractions will fail until you clear the flags. In your case, a loop starts reading after P, extracts five intergers, but then it encounters the R from the next line and errors out.
Change this to a loop that reads a fixed number of integers or alternatively, read a whole line using std::getline into a std::string, put it into a std::stringstream and read from there.
In any case, learn to write robust code. Check for success of extractions and count how many elements you get.
An example of a loop that reads at most 5 integers:
int i;
int counter = 0;
while (counter < 5 && file >> i) {
++counter;
// do something with i
}
if (counter < 5) {
// hm, got less than 5 ints...
}