Problem reading a formatted text file in C++

Problem reading a formatted text file in C++ - c++

Officially my first post. I'm sure the Stack is full of answers, but the problem that I need help with is a little bit specific. So here goes nothing...
The Task:
I'm doing a small school project and in one part of my program I need to read the temperature measurements at different locations, all from a single formatted text file. The data inside the file is written as follows:
23/5/2016
Location 1
-7,12,-16,20,18,13,6
9/11/2014
Location 2
−1,3,6,10,8
9/11/2014
Location 3
−5,−2,0,3,1,2,−1,−4
The first row represents the date, second row the location and the third row represents the all the measurements the were taken on that day (degrees Celsius).
The code that I wrote for this part of the program looks something like this:
tok.seekg(0, std::ios::beg);
int i = 0;
double element;
char sign = ',';
while (!tok.eof()) {
vector_measurements.resize(vector_measurements.size() + 1);
tok >> vector_measurements.at(i).day >> sign >> vector_measurements.at(i).month >> sign >> vector_measurements.at(i).year >> std::ws;
std::getline(tok, vector_measurements.at(i).location);
sign = ',';
while (tok && sign == ',') {
tok >> element;
vector_measurements.at(i).measurements.push_back(element);
sign = tok.get();
}
if (!tok.eof() && !tok) {
tok.clear();
break;
}
vector_measurements.at(i).SetAverage();
i++;
}
The code that I'm presenting is linked to a class:
struct Data {
std::string location;
std::vector<int> measurements;
int day, month, year;
double average = 0;
void SetAverage();
int GetMinimalTemperature();
int GetMaximalTemperature();
};
I've already checked and confirmed that the file exists and the stream is opened in the correct mode without any errors; all class methods working as intended. But here's the problem. Later on, after the data is sorted (the part of data that has been successfully read), it fails to correctly print the data on the screen. I get something like:
Location 2
Date: 9/11/2014
Minimal temperature: 0
Maximal temperature: 0
Average temperature: 0
Location 1
Date: 23/5/2016
Minimal temperature: -16
Maximal temperature: 20
Average temperature: 6.57143
; but I expect:
Location 3
----------
Date: 9/11/2014
Minimal temperature: -5
Maximal temperature: 3
Average temperature: -0.75
Location 2
----------
Date: 9/11/2014
Minimal temperature: -1
Maximal temperature: 10
Average temperature: 5.20
Location 1
----------
Date: 23/5/2016
Minimal temperature: -16
Maximal temperature: 20
Average temperature: 6.57143
The Problem:
The order of the locations is good, since I'm sorting from the lowest to the highest average temperature. But no matter the number of locations, the first location is always correct, the second one only has zero's, and every other location isn't even printed on the screen.
What do I need to change in order for my program to read the data properly? Or am I just missing something? Forgive me for any spelling mistakes I made since English isn't my native language. Thank you all in advance, any help is appreciated!

So the issue is there is some garbage in your text file. I do believe these are \0 characters, but I am not sure. They present themselves as ? characters in Atom text editor.
You're quite lucky StackOverflow didn't sanitize them, otherwise, nobody would be able to help you.
After I cleaned up the text file, your code works. You just need to also kill the loop and drop the last item when the file ends, I did it like this. It's not optimal but it works.
while (!tok.eof())
{
vector_measurements.resize(vector_measurements.size() + 1);
Data& currentItem = vector_measurements[i];
tok >> currentItem.day >> sign >> currentItem.month >> sign >> currentItem.year >> std::ws;
// If the file ends, the data is invalid and the last item can be thrown away
if (tok.eof())
{
vector_measurements.pop_back();
break;
}
std::getline(tok, currentItem.location);
sign = ',';
while (tok && sign == ',')
{
tok >> element;
currentItem.measurements.push_back(element);
sign = tok.get();
}
if (!tok.eof() && !tok)
{
tok.clear();
break;
}
currentItem.SetAverage();
i++;
}
Please inspect your file with hex editor and observe the weird characters, then figure out how to get rid of them.

Related

Time limit exceeded on test 10 code forces

hello i am a beginner in programming and am in the array lessons ,i just know very basics like if conditions and loops and data types , and when i try to solve this problem.
Problem Description
When Serezha was three years old, he was given a set of cards with letters for his birthday. They were arranged into words in the way which formed the boy's mother favorite number in binary notation. Serezha started playing with them immediately and shuffled them because he wasn't yet able to read. His father decided to rearrange them. Help him restore the original number, on condition that it was the maximum possible one.
Input Specification
The first line contains a single integer n (1⩽n⩽105) — the length of the string. The second line contains a string consisting of English lowercase letters: 'z', 'e', 'r', 'o' and 'n'.
It is guaranteed that it is possible to rearrange the letters in such a way that they form a sequence of words, each being either "zero" which corresponds to the digit 00 or "one" which corresponds to the digit 11.
Output Specification
Print the maximum possible number in binary notation. Print binary digits separated by a space. The leading zeroes are allowed.
Sample input:
4
ezor
Output:
0
Sample Input:
10
nznooeeoer
Output:
1 1 0
i got Time limit exceeded on test 10 code forces and that is my code
#include <iostream>
using namespace std;
int main()
{
int n;
char arr[10000];
cin >> n;
for (int i = 0; i < n; i++) {
cin >> arr[i];
}
for (int i = 0; i < n; i++) {
if (arr[i] == 'n') {
cout << "1"
<< " ";
}
}
for (int i = 0; i < n; i++) {
if (arr[i] == 'z') {
cout << "0"
<< " ";
}
}
}

Your problem is a buffer overrun. You put an awful 10K array on the stack, but the problem description says you can have up to 100K characters.
After your array fills up, you start overwriting the stack, including the variable n. This makes you try to read too many characters. When your program gets to the end of the input, it waits forever for more.
Instead of putting an even more awful 100K array on the stack, just count the number of z's and n's as you're reading the input, and don't bother storing the string at all.

According to the compromise (applicable to homework and challenge questions) described here
How do I ask and answer homework questions?
I will hint, without giving a code solution.
In order to fix TLEs you need to be more efficient.
In this case I'd start by getting rid of one of the three loops and of all of the array accesses.
You only need to count two things during input and one output loop.

how to ignore n integers from input

I am trying to read the last integer from an input such as-
100 121 13 ... 7 11 81
I'm only interested in the last integer and hence want to ignore all
previous integers.
I thought of using cin.ignore but that won't work here due to
unknown integers (100 is of 3 digits, while 13 is of 2 digits & so on)
I can input integer by integer using a loop and do nothing with them. Is there a better way?

It all depends on the use case that you have.
Reading a none specified number of integers from std::cin is not as easy at it may seem. Because, in contrast to reading from a file, you will not have an EOF condition. If you would read from a file stream, then it would be very simple.
int value{};
while (fileStream >> value)
;
If you are using std::cin you could try pressing CTRL-D or CTRL-Z or whatever works on your terminal to produce an EOF (End Of File) condition. But usually the approach is to use std::getline to read a complete line until the user presses enter, then put this line into a std::istringstream and extract from there.
Insofar, one answer given below is not that good.
So, next solution:
std::string line{};
std::getline(std::cin, line);
std::istringstream iss{line};
int value{};
while (iss >> value)
;
You were asking
Is there a better way?
That also depends a little. If you are just reading some integers, then please go with above approach. If you would have many many values, then you would maybe waste time by unnecessarily converting many substrings to integers and loose time.
Then, it would be better, to first read the complete string, then use rfind to find the last space in the string and use std::stoi to convert the last substring to an integer.
Caveat: In this case you must be sure (or check with more lines of code) that there are no white space at the end and the last substring is really a number. That is a lot of string/character fiddling, which can most probably avoided.
So, I would recommend the getline-stringstream approach.

You can try this simple solution for dynamically ignoring rest of the values except the last given in this problem as shown:
int count = 0;
int values, lastValue; // lastValue used for future use
std::cout << "Enter your input: ";
while (std::cin >> values) {
lastValue = values; // must be used, otherwise values = 0 when loop ends
count++;
}
std::cout << lastValue; // prints
Note: A character must be required to stop the while(), hence it's better put a . at last.
Output example
Enter your input: 3 2 4 5 6 7.
7

Try this:
for( int i=0; i<nums_to_ignore; i++) {
int ignored;
std::cin >> ignored;
}

Comparing lines of data in the same file

I’m currently working on a project for my intro CS class. We are still pretty new to C++ and working with rudimentary concepts like while and for loops as well as file streams. The below problem is supposed to be resolved without resort to advanced features like arrays, vectors or functions.
Basically, I take a text file (FILE ONE) that has student and course data and create a new file. File one (where I’m inputting the data from) has 6k lines. Here’s an example below:
20424297 1139 CSCI 16000 W -1 3.00 RNL
20424297 1142 PSYCH 18000 W -1 3.00 RLA
20424297 1142 PSYCH 22000 W -1 3.00 RLA
20608974 1082 ENGL 12000 A- 3.7 3.00 RECR
20608974 1082 HIST 15200 B+ 3.3 3.00 FUSR
20608974 1082 PHILO 10100 A+ 4 3.00 FISR
See that very first column? Each unique set of numbers represents a student (also known as an eiD). File one is a giant list of every class a student took, and includes the subject, courses and grades they got.
The point of this project is to create a new text file that summarizes the GPAs of each student. That part I’m fairly confident I could figure out (taking cumulative GPA data). What confuses me is how I’m supposed to compare lines within the file to one another.
My professor did make things easy by having all the data grouped together by student. That lightens my load a little bit. I basically have to go through this file, line by line, and compare it with the next line to see if it has the same student ID number.
My first inclination was to create a series of nested while loops. The first loop would be active as long as data was being read. My next inclination was to repeat this in another loop. I would then create variables to hold the previous line’s student ID number and the current lines student ID number, creating conditions that would be active depending on whether or not they were the same or not:
while (sdStream2 >> eiD_2 >> semester_2 >> subject_2 >> coursenumSD_2 >> grade_2 >> gpa_2 >> courseHours2 >> code_2) // This loop will keep running until there's no data left
{
string eiD_base = eiD_2; // eiD_base was the variable I made to hold the "previous" student's ID, for comparison to the next line
while (sdStream2 >> eiD_2 >> semester_2 >> subject_2 >> coursenumSD_2 >> grade_2 >> gpa_2 >> courseHours2 >> code_2) // This loop unfortunately reads the entire file, defeating its intent
{
string eiD_temp = eiD_2; // eiD_temp was the variable I made to hold the current student ID, for comparison
if (eiD_base == eiD_temp)
{
outputStream2 << "Same line :( " << endl;
}
else
{
outputStream2 << eiD_2 << endl; // this is where you post the student data from the previous line!
}
}
}
After compiling and running the above, I came to the realization that this approach would not work because the second, nested loop, would run through every line in the FILE ONE without touching the first loop. I eventually figured out another method that used a counter instead:
// NOTE: The logic of the below code is as follows:
// Create a counter to note what the first student ID is.
// Store that value in eiD_Base when counter = 0. Increment counter.
// Now change eiD_Base everytime you find a line where eiD_temp
// differs from eiD_base.
string eiD_base;
string eiD_temp;
int counter = 0; // counter to help figure out what the first student ID was
while (sdStream2 >> eiD_2 >> semester_2 >> subject_2 >> coursenumSD_2 >> grade_2 >> gpa_2 >> courseHours2 >> code_2)
{
eiD_temp = eiD_2;
if (counter == 0)
{
eiD_base = eiD_2; // basically, set the first student ID to eiD_base when counter is 0. This counter is incremented only once.
counter++;
}
if (eiD_base == eiD_temp)
{
outputStream2 << "Same ID: " << eiD_2 << endl;
// NOTE: This is my first instinct as to where the code for calculating GPAs should go.
// The problem is that if that if the code is here, how do I factor in GPA data
// from a line that doesn't meet (eiD_base == eiD_temp)? I feel like that data would
// be jettisoned from calculations.
}
else
{
outputStream2 << "Previous ID: " << eiD_base << " and this is what eiD is now is now: " << eiD_temp << endl; // This is my first instict for
eiD_base = eiD_2; // if eiD_base !== eiD_temp, have eiD_base reset here.
}
}
That seemed closer to what I needed. However, I noticed another issue. With this method, when the variables I created to note changes in student id (eiD_base & eiD_temp) are not equal on a line of data, it seems like that line is jettisoned. Given that I need to calculate a number of things like GPA data for each student, having a method that doesn’t allow to accumulate data for the first line of a different student isn’t a good solution.
I don't know if I should dispense with the counter method entirely (in which case I would welcome recommendations of how best to replace it) or if my counter method is workable by placing the code for calculating GPAs more strategically. Any insight or help would be most welcome!

My answer style was my attempt of following: https://meta.stackexchange.com/questions/10811/how-do-i-ask-and-answer-homework-questions
A question you have is that you do not know if you should dispense with the counter method entirely (in which case you would welcome recommendations of how best to replace it) or if your counter method is workable by placing the code for calculating GPAs more strategically.
For the former, LiMuBei mentioned the method already. When you calculate more than one GPA (major gpa, gpa for just comp sci classes), you sum up the multiple GPA's with multiple variables.
For the latter, you would like to consider the unknown elements that vary the scenarios in each of the if/while statements. (counter == 0) is the scenario for the first line. (eiD_base == eiD_temp) is the scenario for the first line and the scenario when there are at least 2 lines, the current line has the same ID as the previous line. (eiD_base != eiD_temp) is the scenario when there are at least 2 lines, the current line has a different ID as the previous line. Here're the unknown elements: {1 line, at least 2 lines}, {sameID, differentID}. When the unknown element is {1 line}, you have to modify (counter == 0) and (eiD_base == eiD_temp). In (counter == 0), you modify the code that applies to the first and the only 1 line. In (eiD_base == eiD_temp), which applies to {1 line} and {at least 2 lines}, {sameID}, the code has to work for the 2 scenarios.
For the complete solution, you are going to declare variables before the while loop, aggregate variables in (eiD_base == eiD_temp), print the GPA values of the previous ID & set the variables for the first line of a new student in (eiD_base != eiD_temp), and print the GPA values of the last ID after the while loop.
double csci_Grape_Point;
// more variables for doing the calculation
while (sdStream2 >> eiD_2 >> semester_2 >> subject_2 >> coursenumSD_2 >> grade_2 >> gpa_2 >> courseHours2 >> code_2) {
eiD_temp = eiD_2;
if (counter == 0)
{
eiD_base = eiD_2;
counter++;
csci_Grape_Point = 0.0;
// more initialization of variables for doing the calculation
}
if (eiD_base == eiD_temp)
{
csci_Grape_Point = csci_Grape_Point + (gpa_2 * courseHours2);
// more sum calculation, such as total csci credit hours
}
else
{
outputStream2 << "Previous ID: " << eiD_base << " and this is what eiD is now is now: " << eiD_temp << endl;
eiD_base = eiD_2;
// for the previous ID, calculate gpa for just comp sci classes
// for the previous ID, calculate more gpa's
// set the variable to include the first line of data of a new student
csci_Grape_Point = (gpa_2 * courseHours2);
// set more variables for doing the calculation
}
}
// for the last ID, calculate gpa for just comp sci classes
// for the last ID, calculate more gpa's
Another question you have is about data calculation in (eiD_base == eiD_temp).
When a line doesn't meet (eiD_base == eiD_temp), the current line is different from the previous line. You factor in GPA data from the data you aggregate in (eiD_base == eiD_temp) and the data you set for the first line of a new student in (eiD_base != eiD_temp).
You probably want to solve a simpler problem first, with a file with 1 line and 2 lines, if the problem is not easily solved for you and you would like to attempt to do well in programming.

End array input with a newline?

Not sure if the title is properly worded, but what I am trying to ask is how would you signify the end of input for an array using newline. Take the following code for example. Not matter how many numbers(more or less) you type during the input for score[6], it must take 6 before you can proceed. Is there a method to change it so that an array can store 6 or 100 variables, but you can decide how many variables actually contain values. The only way I can think of doing this is to somehow incorporate '\n', so that pressing enter once creates a newline and pressing enter again signifies that you don't want to set any more values. Or is something like this not possible?
#include <iostream>
using namespace std;
int main()
{
int i,score[6],max;
cout<<"Enter the scores:"<<endl;
cin>>score[0];
max = score[0];
for(i = 1;i<6;i++)
{
cin>>score[i];
if(score[i]>max)
max = score[i];
}
return 0;
}

To detect "no input was given", you will need to read the input as a input line (string), rather than using cin >> x; - no matter what the type is of x, cin >> x; will skip over "whitespace", such as newlines and spaces.
The trouble with reading the input as lines is that you then have to "parse" the input into numbers. You can use std::stringstream or similar to do this, but it's quite a bit of extra code compared to what you have now.
The typical way to solve this kind of problem, however, is to use a "sentry" value - for example, if your input is always going to be greater or equal to zero, you can use -1 as the sentry. So you enter
1 2 3 4 5 -1
This would reduce the amount of extra code is relatively small - just check if the input is -1, such as
while(cin >> score[i] && score[i] >= 0)
{
...
}
(This will also detect end-of-file, so you could end the input with CTRL-Z or CTRL-D as appropriate for your platform)

Reading from a text file properly

I am reading string from a line in a text file and for some reason the the code will not read the whole text file. It reads to some random point and then stops and leaves out several words from a line or a few lines. Here is my code.
string total;
while(file >> word){
if(total.size() <= 40){
total += ' ' + word;
}
else{
my_vector.push_back(total);
total.clear();
}
Here is an example of a file
The programme certifies that all nutritional supplements and/or ingredients that bear the Informed-Sport logo have been tested for banned substances by the world class sports anti-doping lab, LGC. Athletes choosing to use supplements can use the search function above to find products that have been through this rigorous certification process.
It reads until "through" and leaves out the last four words.
I expected the output to be the whole file. not just part of it.
This is how I printed the vector.
for(int x = 0; x< my_vector.size(); ++x){
cout << my_vector[x];
}

You missed two things here:
First: in case when total.size() is not <= 40 i.e >40 it moves to else part where you just update your my_vector but ignore the current data in word which you read from the file. You actually need to to update the total after total.clear().
Second: when your loop is terminated you ignore the data in word as well. you need to consider that and push_back()in vector (if req, depends on your program logic).
So overall you code is gonna look like this.
string total;
while(file >> word)
{
if(total.size() <= 40)
{
total += ' ' + word;
}
else
{
my_vector.push_back(total);
total.clear();
total += ' ' + word;
}
}
my_vector.push_back(total);//this step depends on your logic
//that what u actually want to do

Your loop finishes when the end of file is read. However at this point you still have data in total. Add something like this after the loop:
if(!total.empty()) {
my_vector.push_back(total);
}
to add the last bit to the vector.

There are two problems:
When 40 < total.size() only total is pushed to my_vector but the current word is not. You should probably unconditionally append the word to total and then my_vector.push_back(total) if 40 < total.size().
When the loop terminated you still need to push_back() the content of total as it may not have reached a size of more than 40. That is, if total is no-empty after the loop terminated, you still need to append it to my_vector.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Problem reading a formatted text file in C++ - c++

Related

Time limit exceeded on test 10 code forces

how to ignore n integers from input

Comparing lines of data in the same file

End array input with a newline?

Reading from a text file properly

Categories

Resources