Parsing columns into arrays, while discriminating whats in the rows - c++

I'm trying to parse a text file that is outputted like the example below, my example has limited entries but my actual one has over 15000 lines, so i can't read these in individually:
ID IC TIME
15:23:43.867 /g/mydata/dataoutputfile.txt identifier
0003 1233 abcd
0043 eb54 abf3
000f 0bb4 ac24
000a a325 ac75
0023 0043 ac91
15:23:44.000 /g/mydata/dataoutputfile.txt identifier
0003 1233 abcd
0043 eb54 abf3
000f 0bb4 ac24
000a a325 ac75
0023 0043 ac91
Is kind of the output I have. The time column resets every so often.
What I am doing now is making 2 additional columns in addition to the 3 i have in my example. The first column is the conversion of the ID column, into a translation into an understandable message. The second additional column will calculate the difference between each time code, except when the time code resets.
My logic is, is to read each column into an array so I can perform the necessary translations and operations.
I am focusing on getting the timecode differential first, as I think getting the translation will be a bit simpler.
The problem I'm having is getting the entries read into their matrices:
my code looks a bit like this:
while(readOK && getline(myfile,line))
{
stringstream ss(line);
string ident,IC,timehex,time,filelocation;
string junk1,junk2;
int ID[count];
int timecode[count2];
int idx=0;
if(line.find("ID") !=string::npos)
{
readOK=ss>>ident>>IC>>timehex;
myfile2<<ident<<"\t\t"<<IC<<"\t\t"<<timehex<<"\t\t"<<"ID Decoded"<<"\t\t"<<"DT"<<endl;
myfile3<<"headers read"<<endl
}
else if(line.find("identifier") != string::npos)
{
readOK=ss>>time>>filelocation;
myfile3<<"time and location read";
myfile2<<time<<"\t\t"<<filelocation<<endl;
}
else //this is for the hex code lines
{
readOK=ss>>hex>>ID[idx]>>IC>>timecode[idx];
if (readOK)
{
myfile2<<setw(4)<<setfill('0')<<hex<<ID[1000]<<"\t\t"<<IC<<"\t\t"<<timecode[1000]<<endl;
myfile3<<"success reading info into arrays"<<endl;
}
else
myfile3<<"error reading hex codes"<<endl;
}
idx++;
}
Although this code doesn't work correctly. I can't just read in every line quite the same because of the intervening time and file location entries that are inserted to help keep track of when I am looking at in my code.
My gut is telling me that I'm calling the matrix entries too early and they haven't been filled yet, because if I cout number 1000, I get a 0 (i have well over 15000 lines in my input file and I have the boundaries of my arrays set dynamically in another part of my program).
I can't seem to figure out how to get the entries assigned correctly as I am having some inheritance issues with the count variable resetting to 0 every time through the loop.

Define int idx outside of the scope of the while loop (before the while). As it is now, each time through the loop it will be reset.

Related

Read a file in by columns in C++

I have a file that contains numbers:
23 899 234 12
12 366 100 14
10 256 500 23
14 888 564 30
How can I read this file by column using C++? I searched YouTube but they only read file by row. If I have to find the highest value from the first column, I need to read it by columns right?
Try initialising a variable called column =0
Now when you access row iterate through column using a for loop until end of column which can be found out by no of lines in a file, then implement the operation what you are willing to do, now increase column no for going to next column.
And you can get value in form of n*vectors
Where n is no of rows
Dimension of vector is length of column
Files are storages with sequential access, so , generally you have to read everything. Essentially it's like reading from a tape. And format of your file doesn't offer any shortcuts.
But, you can fast-forward and rewind along file, using seek() if it's a file on permanent storage. It's not effective and you have to know position where to go. If your records are of same size, you can advance by fixed amount of bytes.
That is usually done with binary formats and those formats are designed to have some kind directory or other auxiliary data to help searching for proper position..
Read each line and split it by space and store the first value as highest value. repeat the steps for all rows until you reach end of file while comparing the first value again with already stored value as highest.
I don't have C++ compiler available with myself to test. it should work for you conceptually.
#include <fstream>
#include <string>
#include <iostream>
#include<sstream>
using namespace std;
int main() {
ifstream inFile("Read.txt");
string line;
int greatest = 0;
while (getline(inFile, line)) {
stringstream ss(line);
string FirstColumn;
while (ss >> FirstColumn) {
if (stoi(FirstColumn) > greatest)
greatest = stoi(FirstColumn);
}
cout << greatest << endl;
}
}

How can I read from the same input file multiple times from different points within the file using sentinel values (-1) in c++?

I have a task where I have to read different sections of an input file(.txt) of integers in c++. The file contains an unknown number of positive integers, each separated by white-space with several sentinel values of -1 placed randomly in the list to "break-up" the list into sections and another -1 at the end of the file.
Here is a sample of my input file(.txt):
3 54 35 4 9 16 -1 14 57 32 4 6 8 41 2 -1 5 6 54 21 3 -1
Here is what I've attempted so far:
int data[20],
index = 0;
ifstream fin;
fin.open("data_file.txt");
while (index < 20 && data[index] != -1 && fin >> data[index])
{
cout << data[index] << endl;
index++;
}
I can't get this to read past the first SV even if I repeat this while loop. It always just starts at the beginning of the file.
How do I read again STARTING AFTER the first SV to the second SV? The only methods I know involve reading a file from beginning to end. How do I read seperate sections?
Thanks in advance for any help,
Cheers
It sounds like you just want to group information from the file. I will not provide code since you didn't, but I may help you with the logic:
Create a file object, 2d vector, and a string
Read from the file object to the string
if the value is equal to "-1", then add a new row. Else, add a new column
The result will be a 2d vector with the rows being each group, and the columns being each positive number in that group.

Reading from a file with multiple columns of integers and putting them into arrays

I am creating a command-line minesweeper game which has a save and continue capability. My code generates a file called "save.txt" which stores the position of the mines and the cells that the player has opened. It is separated into two columns delimited by a space where the left column represents the row of the cell and the right column represents the column of the cell in the matrix generated by my code. The following is the contents of save.txt after a sample run:
3 7
3 9
5 7
6 7
8 4
Mine end
2 9
1 10
3 5
1 1
Cell open end
You may have noticed Mine end and Cell open end. These two basically separate the numbers into two groups where the first one is for the position of the mines and the latter is for the position of the cells opened by the player. I have created a code which generates an array for each column provided that the text file contains integers:
int arrayRow[9];
int arrayCol[9];
ifstream infile("save.txt");
int a, b;
while(infile >> a >> b){
for(int i = 0; i < 9; i++){
arrayRow[i] = a;
arrayCol[i] = b;
}
}
As you can see, this won't quite work with my text file since it contains non-integer text. Basically, I want to create four arrays: mineRow, mineCol, openedRow, and openedCol as per described by the first paragraph.
Aside from parsing the string yourself and doing string operations, you can probably redefine the file format to have a header. Then you can parse the once and keep everything in numbers. Namely:
Let the Header be the first two rows
Row 1 = mineRowLen mineColLen
Row 2 = openedRowLen openedColLen
Row 3...N = data
save.txt:
40 30
20 10
// rest of the entries
Then you just read 40 for the mineRow, 30 for mineCol, 20 for openedRow, 10 for openedCol since you know their lengths. This will be potentially harder to debug, but would allow you to hide the save state better to disallow easy modification of it.
You can read the file line by line.
If the line matches "Mine end" or "Cell open end", continue;
Else, split the line by space (" "), and fill the array.

Reading integers from a file with mixed integers, letters, and spaces C++

This is a sort of self-imposed extra credit problem I'm adding to my current programming assignment which I finished a week early. The assignment involved reading in integers from a file with multiple integers per line, each separated by a space. This was achieved easily using while(inFile >> val) .
The challenge I put myself up to was to try and read integers from a file of mixed numbers and letters, pulling out all contiguous digits as separate integers composed of those digits. For examples if I was reading in the following line from a text file:
12f 356 48 r56 fs6879 57g 132e efw ddf312 323f
The values that would be read in (and stored) would be
12f 356 48 r56 fs6879 57g 132e efw ddf312 323f
or
12, 356, 48, 56, 6879, 57, 132, 312, and 323
I've spent all afternoon digging through cplusplus.com and reading cover to cover the specifics of get, getline, cin etc. and I am unable to find an elegant solution for this. Every method I can deduce involves exhaustive reading in and storing of each character from the entire file into a container of some sort and then going through one element at a time and pulling out each digit.
My question is if there is a way to do this during the process of reading them in from a file; ie does the functionality of get, getline, cin and company support that complex of an operation?
Read one character at a time and inspect it. Have a variable that maintains the number currently being read, and a flag telling you if you are in the middle of processing a number.
If the current character is a digit then multiple the current number by 10 and add the digit to the number (and set the "processing a number" flag).
If the current character isn't a digit and you were in the middle of processing a number, you have reached the end of the number and should add it to your output.
Here is a simple such implementation:
std::vector<int> read_integers(std::istream & input)
{
std::vector<int> numbers;
int number = 0;
bool have_number = false;
char c;
// Loop until reading fails.
while (input.get(c)) {
if (c >= '0' && c <= '9') {
// We have a digit.
have_number = true;
// Add the digit to the right of our number. (No overflow check here!)
number = number * 10 + (c - '0');
} else if (have_number) {
// It wasn't a digit and we started on a number, so we hit the end of it.
numbers.push_back(number);
have_number = false;
number = 0;
}
}
// Make sure if we ended with a number that we return it, too.
if (have_number) { numbers.push_back(number); }
return numbers;
}
(See a live demo.)
Now you can do something like this to read all integers from standard input:
std::vector<int> numbers = read_integers(std::cin);
This will work equally well with an std::ifstream.
You might consider making the function a template where the argument specifies the numeric type to use -- this will allow you to (for example) switch to long long int without altering the function, if you know the file is going to contain large numbers that don't fit inside of an int.

C++ check if the last thing read from file was a number

this is quite a primitive problem, so I guess the solution shouldn't be hard, but I didn't find a way how to do it simply, neither have I summarized it to actually find it in the internet.So going to the question, I have a file of information like this:
1988 Godfather 3 33 42
1991 Dance with Wolves 3 35 43
1992 Silence of the lambs 3 33 44
And I have a requirement to put all the information in a data structure, so lets say it will be int year, string name and three more int types for numbers. But how do I know if the next thing I read is a number or not? I never know how long is the word.Thank you in advance for anyone who took their time with such a primitive problem. :)
EDIT: Don't consider movies with numbers in their title.
You're going to have some major issues when you go to try to parse other movies, like, Free Willy 2.
You might try instead to treat it as a std::stringstream and rely on the last three chunks being the data you're looking for rather than generalizing with a Regular Expression.
your best bet would be to use C++ regex
That would give you a more fine grained control over what you want to parse.
examples:
year -> \d{4}
word -> \w+
number->\d+
If you do not have control over the file format, you may want to do something along these lines (pseudo-process):
1) read in the line from the file
2) reverse the order of the "words" in the file
3) read in the 3 ints first
4) read in the rest of the stream as a string
4) reverse the "words" in the new string
5) read in the year
6) the remainder will be the movie title
Read every field as a string and then convert the appropriate string to integers.
1)initially
1983
GodFather
3
33
45
are all strings and stored in a vector of strings (vector<string>).
2)Then 1983(1st string is converted to integer using atoi) and last three strings are also converted to integers. Rest of the strings constitute the movie_name
Following code has been written under the assumption that input file has already been validated for the format.
// open the input file for reading
ifstream ifile(argv[1]);
string input_str;
//Read each line
while(getline(ifile,input_str)) {
stringstream sstr(input_str);
vector<string> strs;
string str;
while(sstr>>str)
strs.push_back(str);
//use the vector of strings to initialize the variables
// year, movie name and last three integers
unsigned int num_of_strs = strs.size();
//first string is year
int year = atoi(strs[0].c_str());
//last three strings are numbers
int one_num = atoi(strs[num_of_strs-3].c_str());
int two_num = atoi(strs[num_of_strs-2].c_str());
int three_num = atoi(strs[num_of_strs-1].c_str());
//rest correspond to movie name
string movie_name("");
//append the strings to form the movie_name
for(unsigned int i=1;i<num_of_strs-4;i++)
movie_name+=(strs[i]+string(" "));
movie_name+=strs[i];
IMHO Changing delimiters in the file from space to some other character like , or ; or : , will simplify the parsing significantly.
For example , if later on the data specifications change and instead of only last three , either last three or last four can be integers then the code above will need major refactoring.