Is there anyway to reset the filein to the initial state? - c++

I am trying to input data from a text file in C++.
The text file is in that format:
4 15
3 516
25 52 etc.
Each line contains two integers. I don't know the number of lines in the file so I can bind enough memory and this is what I have come into as a way to solve that:
ifstream filein;
filein.open("text.txt",ios::in);
int count=0;
while (!filein.eof())
{
count++;
filein>>temporary;
}
count=count/2; // This is the number of lines in the text file.
My problem is that I can't figure out a way to reset
filein
into the initial state (to the beggining of the file so I can actually input the data) other than closing the input stream and opening it again. Is there any other way to do that?

Rather than answer the question you asked, I'm going to answer the question you didn't ask, namely:
Q: How can I read in all the lines of the file if I don't know how many lines there are?
A: Use a std::vector<>.
If you want to read in all of the numbers, regardless of pairing:
// all code fragments untested. typos are possible
int i;
std::vector<int> all_of_the_values;
while(filein >> i)
all_of_the_values.push_back(i);
If you want to read in all of the numbers, putting alternating numbers into different data structures:
int i, j;
std::vector<int> first_values;
std::vector<int> second_values;
while(filein >> i >> j) {
first_values.push_back(i);
second_values.push_back(j);
If you want to read in all of the numbers, storing them in some sort of data structure:
int i, j;
struct S {int i; int j;};
std::vector<S> values;
while(filein >> i >> j) {
S s = {i, j};
values.push_back(s);
}
Finally, if you want to read the file a line at a time, keeping the first two numbers from each line, discarding the remainder of each line, and storing them a user-defined data structure:
std::vector<MyClass> v;
std::string sline;
while(std::getline(filein, sline)) {
std::istringstream isline(sline);
int i, j;
if(isline >> i >> j) {
values.push_back(MyClass(i, j));
}
}
Aside: never use eof() or good() in a loop conditional. Doing so almost always produces buggy code, as it would have in your case. Instead prefer invoking the input function in the condition, as I have done above.

I think #Robᵩ has pretty much the right idea -- instead of reading through all the data just to count the number of lines, then reading through the whole file again to actually read the data, using something like std::vector (or std::deque) that will expand as needed as you read the data.
In a typical case, however, the two numbers on a line are going to be related to each other, and you typically want to store them in a way that shows that association directly. For example, they might be the X and Y coordinates of points, in which case you want to read points:
class point {
int x, y;
};
std::istream &operator>>(std::istream &is, point &p) {
return is >> p.x >> p.y;
}
std::ifstream in("myfile.txt");
// create the vector from the data in the file:
std::vector<point> points((std::istream_iterator<point>(in)),
std::istream_iterator<point>());
On a slightly different note: even if you decide you want to use an explicit loop, please don't use while (!whatever.eof()) to do it -- that's pretty much guaranteed to fail. You want to check that reading data succeeded, so (for example) using the point class above, you could use something like:
point p;
while (infile >> p)
points.push_back(p);

The function is: filein.seekg (0, ios::beg);
Here is a Reference
You should also use filein.clear() to reset the eof bit in the file if you do it this way.
And, of course, if you want the best method for what you are ultimately trying to do, Robᵩ's answer is much better, albeit more involved.

Related

C++ Read in file element by element, but executing functions every line

I have a file that I need to read in. Each line of the file is exceedingly long, so I'd rather not read each line to a temporary string and then manipulate those strings (unless this isn't actually inefficient - I could be wrong). Each line of the file contains a string of triplets - two numbers and a complex number, separated by a colon (as opposed to a comma, which is used in the complex number). My current code goes something like this:
while (states.eof() == 0)
{
std::istringstream complexString;
getline(states, tmp_str, ':');
tmp_triplet.row() = stoi(tmp_str);
getline(states, tmp_str, ':');
tmp_triplet.col() = stoi(tmp_str);
getline(states, tmp_str, ':');
complexString.str (tmp_str);
complexString >> tmp_triplet.value();
// Then something useful done with the triplet before moving onto the next one
}
tmp_triplet is a variable that stores these three numbers. I want some way to run a function every line (specifically, the triplets in every line are pushed into a vector, and each line in the file denotes a different vector). I'm sure there's an easy way to go about this, but I just want a way to check whether the end of the line has been reached, and to run a function when this is the case.
When trying to plan stuff out, abstraction can be your best friend. If you break down what you want to do by abstract functionality, you can more easily decide what data types should be used and how different data types should be planned out, and often you can find some functions almost write themselves. And typically, your code will be more modular (almost by definition), which will make it easy to reuse, maintain, and adapt if future changes are needed.
For example, it sounds like you want to parse a file. So that should be a function.
To do that function, you want to read in the file lines then process the file lines. So you can make two functions, one for each of those actions, and just call the functions.
To read in file lines you just want to take a file stream, and return a collection of strings for each line.
To process file lines you want to take a collection of strings and for each one parse the string into a triplet value. So you can create a method that takes a string and breaks it into a triplet, and just use that method here.
To process a string you just need to take a string and assign the first part as the row, the second part as the column, and the third part as the value.
struct TripletValue
{
int Row;
int Col;
int Val;
};
std::vector<TripletValue> ParseFile(std::istream& inputStream)
{
std::vector<std::string> fileLines = ReadFileLines(inputStream);
std::vector<TripletValue> parsedValues = GetValuesFromData(fileLines);
return parsedValues;
}
std::vector<std::string> ReadFileLines(std::istream& inputStream)
{
std::vector<std::string> fileLines;
while (!inputStream.eof())
{
std::string fileLine;
getline(inputStream, fileLine);
fileLines.push_back(fileLine);
}
return fileLines;
}
std::vector<TripletValue> GetValuesFromData(std::vector<std::string> data)
{
std::vector<TripletValue> values;
for (int i = 0; i < data.size(); i++)
{
TripletValue parsedValue = ParseLine(data[i]);
values.push_back(parsedValue);
}
return values;
}
TripletValue ParseLine(std::string fileLine)
{
std::stringstream sstream;
sstream << fileLine;
TripletValue parsedValue;
std::string strValue;
sstream >> strValue;
parsedValue.Row = stoi(strValue);
sstream >> strValue;
parsedValue.Col = stoi(strValue);
sstream >> strValue;
parsedValue.Val = stoi(strValue);
return parsedValue;
}

How does one correctly store data into an array struct with stringstream? [duplicate]

This question already has answers here:
Why does reading a record struct fields from std::istream fail, and how can I fix it?
(9 answers)
Closed 6 years ago.
I was wondering how to store data from a CSV file into a structured array. I realize I need to use getline and such and so far I have come up with this code:
This is my struct:
struct csvData //creating a structure
{
string username; //creating a vector of strings called username
float gpa; //creating a vector of floats called gpa
int age; //creating a vector of ints called age
};
This is my data reader and the part that stores the data:
csvData arrayData[10];
string data;
ifstream infile; //creating object with ifstream
infile.open("datafile.csv"); //opening file
if (infile.is_open()) //error check
int i=0;
while(getline(infile, data));
{
stringstream ss(data);
ss >> arrayData[i].username;
ss >> arrayData[i].gpa;
ss >> arrayData[i].age;
i++;
}
Further, this is how I was attempting to print out the information:
for (int z = 0; z<10; z++)
{
cout<<arrayData[z].username<<arrayData[z].gpa<<arrayData[z].age<<endl;
}
However, when running this command, I get a cout of what seem to be random numbers:
1.83751e-0383 03 4.2039e-0453 1.8368e-0383 07011688
I assume this has to be the array running not storing the variables correctly and thus I am reading out random memory slots, however, I am unsure.
Lastly, here is the CSV file I am attempting to read.
username,gpa,age
Steven,3.2,20
Will,3.4,19
Ryan,3.6,19
Tom,3,19
There's nothing in your parsing code that actually attempts to parse the single line into the individual fields:
while(getline(infile, data));
{
This correctly reads a single line from the input file into the data string.
stringstream ss(data);
ss >> arrayData[i].username;
ss >> arrayData[i].gpa;
ss >> arrayData[i].age;
You need to try to explain to your rubber duck how this is supposed to take a single line of comma-separated values, like the one you showed in your question:
Steven,3.2,20
and separate that string into the individual values, by commas. There's nothing about the >> operator that will do this. operator>> separates input using whitespaces, not commas. Your suspicions were correct, you were not parsing the input correctly.
This is a task that you have to do yourself. I am presuming that you would like, as a learning experience, or as a homework assignment, to do this yourself, manually. Well, then, do it yourself. You have the a single line in data. Use any number of tools that C++ gives you: the std::string's find() method, or std::find() from <algorithm>, to find each comma in the data string, then extract each individual portion of the string that's between each comma. Then, you still need to convert the two numeric fields into the appropriate datatypes. And that's when you put each one of them into a std::istringstream, and use operator>> to convert them to numeric types.
But, having said all that, there's an alternative dirty trick, to solve this problem quickly. Recall that the original line in data contains
Steven,3.2,20
All you have to do is replace the commas with spaces, turning it into:
Steven 3.2 20
Replacing commas with spaces is trivial with std::replace(), or with a small loop. Then, you can stuff the result into a std::istringstream, and use operator>> to extract the individual whitespace-delimited values into the discrete variables, using the code that you've already written.
Just a small word of warning: if this was indeed your homework assignment, to write code to manually parse and extract comma-delimited values, it's not guaranteed that your instructor will give you the full grade for taking the dirty-trick approach...
UNDER CONSTRUCTION
Ton, nice try and nice complete question. Here is the answer:
1) You have a semicolon after the loop:
while(getline(infile, data));
delete it.
How did I figure that out easily? I compiled with all the warnings enabled, like this:
C02QT2UBFVH6-lm:~ gsamaras$ g++ -Wall main.cpp
main.cpp:24:33: warning: while loop has empty body [-Wempty-body]
while(getline(infile, data));
^
main.cpp:24:33: note: put the semicolon on a separate line to silence this warning
1 warning generated.
In fact, you should get that warning without -Wall as well, but get into using it, it will also make good to you! :)
2) Then, you read some elements, but not 10, so why do you print 10? Print as many as the ones you actually read, i.e. i.
When you try to print all 10 elements of your array, you print elements that are not initialized, since you didn't initialize your array of structs.
Moreover, the number of lines in datafile.csv was less than 10. So you started populating your array, but you stopped, when the file didn't have more lines. As a result, some of the elements of your array (the last 6 elements) remained uninitialized.
Printing uninitialized data, causes Undefined Behavior, that's why you see garbage values.
3) Also this:
if (infile.is_open()) //error check
could be written like this:
if (!infile.is_open())
cerr << "Error Message by Mr. Tom\n";
Putting them all together:
WILL STILL NOT WORK, BECAUSE ss >> arrayData[i].username; eats the entire input line and the next two extractions fail, as Pete Becker said, but I leave it here, so that others won't make the same attempt!!!!!!!
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;
struct csvData //creating a structure
{
string username; //creating a vector of strings called username
float gpa; //creating a vector of floats called gpa
int age; //creating a vector of ints called age
};
int main()
{
csvData arrayData[10];
string data;
ifstream infile; //creating object with ifstream
infile.open("datafile.csv"); //opening file
if (!infile.is_open()) { cerr << "File is not opened..\n"; }
int i=0;
while(getline(infile, data))
{
stringstream ss(data);
ss >> arrayData[i].username;
ss >> arrayData[i].gpa;
ss >> arrayData[i].age;
i++;
}
for (int z = 0; z< i; z++)
{
cout<<arrayData[z].username<<arrayData[z].gpa<<arrayData[z].age<<endl;
}
return 0;
}
Output:
C02QT2UBFVH6-lm:~ gsamaras$ g++ -Wall main.cpp
C02QT2UBFVH6-lm:~ gsamaras$ ./a.out
username,gpa,age00
Steven,3.2,2000
Will,3.4,1900
Ryan,3.6,1900
Tom,3,1900
But wait a minute, so now it works, but why this:
while(getline(infile, data));
{
...
}
didn't?
Because, putting a semicolon after a loop is equivalent to this:
while()
{
;
}
because as you probably already know loops with only one line as a body do not require curly brackets.
And what happened to what I thought it was the body of the loop (i.e. the part were you use std::stringstream)?
It got executed! But only once!.
You see, a pair of curly brackets alone means something, it's an anonymous scope/block.
So this:
{
stringstream ss(data);
ss >> arrayData[i].username;
ss >> arrayData[i].gpa;
ss >> arrayData[i].age;
i++;
}
functioned on its one, without being part of the while loop, as you intended too!
Any why did it work?! Because you had declared i before the loop! ;)

How to read in a data file of unknown dimensions in C/C++

I have a data file which contains data in row/colum form. I would like a way to read this data in to a 2D array in C or C++ (whichever is easier) but I don't know how many rows or columns the file might have before I start reading it in.
At the top of the file is a commented line giving a series of numbers relating to what each column holds. Each row is holding the data for each number at a point in time, so an example data file (a small one - the ones i'm using are much bigger!) could be like:
# 1 4 6 28
21.2 492.1 58201.5 586.2
182.4 1284.2 12059. 28195.2
.....
I am currently using Python to read in the data using numpy.loadtxt which conveniently splits the data in row/column form whatever the data array size, but this is getting quite slow. I want to be able to do this reliably in C or C++.
I can see some options:
Add a header tag with the dimensions from my extraction program
# 1 4 6 28
# xdim, ydim
21.2 492.1 58201.5 586.2
182.4 1284.2 12059. 28195.2
.....
but this requires rewriting my extraction programs and programs which use the extracted data, which is quite intensive.
Store the data in a database file eg. MySQL, SQLite etc. Then the data could be extracted on demand. This might be a requirement further along in the development process so it might be good to look into anyway.
Use Python to read in the data and wrap C code for the analysis. This might be easiest in the short run.
Use wc on linux to find the number of lines and number of words in the header to find the dimensions.
echo $((`cat FILE | wc -l` - 1)) # get number of rows (-1 for header line)
echo $((`cat FILE | head -n 1 | wc -w` - 1)) # get number of columns (-1 for '#' character)
Use C/C++ code
This question is mostly related to point 5 - if there is an easy and reliable way to do this in C/C++. Otherwise any other suggestions would be welcome
Thanks
Create table as vector of vectors:
std::vector<std::vector<double> > table;
Inside infinite (while(true)) loop:
Read line:
std::string line;
std::getline(ifs, line);
If something went wrong (probably EOF), exit the loop:
if(!ifs)
break;
Skip that line if it's a comment:
if(line[0] == '#')
continue;
Read row contents into vector:
std::vector<double> row;
std::copy(std::istream_iterator<double>(ifs),
std::istream_iterator<double>(),
std::back_inserter(row));
Add row to table;
table.push_back(row);
At the time you're out of the loop, "table" contains the data:
table.size() is the number of rows
table[i] is row i
table[i].size() is the number of cols. in row i
table[i][j] is the element at the j-th col. of row i
How about:
Load the file.
Count the number of rows and columns.
Close the file.
Allocate the memory needed.
Load the file again.
Fill the array with data.
Every .obj (3D model file) loader I've seen uses this method. :)
Figured out a way to do this. Thanks go mostly to Manuel as it was the most informative answer.
std::vector< std::vector<double> > readIn2dData(const char* filename)
{
/* Function takes a char* filename argument and returns a
* 2d dynamic array containing the data
*/
std::vector< std::vector<double> > table;
std::fstream ifs;
/* open file */
ifs.open(filename);
while (true)
{
std::string line;
double buf;
getline(ifs, line);
std::stringstream ss(line, std::ios_base::out|std::ios_base::in|std::ios_base::binary);
if (!ifs)
// mainly catch EOF
break;
if (line[0] == '#' || line.empty())
// catch empty lines or comment lines
continue;
std::vector<double> row;
while (ss >> buf)
row.push_back(buf);
table.push_back(row);
}
ifs.close();
return table;
}
Basically create a vector of vectors. The only difficulty was splitting by whitespace which is taken care of with the stringstream object. This may not be the most effective way of doing it but it certainly works in the short term!
Also I'm looking for a replacement for the deprecated atof function, but nevermind. Just needs some memory leak checking (it shouldn't have any since most of the objects are std objects) and I'm done.
Thanks for all your help
Do you need a square or a ragged matrix? If the latter, create a structure like this:
std:vector < std::vector <double> > data;
Now read each line at a time into a:
vector <double> d;
and add the vector to the ragged matrix:
data.push_back( d );
All data structures involved are dynamic, and will grow as required.
I've seen your answer, and while it's not bad, I don't think it's ideal either. At least as I understand your original question, the first comment basically specifies how many columns you'll have in each of the remaining rows. e.g. the one you've given ("1 4 6 28") contains four numbers, which can be interpreted as saying each succeeding line will contain 4 numbers.
Assuming that's correct, I'd use that data to optimize reading the data. In particular, after that, (again, as I understand it) the file just contains row after row of numbers. That being the case, I'd put all the numbers together into a single vector, and use the number of columns from the header to index into the rest:
class matrix {
std::vector<double> data;
int columns;
public:
// a matrix is 2D, with fixed number of columns, and arbitrary number of rows.
matrix(int cols) : columns(cols) {}
// just read raw data from stream into vector:
std::istream &read(std::istream &stream) {
std::copy(std::istream_iterator<double>(stream),
std::istream_iterator<double>(),
std::back_inserter(data));
return stream;
}
// Do 2D addressing by converting rows/columns to a linear address
// If you want to check subscripts, use vector.at(x) instead of vector[x].
double operator()(size_t row, size_t col) {
return data[row*columns+col];
}
};
This is all pretty straightfoward -- the matrix knows how many columns it has, so you can do x,y indexing into the matrix, even though it stores all its data in a single vector. Reading the data from the stream just means copying that data from the stream into the vector. To deal with the header, and simplify creating a matrix from the data in a stream, we can use a simple function like this:
matrix read_data(std::string name) {
// read one line from the stream.
std::ifstream in(name.c_str());
std::string line;
std::getline(in, line);
// break that up into space-separated groups:
std::istringstream temp(line);
std::vector<std::string> counter;
std::copy(std::istream_iterator<std::string>(temp),
std::istream_iterator<std::string>(),
std::back_inserter(counter));
// the number of columns is the number of groups, -1 for the leading '#'.
matrix m(counter.size()-1);
// Read the remaining data into the matrix.
m.read(in);
return m;
}
As it's written right now, this depends on your compiler implementing the "Named Return Value Optimization" (NRVO). Without that, the compiler will copy the entire matrix (probably a couple of times) when it's returned from the function. With the optimization, the compiler pre-allocates space for a matrix, and has read_data() generate the matrix in place.

Fastest way to read a file line by line with an arbitrary number of characters in each

Ok, I'm trying to figure out which way would be faster to read a text file that I'm working with. The contents of the file look like this
1982 3923 3542 4343
2344 3453 2
334 423423 32432 23423
They're basically just an arbitrary number of int numbers and I need to read line by line. Would it be better to use getline or the insertion (>>) operator? I, personally, think it would be a lot easier to implement by using the insertion operator but I don't know how I would make the program so that it reads all of the int numbers in the same line until it reaches the end. I was thinking of setting it up like the following:
ifstream input;
input.open("someFile.txt");
if (input) {
char* ch;
while (ch != '\n\)
getline(input, buffer, ' ')
The only problem is that I have to do a conversion to an int, then put each int in an array. My desired end goal is to produce a two-dimensional array where each line of int's is an array of int's. Any suggestions as to the best implementation is appreciated!
I would keep it real simple:
ifstream in(...);
string line;
while (getline(in, line)) {
istringstream line_in(line);
while (line_in) {
int val = 0;
if (line_in >> val) {
// Do something with val
}
}
// eol
}
You'd have to benchmark to get a correct answer.
The speed of the two functions is implementation defined. You might get different results on different compilers.
Fastest way to do it would probably to use a custom-made finite state machine. But those are about as unreadable as you get.
Produce correct code first. Then fine tune it if you need to later.

reading data from files, file name as input

I am writing a program which reads data from different files, which are given as input strings, and stores them into a vector of vectors. The problem I am not able to debug the loop which reads different files. I have closed the ifstream object, cleared the string using empty function... but still it just terminates when i give second file name as input.
I am copying the code for your perusal. It is a function called by another another function. Transposectr transposes a matrix.
code:
vector<vector<float> > store1,store2;
ifstream bb;
string my_string;
float carrier;
vector<float> buffer;
cout<<"enter the file name"<<endl;
getline(cin,my_string);
while (my_string!="end")
{
bb.open(my_string.c_str());
while (!bb.eof())
{
bb >> carrier;
if (bb.peek() == '\n' || bb.eof() )
{
buffer.push_back(carrier);
store1.push_back(buffer);
buffer.clear();
}
else
{
buffer.push_back(carrier);
}
}
bb.close();
buffer.clear();
transposectr1(store1);
storex.push_back(store1[1]);
storey.push_back(store1[0]);
store1.clear();
my_string.empty();
cout<<"done reading the file"<<endl;
cout<<"enter the file name"<<endl;
getline(cin,my_string);
}
I'm really not clear what you are trying to do. But I have one golden ruile when it comes to using istreams:
Never use the eof() function!
It almost certainly does not do what you think it does. Instead you should test if a read operation succeeded.
int x;
while( in >> x ) {
// I read something successfully
}
You might also want to avoid peek() too. Try re-writing your code with this advice in mind.
Add
bb.clear();
after the bb.close() you may get the right thing. bb.close() doesn't reset the cursor I think.
Neil Butterworth is right
Never use the eof() function!
This link explains why.