C++ line jumping and file reading

C++ line jumping and file reading - c++

I have a file with numbers in it. I would like to read certain lines (ones that haven't been read already but are no long easily accessible due to the way my code runs)
for example..
I have code like
for (c=0; c < 5;c++)
{
in >> tmp;
}
when implemented this reads 5 parts of the first line (lines are all the same length).
I want to be able to call this same section of code again and be able to read the second..third.ect
what do I need to do to make this work

Assuming in is an input stream (istream), you can use its seekg method in order to seek back to the beginning of the file.
// read it the first time
for (c=0; c < 5;c++)
{
in >> tmp;
}
in.seekg(0, in.beg); // seek to the beginning
// read it the second time
for (c=0; c < 5;c++)
{
in >> tmp;
}
Check out the documentation of the seekg method.

Related

Is there a data structure for implementing a function equivalent to 'tail -n' command in C++?

I want to write a function equivalent to the Linux tail -n command in C++. While, I parsed over the data of that file line-by-line thereby incrementing the line count, if the file size gets really big(~gigabytes), this method will take a lot of time! Is there a better approach or a data structure to implement this function?
Here are my 2 methods:
int File::countlines()
{
int lineCount = 0;
string str;
if (file)
{
while (getline(file, str))
{
lineCount += 1;
}
}
return lineCount;
}
void File::printlines()
{
int lineCount = 0;
string line;
if (file)
{
lineCount = countlines();
file.clear();
file.seekg(ios::beg);
if (lineCount <= 10)
{
while (getline(file, line))
{
cout << line << endl;
}
}
else
{
int position = lineCount - 10;
while (position--)
{
getline(file, line);
}
while (getline(file, line))
{
cout << line << endl;
}
}
}
}
This method is time consuming if the file size increases, so I want to either replace it with another data structure, or write a more efficient code.

One of the things that is slowing down your program is reading the file twice, so you could keep the last n EOL positions (n=10 in your program) and the most convenient data structure is a circular buffer but this isn't provided by the standard library as far as I know (boost has one). It can be implemented by an std::vector with size n, with an index where a modulo of n is done after incrementing.
With that circular buffer, you can jump immediately to the lowest offset (next one if buffer is full) in the file and print the needed lines.

When I've done this, I've done a generous estimate of the maximum length of a line (e.g., one kilobyte), seeked to that distance from the end, and started reading lines into a circular buffer until the end of the file.
In nearly every case, you get more than n lines, so you just print out the contents of the circular buffer, and you're done. Note, however, that you do need to assure that you read more than n lines, not just n lines. The first line you read will usually only be a partial line, so if you read exactly n lines, the first would probably be only a partial line.
On rare occasion, you haven't gotten the required number of lines, so you seek back twice as far (or other factor of your choice), and restart. If you want to get really fancy, you can extrapolate the number of lines you'll need based on the average length of the lines you did read (but honestly, this is such a rare situation it's not worth a lot of work to optimize it).
This normally works essentially instantly, regardless of file size. I suppose (in theory) for a file with incredibly long lines, it would get slower, but if that's the case, the user has probably made a mistake, and tried to tail something that isn't a text file (which is generally useless anyway).

Recover ifstream from failed read

I currently have a piece of code that reads in text from a file using an ifstream. Each line of this file corresponds to a different piece of data that must be encoded into a struct. My "encodeLine" function takes care of this.
For safety, I want my system to be able to handle data that is too big to fit into its variable. For example, if the number 999999999 is read into a short, I want the program to be able to continue on reading the rest of the lines.
Currently, when I encounter data like this, I print out "ERROR" and clear the stream. However, when I perform more reads, the data that is read is corrupted. For example, on the next line the number "1" should be read, but instead something like 27021 is read.
How can the ifstream be reset to continue with valid reads?
Here is my code:
ifstream inputStream;
inputStream.open(foo.txt);
char token[64];
int totalSize = 0;
// Priming read
inputStream >> token;
while(inputStream.good())
{
// Read and encode line of data from file
totalSize = totalSize + encodeLine(inputStream, &recordPtr, header, filetypeChar)
if(!inputStream.eof() && !inputStream.good())
{
printf("ERROR");
inputStream.clear();
}
else if(inputStream.eof())
{
break;
}
inputStream >> token;
}

does fstream move to the next position after read in a binary integer (c++)

I am trying to read in a binary file and write in chunks to multiple output files. Say if there are 25 4byte numbers in total and chunk size is set to 20, the program will generate two output files one with 20 integers the other with 5. However if I have a input file with 40 integers, my program generates three files, first 2 files both have 20 numbers, however the third file has one number which is the last one from the input file and it is already included in the second output file. How do I force the read position to move forward every time reading a number?
ifstream fin("in.txt", ios::in | ios::binary);
if(fin.is_open())
{
while(!fin.eof()){
//set file name for each output file
fname[0] = 0;
strcpy(fname, "binary_chunk");
index[0] = 0;
sprintf(index, "%d", fcount);
strcat(fname, index);
// open output file to write
fout.open(fname);
for(i = 0; i < chunk; i++)
{
fin.read((char *)(&num), INT_SIZE);
fout << num << "\n";
if(fin.eof())
{
fout.close();
fin.close();
return;
}
}
fcount ++;
fout.close();
}
fout.close();
}

The problem is most likely your use of while (!fin.eof()). The eofbit flag is not set until after you have tried to read from beyond the end of the file. This means that the loop will loop one extra time without you noticing.
Instead you should remember that all input operations returns the stream object, and that stream objects can be used as boolean conditions. That means you can do like this:
while (fin.read(...))
This is safe from the problems with looping while !fin.eof().
And to answer your question in the title: Yes, the file position is moved when you successfully read anything. If you read X bytes, the read-position will be moved forward X bytes as well.

reading until the end of file in C++

I'm trying to read till the end of a file for a phonebook app that im converting from C to C++. When I print the the results from the file i get this:
johnny smith
(Home)3
(Cell)4
x☺> x☺>
(Home)4
(Cell)4
it should print:
johnny smith
(Home)3
(Cell)4
Right now I'm using while(!infile.eof()) which i've read is a poor practice, but when I use infile.getline() I get a repeat of the first and last name, and the format is all jacked up. Is there anyway(or another way) to get rid of the junk at the end of the input or another way to read till the end of file in C++ that fixes this. I've been reading about different solutions, but the one a lot of sites seem to agree on is fgets, which is what I had with the original C version, but obviously fgets doesn't work with ifstream which is what I'm using. here is the code:
void contacts:: readfile(contacts*friends ,int* counter, int i,char buffer[],char user_entry3[])
{
ifstream read;
read.open(user_entry3,ios::in);
int len;
contacts temp;
*counter=0;
i=0;
while (!read.eof()) {
temp.First_Name=(char*)malloc(36);
temp.Last_Name=(char*)malloc(36);
read>>temp.First_Name>>temp.Last_Name;
read>>buffer;
len=strlen(buffer);
if(buffer[len-1]=='\n')
buffer[len-1]='\0';
temp.home=(char*)malloc(20);
strcpy(temp.home, buffer);
read>>buffer;
len=strlen(buffer);
if(buffer[len-1]=='\n')
buffer[len-1]='\0';
temp.cell=(char*)malloc(20);
strcpy(temp.cell, buffer);
friends[i].First_Name=(char*)malloc(MAXNAME);
friends[i].Last_Name=(char*)malloc(MAXNAME);
friends[i].home=(char*)malloc(MAXPHONE);
friends[i].cell=(char*)malloc(MAXPHONE);
//adds file content to the structure
strcpy(friends[*counter].First_Name,temp.First_Name);
strcpy(friends[*counter].Last_Name,temp.Last_Name);
strcpy(friends[*counter].home,temp.home);
strcpy(friends[*counter].cell,temp.cell);
(*counter)++;
i++;
}
//closes file and frees memory
read.close();
free(temp.Last_Name);
free(temp.First_Name);
free(temp.home);
free(temp.cell);
}

Don't use !eof(). It checks whether the last read failure was due to reaching the end of the file. It does not predict the future.
Don't use malloc in C++. If you do, check the return value for errors!
Don't use operator>> for char *. There's no size check so that's just asking for buffer overflows.
The '\n' check on buffer is useless. operator>> for strings stops at whitespace.
You're blindly strcpying a string of unknown length into temp.home of size 20. That's another buffer overflow.
... I kind of stopped reading there. If you want to read stuff from a file but stop on eof/error, you can do something like this:
.
string a, b, c;
while (true) {
if (!(in >> a)) break;
if (!(in >> b)) break;
if (!(in >> c)) break;
do_stuff_with(a, b, c);
}

Do not use eof() to determine if you reached end of file. Instead, read what you want to read and then check if you successfully read the data. Obce reading failed you may use eof() to determine if the error is down to having reached the end of the file before producing an error report about a format error.
Since you mentioned that you read that using !infile.eof() is good practice: Can you point us at the source of this wrong information? This information need correction.

Using multiple instances of getline in C++

I've been working on a class assignment for C++ and we're required to acquire input from a text file and assign those values to an array....one is a string, the second an int, and the third a double.
We've only been introduced to arrays and I don't know anything yet about pointers or linked lists, or any of the higher end stuff, so I feel like I'm somewhat limited in my options. I've worked all day trying to figure out a way to acquire input from the text file and assign it to the appropriate array. I've tried to use getline to read the input file and set a delimiter to separate each piece of data but I get an error when I try to use it more than once. From what I've read, this has to do with how I'm overloading the function but I'm at a loss at resolving it. Every explanation I've read about it goes beyond my current level of familiarity. Right now, I'm focused on this fragment of code:
for (int i = 0; i < EMP_NUM; i++) // Get input from text file for names.
getline(inFile, nameAr[i], '*');
for (int i = 0; i < EMP_NUM; i++) // Input for hours.
getline(inFile, hoursAr[i], '*');
for (int i=0; i < EMP_NUM; i++) // Input for hourly rate.
getline(inFile, hrateAr[i], '*');
I'm trying to use getline three times and write the data to three separate arrays, then make a series of calculations with them later and output them to another text file. The first instance of getline doesn't produce any compiler errors but the latter two do. I'm not quite sure of another solution to get the data into my arrays, so I'm at a loss. Any help would be great!

If I understand correctly you merely have three values in a file: a string, an int and a double. I assume they are delimited by whitespace.
If that is so then you don't need std::getline(). Rather, use the extraction operator:
std::ifstream file("input.txt");
std::string s;
if( ! (file >> s) ) { // a single word extracted from the file
// failure
}
int n;
// ...

1) Instead of three different iteration, use only one
2) Pass string object in getline instead of pointers
string buf;
for (int i = 0; i < EMP_NUM; i++) // Get input from text file for names.
{
getline(inFile, buf, '*');
nameAr[i] = buf;
getline(inFile, buf, '*'); //assuming delimiter is again *
hoursAr[i] = atoi(buf.c_str() ); //C way to doing it...however in c++ u have to use stringstreams....
getline(inFile, buf);
hrateAr[i] = atof(buf.c_str() );;
}

What do the compiler errors say? Are you sure that the error is caused by getline? Maybe it's not because the getline calls but because of multiple declarations of i.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ line jumping and file reading - c++

Related

Is there a data structure for implementing a function equivalent to 'tail -n' command in C++?

Recover ifstream from failed read

does fstream move to the next position after read in a binary integer (c++)

reading until the end of file in C++

Using multiple instances of getline in C++

Categories

Resources