Read a line of a file c++ - c++

I'm just trying to use the fstream library and I wanna read a given row.
I thought this, but I don't know if is the most efficient way.
#include <iostream>
#include <fstream>
using namespace std;
int main(){
int x;
fstream input2;
string line;
int countLine = 0;
input2.open("theinput.txt");
if(input2.is_open());
while(getline(input2,line)){
countLine++;
if (countLine==1){ //1 is the lane I want to read.
cout<<line<<endl;
}
}
}
}
Is there another way?

This does not appear to be the most efficient code, no.
In particular, you're currently reading the entire input file even though you only care about one line of the file. Unfortunately, doing a good job of skipping a line is somewhat difficult. Quite a few people recommend using code like:
your_stream.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
...for this job. This can work, but has a couple of shortcomings. First and foremost, if you try to use it on a non-text file (especially one that doesn't contain new-lines) it can waste inordinate amounts of time reading an entire huge file, long after you've read enough that you would normally realize that there must be a problem. For example, if you're reading a "line", that's a pretty good indication that you're expecting a text file, and you can pretty easily set a much lower limit on how long that first line could reasonably be, such as (say) a megabyte, and usually quite a lot less than that.
You also usually want to detect whether it stopped reading because it reached that maximum, or because it got to the end of the line. Skipping a line "succeeded" only if a new-line was encountered before reaching the specified maximum. To do that, you can use gcount() to compare against the maximum you specified. If you stopped reading because you reached the specified maximum, you typically want to stop processing that file (and log the error, print out an error message, etc.)
With that in mind, we might write code like this:
bool skip_line(std::istream &in) {
size_t max = 0xfffff;
in.ignore(max, '\n');
return in.gcount() < max;
}
Depending on the situation, you might prefer to pass the maximum line size as a parameter (probably with a default) instead:
bool skip_line(std::istream &in, size_t max = 0xfffff) {
// skip definition of `max`, remainder identical
With this, you can skip up to a megabyte by default, but if you want to specify a different maximum, you can do so quite easily.
Either way, with that defined, the remainder becomes fairly trivial, something like this:
int main(){
std::ifstream in("theinput.txt");
if (!skip_line(in)) {
std::cerr << "Error reading file\n";
return EXIT_FAILURE;
}
// copy the second line:
std::string line;
if (std::getline(in, line))
std::cout << line;
}
Of course, if you want to skip more than one line, you can do that pretty easily as well by putting the call to skip_line in a loop--but note that you still usually want to test the result from it, and break the loop (and log the error) if it fails. You don't usually want something like:
for (int i=0; i<lines_to_skip; i++)
skip_line(in);
With this, you'd lose one of the basic benefits of assuring that your input really is what you expected, and you're not producing garbage.

I think you can condense your code to this. if (input) is sufficient to check for failure.
#include <iostream>
#include <fstream>
#include <limits>
int main()
{
std::ifstream input("file.txt");
int row = 5;
int count = 0;
if (input)
{
while (count++ < row) input.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
std::string line;
std::getline(input, line);
std::cout << line;
}
}

Related

Is there a data structure for implementing a function equivalent to 'tail -n' command in C++?

I want to write a function equivalent to the Linux tail -n command in C++. While, I parsed over the data of that file line-by-line thereby incrementing the line count, if the file size gets really big(~gigabytes), this method will take a lot of time! Is there a better approach or a data structure to implement this function?
Here are my 2 methods:
int File::countlines()
{
int lineCount = 0;
string str;
if (file)
{
while (getline(file, str))
{
lineCount += 1;
}
}
return lineCount;
}
void File::printlines()
{
int lineCount = 0;
string line;
if (file)
{
lineCount = countlines();
file.clear();
file.seekg(ios::beg);
if (lineCount <= 10)
{
while (getline(file, line))
{
cout << line << endl;
}
}
else
{
int position = lineCount - 10;
while (position--)
{
getline(file, line);
}
while (getline(file, line))
{
cout << line << endl;
}
}
}
}
This method is time consuming if the file size increases, so I want to either replace it with another data structure, or write a more efficient code.
One of the things that is slowing down your program is reading the file twice, so you could keep the last n EOL positions (n=10 in your program) and the most convenient data structure is a circular buffer but this isn't provided by the standard library as far as I know (boost has one). It can be implemented by an std::vector with size n, with an index where a modulo of n is done after incrementing.
With that circular buffer, you can jump immediately to the lowest offset (next one if buffer is full) in the file and print the needed lines.
When I've done this, I've done a generous estimate of the maximum length of a line (e.g., one kilobyte), seeked to that distance from the end, and started reading lines into a circular buffer until the end of the file.
In nearly every case, you get more than n lines, so you just print out the contents of the circular buffer, and you're done. Note, however, that you do need to assure that you read more than n lines, not just n lines. The first line you read will usually only be a partial line, so if you read exactly n lines, the first would probably be only a partial line.
On rare occasion, you haven't gotten the required number of lines, so you seek back twice as far (or other factor of your choice), and restart. If you want to get really fancy, you can extrapolate the number of lines you'll need based on the average length of the lines you did read (but honestly, this is such a rare situation it's not worth a lot of work to optimize it).
This normally works essentially instantly, regardless of file size. I suppose (in theory) for a file with incredibly long lines, it would get slower, but if that's the case, the user has probably made a mistake, and tried to tail something that isn't a text file (which is generally useless anyway).

Filling a cstring using <cstring> with text from a textfile using File I/O C++

I began learning strings yesterday and wanted to manipulate it around by filling it with a text from a text file. However, upon filling it the cstring array only prints out the last word of the text file. I am a complete beginner, so I hope you can keep this beginner friendly. The lines I want to print from the file are:
"Hello World from UAE" - First line
"I like to program" - Second line
Now I did look around and eventually found a way and that is to use std::skipary or something like that but that did not print it the way I had envisioned, it prints letter by letter and skips each line in doing so.
here is my code:
#include <fstream>
#include <iostream>
#include <cstring>
#include <cctype>
using namespace std;
int main() {
ifstream myfile;
myfile.open("output.txt");
int vowels = 0, spaces = 0, upper = 0, lower = 0;
//check for error
if (myfile.fail()) {
cout << "Error opening file: ";
exit(1);
}
char statement[100];
while (!myfile.eof()) {
myfile >> statement;
}
for (int i = 0; i < 30; ++i) {
cout << statement << " ";
}
I'm not exactly sure what you try to do with output.txt's contents, but a clean way to read through a file's contents using C++ Strings goes like this:
if (std::ifstream in("output.txt"); in.good()) {
for (std::string line; std::getline(in, line); ) {
// do something with line
std::cout << line << '\n';
}
}
You wouldn't want to use char[] for that, in fact raw char arrays are hardly ever useful in modern C++.
Also - As you can see, it's much more concise to check if the stream is good than checking for std::ifstream::fail() and std::ifstream::eof(). Be optimistic! :)
Whenever you encounter output issues - either wrong or no output, the best practise is to add print (cout) statements wherever data change is occurring.
So I first modified your code as follows:
while (!myfile.eof()) {
myfile >> statement;
std::cout<<statement;
}
This way, the output I got was - all lines are printed but the last line gets printed twice.
So,
We understood that data is being read correctly and stored in statement.
This raises 2 questions. One is your question, other is why last line is printed twice.
To answer your question exactly, in every loop iteration, you're reading the text completely into statement. You're overwriting existing value. So whatever value you read last is only stored.
Once you fix that, you might come across the second question. It's very common and I myself came across that issue long back. So I'm gonna answer that as well.
Let's say your file has 3 lines:
line1
line2
line3
Initially your file control (pointer) is at the beginning, exactly where line 1 starts. After iterations when it comes to line3, we know it's last line as we input the data. But the loop control doesn't know that. For all it knows, there could be a million more lines. Only after it enters the loop condition THE NEXT TIME will it come to know that the file has ended. So the final value will be printed twice.

C++ Read file into Array / List / Vector

I am currently working on a small program to join two text files (similar to a database join). One file might look like:
269ED3
86356D
818858
5C8ABB
531810
38066C
7485C5
948FD4
The second one is similar:
hsdf87347
7485C5
rhdff
23487
948FD4
Both files have over 1.000.000 lines and are not limited to a specific number of characters. What I would like to do is find all matching lines in both files.
I have tried a few things, Arrays, Vectors, Lists - but I am currently struggling with deciding what the best (fastest and memory easy) way.
My code currently looks like:
#include iostream>
#include fstream>
#include string>
#include ctime>
#include list>
#include algorithm>
#include iterator>
using namespace std;
int main()
{
string line;
clock_t startTime = clock();
list data;
//read first file
ifstream myfile ("test.txt");
if (myfile.is_open())
{
for(line; getline(myfile, line);/**/){
data.push_back(line);
}
myfile.close();
}
list data2;
//read second file
ifstream myfile2 ("test2.txt");
if (myfile2.is_open())
{
for(line; getline(myfile2, line);/**/){
data2.push_back(line);
}
myfile2.close();
}
else cout data2[k], k++
//if data[j] > a;
return 0;
}
My thinking is: With a vector, random access on elements is very difficult and jumping to the next element is not optimal (not in the code, but I hope you get the point). It also takes a long time to read the file into a vector by using push_back and adding the lines one by one. With arrays the random access is easier, but reading >1.000.000 records into an array will be very memory intense and takes a long time as well. Lists can read the files faster, random access is expensive again.
Eventually I will not only look for exact matches, but also for the first 4 characters of each line.
Can you please help me deciding, what the most efficient way is? I have tried arrays, vectors and lists, but am not satisfied with the speed so far. Is there any other way to find matches, that I have not considered? I am very happy to change the code completely, looking forward to any suggestion!
Thanks a lot!
EDIT: The output should list the matching values / lines. In this example the output is supposed to look like:
7485C5
948FD4
Reading a 2 millions lines won't be too much slow, what might be slowing down is your comparison logic :
Use : std::intersection
data1.sort(data1.begin(), data1.end()); // N1log(N1)
data2.sort(data2.begin(), data2.end()); // N2log(N2)
std::vector<int> v; //Gives the matching elements
std::set_intersection(data1.begin(), data1.end(),
data2.begin(), data2.end(),
std::back_inserter(v));
// Does 2(N1+N2-1) comparisons (worst case)
You can also try using std::set and insert lines into it from both files, the resultant set will have only unique elements.
If the values for this are unique in the first file, this becomes trivial when exploiting the O(nlogn) characteristics of a set. The following stores all lines in the first file passed as a command-line argument to a set, then performs a O(logn) search for each line in the second file.
EDIT: Added 4-char-only preamble searching. To do this, the set contains only the first four chars of each line, and the search from the second looks for only the first four chars of each search-line. The second-file line is printed in its entirety if there is a match. Printing the first file full-line in entirety would be a bit more challenging.
#include <iostream>
#include <fstream>
#include <string>
#include <set>
int main(int argc, char *argv[])
{
if (argc < 3)
return EXIT_FAILURE;
// load set with first file
std::ifstream inf(argv[1]);
std::set<std::string> lines;
std::string line;
for (unsigned int i=1; std::getline(inf,line); ++i)
lines.insert(line.substr(0,4));
// load second file, identifying all entries.
std::ifstream inf2(argv[2]);
while (std::getline(inf2, line))
{
if (lines.find(line.substr(0,4)) != lines.end())
std::cout << line << std::endl;
}
return 0;
}
One solution is to read the entire file at once.
Use istream::seekg and istream::tellg to figure the size of the two files. Allocate a character array large enough to store them both. Read both files into the array, at appropriate location, using istream::read.
Here is an example of the above functions.

Error reading and printing a text file with C++

I have a bug with my code (the code at the end of the question). The purpose of my C++ executable is to read a file that contains numbers, copy it in a std::vector and
then just print the contents in the stdout? Where is the problem? (atoi?)
I have a simple text file that contains the following numbers (each line has one number)
mini01:algorithms ios$ cat numbers.txt
1
2
3
4
5
When I execute the program I receive one more line:
mini01:algorithms ios$ ./a.out
1
2
3
4
5
0
Why I get the 6th line in the stdout?
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
using namespace std;
void algorithm(std::vector<int>& v) {
for(int i=0; i < v.size(); i++) {
cout << v[i] << endl;
}
}
int main(int argc, char **argv) {
string line;
std::vector<int> vector1;
ifstream myfile("numbers.txt");
if ( myfile.is_open()) {
while( myfile.good() )
{
getline(myfile, line);
vector1.push_back(atoi(line.c_str()));
}
myfile.close();
}
else {
cout << "Unable to open file" << endl;
}
algorithm(vector1);
return 0;
}
You should not use while (myfile.good()), as it will loop once to many.
Instead use
while (getline(...))
The reason you can't use the flags to check for looping, is that they don't get set until after an input/output operation notices the problem (error or end-of-file).
Don't use good() as the condition of your extraction loop. It does not accurately indicate whether the next read will succeed or not. Move your call to getline into the condition:
while(getline(myfile, line))
{
vector1.push_back(atoi(line.c_str()));
}
The reason it is failing in this particular case is because text files typically have an \n at the end of the file (that is not shown by text editors). When the last line is read, this \n is extracted from the stream. Yes, that may be the very last character in the file, but getline doesn't care to look any further than the \n it has extracted. It's done. It does not set the EOF flag or do anything else to cause good() to return false.
So at the next iteration, good() is still true, the loop continues and getline attempts to extract from the file. However, now there's nothing left to extract and you just get line set to an empty string. This then gets converted to an int and pushed into the vector1, giving you the extra value.
In fact, the only robust way to check if there is a problem with extraction is to check the stream's status bits after extracting. The easiest way to do this is to make the extraction itself the condition.
You read one too many lines, since the condition while is false AFTER you had a "bad read".
Welcome to the wonderful world of C++. Before we go to the bug first, I would advise you to drop the std:: namespace resolution before defining or declaring a vector as you already have
using namespace::std;
A second advise would be to use the pre increment operator ++i instead of i++ wherever feasible. You can see more details on that here.
Coming to your problem in itself, the issue is an empty new line being read at the end of file. A simple way to avoid this would be to check the length of line before using it.
getline(myfile, line);
if (line.size()) {
vector1.push_back(atoi(line.c_str()));
}
This would enable your program now to read a file interspersed with empty lines. To be further foolproof you can check the line read for presence of any non numeric characters before using atoi on it. However the best solution as mentioned would be use to read the line read to the loop evaluation.

For loops and inputing data?

trying to figure out how to make a little inventory program and I can't for the life figure out why it isn't working.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
struct record
{
int item_id;
string item_type;
int item_price;
int num_stock;
string item_title;
string item_author;
int year_published;
};
void read_all_records(record records[]);
const int max_array = 100;
int main()
{
record records[max_array];
read_all_records(records);
cout << records[2].item_author;
return 0;
}
void read_all_records(record records[])
{
ifstream invfile;
invfile.open("inventory.dat");
int slot = 0;
for (int count = 0; count<max_array; count++);
{
invfile >> records[slot].item_id >> records[slot].item_type >> records[slot].item_price >> records[slot].num_stock >> records[slot].item_title >> records[slot].item_author >> records[slot].year_published;
slot++;
}
invfile.close();
}
I'm testing it by having it print the second item from records author. When I run it, it doesn't show the authors name at all. The .dat file is located in just about every folder where the project is (I forgot which folder it needs to be in) so it's there.
The issue isn't that the file isn't working. It's the array not printing off anything.
my inv file is basically:
123456
book
69.99
16
title
etc
etc
and repeats for different books/cds etc all on one line, all without spaces. Should just next in.
You should check to see that the file is open.
invfile.open("inventory.dat");
if (!invfile.is_open())
throw std::runtime_error("couldn't open inventory file");
You should check to seen that your file reads are working and breaks when you hit the end of file.
invfile >> records[slot].item_id >> records[slot].item_type ...
if (invfile.bad())
throw std::runtime_error("file handling didn't work");
if (invfile.eof())
break;
You probably want to read each record at time, as it isn't clear from this code how the C++ streams are supposed to differentiate between each field.
Usually you'd expect to use std::getline, split the fields on however you delimit them, and then use something like boost::lexical_cast to do the type parsing.
If I were doing this, I think I'd structure it quite a bit differently.
First, I'd overload operator>> for a record:
std::istream &operator>>(std::istream &is, record &r) {
// code about like you had in `read_all_records` to read a single `record`
// but be sure to return the `stream` when you're done reading from it.
}
Then I'd use an std::vector<record> instead of an array -- it's much less prone to errors.
To read the data, I'd use std::istream_iterators, probably supplying them to the constructor for the vector<record>:
std::ifstream invfile("inventory.dat");
std::vector<record> records((std::istream_iterator<record>(invfile)),
std::istream_iterator<record>());
In between those (i.e., after creating the file, but before the vector) is where you'd insert your error handling, roughly on the order of what #Tom Kerr recommended -- checks for is_open(), bad(), eof(), etc., to figure out what (if anything) is going wrong in attempting to open the file.
Add a little check:
if (!invfile.is_open()) {
cout<<"file open failed";
exit(1);
}
So that way, you don't need to copy your input file everywhere like you do now ;-)
You are reading in a specific order, so your input file should have the same order and required number of inputs.
You are printing 3rd element of the struct records. So you should have at least 3 records. I don't see anything wrong with your code. It would a lot easier if you can post your sample input file.