readout a big txt file - c++

i want to know, if there is another way to readout a big file
Hans //name
Bachelor // study_path
WS10_11 //semester
22 // age
and not like this:
fout >> name; //string
fout >> study_path; //string
fout >> Semester ; //string
fout >> age; //int
when my file turns to more than 20 line's i should make 20+ fouts?
Is there another way?

You could define a class to hold the data for each person:
class Person {
public:
std::string name;
std::string study_path;
std::string semester;
unsigned int age;
};
Then you can define a stream extraction operator for that class:
std::istream & operator>>(std::istream & stream, Person & person) {
stream >> person.name >> person.study_path >> person.semester >> person.age;
return stream;
}
And then you can just read the entire file like that:
std::ifstream file("datafile.txt");
std::vector<Person> data;
std::copy(std::istream_iterator<Person>(file), std::istream_iterator<Person>(),
std::back_inserter(data));
This will read the entire file and store all the extracted records in a vector. If you know the number of records you will read in advance, you can call data.reserve(number_of_records) before reading the file. Thus, the vector will have enough memory to store all records without reallocation, which might potentially speed up the loading if the file is large.

If you are on linux, you can mmap() the big file, and use data as it is in memory.

Related

Counting the no. of words in a csv file in c++

I am trying to count the no. of dates stores in the first line of a CSV file (Separated by commas):-
State,Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20
I have to count the no. of dates after Long(i.e. output = 5).
I have written the code to read the CSV file, which I have shown below, but, how shall I count the no. of dates after long. Would highly appreciate your help. Please feel free to ask for any other piece of information. Thanks.
char** readCSV(const char* csvFileName, int& csvLineCount)
{
ifstream fin(csvFileName);
if (!fin)
{
return nullptr;
}
csvLineCount = 0;
char line[1024];
while(fin.getline(line, 1024))
{
csvLineCount++;
};
char **lines = new char*[csvLineCount];
fin.clear();
fin.seekg(0, ios::beg);
for (int i=0; i<csvLineCount; i++)
{
fin.getline(line, 1024);
lines[i] = new char[strlen(line)+1];
strcpy(lines[i], line);
};
fin.close();
return lines;
}
This looks like you are reading a record that contains one or more subrecords.
I recommend using a Date class as well as a Record class.
class Date
{
public:
unsigned int month_num;
unsigned int day_num;
unsigned int year_num;
friend std::istream& operator>>(std::istream& input, Date& d);
};
std::istream& operator>>(std::istream& input, Date& d)
{
char forward_slash;
input >> d.month_num; input >> forward_slash;
input >> d.day_num; input >> forward_slash;
input >> d.year_num;
return input;
}
Overloading operator>> for the Date class will come in handy later.
Now the record class:
class Record
{
public:
std::string state;
std::string region;
std::string latitude;
std::string longitude;
// A container for the date subrecords
std::vector<Date> dates;
friend std::istream& operator>>(std::istream& input, Record& r);
};
std::istream& operator>>(std::istream& input, Record& r)
{
std::string text_line;
std::getline(input, text_line);
std::istringstream record_stream(text_line);
std::getline(record_stream, r.state, ',');
std::getline(record_stream, r.region, ',');
std::getline(record_stream, r.latitude, ',');
std::getline(record_stream, r.longitude, ',');
Date d;
while (record_stream >> d)
{
r.dates.push_back(d);
char comma;
record_stream >> comma;
}
return input;
}
The above function reads a text line of input, since the records are terminated by a newline. A stream is created from the text line. The stream helps in reading a variable quantity of dates.
Edit 1: Reading in the file
Your input code would look something like this:
std::vector<Record> database;
Record r;
while (fin >> r)
{
database.push_back(r);
}
Yes, you can use C-Strings and arrays, but C++ streams and std::vector simplify the code and simple code has less defects than complicated code.
Also, the std::vector and std::string classes manage dynamic memory for you. They expand as necessary. No need for checking for array overflows. They are also easier to pass. Passing arrays requires passing the capacity also (and maybe the quantity of elements in the array).

How to read from a file to constructor in C++?

I have a players.txt file where I need to read from and put my readings in a constructor and then I will make a list of these TennisPlayer objects. And I am stuck with how to do it. Well, the first thing is I read from file word by word or line by line but couldn't manage how to put my readings into constructor.
My constructor has five inputs:
TennisPlayer(string firstName, tring lastName, int ranking, int totalPoints, string country)
And part of my players.txt file is here:
Novak Djokovic 16790 Serbia
Andy Murray 8945 Great Britain
And secondly, how can take "Great Britain" as one string?
I am really new in C++ and in a desperate position. Thank you all for your help.
Create a stream from a file and then pass it to a class like this. It will iterate over the file, store the values in a vector and then initialise a vector of those objects in the constructor. To read the country, we keep reading in words until we find the end of the line. Therefore, I am working under the assumption that each tennis player is specified in a single line.
Note that there is no error checking here, you may wish to add that
class TennisPlayersList
{
public:
struct TennisPlayer
{
std::string first_name;
std::string second_name;
int ranking;
int total_points;
std::string country;
};
TennisPlayersList(std::istream& filestream)
:players(ReadPlayersFromFile(filestream))
{}
private:
std::vector<TennisPlayer> players;
static std::vector<TennisPlayer> ReadPlayersFromFile(std::istream& filestream)
{
std::vector<TennisPlayer> p;
while (filestream.eof() == false)
{
p.push_back(ReadPlayer(filestream));
}
return p;
}
static TennisPlayer ReadPlayer(std::istream& filestream)
{
using namespace std;
TennisPlayer player;
std::string line;
getline(filestream, line);
std::stringstream ss(line);
ss >> player.first_name;
ss >> player.second_name;
ss >> player.ranking;
ss >> player.total_points;
// country could be several words, keep appending until we have finished
while (ss.good())
{
std::string country_part;
ss >> country_part;
if (player.country.empty())
player.country += country_part;
else
player.country += " " + country_part;
}
return player;
}
};
call with something like this
std::fstream players;
players.open("tennis_players.txt", std::ios::in);
TennisPlayersLists list(players);
players.close();

Reading data from a file and storing it into a vector

I'm trying to read a list of items from from a file and then store them into a vector. The issue is my code is adding the last item to the vector twice and I'm not sure why the it keeps reading the file even though the program has reached the end.
Here's what's in the text file. The "Oranges" line appears twice when I display the contents of the vector.
Apples-pounds-10 2
Oranges-pounds-5 6
Here's the code
//Read the contents of the list to a file
while (!inputFile.fail())
{
//Extract the line from the list
getline(inputFile,item_name,'-');
getline(inputFile,item_unit,'-');
inputFile >> item_amount;
inputFile >> item_price;
//Create an instance of the item object
Item New_Item(item_name, item_unit, item_amount,item_price);
//Push it to the list vector
list.push_back(New_Item);
}
//Close the file
inputFile.close();
This is a typical symptom of the while (!infile.fail()) anti-pattern.
I'd define a struct and overload operator>> for that type:
struct item {
std::string name;
std::string unit;
int amount;
int price;
};
std::istream &std::operator>>(std::istream &is, item &i) {
getline(is, i.name, '-');
getline(is, i.unit, '-');
is >> i.amount;
return is >> i.price;
}
With those defined, reading the data borders on trivial:
std::ifstream inputFile("fileNameHere");
std::vector<New_Item> items { std::istream_iterator<Item>(inputFile),
std::istream_iterator<Item>() };
[I changed it from list to vector, because, well, you really don't want list. You can change it back, but probably shouldn't.]
The problem is that the "fail" flag is not set until you make an attempt at reading some more data from the file. Here is a quick way of fixing this:
for (;;) {
//Extract the line from the list
getline(inputFile,item_name,'-');
getline(inputFile,item_unit,'-');
inputFile >> item_amount;
inputFile >> item_price;
if (inputFile.fail()) break;
//Create an instance of the item object
Item New_Item(item_name, item_unit, item_amount,item_price);
//Push it to the list vector
list.push_back(New_Item);
}
If this is for a learning exercise, and you have not learn the >> operator yet, this should do it. Otherwise, the operator>> approach is better.

Reading an object from a file in c++

If I have a file containing 10 rows, where each row contain information about the following object,
class student
{
public:
string name;
int age;
int rollnum;
int year;
string father;
string mother;
string category;
string region;
char sex;
string branch;
int semester;
};
How can I read all the 10 objects information from a file? ( I am guessing I will have to take an array of 10 objects for this )
istream& operator >>(istream& in, student& val) {
return in >> name >> age >> rollnum >> year; // ...
}
Then you can do this:
for (string line; getline(infile, line); ) {
istringstream stream(line);
student person;
stream >> person;
}
Now person is being populated once per line. I do it this way rather than directly streaming from the file because this way is safer: if a line has the wrong number of tokens it won't misunderstand what the columns are, whereas without getline() you might naively parse 10 tokens of an 11 token line, then think that the 11th token on the line is the first token of the next record. Just a typical mistake that happens in beginner C++ text parsing code.

Reading in a .txt file word by word to a struct in C++

I am having some trouble with my lab assignment for my CMPT class...
I am trying to read a text file that has two words and a string of numbers per line, and the file can be as long as anyone makes it.
An example is
Xiao Wang 135798642
Lucie Chan 122344566
Rich Morlan 123456789
Amir Khan 975312468
Pierre Guertin 533665789
Marie Tye 987654321
I have to make each line a separate "student", so I was thinking of using struct to do so, but I don't know how to do that as I need the first, last, and ID number to be separate.
struct Student{
string firstName;
string secondName;
string idNumber;
};
All of the tries done to read in each word separately have failed (ended up reading the whole line instead) and I am getting mildly frustrated.
With the help from #Sylence I have managed to read in each line separately. I am still confused with how to split the lines by the whitespace though. Is there a split function in ifstream?
Sylence, is 'parts' going to be an array? I saw you had indexes in []'s.
What exactly does the students.add( stud ) do?
My code so far is:
int getFileInfo()
{
Student stdnt;
ifstream stdntFile;
string fileName;
char buffer[256];
cout<<"Please enter the filename of the file";
cin>>filename;
stdntFile.open(fileName.c_str());
while(!stdFile.eof())
{
stdFile.getLine(buffer,100);
}
return 0;
}
This is my modified and final version of getFileInfo(), thank you Shahbaz, for the easy and quick way to read in the data.
void getFileInfo()
{
int failed=0;
ifstream fin;
string fileName;
vector<Student> students; // A place to store the list of students
Student s; // A place to store data of one student
cout<<"Please enter the filename of the student grades (ex. filename_1.txt)."<<endl;
do{
if(failed>=1)
cout<<"Please enter a correct filename."<<endl;
cin>>fileName;
fin.open(fileName.c_str());// Open the file
failed++;
}while(!fin.good());
while (fin >> s.firstName >> s.lastName >> s.stdNumber)
students.push_back(s);
fin.close();
cout<<students.max_size()<<endl<< students.size()<<endl<<students.capacity()<<endl;
return;
}
What I am confused about now is how to access the data that was inputted! I know it was put into a vector, but How to I go about accessing the individual spaces in the vector, and how exactly is the inputted data stored in the vector? If I try to cout a spot of the vector, I get an error because Visual Studio doesn't know what to output I guess..
The other answers are good, but they look a bit complicated. You can do it simply by:
vector<Student> students; // A place to store the list of students
Student s; // A place to store data of one student
ifstream fin("filename"); // Open the file
while (fin >> s.firstName >> s.secondName >> s.idNumber)
students.push_back(s);
Note that if istream fails, such as when the file finishes, the istream object (fin) will evaluate to false. Therefore while (fin >> ....) will stop when the file finishes.
P.S. Don't forget to check if the file is opened or not.
Define a stream reader for student:
std::istream& operator>>(std::istream& stream, Student& data)
{
std::string line;
std::getline(stream, line);
std::stringstream linestream(line);
linestream >> data.firstName >> data.secondName >> data.idNumber;
return stream;
}
Now you should be able to stream objects from any stream, including a file:
int main()
{
std::ifstream file("data");
Student student1;
file >> student1; // Read 1 student;
// Or Copy a file of students into a vector
std::vector<Student> studentVector;
std::copy(std::istream_iterator<Student>(file),
std::istream_iterator<Student>(),
std::back_inserter(studentVector)
);
}
Simply read a whole line and then split the string at the spaces and assign the values to an object of the struct.
pseudo code:
while( !eof )
line = readline()
parts = line.split( ' ' )
Student stud = new Student()
stud.firstName = parts[0]
stud.secondName = parts[1]
stud.idNumber = parts[2]
students.add( stud )
end while