i have a person class and a pet class.
person has a name and an age.
pet has an animal and its colour
person class holds an vector of pets
i am trying to read in data from a file to fill an vector of person objects(who each have their own vector of pets)
file structured like so:
sally 32
cat brown
-1
tom 49
dog white
dog brown
-1
sue 54
lizard green
-1
emily 18
cat white
cat brown
cat black
-1
-1 being a "flag" that the entry is finished
i cant figure out how to account for the variability of # of pets for each person, let alone how to stop at -1
Here is the code that I have so far: (works on data file that is linear/without variability in amount of pets)
void fillArray(vector<Person> &people){
ifstream peopleFile("people.dat");
string name;
int age;
string animal;
string colour;
while(!peopleFile.eof()){
peopleFile>>name>>age;
peopleFile>>animal>>colour;
Person p(name,age);
Pet pet(animal,colour);
p.addPet(pet);
people.push_back(p);
..
}
}
It is not so complicated. We will break up the big problem into smaller problems and then solve that part by part.
We will use the iostream extractor operator to get all data from the stream. For the Pet class, we will first read the complete line, with the animal and the color. We will use the most used std::getline for that purpose.
We always read the complete line first to avoid problems with the line end. Then we stuff the just read line into a std::istringstream. This is also a stream, and with that, we can read from a string like from any other stream.
From that stream we simply extract the name and the age. Not so complicated.
Reading a person is a little bit more complex. A new entry in a file could start with an empty line. So we will first read the empty lines and discard them.
Then, again, we put the line in an std::istringstream and then extract the name and the age. Simple.
Next, there are may be one or more Pets. The delimiter is a line containing -1. So, we read lines in a while loop, until the line is -1. In the case of -1, we will not execute the loop body.
If it is not -1, but valid data, we put the line again into a std::istringstream and extract one Pet from this stream. The line Pet pTemp; iss2 >> pTemp; will call the pets extractor operator. Then we add the new pet to our internal std::vector.
That is basically all.
In main, we open the file and check, if that worked.
Then we define a variable of type vector (persons) and use its range constructor to initialize it. As iterator for the range constructor, we will use the std::istream_iterator for the Person class. And this will simply call the Person extractor, until all data are read. So, we will read all data with a one liner in the end.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>
struct Pet {
std::string animal{};
std::string color{};
};
// Define Extractor for a pet
std::istream& operator >> (std::istream& is, Pet& p) {
// Read a line containing the animal name and the color
if (std::string line{}; std::getline(is, line)) {
// Now put line in istringstream, so that we can use iostream operations to extract data
std::istringstream iss{ line };
// Extract info for one pet
iss >> p.animal >> p.color;
}
return is;
}
// Define Inserter for a pet for easier output
std::ostream& operator << (std::ostream & os, const Pet & p) {
return os << "Pet: \t" << p.animal << "\t\tAnimal: \t " << p.color << '\n';
}
struct Person {
std::string name{};
unsigned int age;
std::vector<Pet> pets;
};
// Define a extractor for a person
std::istream& operator >> (std::istream& is, Person& p) {
std::string line{};
// Read empty strings and discard them
while (std::getline(is, line) && line == "")
;
if (is) {
// Read a line containing the name and the age
// Now put line in istringstream, so that we can use iostream operations to extract data
std::istringstream iss{ line };
// Extract name and age
iss >> p.name >> p.age;
p.pets.clear();
// Next, we want to read all pets, line by line, until we find -1
while (std::getline(is, line) && line != "-1") {
// Now put line in istringstream, so that we can use iostream operations to extract data
std::istringstream iss2{ line };
// Extract pet from this line. Call overwritten Extractor from Pet
Pet pTemp; iss2 >> pTemp;
// Add new pet to vector
p.pets.push_back(std::move(pTemp));
}
}
return is;
}
// Define Inserter for a person for easier output
std::ostream& operator << (std::ostream& os, const Person& p) {
os << "\n\nName:\t" << p.name << "\t\tAge: \t" << p.age <<'\n';
for (const Pet& pet : p.pets) os << pet;
return os;
}
int main() {
// Open file and check, if it could be opened
if (std::ifstream datFileStream{ "r:\\people.dat" }; datFileStream) {
// Read complete source file with the vectors range constructor
std::vector persons(std::istream_iterator<Person>(datFileStream), {});
// Show debug output
for (const Person& p : persons) std::cout << p;
}
else
std::cerr << "\n\n*** Error: Could not open source file\n";
return 0;
}
Related
I'm new to file read/write in c++. Please someone help me the best way to read the file somewhat like shown below ta class like this
class Student
{
public:
string fName;
string sName;
int mark
};
// file.txt each data is ends with newline and metadata ends with ';'
Firstname1;SecondName1;Mark1
Firstname2;SecondName2;Mark2
Firstname3;SecondName3;Mark3
Firstname4;SecondName4;Mark4
Firstname5;SecondName5;Mark5
please someone help me to find a best way
So, we have here 2 problems to solve.
read data from a file
split the read data into its parts
The format of the source text file is so called "CSV", for "Comma Separated Values". Instead of comma, any other meaningfull separator may be used.
So, what needs to be done?
Open the file and check, if it could be opened
In a loop read line by line from the file
split the values into its parts
store the parts in your struct
The challenge for you is here the "splitting" of the string. There are many many many potential solutions, let me share one often used approach with std::getline.
std::getline basically reads characters from an input stream and places them into a string. Reading is done until a delimiter is found. If you do not specifically provide a delimiter as input parameter, then '\n' is assumed. So, it will read a complete line from a stream (file).
Then, because many people do not understand the pitfalls in using a mix of formatted and unformatted input, the read line is put into a std::istringstream. This is also more robust/resilient against problems Then we can extract the parts of the string again with using std::getline.
One potential example could be:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
class Student
{
public:
std::string fName{};
std::string sName{};
int mark{};
};
int main() {
// Open file and check, if it could be opened
if (std::ifstream sourceFileStream{ "file.txt" }; sourceFileStream) {
// The file could be opened. Define a vector, where we store all Student data
std::vector<Student> students{};
// Read all lines of the file
for (std::string line; std::getline(sourceFileStream, line); ) {
// Put the line into an std::istringstream, to extract the parts
std::istringstream iss{ line };
// Define a temporary Student for storing the parts
Student tempStudent{};
// Get the parts
std::getline(iss, tempStudent.fName, ';');
std::getline(iss, tempStudent.sName, ';');
// Use formatted input function to get the mark and convert it to an int
iss >> tempStudent.mark;
// So, now all parts are in the tempStudent. Add this to the result
students.push_back(tempStudent);
}
// For debug purposes, we show all read data
for (const Student& student : students)
std::cout << student.fName << '\t' << student.sName << "\t Mark: " << student.mark << '\n';
}
else
// File could not be opened. Show error message
std::cerr << "\n\nError: 'file.txt' could not be opened\n";
}
For the more experienced users. In C++ we often use a more object oriented approach. We store the data and the methods, operating on that data in the class.
C++ allows us to override the extraction >> and inserter << operator. Those will then be part of the class and make IO easier.
Please see the little bit more advanced solution below:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>
class Student
{
public:
std::string fName{};
std::string sName{};
int mark{};
friend std::istream& operator >> (std::istream& is, Student& s) {
if (std::string line{}; std::getline(is, line)) {
std::istringstream iss{ line };
std::getline(std::getline(iss, s.fName, ';'), s.sName) >> s.mark;
}
return is;
}
friend std::ostream& operator << (std::ostream& os, const Student& s) {
return os << s.fName << '\t' << s.sName << "\t Mark: " << s.mark;
}
};
class Students
{
public:
std::vector<Student> data{};
friend std::istream& operator >> (std::istream& is, Students& s) {
s.data = { std::istream_iterator<Student>(is),{} };
return is;
}
friend std::ostream& operator << (std::ostream& os, const Students& s) {
for (const Student& d : s.data) os << d << '\n';
return os;
}
};
int main() {
// Open file and check, if it could be opened
if (std::ifstream sourceFileStream{ "file.txt" }; sourceFileStream) {
// The file could be opened. Define a instance of Students
Students students{};
// Read all students
sourceFileStream >> students;
// Show all data
std::cout << students;
}
else
// File could not be opened. SHow error message
std::cerr << "\n\nError: 'file.txt' could not be opened\n";
}
I have a done a simple C++ program to gain knowledge in C++. It's a game which stores and reads in the end to a file. Score, Name etc..
At each line in the file the content for a Player object is stored.
Ex: ID Age Name etc.
I now wanted to change to comma separation in the file but then I faced the issue how to read each line and write the Player object into a vector of Player objects std::vector correct.
My code today is like.
std::vector<Player> readPlayerToVector()
{
// Open the File
std::ifstream in("players.txt");
std::vector<Player> players; // Empty player vector
while (in.good()) {
Player temp; //
in >> temp.pID;
....
players.push_back(temp);
}
in.close();
return players;
}
How should I change this code to be compatible with comma separation. Not it works with space separation with the overload of >>.
Be aware that I am a beginner in C++. I've tried looking of the examples where std::getline(ss, line) with stringstream is used but I can't figure out a good way to assign the Player object with that method.
I will try to help and explain you all steps. I will first show a little bit of theory and then some easy solution, some alternative solutions and the C++ (object-oriented) approach.
So, we will go from super easy to more modern C++ solution.
Let’s start. Assume that you have a of player with some attributes. Attributes could be for example: ID Name Age Score. If you store this data in a file, it could look like:
1 Peter 23 0.98
2 Carl 24 0.75
3 Bert 26 0.88
4 Mike 24 0.95
But at some point in time, we notice that this nice and simple format will not work any longer. The reason is that formatted input functions with the extractor operator >> will stop the conversion at a white space. And this will not work for the following example:
1 Peter Paul 23 0.98
2 Carl Maria 24 0.75
3 Bert Junior 26 0.88
4 Mike Senior 24 0.95
Then the statement fileStream >> id >> name >> age >> score; will not work any longer, and everything will fail. Therefore storing data in a CSV (Comma Separated Values) format is widely chosen.
The file would then look like:
1, Peter Paul, 23, 0.98
2, Carl Maria, 24, 0.75
3, Bert Junior, 26, 0.88
4, Mike Senior, 24, 0.95
And with that, we can clearly see, what value belongs to which attribute. But unfortunately, this will make reading more difficult. Because you do need to follow 3 steps now:
Read a complete line as a std::string
Split this string into substrings using the comma as a separator
Convert the substrings to the required format, for example from string to number age
So, let us solve this step by step.
Reading a complete line is easy. For this we have the function std::getline. It will read a line (at text until the end of the line character ‘\n’) from a stream (from any istream, like std::cin, an std::ifstream or also from an std::istringstream) and store it in a std::string variable. Please read a description of the function in the CPP Reference here.
Now, splitting a CSV string in its parts. There are so many methods available, that it is hard to tell what is the good one. I will also show several methods later, but the most common approach is done with std::getline. (My personal favorite is the std::sregex_token_iterator, because it fits perfectly into the C++ algorithm world. But for here, it is too complex).
OK, std::getline. As you have read in the CPP reference, std::getline reads characters until it finds a delimiter. If you do not specify a delimiter, then it will read until the end of line \n. But you can also specify a different delimiter. And this we will do in our case. We will choose the delimiter ‘,’.
But, additional problem, after reading a complete line in step 1, we have this line in a std::string. And, std::getline wants to read from a stream. So, the std::getline with a comma as delimiter cannot be used with a std::string as source. Fortunately also here is a standard approach available. We will convert the std::string into a stream, by using a std::istringstream. You can simply define a variable of this type and pass the just read string as parameter to its constructor. For example:
std::istringstream iss(line);
And now we can use all iostream functions also with this “iss”. Cool. We will use std::getline with a ‘,’ delimiter and receive a substring.
The 3rd and last is unfortunately also necessary. Now we have a bunch of substrings. But we have also 3 numbers as attributes. The “ID” is an unsigned long, the “Age” is an int and the “Score” is a double, So we need to use string conversion functions to convert the substring to a number: std::stoul, std::stoi and std::stod. If the input data is always OK, then this is OK, but if we need to validate the input, then it will be more complicated. Let us assume that we have a good input.
Then, one of really many many possible examples:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
unsigned long ID{};
std::string name{};
int age{};
double score{};
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("players.txt");
// Here we will store all players that we read
std::vector<Player> players{};
// We will read a complete line and store it here
std::string line{};
// Read all lines of the source CSV file
while (std::getline(in, line)) {
// Now we read a complete line into our std::string line
// Put it into a std::istringstream to be able to extract it with iostream functions
std::istringstream iss(line);
// We will use a vector to store the substrings
std::string substring{};
std::vector<std::string> substrings{};
// Now, in a loop, get the substrings from the std::istringstream
while (std::getline(iss, substring, ',')) {
// Add the substring to the std::vector
substrings.push_back(substring);
}
// Now store the data for one player in a Player struct
Player player{};
player.ID = std::stoul(substrings[0]);
player.name = substrings[1];
player.age = std::stoi(substrings[2]);
player.score = std::stod(substrings[3]);
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score << '\n';
}
}
You see, it is getting more complex.
If you are more experienced, then you can use also other mechanisms. But then you need to understand the difference between formatted an unformatted input and need to have little bit more practice. This is complex. (So, do not use that in the beginning):
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
unsigned long ID{};
std::string name{};
int age{};
double score{};
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("r:\\players.txt");
// Here we will store all players that we read
Player player{};
std::vector<Player> players{};
char comma{}; // Some dummy for reading a comma
// Read all lines of the source CSV file
while (std::getline(in >> player.ID >> comma >> std::ws, player.name, ',') >> comma >> player.age >> comma >> player.score) {
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score << '\n';
}
}
As said, do not use in the beginning.
But, what you should try to learn and understand is: C++ is an object oriented language. This means we do not only put the data into the Player struct, but also the methods that operate on this data.
And those are at the moment just input and output. And as you already know, input and output is done using iostream-functionality with the extractor operator >> and inserter operator <<. But, how to do this? Our Player struct is a custom type. It has no build in >> and << operator.
Fortunately, C++ is a powerful language and allows us to add such functionality easily.
The signature of the struct would then look like:
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p);
friend std::ostream& operator << (std::ostream& os, const Player& p);
};
And, after writing the code for these operators using the above-mentioned method, we will get:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p) {
std::string line{}, substring{}; std::vector<std::string> substrings{};
std::getline(is, line);
std::istringstream iss(line);
// Read all substrings
while (std::getline(iss, substring, ','))
substrings.push_back(substring);
// Now store the data for one player in the given Player struct
Player player{};
p.ID = std::stoul(substrings[0]);
p.name = substrings[1];
p.age = std::stoi(substrings[2]);
p.score = std::stod(substrings[3]);
return is;
}
friend std::ostream& operator << (std::ostream& os, const Player& p) {
return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
}
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("r:\\players.txt");
// Here we will store all players that we read
Player player{};
std::vector<Player> players{};
// Read all lines of the source CSV file into players
while (in >> player) {
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p << '\n';
}
}
It is simply reusing everything from what we learned above. Just put it at the right place.
We can even go one step ahead. Also the player list, the ste::vector<Player> can be wrapped in a class and amended with iostream-functionality.
By knowing all of the above, this will be really simple now. See:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p) {
char comma{}; // Some dummy for reading a comma
return std::getline(is >> p.ID >> comma >> std::ws, p.name, ',') >> comma >> p.age >> comma >> p.score;
}
friend std::ostream& operator << (std::ostream& os, const Player& p) {
return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
}
};
struct Players {
// The data part
std::vector<Player> players{};
// The methods part
friend std::istream& operator >> (std::istream& is, Players& ps) {
Player player{};
while (is >> player) ps.players.push_back(player);
return is;
}
friend std::ostream& operator << (std::ostream& os, const Players& ps) {
for (const Player& p : ps.players) os << p << '\n';
return os;
}
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("players.txt");
// Here we will store all players that we read
Players players{};
// Read the complete CSV file and store everything in the players list at the correct place
in >> players;
// Debug output of complete players data. Ultra short.
std::cout << players;
}
I would be happy, if you could see the simple and yet powerful solution.
At the very end, as promised. Some further methods to split a string into substrings:
Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.
Alternatives
Handcrafted, many variants, using pointers or iterators, maybe hard to develop and error prone.
Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
std::getline. Most used implementation. But actually a "misuse" and not so flexible
Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower.
Please see 4 examples in one piece of code.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>
using Container = std::vector<std::string>;
std::regex delimiter{ "," };
int main() {
// Some function to print the contents of an STL container
auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };
// Example 1: Handcrafted -------------------------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Search for comma, then take the part and add to the result
for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {
// So, if there is a comma or the end of the string
if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {
// Copy substring
c.push_back(stringToSplit.substr(startpos, i - startpos));
startpos = i + 1;
}
}
print(c);
}
// Example 2: Using very old strtok function ----------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
c.push_back(token);
}
print(c);
}
// Example 3: Very often used std::getline with additional istringstream ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Put string in an std::istringstream
std::istringstream iss{ stringToSplit };
// Extract string parts in simple for loop
for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
;
print(c);
}
// Example 4: Most flexible iterator solution ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
//
// Everything done already with range constructor. No additional code needed.
//
print(c);
// Works also with other containers in the same way
std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
print(c2);
// And works with algorithms
std::deque<std::string> c3{};
std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));
print(c3);
}
return 0;
}
Happy coding!
I provided a similar solution here:
read .dat file in c++ and create to multiple data types
#include <iostream>
#include <sstream>
#include <vector>
struct Coefficients {
unsigned A;
std::vector<double> B;
std::vector< std::vector<double> > C;
};
std::vector<double> parseFloats( const std::string& s ) {
std::istringstream isf( s );
std::vector<double> res;
while ( isf.good() ) {
double value;
isf >> value;
res.push_back( value );
}
return res;
}
void readCoefficients( std::istream& fs, Coefficients& c ) {
fs >> c.A;
std::ws( fs );
std::string line;
std::getline( fs, line );
c.B = parseFloats( line );
while ( std::getline( fs, line ) ) {
c.C.push_back( parseFloats( line ) );
}
}
This one also might apply:
Best way to read a files contents and separate different data types into separate vectors in C++
std::vector<int> integers;
std::vector<std::string> strings;
// open file and iterate
std::ifstream file( "filepath.txt" );
while ( file ) {
// read one line
std::string line;
std::getline(file, line, '\n');
// create stream for fields
std::istringstream ils( line );
std::string token;
// read integer (I like to parse it and convert separated)
if ( !std::getline(ils, token, ',') ) continue;
int ivalue;
try {
ivalue = std::stoi( token );
} catch (...) {
continue;
}
integers.push_back( ivalue );
// Read string
if ( !std::getline( ils, token, ',' )) continue;
strings.push_back( token );
}
You could separate each variable by line rather than comma. I find this approach much more simple as you can use the getline function.
Have a read of the documentation of ifstream/ofstream. I've done several projects based of this documentation alone!
C++ fstream reference
I read configuration files of the following format into my C++ code:
# name score
Marc 19.7
Alex 3.0
Julia 21.2
So far, I have adapted a solution found here: Parse (split) a string in C++ using string delimiter (standard C++). For example, the following code snippet reads in the file line by line, and for each line calls parseDictionaryLine, which discards the first line, splits the string as described in the original thread, and inserts the values into a (self-implemented) hash table.
void parseDictionaryLine(std::string &line, std::string &delimiter, hash_table &table) {
size_t position = 0;
std::string name;
float score;
while((position = line.find(delimiter)) != std::string::npos) {
name = line.substr(0, position);
line.erase(0, position + delimiter.length());
score = stof(line);
table.hinsert(name, score);
}
}
void loadDictionary(const std::string &path, hash_table &table) {
std::string line;
std::ifstream fin(path);
std::string delimiter = " ";
int lineNumber = 0;
if(fin.is_open()) {
while(getline(fin, line)) {
if(lineNumber++ < 1) {
continue; // first line
}
parseDictionaryLine(line, delimiter, table);
}
fin.close();
}
else {
std::cerr << "Unable to open file." << std::endl;
}
}
My question would be, is there a more elegant way in C++ to achieve this task? In particular, is there (1) a better split function as for example in Python, (2) a better method to test if a line is a comment line (starting with #), like startsWith (3) potentially even in iterator that handles files similar to a context manager in Python and makes sure the file will actually be closed? My solution works for simple cases shown here but becomes more clunky with more complicated variations such as several comment lines at unpredictable positions and more parameters. Also, it worries me that my solution does not check if the file actually agrees with the prescribed format (two values per line, first is string, second is float). Implementing these checks with my method seems very cumbersome.
I understand there is JSON and other file formats with libraries made for this use case, but I am dealing with legacy code and cannot go there.
I will try to answer all your questions.
First for splitting a string, you should not use the linked question/answer. It is from 2010 and rather outdated. Or, you need to scroll at the very bottom. There you will find more modern answers.
In C++ many things are done with iterators. Because a lot of algorithms or constructors in C++ work with iterators. So, the better approch for splitting a string is to use iterators. This will then always result in a one liner.
Background. A std::string is also a container. And you can iterate over elements like for example words or values in it. In case of space separated values you can use the std::istream_iterator on a std::istringstream. But since years there is a dedicated iterator for iterating of patterns in a string:
The std::sregex_token_iterator. And because it is specifically designed for that purpuse, it should be used.
Ans if it is used for splitting the strings, the overhead of using regexes is also minimal. So, you may split on strings, commas, colons or whatever. Example:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
const std::regex re(";");
int main() {
// Some test string to be splitted
std::string test{ "Label;42;string;3.14" };
// Split and store whatever number of elements in the vector. One Liner
std::vector data(std::sregex_token_iterator(test.begin(), test.end(), re, -1), {});
// Some debug output
for (const std::string& s : data) std::cout << s << '\n';
}
So, regardless of the number of patterns, it will copy all data parts into the std::vector.
So, now you have a one liner solution for splitting strings.
For checking. if the first character is a string, you may use
the index operator (if (string[0] == '#'))
or, the std::string's front function (if (string.front() == '#'))
or again a regex
But, here you need to be careful. The string must not be empty, so, better write:
if (not string.empty() and string.front() == '#')
Closing file or iterating over files.
If you use a std::ifstream then the constructor will open the file for you and the destructor will automatically close it, when the stream variable rund out of scope. The typical pattern here is:
// Open the file and check, if it coud be opened
if (std::iftsream fileStream{"test.txt"};fileStream) {
// Do things
} // <-- This will close the file automatically for you
Then, in general you shoud use a more object oriented approach. Data, and methods operating on this data, should be encapsulated in one class. Then you would overwrite the extractor operatoe >> and the inserter operator << to read and write the data. This, because only the class should know, how to handle the data. And if you decide to use a different mechanism, modify your class and the rest of the outside world will still work.
In your example case, input and output is that simple, that easiest IO will work. No splitting of string necessary.
Please see the following example.
And note especially the only few statements in main.
If you change something inside the classes, it will simple continue to work.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
// Data in one line
struct Data {
// Name and score
std::string name{};
double score{};
// Extractor and inserter
friend std::istream& operator >> (std::istream& is, Data& d) { return is >> d.name >> d.score; }
friend std::ostream& operator << (std::ostream& os, const Data& d) { return os << d.name << '\t' << d.score; }
};
// Datbase, so all data from the source file
struct DataBase {
std::vector<Data> data{};
// Extractor
friend std::istream& operator >> (std::istream& is, DataBase& d) {
// Clear old data
d.data.clear(); Data element{};
// Read all lines from source stream
for (std::string line{}; std::getline(is, line);) {
// Ignore empty and command lines
if (not line.empty() and line.front() != '#') {
// Call extractor from Data class end get the data
std::istringstream(line) >> element;
// And save new data in the datbase
d.data.push_back(std::move(element));
}
}
return is;
}
// Inserter. Output all data
friend std::ostream& operator << (std::ostream& os, const DataBase& d) {
std::copy(d.data.begin(), d.data.end(), std::ostream_iterator<Data>(os, "\n"));
return os;
}
};
int main() {
// Open file and check, if it is open
if (std::ifstream ifs{ "test.txt" }; ifs) {
// Our database
DataBase db{};
// Read all data
ifs >> db;
// Debug output show all data
std::cout << db;
}
else std::cerr << "\nError: Could not open source file\n";
}
You can use operator>> to split at delimiters for you, like this:
#include <iostream>
#include <sstream>
#include <unordered_map>
std::istringstream input{
"# name score\n"
"Marc 19.7\n"
"Alex 3.0\n"
"Julia 21.2\n"
};
auto ReadDictionary(std::istream& stream)
{
// unordered_map has O(1) lookup, map has n(log n) lookup
// so I prefer unordered maps as dictionaries.
std::unordered_map<std::string, double> dictionary;
std::string header;
// read the first line from input (the comment line or header)
std::getline(stream, header);
std::string name;
std::string score;
// read name and score from line (>> will split at delimiters for you)
while (stream >> name >> score)
{
dictionary.insert({ name, std::stod(score) });
}
return dictionary;
}
int main()
{
auto dictionary = ReadDictionary(input); // todo replace with file stream
// range based for loop : https://en.cppreference.com/w/cpp/language/range-for
// captured binding : https://en.cppreference.com/w/cpp/language/structured_binding
for (const auto& [name, score] : dictionary)
{
std::cout << name << ": " << score << "\n";
}
return 0;
}
I want to parse a file with the following content:
2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13
Here, 2 is the number of lines, 300 is the weight.
Each line has a string and a number(price). Same with 3 and 400.
I need to store 130, 456 in an array.
Currently, I am reading the file and each line is processed as std::string. I need help to progress further.
Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//void processString(string line);
void process2(string line);
int main(int argc, char ** argv) {
cout << "You have entered " << argc <<
" arguments:" << "\n";
for (int i = 1; i < argc; ++i)
cout << argv[i] << "\n";
//2, 4 are the file names
//Reading file - market price file
string line;
ifstream myfile(argv[2]);
if (myfile.is_open()) {
while (getline(myfile, line)) {
// cout << line << '\n';
}
myfile.close();
} else cout << "Unable to open market price file";
//Reading file - price list file
string line_;
ifstream myfile2(argv[4]);
int c = 1;
if (myfile2.is_open()) {
while (getline(myfile2, line_)) {
// processString(line_);
process2(line_);
}
myfile2.close();
} else cout << "Unable to open price lists file";
//processString(line_);
return 0;
}
void process2(string line) {
string word = "";
for (auto x: line) {
if (x == ' ') {
word += " ";
} else {
word = word + x;
}
}
cout << word << endl;
}
Is there a split function like in Java, so I can split and store everything as tokens?
You have 2 questions in your post:
How do I parse this file in cpp?
Is there a split function like in Java, so I can split and store everything as tokens?
I will answer both questions and show a demo example.
Let's start with splitting a string into tokens. There are several possibilities. We start with the easy ones.
Since the tokens in your string are delimited by a whitespace, we can take advantage of the functionality of the extractor operator (>>). This will read data from an input stream, up to a whitespace and then converts this read data into the specified variable. You know that this operation can be chained.
Then for the example string
const std::string line{ "Token1 Token2 Token3 Token4" };
you can simply put that into a std::istringstream and then extract the variables from the stream:
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
The disadvantage is that you need to write a lot of stuff and you have to know the number of elements in the string.
We can overcome this problem with using a vector as the taget data store and fill it with its range constructor. The vectors range constructor takes a begin and and end interator and copies the data into it.
As iterator we use the std::istream_iterator. This will, in simple terms, call the extractor operator (>>) until all data is consumed. Whatever number of data we will have.
This will then look like the below:
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
This may look complicated, but is not. We define a variable "token" of type std::vector. We use its range constructor.
And, we can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction", C++17 required).
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
There is an additional solution. It is the most powerful solution and hence maybe a little bit to complicated in the beginning.
With that can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator. Very simple to use. And the result is a one liner for splitting the original string:
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
So, modern C++ has a build in functionality which is exactly designed for the purpose of tokenizing strings. It is called std::sregex_token_iterator. What is this thing?
As it name says, it is an iterator. It will iterate over a string (hence the 's' in its name) and return the split up tokens. The tokens will be matched again a regular expression. Or, natively, the delimiter will be matched and the rest will be seen as token and returned. This will be controlled via the last flag in its constructor.
Let's have a look at this constructor:
token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
The first parameter is, where it should start in the source string, the 2nd parameter is the end position, up to which the iterator should work. The last parameter is:
1, if you want to have a positive match for the regex
-1, will return everything that not matches the regex
And last but not least the regex itself. Please read in the net abot regex'es. There are tons of pages available.
Please see a demo for all 3 solutions here:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
#include <sstream>
#include <iterator>
#include <algorithm>
/// Split string into tokens
int main() {
// White space separated tokens in a string
const std::string line{ "Token1 Token2 Token3 Token4" };
// Solution 1: Use extractor operator ----------------------------------
// Here, we will store the result
std::string subString1{}, subString2{}, subString3{}, subString4{};
// Put the line into an istringstream for easier extraction
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
// Show result
std::cout << "\nSolution 1: Use inserter operator\n- Data: -\n" << subString1 << "\n"
<< subString2 << "\n" << subString3 << "\n" << subString4 << "\n";
// Solution 2: Use istream_iterator ----------------------------------
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
// Show result
std::cout << "\nSolution 2: Use istream_iterator\n- Data: -\n";
std::copy(token.begin(), token.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
// Solution 3: Use std::sregex_token_iterator ----------------------------------
const std::regex re(" ");
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
// Show result
std::cout << "\nSolution 3: Use sregex_token_iterator\n- Data: -\n";
std::copy(token2.begin(), token2.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
So, now the answer on how you could read you text file.
It is essential to create the correct data structures. Then, overwrite the inserter and extractor operator and put the above functionality in it.
Please see the below demo example. Of course there are many other possible solutions:
#include <string>
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
struct ItemAndPrice {
// Data
std::string item{};
unsigned int price{};
// Extractor
friend std::istream& operator >> (std::istream& is, ItemAndPrice& iap) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
// Read the item and price from that line and check, if that worked
if (std::istringstream iss(line); !(iss >> iap.item >> iap.price))
// There was an error, while reading item and price. Set failbit of input stream
is.setf(std::ios::failbit);
}
return is;
}
// Inserter
friend std::ostream& operator << (std::ostream& os, const ItemAndPrice& iap) {
// Simple output of our internal data
return os << iap.item << " " << iap.price;
}
};
struct MarketPrice {
// Data
std::vector<ItemAndPrice> marketPriceData{};
size_t numberOfElements() const { return marketPriceData.size(); }
unsigned int weight{};
// Extractor
friend std::istream& operator >> (std::istream& is, MarketPrice& mp) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
size_t numberOfEntries{};
// Read the number of following entries and the weigth from that line and check, if that worked
if (std::istringstream iss(line); (iss >> numberOfEntries >> mp.weight)) {
mp.marketPriceData.clear();
// Now copy the numberOfEntries next lines into our vector
std::copy_n(std::istream_iterator<ItemAndPrice>(is), numberOfEntries, std::back_inserter(mp.marketPriceData));
}
else {
// There was an error, while reading number of following entries and the weigth. Set failbit of input stream
is.setf(std::ios::failbit);
}
}
return is;
};
// Inserter
friend std::ostream& operator << (std::ostream& os, const MarketPrice& mp) {
// Simple output of our internal data
os << "\nNumber of Elements: " << mp.numberOfElements() << " Weight: " << mp.weight << "\n";
// Now copy all marekt price data to output stream
if (os) std::copy(mp.marketPriceData.begin(), mp.marketPriceData.end(), std::ostream_iterator<ItemAndPrice>(os, "\n"));
return os;
}
};
// For this example I do not use argv and argc and file streams.
// This, because on Stackoverflow, I do not have files on Stackoverflow
// So, I put the file data in an istringstream. But for the below example,
// there is no difference between a file stream or a string stream
std::istringstream sourceFile{R"(2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13)"};
int main() {
// Here we will store all the resulting data
// So, read the complete source file, parse the data and store result in vector
std::vector mp(std::istream_iterator<MarketPrice>(sourceFile), {});
// Now, all data are in mp. You may work with that now
// Show result on display
std::copy(mp.begin(), mp.end(), std::ostream_iterator<MarketPrice>(std::cout, "\n"));
return 0;
}
I'm trying to parse a .tsv file and store the values of each cell of the row in a struct. Each row forms the struct and is appended to a list. If a cell is empty the getline while loop ends abruptly
The .tsv file looks like this:
No Name Age Grade
1 Andy 17 A
2 Drew 16 B
3 Brad 17 B
4 Cam A
5 Sam 18 B
Sample code
std::ifstream tsvFile(filePath);
if (!tsvFile.good()) return;
for (std::string line; std::getline(tsvFile, line); )
{
example item;
tsvFile >> example.s_no >> example.name >> example.age >> example.grade;
tsv_list.push_back(item);
}
tsvFile.close();
Loops through all rows and not stop abruptly.Is there a better way to parse a tsv line by line and add a specific tab delimiter ? I tried using line but the value doesn't seem correct. Printing line gives me an integer number and not the entire row every time I iterate through.
This is a typical example for reading records from a file.
We will choose an Object Oriented Approach and put all data into a struct, and, since the struct should know how to read and write its data, add an extractor and inserter opererator.
The extractor will read a complete line and then put the data into the structures member. In case of an errror, default values can be used.
In main, we just define a vector of Roster and use the range constructor, to read the complete input file during the variable definition.
Afterwards, we print the result.
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <vector>
#include <iterator>
// Test Data. Same as reading from file
std::istringstream testFile(R"#(1 Andy 17 A
2 Drew 16 B
3 Brad 17 B
4 Cam A
5 Sam 18 B
)#");
struct Roster
{
// Roster Data
size_t No{}; std::string Name{}; size_t Age{}; char Grade{};
// Extractor operator >> for Roster
friend std::istream& operator >> (std::istream& is, Roster& r) {
std::string line{}; // Here we will store the read line
std::getline(is, line); // Read complete line
std::istringstream iss(line); // Copy to a istringstream for extraction
if (!(iss >> r.No >> r.Name >> r.Age >> r.Grade)) { // Extract
// In case of error: Reset all values
r.No = 0; r.Name = "ERROR"; r.Age = 0; r.Grade = '#';
};
return is;
}
// Inserter operator << . Print space delimited data
friend std::ostream& operator << (std::ostream& os, const Roster& r) {
return os << r.No << ' ' << r.Name << ' ' << r.Age << ' ' << r.Grade;
}
};
int main()
{
// Read complete CSV
std::vector<Roster> roster{ std::istream_iterator<Roster>(testFile), std::istream_iterator<Roster>() };
// Copy all data to output
std::copy(roster.begin(), roster.end(), std::ostream_iterator<Roster>(std::cout, "\n"));
return 0;
}