How do I parse this file in cpp?

How do I parse this file in cpp? - c++

I want to parse a file with the following content:
2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13
Here, 2 is the number of lines, 300 is the weight.
Each line has a string and a number(price). Same with 3 and 400.
I need to store 130, 456 in an array.
Currently, I am reading the file and each line is processed as std::string. I need help to progress further.
Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//void processString(string line);
void process2(string line);
int main(int argc, char ** argv) {
cout << "You have entered " << argc <<
" arguments:" << "\n";
for (int i = 1; i < argc; ++i)
cout << argv[i] << "\n";
//2, 4 are the file names
//Reading file - market price file
string line;
ifstream myfile(argv[2]);
if (myfile.is_open()) {
while (getline(myfile, line)) {
// cout << line << '\n';
}
myfile.close();
} else cout << "Unable to open market price file";
//Reading file - price list file
string line_;
ifstream myfile2(argv[4]);
int c = 1;
if (myfile2.is_open()) {
while (getline(myfile2, line_)) {
// processString(line_);
process2(line_);
}
myfile2.close();
} else cout << "Unable to open price lists file";
//processString(line_);
return 0;
}
void process2(string line) {
string word = "";
for (auto x: line) {
if (x == ' ') {
word += " ";
} else {
word = word + x;
}
}
cout << word << endl;
}
Is there a split function like in Java, so I can split and store everything as tokens?

You have 2 questions in your post:
How do I parse this file in cpp?
Is there a split function like in Java, so I can split and store everything as tokens?
I will answer both questions and show a demo example.
Let's start with splitting a string into tokens. There are several possibilities. We start with the easy ones.
Since the tokens in your string are delimited by a whitespace, we can take advantage of the functionality of the extractor operator (>>). This will read data from an input stream, up to a whitespace and then converts this read data into the specified variable. You know that this operation can be chained.
Then for the example string
const std::string line{ "Token1 Token2 Token3 Token4" };
you can simply put that into a std::istringstream and then extract the variables from the stream:
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
The disadvantage is that you need to write a lot of stuff and you have to know the number of elements in the string.
We can overcome this problem with using a vector as the taget data store and fill it with its range constructor. The vectors range constructor takes a begin and and end interator and copies the data into it.
As iterator we use the std::istream_iterator. This will, in simple terms, call the extractor operator (>>) until all data is consumed. Whatever number of data we will have.
This will then look like the below:
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
This may look complicated, but is not. We define a variable "token" of type std::vector. We use its range constructor.
And, we can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction", C++17 required).
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
There is an additional solution. It is the most powerful solution and hence maybe a little bit to complicated in the beginning.
With that can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator. Very simple to use. And the result is a one liner for splitting the original string:
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
So, modern C++ has a build in functionality which is exactly designed for the purpose of tokenizing strings. It is called std::sregex_token_iterator. What is this thing?
As it name says, it is an iterator. It will iterate over a string (hence the 's' in its name) and return the split up tokens. The tokens will be matched again a regular expression. Or, natively, the delimiter will be matched and the rest will be seen as token and returned. This will be controlled via the last flag in its constructor.
Let's have a look at this constructor:
token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
The first parameter is, where it should start in the source string, the 2nd parameter is the end position, up to which the iterator should work. The last parameter is:
1, if you want to have a positive match for the regex
-1, will return everything that not matches the regex
And last but not least the regex itself. Please read in the net abot regex'es. There are tons of pages available.
Please see a demo for all 3 solutions here:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
#include <sstream>
#include <iterator>
#include <algorithm>
/// Split string into tokens
int main() {
// White space separated tokens in a string
const std::string line{ "Token1 Token2 Token3 Token4" };
// Solution 1: Use extractor operator ----------------------------------
// Here, we will store the result
std::string subString1{}, subString2{}, subString3{}, subString4{};
// Put the line into an istringstream for easier extraction
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
// Show result
std::cout << "\nSolution 1: Use inserter operator\n- Data: -\n" << subString1 << "\n"
<< subString2 << "\n" << subString3 << "\n" << subString4 << "\n";
// Solution 2: Use istream_iterator ----------------------------------
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
// Show result
std::cout << "\nSolution 2: Use istream_iterator\n- Data: -\n";
std::copy(token.begin(), token.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
// Solution 3: Use std::sregex_token_iterator ----------------------------------
const std::regex re(" ");
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
// Show result
std::cout << "\nSolution 3: Use sregex_token_iterator\n- Data: -\n";
std::copy(token2.begin(), token2.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
So, now the answer on how you could read you text file.
It is essential to create the correct data structures. Then, overwrite the inserter and extractor operator and put the above functionality in it.
Please see the below demo example. Of course there are many other possible solutions:
#include <string>
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
struct ItemAndPrice {
// Data
std::string item{};
unsigned int price{};
// Extractor
friend std::istream& operator >> (std::istream& is, ItemAndPrice& iap) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
// Read the item and price from that line and check, if that worked
if (std::istringstream iss(line); !(iss >> iap.item >> iap.price))
// There was an error, while reading item and price. Set failbit of input stream
is.setf(std::ios::failbit);
}
return is;
}
// Inserter
friend std::ostream& operator << (std::ostream& os, const ItemAndPrice& iap) {
// Simple output of our internal data
return os << iap.item << " " << iap.price;
}
};
struct MarketPrice {
// Data
std::vector<ItemAndPrice> marketPriceData{};
size_t numberOfElements() const { return marketPriceData.size(); }
unsigned int weight{};
// Extractor
friend std::istream& operator >> (std::istream& is, MarketPrice& mp) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
size_t numberOfEntries{};
// Read the number of following entries and the weigth from that line and check, if that worked
if (std::istringstream iss(line); (iss >> numberOfEntries >> mp.weight)) {
mp.marketPriceData.clear();
// Now copy the numberOfEntries next lines into our vector
std::copy_n(std::istream_iterator<ItemAndPrice>(is), numberOfEntries, std::back_inserter(mp.marketPriceData));
}
else {
// There was an error, while reading number of following entries and the weigth. Set failbit of input stream
is.setf(std::ios::failbit);
}
}
return is;
};
// Inserter
friend std::ostream& operator << (std::ostream& os, const MarketPrice& mp) {
// Simple output of our internal data
os << "\nNumber of Elements: " << mp.numberOfElements() << " Weight: " << mp.weight << "\n";
// Now copy all marekt price data to output stream
if (os) std::copy(mp.marketPriceData.begin(), mp.marketPriceData.end(), std::ostream_iterator<ItemAndPrice>(os, "\n"));
return os;
}
};
// For this example I do not use argv and argc and file streams.
// This, because on Stackoverflow, I do not have files on Stackoverflow
// So, I put the file data in an istringstream. But for the below example,
// there is no difference between a file stream or a string stream
std::istringstream sourceFile{R"(2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13)"};
int main() {
// Here we will store all the resulting data
// So, read the complete source file, parse the data and store result in vector
std::vector mp(std::istream_iterator<MarketPrice>(sourceFile), {});
// Now, all data are in mp. You may work with that now
// Show result on display
std::copy(mp.begin(), mp.end(), std::ostream_iterator<MarketPrice>(std::cout, "\n"));
return 0;
}

Related

Read from comma separated file into vector of objects

I have a done a simple C++ program to gain knowledge in C++. It's a game which stores and reads in the end to a file. Score, Name etc..
At each line in the file the content for a Player object is stored.
Ex: ID Age Name etc.
I now wanted to change to comma separation in the file but then I faced the issue how to read each line and write the Player object into a vector of Player objects std::vector correct.
My code today is like.
std::vector<Player> readPlayerToVector()
{
// Open the File
std::ifstream in("players.txt");
std::vector<Player> players; // Empty player vector
while (in.good()) {
Player temp; //
in >> temp.pID;
....
players.push_back(temp);
}
in.close();
return players;
}
How should I change this code to be compatible with comma separation. Not it works with space separation with the overload of >>.
Be aware that I am a beginner in C++. I've tried looking of the examples where std::getline(ss, line) with stringstream is used but I can't figure out a good way to assign the Player object with that method.

I will try to help and explain you all steps. I will first show a little bit of theory and then some easy solution, some alternative solutions and the C++ (object-oriented) approach.
So, we will go from super easy to more modern C++ solution.
Let’s start. Assume that you have a of player with some attributes. Attributes could be for example: ID Name Age Score. If you store this data in a file, it could look like:
1 Peter 23 0.98
2 Carl 24 0.75
3 Bert 26 0.88
4 Mike 24 0.95
But at some point in time, we notice that this nice and simple format will not work any longer. The reason is that formatted input functions with the extractor operator >> will stop the conversion at a white space. And this will not work for the following example:
1 Peter Paul 23 0.98
2 Carl Maria 24 0.75
3 Bert Junior 26 0.88
4 Mike Senior 24 0.95
Then the statement fileStream >> id >> name >> age >> score; will not work any longer, and everything will fail. Therefore storing data in a CSV (Comma Separated Values) format is widely chosen.
The file would then look like:
1, Peter Paul, 23, 0.98
2, Carl Maria, 24, 0.75
3, Bert Junior, 26, 0.88
4, Mike Senior, 24, 0.95
And with that, we can clearly see, what value belongs to which attribute. But unfortunately, this will make reading more difficult. Because you do need to follow 3 steps now:
Read a complete line as a std::string
Split this string into substrings using the comma as a separator
Convert the substrings to the required format, for example from string to number age
So, let us solve this step by step.
Reading a complete line is easy. For this we have the function std::getline. It will read a line (at text until the end of the line character ‘\n’) from a stream (from any istream, like std::cin, an std::ifstream or also from an std::istringstream) and store it in a std::string variable. Please read a description of the function in the CPP Reference here.
Now, splitting a CSV string in its parts. There are so many methods available, that it is hard to tell what is the good one. I will also show several methods later, but the most common approach is done with std::getline. (My personal favorite is the std::sregex_token_iterator, because it fits perfectly into the C++ algorithm world. But for here, it is too complex).
OK, std::getline. As you have read in the CPP reference, std::getline reads characters until it finds a delimiter. If you do not specify a delimiter, then it will read until the end of line \n. But you can also specify a different delimiter. And this we will do in our case. We will choose the delimiter ‘,’.
But, additional problem, after reading a complete line in step 1, we have this line in a std::string. And, std::getline wants to read from a stream. So, the std::getline with a comma as delimiter cannot be used with a std::string as source. Fortunately also here is a standard approach available. We will convert the std::string into a stream, by using a std::istringstream. You can simply define a variable of this type and pass the just read string as parameter to its constructor. For example:
std::istringstream iss(line);
And now we can use all iostream functions also with this “iss”. Cool. We will use std::getline with a ‘,’ delimiter and receive a substring.
The 3rd and last is unfortunately also necessary. Now we have a bunch of substrings. But we have also 3 numbers as attributes. The “ID” is an unsigned long, the “Age” is an int and the “Score” is a double, So we need to use string conversion functions to convert the substring to a number: std::stoul, std::stoi and std::stod. If the input data is always OK, then this is OK, but if we need to validate the input, then it will be more complicated. Let us assume that we have a good input.
Then, one of really many many possible examples:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
unsigned long ID{};
std::string name{};
int age{};
double score{};
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("players.txt");
// Here we will store all players that we read
std::vector<Player> players{};
// We will read a complete line and store it here
std::string line{};
// Read all lines of the source CSV file
while (std::getline(in, line)) {
// Now we read a complete line into our std::string line
// Put it into a std::istringstream to be able to extract it with iostream functions
std::istringstream iss(line);
// We will use a vector to store the substrings
std::string substring{};
std::vector<std::string> substrings{};
// Now, in a loop, get the substrings from the std::istringstream
while (std::getline(iss, substring, ',')) {
// Add the substring to the std::vector
substrings.push_back(substring);
}
// Now store the data for one player in a Player struct
Player player{};
player.ID = std::stoul(substrings[0]);
player.name = substrings[1];
player.age = std::stoi(substrings[2]);
player.score = std::stod(substrings[3]);
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score << '\n';
}
}
You see, it is getting more complex.
If you are more experienced, then you can use also other mechanisms. But then you need to understand the difference between formatted an unformatted input and need to have little bit more practice. This is complex. (So, do not use that in the beginning):
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
unsigned long ID{};
std::string name{};
int age{};
double score{};
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("r:\\players.txt");
// Here we will store all players that we read
Player player{};
std::vector<Player> players{};
char comma{}; // Some dummy for reading a comma
// Read all lines of the source CSV file
while (std::getline(in >> player.ID >> comma >> std::ws, player.name, ',') >> comma >> player.age >> comma >> player.score) {
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score << '\n';
}
}
As said, do not use in the beginning.
But, what you should try to learn and understand is: C++ is an object oriented language. This means we do not only put the data into the Player struct, but also the methods that operate on this data.
And those are at the moment just input and output. And as you already know, input and output is done using iostream-functionality with the extractor operator >> and inserter operator <<. But, how to do this? Our Player struct is a custom type. It has no build in >> and << operator.
Fortunately, C++ is a powerful language and allows us to add such functionality easily.
The signature of the struct would then look like:
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p);
friend std::ostream& operator << (std::ostream& os, const Player& p);
};
And, after writing the code for these operators using the above-mentioned method, we will get:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p) {
std::string line{}, substring{}; std::vector<std::string> substrings{};
std::getline(is, line);
std::istringstream iss(line);
// Read all substrings
while (std::getline(iss, substring, ','))
substrings.push_back(substring);
// Now store the data for one player in the given Player struct
Player player{};
p.ID = std::stoul(substrings[0]);
p.name = substrings[1];
p.age = std::stoi(substrings[2]);
p.score = std::stod(substrings[3]);
return is;
}
friend std::ostream& operator << (std::ostream& os, const Player& p) {
return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
}
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("r:\\players.txt");
// Here we will store all players that we read
Player player{};
std::vector<Player> players{};
// Read all lines of the source CSV file into players
while (in >> player) {
// Add this new player to our player list
players.push_back(player);
}
// Debug output
for (const Player& p : players) {
std::cout << p << '\n';
}
}
It is simply reusing everything from what we learned above. Just put it at the right place.
We can even go one step ahead. Also the player list, the ste::vector<Player> can be wrapped in a class and amended with iostream-functionality.
By knowing all of the above, this will be really simple now. See:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
#include <string>
struct Player {
// The data part
unsigned long ID{};
std::string name{};
int age{};
double score{};
// The methods part
friend std::istream& operator >> (std::istream& is, Player& p) {
char comma{}; // Some dummy for reading a comma
return std::getline(is >> p.ID >> comma >> std::ws, p.name, ',') >> comma >> p.age >> comma >> p.score;
}
friend std::ostream& operator << (std::ostream& os, const Player& p) {
return os << p.ID << "\t" << p.name << '\t' << p.age << '\t' << p.score;
}
};
struct Players {
// The data part
std::vector<Player> players{};
// The methods part
friend std::istream& operator >> (std::istream& is, Players& ps) {
Player player{};
while (is >> player) ps.players.push_back(player);
return is;
}
friend std::ostream& operator << (std::ostream& os, const Players& ps) {
for (const Player& p : ps.players) os << p << '\n';
return os;
}
};
// !!! Demo. All without error checking !!!
int main() {
// Open the source CSV file
std::ifstream in("players.txt");
// Here we will store all players that we read
Players players{};
// Read the complete CSV file and store everything in the players list at the correct place
in >> players;
// Debug output of complete players data. Ultra short.
std::cout << players;
}
I would be happy, if you could see the simple and yet powerful solution.
At the very end, as promised. Some further methods to split a string into substrings:
Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.
Alternatives
Handcrafted, many variants, using pointers or iterators, maybe hard to develop and error prone.
Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
std::getline. Most used implementation. But actually a "misuse" and not so flexible
Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower.
Please see 4 examples in one piece of code.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>
using Container = std::vector<std::string>;
std::regex delimiter{ "," };
int main() {
// Some function to print the contents of an STL container
auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };
// Example 1: Handcrafted -------------------------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Search for comma, then take the part and add to the result
for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {
// So, if there is a comma or the end of the string
if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {
// Copy substring
c.push_back(stringToSplit.substr(startpos, i - startpos));
startpos = i + 1;
}
}
print(c);
}
// Example 2: Using very old strtok function ----------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
c.push_back(token);
}
print(c);
}
// Example 3: Very often used std::getline with additional istringstream ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Put string in an std::istringstream
std::istringstream iss{ stringToSplit };
// Extract string parts in simple for loop
for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
;
print(c);
}
// Example 4: Most flexible iterator solution ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
//
// Everything done already with range constructor. No additional code needed.
//
print(c);
// Works also with other containers in the same way
std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
print(c2);
// And works with algorithms
std::deque<std::string> c3{};
std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));
print(c3);
}
return 0;
}
Happy coding!

I provided a similar solution here:
read .dat file in c++ and create to multiple data types
#include <iostream>
#include <sstream>
#include <vector>
struct Coefficients {
unsigned A;
std::vector<double> B;
std::vector< std::vector<double> > C;
};
std::vector<double> parseFloats( const std::string& s ) {
std::istringstream isf( s );
std::vector<double> res;
while ( isf.good() ) {
double value;
isf >> value;
res.push_back( value );
}
return res;
}
void readCoefficients( std::istream& fs, Coefficients& c ) {
fs >> c.A;
std::ws( fs );
std::string line;
std::getline( fs, line );
c.B = parseFloats( line );
while ( std::getline( fs, line ) ) {
c.C.push_back( parseFloats( line ) );
}
}
This one also might apply:
Best way to read a files contents and separate different data types into separate vectors in C++
std::vector<int> integers;
std::vector<std::string> strings;
// open file and iterate
std::ifstream file( "filepath.txt" );
while ( file ) {
// read one line
std::string line;
std::getline(file, line, '\n');
// create stream for fields
std::istringstream ils( line );
std::string token;
// read integer (I like to parse it and convert separated)
if ( !std::getline(ils, token, ',') ) continue;
int ivalue;
try {
ivalue = std::stoi( token );
} catch (...) {
continue;
}
integers.push_back( ivalue );
// Read string
if ( !std::getline( ils, token, ',' )) continue;
strings.push_back( token );
}

You could separate each variable by line rather than comma. I find this approach much more simple as you can use the getline function.
Have a read of the documentation of ifstream/ofstream. I've done several projects based of this documentation alone!
C++ fstream reference

Reading custom file formats in C++

I read configuration files of the following format into my C++ code:
# name score
Marc 19.7
Alex 3.0
Julia 21.2
So far, I have adapted a solution found here: Parse (split) a string in C++ using string delimiter (standard C++). For example, the following code snippet reads in the file line by line, and for each line calls parseDictionaryLine, which discards the first line, splits the string as described in the original thread, and inserts the values into a (self-implemented) hash table.
void parseDictionaryLine(std::string &line, std::string &delimiter, hash_table &table) {
size_t position = 0;
std::string name;
float score;
while((position = line.find(delimiter)) != std::string::npos) {
name = line.substr(0, position);
line.erase(0, position + delimiter.length());
score = stof(line);
table.hinsert(name, score);
}
}
void loadDictionary(const std::string &path, hash_table &table) {
std::string line;
std::ifstream fin(path);
std::string delimiter = " ";
int lineNumber = 0;
if(fin.is_open()) {
while(getline(fin, line)) {
if(lineNumber++ < 1) {
continue; // first line
}
parseDictionaryLine(line, delimiter, table);
}
fin.close();
}
else {
std::cerr << "Unable to open file." << std::endl;
}
}
My question would be, is there a more elegant way in C++ to achieve this task? In particular, is there (1) a better split function as for example in Python, (2) a better method to test if a line is a comment line (starting with #), like startsWith (3) potentially even in iterator that handles files similar to a context manager in Python and makes sure the file will actually be closed? My solution works for simple cases shown here but becomes more clunky with more complicated variations such as several comment lines at unpredictable positions and more parameters. Also, it worries me that my solution does not check if the file actually agrees with the prescribed format (two values per line, first is string, second is float). Implementing these checks with my method seems very cumbersome.
I understand there is JSON and other file formats with libraries made for this use case, but I am dealing with legacy code and cannot go there.

I will try to answer all your questions.
First for splitting a string, you should not use the linked question/answer. It is from 2010 and rather outdated. Or, you need to scroll at the very bottom. There you will find more modern answers.
In C++ many things are done with iterators. Because a lot of algorithms or constructors in C++ work with iterators. So, the better approch for splitting a string is to use iterators. This will then always result in a one liner.
Background. A std::string is also a container. And you can iterate over elements like for example words or values in it. In case of space separated values you can use the std::istream_iterator on a std::istringstream. But since years there is a dedicated iterator for iterating of patterns in a string:
The std::sregex_token_iterator. And because it is specifically designed for that purpuse, it should be used.
Ans if it is used for splitting the strings, the overhead of using regexes is also minimal. So, you may split on strings, commas, colons or whatever. Example:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
const std::regex re(";");
int main() {
// Some test string to be splitted
std::string test{ "Label;42;string;3.14" };
// Split and store whatever number of elements in the vector. One Liner
std::vector data(std::sregex_token_iterator(test.begin(), test.end(), re, -1), {});
// Some debug output
for (const std::string& s : data) std::cout << s << '\n';
}
So, regardless of the number of patterns, it will copy all data parts into the std::vector.
So, now you have a one liner solution for splitting strings.
For checking. if the first character is a string, you may use
the index operator (if (string[0] == '#'))
or, the std::string's front function (if (string.front() == '#'))
or again a regex
But, here you need to be careful. The string must not be empty, so, better write:
if (not string.empty() and string.front() == '#')
Closing file or iterating over files.
If you use a std::ifstream then the constructor will open the file for you and the destructor will automatically close it, when the stream variable rund out of scope. The typical pattern here is:
// Open the file and check, if it coud be opened
if (std::iftsream fileStream{"test.txt"};fileStream) {
// Do things
} // <-- This will close the file automatically for you
Then, in general you shoud use a more object oriented approach. Data, and methods operating on this data, should be encapsulated in one class. Then you would overwrite the extractor operatoe >> and the inserter operator << to read and write the data. This, because only the class should know, how to handle the data. And if you decide to use a different mechanism, modify your class and the rest of the outside world will still work.
In your example case, input and output is that simple, that easiest IO will work. No splitting of string necessary.
Please see the following example.
And note especially the only few statements in main.
If you change something inside the classes, it will simple continue to work.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
// Data in one line
struct Data {
// Name and score
std::string name{};
double score{};
// Extractor and inserter
friend std::istream& operator >> (std::istream& is, Data& d) { return is >> d.name >> d.score; }
friend std::ostream& operator << (std::ostream& os, const Data& d) { return os << d.name << '\t' << d.score; }
};
// Datbase, so all data from the source file
struct DataBase {
std::vector<Data> data{};
// Extractor
friend std::istream& operator >> (std::istream& is, DataBase& d) {
// Clear old data
d.data.clear(); Data element{};
// Read all lines from source stream
for (std::string line{}; std::getline(is, line);) {
// Ignore empty and command lines
if (not line.empty() and line.front() != '#') {
// Call extractor from Data class end get the data
std::istringstream(line) >> element;
// And save new data in the datbase
d.data.push_back(std::move(element));
}
}
return is;
}
// Inserter. Output all data
friend std::ostream& operator << (std::ostream& os, const DataBase& d) {
std::copy(d.data.begin(), d.data.end(), std::ostream_iterator<Data>(os, "\n"));
return os;
}
};
int main() {
// Open file and check, if it is open
if (std::ifstream ifs{ "test.txt" }; ifs) {
// Our database
DataBase db{};
// Read all data
ifs >> db;
// Debug output show all data
std::cout << db;
}
else std::cerr << "\nError: Could not open source file\n";
}

You can use operator>> to split at delimiters for you, like this:
#include <iostream>
#include <sstream>
#include <unordered_map>
std::istringstream input{
"# name score\n"
"Marc 19.7\n"
"Alex 3.0\n"
"Julia 21.2\n"
};
auto ReadDictionary(std::istream& stream)
{
// unordered_map has O(1) lookup, map has n(log n) lookup
// so I prefer unordered maps as dictionaries.
std::unordered_map<std::string, double> dictionary;
std::string header;
// read the first line from input (the comment line or header)
std::getline(stream, header);
std::string name;
std::string score;
// read name and score from line (>> will split at delimiters for you)
while (stream >> name >> score)
{
dictionary.insert({ name, std::stod(score) });
}
return dictionary;
}
int main()
{
auto dictionary = ReadDictionary(input); // todo replace with file stream
// range based for loop : https://en.cppreference.com/w/cpp/language/range-for
// captured binding : https://en.cppreference.com/w/cpp/language/structured_binding
for (const auto& [name, score] : dictionary)
{
std::cout << name << ": " << score << "\n";
}
return 0;
}

Is it possible to print specific lines out of this code?

I am trying to print out whatever is necessary from my program. What it does is it takes a long list from a text file and sort it based on first choice and GPA and put it into a vector.
I manage to sort by First choice and GPA however how can I remove whatever output that isn't necessary?
I know I asked this before but I think didn't ask correctly previously and I already edited some of it.
This is an example of my Txt File (The sequence of each line is 1st choice, 2nd choice, 3rd choice, GPA, Name):
CC,DR,TP,3.8,AlexKong
SN,SM,TP,4,MarcusTan
DR,TP,SC,3.6,AstaGoodwin
SC,TP,DR,2.8,MalcumYeo
SN,SM,TP,3.7,DavidLim
SN,SM,TP,3.2,SebastianHo
SC,TP,DR,4,PranjitSingh
DR,TP,SC,3.7,JacobMa
and so on...
This is my output now (it is a long vector):
TP,DR,SC,4,SitiZakariah
TP,DR,SC,3.9,MuttuSami
TP,DR,SC,3.5,SabrinaEster
TP,DR,SC,3,KarimIlham
TP,DR,SC,3,AndryHritik
SN,SM,TP,4,JasonTan
SN,SM,TP,3.8,MarcusOng
SN,SM,TP,3.7,DavidLim
SN,SM,TP,3.4,MollyLau
SN,SM,TP,3.2,SebastianHo
SN,SM,TP,3.2,NurAfiqah
SN,SM,TP,2.4,TanXiWei
SC,TP,DR,4,SallyYeo
SC,TP,DR,4,PranjitSingh
SC,TP,DR,3.6,RanjitSing
SC,TP,DR,2.8,MalcumYeo
SC,TP,DR,2.8,AbdulHalim
SC,TP,DR,2.7,AlifAziz
DR,TP,SC,3.9,SitiAliyah
DR,TP,SC,3.9,LindaChan
DR,TP,SC,3.8,SohLeeHoon
DR,TP,SC,3.7,PrithikaSari
DR,TP,SC,3.7,NurAzizah
DR,TP,SC,3.7,JacobMa
DR,TP,SC,3.6,AstaGoodwin
CC,DR,TP,3.9,MuruArun
CC,DR,TP,3.7,DamianKoh
CC,DR,TP,3.3,MattWiliiams
CC,DR,TP,3.3,IrfanMuhaimin
And this is the output that I need (Basically students with CC as their 1st choice without displaying the 3 options. I don't want the other options without CC as their first option. I already manage to print the output without the 3 choices as follow.):
3.9,MuruArun
3.8,AlexKong
3.7,DamianKoh
3.3,MattWiliiams
3.3,IrfanMuhaimin
This is my program:
#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <algorithm>
using namespace std;
struct greater
{
template<class T>
bool operator()(T const &a, T const &b) const { return a > b; }
};
void main()
{
vector<string> v;
ifstream File;
File.open("DSA.txt");
if (!File.is_open()) return;
string line;
string Name;
string GPA;
string First;
string Second;
string Third;
getline(File, First, ',');
getline(File, Second, ',');
getline(File, Third, ',');
getline(File, Name, ',');
getline(File, GPA, '\n');
cout << "Round 1:\n";
if (First == "CC")
while (File>>line)
{
v.push_back(line);
}
sort(v.begin(), v.end(), greater());
for (int i = 0; i < v.size(); i++)
{
cout << v[i].substr(9) << endl; //remove first 3 choices from output
}
}
This is my attempt to filter out my output:
if (First == "CC")
while (File>>line)
{
v.push_back(line);
}
sort(v.begin(), v.end(), greater());
for (int i = 0; i < v.size(); i++)
{
cout << v[i].substr(9) << endl;
}
I thought that if I getline and make an if condition to separate CC (if the first choice is CC, then condition is true) then I only print the ones with CC as first choice and ignore the rest. so basically I try to search for CC as the first choice.
But obviously I was very wrong. So I was hoping if anyone knows how to filter the output

Previous point:
As was noted in the comment section using namespace std; is a bad choice and your code has an example of one of the reasons why that is, the redefinition of greater which is already present in the namespace.
The provided link has further explanation and alternatives.
As for you code, if the goal is to output the lines, starting with CC without the options, ordered by GPA, as I understand it, there are simpler ways of doing it, for instance, you can use std::find to parse only lines with "CC" at its beginning and work from there.
You could also use std::string::starts_with however it's only available with C++20, so I'll go with the first option.
Live demo
int main()
{
std::vector<std::string> v;
std::ifstream File;
File.open("DSA.txt");
if (!File.is_open())
return EXIT_FAILURE;
std::string line;
while (File >> line)
{
if (line.find("CC") == 0) // find lines that start with CC
v.push_back(&line[9]); // add lines without the options
} // or v.push_back(line.substr(9));
sort(v.begin(), v.end(), std::greater<std::string>()); //sort the lines
std::cout << "GPA" << "\t" << "Name" <<"\n\n"; // Title for the table
for (auto& str : v) //print the lines
{
std::replace(str.begin(), str.end(), ',', '\t'); //let's replace de comma
std::cout << str << "\n";
}
return EXIT_SUCCESS;
}
Taking your sample, this will output:
GPA Name
3.9 MuruArun
3.7 DamianKoh
3.3 MattWiliiams
3.3 IrfanMuhaimin
Lines with "CC" in second or third options will not be parsed, as is our goal.
Note:
This sorting method by string is possible and works in this case because the GPA values are lower than 10, otherwise we would have to convert the GPA field and sort the lines by its value, e.g.: 10 is larger than 9 but as a string it would be sorted first because lexicographicaly ordered, 9 would be considered larger, the character 9 is larger than the character 1.
As you can see I used the default greater template, in this case you don't need to make your own, you can just use this one.
One more thing, main must have int return type.

Note that sorting and filtering records of data is a classical task for a DBMS.
So instead of writing a program, consider loading your CSV into a DBMS of your choice (MonetDB is a nice FOSS DBMS for analytics), say into a table named people then issuing an appropriate query, e.g.
SELECT * FROM people WHERE first_choice = 'CC' ORDER BY gpa;
(that is an SQL query) to get the output you want.
Some DBMSes even work natively with CSV files, in which case you won't need to load anything, just point your DBMS at the CSV file.
Finally, and sorry for suggesting something crude, but - if you are willing to be more "manual" about this - a spreadsheet application like LibreOffice Calc or MS Excel can import the CSV; and you can use the AutoFilter functionality to only display people with CC as the first option, and sort of descending GPA using the autofilter drop-down menu on the GPA column.
PS - This is not to detract from other valid answers of course.

Obviously you are using the wrong approach. This must be changed.
At first, we do need to analyze What the problem is about. So, we have a file with many lines. Each line of the many lines contains infos / values for one student. The values are delimited by a comma.
Such data is usually refered to as CSV --> comma separated values.
There are tons of posts here on SO to explain, how to read CSV files.
Anyway. After having done the initial analysis, we must start to think now, How we could solve that problem. Looking at the data in one line, we notice that it is always structured on the same way. For that reason, we define a structure, which will contain the values for one student. We call this new structure "Student". It will be defined like this:
// Data for one student
struct Student {
std::string first{};
std::string second{};
std::string third{};
double GPA{};
std::string name{};
};
Please note that the GPA will be stored as a double value and not as a string, because maybe we want to do some mathematical calculations.
The next requirement is that we have many lines with student data. So, we will store the many students in a container. And here we select the std::vector, because it can grow dynamically.So all data for all studnets can be stored in
// Here we will store all student swith all data
std::vector<Student> student{};
Next, we want to read all data from a file. So, then let us define a filestream variable and give the filename as a constructor parameter. This will try to open the file automatically. And, if this variable will not be used any longer and falls out of scope, then the file will be closed automatically.
We check, if the file is open. Because the bool operator and the ! operator for streams is overwritten to return the status of the file, we can simply write:
if (fileStream) {
Next we want to read many lines of the file, containing student data. We will use the std::getlinefunction to read a complete line. And we will use this function in a while loop. The function will return a reference to the ifstream again. And, as written above, this has a bool operator. So, the reading will be stopped at EOF (End-Of_file).
while (std::getline(fileStream, line))
Then, we have a complete line in our variable "line". This needs to be split into its sub components. "SN,SM,TP,3.7,DavidLim" needs to be split into "SN", "SM","TP", 3.7, "DavidLim".
There are many many possible solutions for splitting a CSV-string. I will use an approach with std::getline. At the bottom of this post, I show some further examples.
So, in order to use iostream facilities to extract data from a string, we can use the std::istringstream. We will put the line into it and can then extract data as from any other stream:
std::istringstream iss{ line };
Then we will use again the std::getline function to extract our needed data from the stringstream. And, after having extracted everything, we will add a complete Student record to our target vector:
student.push_back(tempStudent);
Now, we have all student data in our vector and we can use all function and algorithms of C++ to do all kind of operations on the data.
For the filtering, we will iterate over all data in the vector and then use an if statement to find out, if the current student record fullfills the condition. Then we will print it.
See the following example program:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
// Data for one student
struct Student {
std::string first{};
std::string second{};
std::string third{};
double GPA{};
std::string name{};
};
const std::string fileName{ "r:\\DSA.txt" };
int main() {
// Here we will store all student swith all data
std::vector<Student> student{};
// Open the source file
std::ifstream fileStream{ fileName };
// Check, if file could be opened
if (fileStream) {
// One complete line of the source file
std::string line{};
// Now read all lines of the source file
while (std::getline(fileStream, line)) {
// Now we have a complete line like "SN,SM,TP,4,MarcusTan\n" in the variable line
// In order to extract from this line, we will put it in a std::istringstream
std::istringstream iss{ line };
// Now we can extract from this string stream all our needed strings
Student tempStudent{};
// Extract all data
std::getline(iss, tempStudent.first,',');
std::getline(iss, tempStudent.second,',');
std::getline(iss, tempStudent.third, ',');
std::string tempGPA{}; std::getline(iss, tempGPA, ','); tempStudent.GPA = std::stod(tempGPA);
std::getline(iss, tempStudent.name);
// Add this data for one student to the vector with all students
student.push_back(tempStudent);
}
// Now, all Students are available
// If you want to sort, then do it know. We can sort for any field.
// As an example, we sort by name. Anything else also possible
std::sort(student.begin(), student.end(), [](const Student& s1, const Student& s2) { return s1.name < s2.name; });
// Now, we make a filtered output
// Iterate over all students
for (const Student& s : student) {
// Check, if condition is fullfilled
if (s.first == "CC") {
std::cout << s.GPA << ", " << s.name << '\n';
}
}
}
else {
// There was a problem with opening the input source file. Show error message.
std::cerr << "\n\nError: Could not open file '" << fileName << "'\n\n";
}
}
But this is very C-Style. In modern C++ we would go a different way. The object oriented appoach keeps data and methods (operating on that data) together in one class or struct.
So, basically we would define an extractor and inserter operator for the struct, because only this Object should know, how to read and write its data.
Then things will be really simple and compact.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
// Data for one student
struct Student {
std::string first{};
std::string second{};
std::string third{};
double GPA{};
std::string name{};
friend std::istream& operator >> (std::istream& is, Student& s) {
char comma{};
return std::getline(std::getline(std::getline(std::getline(is, s.first,','), s.second,','), s.third,',') >> s.GPA >> comma, s.name);
}
friend std::ostream& operator << (std::ostream& os, const Student& s) {
return os << s.first << '\t' << s.second << '\t' << s.third << '\t' << s.GPA << '\t' << s.name;
}
};
const std::string fileName{ "r:\\DSA.txt" };
int main() {
// Open the source file and check, if it could be opened
if (std::ifstream fileStream{ fileName }; fileStream) {
// Read the complet CSV file and parse it
std::vector student(std::istream_iterator<Student>(fileStream), {});
// Show all recors with first==CC
std::copy_if(student.begin(), student.end(), std::ostream_iterator<Student>(std::cout, "\n"), [](const Student& s) { return s.first == "CC"; });
}
return 0;
}
So, you have a one-liner for reading all student data. And then you can apply all kind of algorithms from the standard library.
That's the way to go.
Splitting a string
Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.
Alternatives
Handcrafted, many variants, using pointers or iterators, maybe hard to develop and error prone.
Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
std::getline. Most used implementation. But actually a "misuse" and not so flexible
Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower.
Please see 4 examples in one piece of code.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>
using Container = std::vector<std::string>;
std::regex delimiter{ "," };
int main() {
// Some function to print the contents of an STL container
auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };
// Example 1: Handcrafted -------------------------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Search for comma, then take the part and add to the result
for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {
// So, if there is a comma or the end of the string
if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {
// Copy substring
c.push_back(stringToSplit.substr(startpos, i - startpos));
startpos = i + 1;
}
}
print(c);
}
// Example 2: Using very old strtok function ----------------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
c.push_back(token);
}
print(c);
}
// Example 3: Very often used std::getline with additional istringstream ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c{};
// Put string in an std::istringstream
std::istringstream iss{ stringToSplit };
// Extract string parts in simple for loop
for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
;
print(c);
}
// Example 4: Most flexible iterator solution ------------------------------------------------
{
// Our string that we want to split
std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
//
// Everything done already with range constructor. No additional code needed.
//
print(c);
// Works also with other containers in the same way
std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
print(c2);
// And works with algorithms
std::deque<std::string> c3{};
std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));
print(c3);
}
return 0;
}
Please compile with C++17 enabled.
What a pity that nobody will read that.

Extract all numbers from stringstream

I want to read string and extract all numbers.
Input: 5a3 1f a0aaaa f1fg3
Output: 53 1 0 13
I tried this code:
string s;
getline(cin, s);
stringstream str_strm(s);
int found;
string temp;
while (!str_strm.eof()) {
str_strm >> temp;
if (stringstream(temp) >> found)
{
cout << found << endl;
}
}
but when found 5 (from example)after that automatically start to check the other string. How can I extract all numbers?

Here's a possible solution - while loop is used to separate strings with whitespaces, after that digits are extracted from the sub-strings.
int main()
{
stringstream ss("5a3 1f a0aaaa f1fg3");
string str;
while (getline(ss, str, ' ') ){
str.erase(std::remove_if(str.begin(), str.end(), [](unsigned char c) { return !std::isdigit(c); }), str.end());
cout << str << " ";
}
}

You could read each space separated word, and then remove the non-digits, like this
std::string word;
while (std::cin >> word)
{
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c); }),
word.end());
std::cout << word << " ";
}
For the input of 5a3 1f a0aaaa f1fg3, it prints 53 1 0 13.
The admittedly odd way of removing elements of a range, is a common idiom.
You could even avoid the loop entirely, if you have the input on a single line
std::string word;
std::getline(std::cin, word);
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c)
and not std::isspace(c); }),
word.end());
std::cout << word;

Please see here the ultra simple example. (There is an even simpler solution at the bottom of this post)
It is using modern C++ elements and algorithms. And has only a few lines of code.
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
#include <algorithm>
#include <sstream>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Put the complete line into a std::istringstream
std::istringstream iss{line};
// Print result
std::transform(std::istream_iterator<std::string>(iss), {}, std::ostream_iterator<std::string>(std::cout, " "),
[](const std::string& s) { return std::regex_replace(s, std::regex{ R"([^\d])" }, ""); });
}
return 0;
}
So, what's going on here. Let us look at it statement by statement. So, first:
if (std::string line{}; std::getline(std::cin, line)) {
This is a if-statement with initializer. If you look up if in the C++ reference, here, then you can see, that we can now have an additional initialization statement as the first part in the if. And why are we using that? Because it is an additional measure for scoping. The variable "line" is only used within the scope of the if statement. It is not needed outside the if. From the functionality point of view, it is the same as writing:
std::string line{};
if (std::getline(std::cin, line)) {
But then, "line" would be also visible outside of the if statement. And, because we want to prevent the pollution of outer namespace, we select this method.
Next is std::getline. This will read a complete line from the input stream, so, from the console (std::cin)and put it into the string. The std::getline returns a reference to the stream. The stream has an overloaded bool operator, that returns, if there was a failure (or end of file) or not. So, the if statement checks, if the input operation works. By the way. All IO-opereations should be checked, if they work or fail.
Good, now we have the complete line of the user input in our variable "line".
With
std::istringstream iss{line};
we put the string into an std::istringstream. We do this, because we want to make use of the C++ "iostream" library. The std::istringstream behaves as any other stream, for example std::cin and you can extract values from it that are separated by a white space. Like in std::cin >> v1 >> v2. The disadvantage for such an approach is, that you need to know the number of values in advance or use a dynamic growing container and a loop.
And this brings ud to our next construct that I want to explain. You may have heard about "iterators". Iterators are like pointers and can point to a range of elements. If you have a std::vector or any other container, then you can iterate with the begin() and end() iterator over all elements in the std::vector without knowing, how many elements are in the std::vector, without knowing how many elements it contains.
And for input streams, we have something similar: The std::istream_iterator. This iterator will iterate over the elements in the std::sitringstream and returns the type of variable given in its template parameter, by repeatedly calling the extractor operator >>. Here, in our case, a std::string. You may know ask: Until when? Where is the end. If you look in the description of the constructor number 1 of the std::istream_operator then you will see, that the default constructor Constructs the end-of-stream iterator. and the default construct can be generated by using the empty braced {} initializer. So {} is the end iterator.
If we want to read all std::strings from the std::istringstream, then we read between
std::istream_iterator<std::string>(iss) and {}. So every string that is in the std::istringstream.
Good, next, there is a similar thing for output, the std::ostream_iterator. This will call the inserter operator "<<" for all elements in a given range. And, we can can specify, to which stream it should send the data, here std::cout and additionally a separator-string, which will be appended to the outputted value.
OK, next: std::transform. As it names says, it will transform the elements in a range of elements, between a begin() and end() iterator, to a other range. So, it will transform the elements as shown above from the std::istringstream and send them to the std::ostream iterator. So, we read the source value, transform it, then write it.
But, how to transform. For the transformation, we give a simple lambda function, which calls the std::regex_replace function. This is a standard function, to replace parts of a string with other string data. And, the what that will be replaced is specified by a std::regex. This is a special pattern that is defined in some kind of meta language and matches specified parts of a string. in our case we use [^\d] which means, not a digit. You can test regexes here. You can also lean about them here.
And now, all together, explains the above solution.
All this can be further optimized to 2 statements:
#include <iostream>
#include <string>
#include <regex>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Remove unnecessary characters
std::cout << std::regex_replace(line, std::regex{ R"([^\d ])" }, "") << "\n";
}
return 0;
}
I cannot think of a more simpler solution.
In case of questions, please ask.

You can use get from istream to get each character, including whitespace, and then isdigit to check for a digit character...
#include <iostream>
#include <cctype>
int main()
{
char ch;
std::cin.get(ch);
while (!std::cin.eof())
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
std::cin.get(ch);
}
return 0;
}
However, you can avoid using std::cin.eof() for your expression for your While loop as follows...
#include <iostream>
#include <cctype>
int main()
{
char ch;
while (std::cin.get(ch))
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
}
return 0;
}

Regular expression pattern matching can be used to find all the digits in the input string.
Here is an example program to find the digits:
// C++ program to find all digits in a string
#include <bits/stdc++.h>
using namespace std;
int main() {
string inputString;
cout << "Enter the input string: ";
getline(cin, inputString);
cout << "Digits found: ";
// Define the regular expression matcher and pattern
smatch matcher;
regex pattern("[[:digit:]]");
while (regex_search(inputString, matcher, pattern)) {
// Show the match
cout << matcher.str(0);
// Continue searching the rest of the string
inputString = matcher.suffix().str();
}
return 0;
}
Output:
Enter the input string: sdfh354 eutyt;ljkn756897490uiotureu 587689jkgf 90
Digits found: 35475689749058768990
Here is another approach of finding the numbers in the string, without using the regular expression pattern matching:
#include <iostream>
#include <cctype>
#include <bits/stdc++.h>
using namespace std;
int main() {
string rawInput;
cout <<"Enter input string: ";
getline(cin, rawInput);
// Get all words from the input string
stringstream allWords(rawInput);
// Find and print digits in each word
string word;
while(allWords >> word) {
for(int i = 0; word[i]; i++) {
// Print only the numbers in the word
if(isdigit(word[i])) {
cout<<word[i];
}
}
cout<<" ";
}
cout<<"\n";
return 0;
}
Output:
Enter input string: ghjg45 jsdfj 897897 343yut45 90
45 897897 34345 90

How can I extract all numbers?
When you KNOW that the input numbers are all hex values ... (and how many)
stringstream ss ("5a3 1f a0aaaa f1fg3");
for (int i=0; i<4; ++i)
{
int k;
ss >> hex >> k;
cout << k << endl;
}
with output
1443
31
10529450
3871

Reading two columns in CSV file in c++

I have a CSV file in the form of two columns: name, age
To read and store the info, I did this
struct person
{
string name;
int age;
}
person record[10];
ifstream read("....file.csv");
However, when I did
read >> record[0].name;
read.get();
read >> record[0].age;
read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
Thank you!

You can first read the whole line with std:getline, then parse it via a std::istringstream (must #include <sstream>), like
std::string line;
while (std::getline(read, line)) // read whole line into line
{
std::istringstream iss(line); // string stream
std::getline(iss, record[0].name, ','); // read first part up to comma, ignore the comma
iss >> record[0].age; // read the second part
}
Below is a fully working general example that tokenizes a CSV file Live on Ideone
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
int main()
{
// in your case you'll have a file
// std::ifstream ifile("input.txt");
std::stringstream ifile("User1, 21, 70\nUser2, 25,68");
std::string line; // we read the full line here
while (std::getline(ifile, line)) // read the current line
{
std::istringstream iss{line}; // construct a string stream from line
// read the tokens from current line separated by comma
std::vector<std::string> tokens; // here we store the tokens
std::string token; // current token
while (std::getline(iss, token, ','))
{
tokens.push_back(token); // add the token to the vector
}
// we can now process the tokens
// first display them
std::cout << "Tokenized line: ";
for (const auto& elem : tokens)
std::cout << "[" << elem << "]";
std::cout << std::endl;
// map the tokens into our variables, this applies to your scenario
std::string name = tokens[0]; // first is a string, no need for further processing
int age = std::stoi(tokens[1]); // second is an int, convert it
int height = std::stoi(tokens[2]); // same for third
std::cout << "Processed tokens: " << std::endl;
std::cout << "\t Name: " << name << std::endl;
std::cout << "\t Age: " << age << std::endl;
std::cout << "\t Height: " << height << std::endl;
}
}

read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
read >> name will read everything into name until a white space is encountered.
If you have a comma separated line without white spaces, it makes sense that the entire line is read into name.
You can use std::getline to read the entire line to one string. Then use various methods of tokenizing a std::string.
Sample SO posts that address tokenizing a std::string:
How do I tokenize a string in C++?
c++ tokenize std string
Splitting a C++ std::string using tokens, e.g. ";"

You maybe could use stringstreams for that, but I wouldn't trust this, if I'm honest.
If I was you, I would write a small function, that reads the whole line into a string and after that, it should search for the separator character in the string. Everything in front of that is the first column and everything behind the second one. With the string operations provided by C++ you can move these parts in your variables (you can convert them into the correct type if you need).
I wrote a small C++ Library for CSV parsing, maybe a look at it helps you. You can find it on GitHub.
EDIT:
In this Gist you can find the parsing function

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How do I parse this file in cpp? - c++

Related

Read from comma separated file into vector of objects

Reading custom file formats in C++

Is it possible to print specific lines out of this code?

Extract all numbers from stringstream

Reading two columns in CSV file in c++

Categories

Resources