Parse a file using C++, load the value to a structure - c++

I have the following file/line:
pc=1 ct=1 av=112 cv=1100 cp=1700 rec=2 p=10001 g=0 a=0 sz=5 cr=200
pc=1 ct=1 av=113 cv=1110 cp=1800 rec=2 p=10001 g=0 a=10 sz=5 cr=200
and so on.
I wish to parse this and take the key value pairs and put them in a structure:
struct pky
{
pky() :
a_id(0),
sz_id(0),
cr_id(0),
cp_id(0),
cv_id(0),
ct_id(0),
fr(0),
g('U'),
a(0),
pc(0),
p_id(0)
{ }
};
wherein either all the structure fields are used or some might be omitted.
How do I create a C++ class, which will do the same? I am new to C++ and not aware of any functions or library which would do this work.
Each line is to be processed, and the structure will be populated with one line each time and used, before it is flushed. The structure is later used as a parameter to a function.

You can do something like this:
std::string line;
std::map<std::string, std::string> props;
std::ifstream file("foo.txt");
while(std::getline(file, line)) {
std::string token;
std::istringstream tokens(line);
while(tokens >> token) {
std::size_t pos = token.find('=');
if(pos != std::string::npos) {
props[token.substr(0, pos)] = token.substr(pos + 1);
}
}
/* work with those keys/values by doing properties["name"] */
Line l(props["pc"], props["ct"], ...);
/* clear the map for the next line */
props.clear();
}
i hope it's helpful. Line can be like this:
struct Line {
std::string pc, ct;
Line(std::string const& pc, std::string const& ct):pc(pc), ct(ct) {
}
};
now that works only if the delimiter is a space. you can make it work with other delimiters too. change
while(tokens >> token) {
into for example the following, if you want to have a semicolon:
while(std::getline(tokens, token, ';')) {
actually, it looks like you have only integers as values, and whitespace as delimiters. you might want to change
std::string token;
std::istringstream tokens(line);
while(tokens >> token) {
std::size_t pos = token.find('=');
if(pos != std::string::npos) {
props[token.substr(0, pos)] = token.substr(pos + 1);
}
}
into this then:
int value;
std::string key;
std::istringstream tokens(line);
while(tokens >> std::ws && std::getline(tokens, key, '=') &&
tokens >> std::ws >> value) {
props[key] = value;
}
std::ws just eats whitespace. you should change the type of props to
std::map<std::string, int> props;
then too, and make Line accept int instead of std::string's. i hope this is not too much information at once.

This is the perfect place to define the stream operators for your structure:
#include <string>
#include <fstream>
#include <sstream>
#include <istream>
#include <vector>
#include <algorithm>
#include <iterator>
std::istream& operator>> (std::istream& str,pky& value)
{
std::string line;
std::getline(str,line);
std::stringstream dataStr(line);
static const std::streamsize max = std::numeric_limits<std::streamsize>::max();
// Code assumes the ordering is always as follows
// pc=1 ct=1 av=112 cv=1100 cp=1700 rec=2 p=10001 g=0 a=0 sz=5 cr=200
dataStr.ignore(max,'=') >> value.pc;
dataStr.ignore(max,'=') >> value.ct_id;
dataStr.ignore(max,'=') >> value.a; // Guessing av=
dataStr.ignore(max,'=') >> value.cv_id;
dataStr.ignore(max,'=') >> value.cp_id;
dataStr.ignore(max,'=') >> value.fr; // Guessing rec=
dataStr.ignore(max,'=') >> value.p_id;
dataStr.ignore(max,'=') >> value.g;
dataStr.ignore(max,'=') >> value.a_id;
dataStr.ignore(max,'=') >> value.sz_id;
dataStr.ignore(max,'=') >> value.cr_id;
return str;
}
int main()
{
std::ifstream file("plop");
std::vector<pky> v;
pky data;
while(file >> data)
{
// Do Somthing with data
v.push_back(data);
}
// Even use the istream_iterators
std::ifstream file2("plop2");
std::vector<pky> v2;
std::copy(std::istream_iterator<pky>(file2),
std::istream_iterator<pky>(),
std::back_inserter(v2)
);
}

This seemed to do the trick. Of course you'd extract the code I've written in main and stick it in a class or something, but you get the idea.
#include <sstream>
#include <string>
#include <vector>
#include <map>
using namespace std;
vector<string> Tokenize(const string &str, const string &delim)
{
vector<string> tokens;
size_t p0 = 0, p1 = string::npos;
while(p0 != string::npos)
{
p1 = str.find_first_of(delim, p0);
if(p1 != p0)
{
string token = str.substr(p0, p1 - p0);
tokens.push_back(token);
}
p0 = str.find_first_not_of(delim, p1);
}
return tokens;
}
int main()
{
string data = "pc=1 ct=1 av=112 cv=1100 cp=1700 rec=2 p=10001 g=0 a=0 sz=5 cr=200 pc=1 ct=1 av=113 cv=1110 cp=1800 rec=2 p=10001 g=0 a=10 sz=5 cr=200";
vector<string> entries = Tokenize(data, " ");
map<string, int> items;
for (size_t i = 0; i < entries.size(); ++i)
{
string item = entries[i];
size_t pos = item.find_first_of('=');
if(pos == string::npos)
continue;
string key = item.substr(0, pos);
int value;
stringstream stream(item.substr(pos + 1));
stream >> value;
items.insert (pair<string, int>(key, value));
}
}

Unfortunately, your source data file is human-oriented, which means that you're going to have to do a bunch of string parsing in order to get it into the structure. Otherwise, if the data had been written directly as a binary file, you could just use fread() to pop it directly into the struct.
If you want to use an "elegant" (ie, ugly minimalistic approach), you could make a loop of sorts to parse each line, basically using strchr() to first find the '=' character, then the next space, then using atoi() to convert each number into a real int, and then using some pointer hackery to push them all into the structure. The obvious disadvantage there is that if the structure changes, or is even reorganized somehow, then the whole algorithm here would silently break.
So, for something that would be more maintainable and readable (but result in more code), you could just push each value into a vector, and then go through the vector and copy each value into the appropriate strucutre field.

What you get taught here, are monstrosities.
http://en.wikipedia.org/wiki/Scanf
Do not use this function to extract strings from untrusted data, but as long as you either trust data, or only get numbers, why not.
If you are familiar with Regular Expressions from using another language, use std::tr1::regex or boost::regex - they are the same. If not familiar, you will do yourself a favor by familiarizing yourself.

Related

Reading a csv file and storing the data in an array using C++ in Visual Studio Code

I have a csv file called "input.csv" (UTF-8, no CR, LF only). It looks like
it has 51 rows and 3 columns. I have to read the csv file and perform some mathematical operations on the data. I have tried couple of codes to read the csv file in c++ but none of them seem to work. I tried these link
How to read a csv file data into an array?,
How can I read and parse CSV files in C++?
I am using these codes and changing the file name but I am not getting any outputs. I cannot seem to figure out the problem. currently I am using this code taken from second link
#include <iterator>
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <string>
class CSVRow
{
public:
std::string_view operator[](std::size_t index) const
{
return std::string_view(&m_line[m_data[index] + 1], m_data[index + 1] - (m_data[index] + 1));
}
std::size_t size() const
{
return m_data.size() - 1;
}
void readNextRow(std::istream& str)
{
std::getline(str, m_line);
m_data.clear();
m_data.emplace_back(-1);
std::string::size_type pos = 0;
while((pos = m_line.find(',', pos)) != std::string::npos)
{
m_data.emplace_back(pos);
++pos;
}
// This checks for a trailing comma with no data after it.
pos = m_line.size();
m_data.emplace_back(pos);
}
private:
std::string m_line;
std::vector<int> m_data;
};
std::istream& operator>>(std::istream& str, CSVRow& data)
{
data.readNextRow(str);
return str;
}
int main()
{
std::ifstream file("input.csv");
CSVRow row;
while(file >> row)
{
std::cout << "4th Element(" << row[3] << ")\n";
}
}
Does anyone knows where is the problem?
The first issue is that you are reading floating point numbers into integer vector, so that is going to lose information.
// std::vector<int> m_data; // Don't need this if you know
// there is always three values.
// std::string m_line; // Don't need this.
// Have a member for each column
double col1;
double col2;
double col3;
Your second issue is that the code above is used to parse strings. You need to convert that into reading floating point numbers.
void readNextRow(std::istream& str)
{
std::string line;
std::getline(str, line);
std::stringstream linestream(line);
char comma1;
char comma2;
bool ok = false;
if (linestream >> col1 >> comma1 >> col2 >> comma2 >> col3) {
// Read worked.
if (comma1 == ',' && comma2 == ',') {
// There were commas
ok = true;
}
}
if (!ok) {
str.setstate(std::ios::failbit);
}
}

separating 2 words from a string

I have done a lot of reading on this topic online, and cannot figure out if my code is working. i am working on my phone with the c4droid app, and the debugger is nearly useless as far as i can tell.
as the title says, i need to separate 2 words out of one input. depending on what the first word is, the second may or may not be used. if i do not need the second word everything is fine. if i need and have the second word it works, or seems to. but if i need a second word but only have the first it compiles, but crashes with an out of range exception.
ActionCommand is a vector of strings with 2 elements.
void splitstring(std::string original)
{
std::string
std::istringstream OrigStream(original);
OrigStream >> x;
ActionCommand.at(0) = x;
OrigStream >> x;
ActionCommand.at(1) = x;
return;
}
this code will separate the words right?
any help would be appreciated.
more of the code:
called from main-
void DoAction(Character & Player, room & RoomPlayerIn)
{
ParseAction(Player, GetAction(), RoomPlayerIn);
return;
}
std::string GetAction()
{
std::string action;
std::cout<< ">";
std::cin>>action;
action = Lowercase(action);
return action;
}
maybe Lowercase is the problem.
std::string Lowercase(std::string sourceString)
{
std::string destinationString;
destinationString.resize(sourceString.size());
std::transform(sourceString.begin(), sourceString.end(), destinationString.begin(), ::tolower);
return destinationString;
)
void ParseAction(Character & Player, std::string CommandIn, room & RoomPlayerIn)
(
std::vector<std::string> ActionCommand;
splitstring(CommandIn, ActionCommand);
std::string action = ActionCommand.at(0);
if (ActionCommand.size() >1)
std::string action2 = ActionCommand.at(1);
skipping some ifs
if (action =="wield")
{
if(ActionCommand.size() >1)
DoWield(action2);
else std::cout<<"wield what??"<<std::endl;
return;
}
and splitstring now looks like this
void splitstring(std::string const &original, std::vector<std::string> &ActionCommand)
{
std::string x;
std::istringstream OrigStream(original);
if (OrigStream >>x)
ActionCommand.push_back(x);
else return;
if (OrigStream>>x)
ActionCommand.push_back(x);
return;
}
#include <sstream>
#include <vector>
#include <string>
std::vector<std::string> ActionCommand;
void splitstring(std::string const &original)
{
std::string x;
std::istringstream OrigStream{ original };
if(OrigStream >> x)
ActionCommand.push_back(x);
else return;
if(OrigStream >> x)
ActionCommand.push_back(x);
}
Another idea would be to use the standard library. You can split a string into tokens (using spaces as dividers) with the following function:
#include <string>
#include <vector>
#include <sstream>
#include <iterator>
inline auto tokenize(const std::string &String)
{
auto Stream = std::stringstream(String);
return std::vector<std::string>{std::istream_iterator<std::string>{Stream}, std::istream_iterator<std::string>{}};
}
Here, the result is created in place by using an std::istream_iterator, which basically stands in for the >> operation in your example.
Warning:
This code needs at least c++11 to compile.

Using Delimiters to Parse an Address By 'Street; City; State; Country' and Storing Each Area into a Different Variable

So I'm having trouble storing the information after parsing a text-file. The text file has something like this inside it
1234 Main St; Oakland; CA; USA
2134 1st St; San Fransico; CA; USA
etc. etc.
I currently have these variables that I'm going to use to store the address's information
vector <string> addressInfo;
vector <string> street;
vector <string> city;
vector <string> state;
vector <string> country;
I'm also currently able to get the program to remove the ";" from the file and store all the information into a single vector using getline
while(read == true)
{
getline(in, line, ';');
if (in.fail())
{
read = false;
}
else
{
addressInfo.push_back(line);
}
}
When I do a for-loop to output what is inside the addressInfo vector, I get
1234 Main St
Oakland
CA
USA
etc. etc.
I know that I might have to use stringstream but I don't know how to store each line from the vector into the different variables.
I don't think you should store your data like this:
vector <string> addressInfo;
vector <string> street;
vector <string> city;
vector <string> state;
vector <string> country;
I think it should look like this:
struct address_info {
std::string street;
std::string city;
std::string state;
std::string country;
address_info() {}
// from C++11, I prefer below style
//address_info() = default;
address_info(std::string street_, std::string city_, std::string state_, std::string country_)
: street(street_), city(city_), state(state_), country(country_)
{}
};
int main()
{
std::vector<address_info> list;
// Let's assume that you know how to get this
std::string line = "1234 Main St; Oakland; CA; USA";
std::string street;
std::string city;
std::string state;
std::string country;
std::istringstream iss(line);
// remember to trim the string, I don't put it here
getline(iss, street, ';');
getline(iss, city, ';');
getline(iss, state, ';');
getline(iss, country, ';');
// This is the C++11 code to add to vector
//list.emplace_back(street, city, state, country);
// Pre-C++11 style
list.push_back(address_info(street, city, state, country));
}
Anyway, you can go search for a csv library.
Here is a c++14 version using tokenized algorithm (pretty similar to STL style). It's c++14 only because I am using a generic lambda but can be easily made c++11 compatible as well.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <iterator>
template <typename Iter, typename Executor>
void for_each_token(Iter first, Iter last,
Iter dfirst, Iter dlast,
Executor ex)
{
if (first == last) return;
auto tmp = first;
while (first != last) {
first = std::find_first_of(first, last, dfirst, dlast);
ex(tmp, first);
if (first == last) break;
first++;
tmp = first;
}
return;
}
template <typename Executor>
void for_each_token_str(const std::string& str, const std::string& delims, Executor ex)
{
for_each_token(std::begin(str), std::end(str), std::begin(delims), std::end(delims), ex);
}
int main() {
std::ifstream in("parse.txt");
if (not in) return 1;
std::string line;
std::vector<std::string> tokens;
std::vector <std::string> addressInfo;
std::vector <std::string> city;
std::vector <std::string> state;
std::vector <std::string> country;
while (std::getline(in, line)) {
for_each_token_str(line, ";", [&](auto f, auto l) {
tokens.emplace_back(f, l);
});
int idx = 0;
addressInfo.emplace_back(tokens[idx++]);
city.emplace_back(tokens[idx++]);
state.emplace_back(tokens[idx++]);
country.emplace_back(tokens[idx++]);
tokens.clear();
}
auto print = [](std::vector<std::string>& v) {
for (auto & e : v) std::cout << e << ' ';
std::cout << std::endl;
};
print(addressInfo);
print(city);
print(state);
print(country);
return 0;
}
I am assuming that you are using a vector for every field following SOA (Struct of arrays) principle. If not, I would rather have them grouped in a structure.
NOTE: I have skipped some error checking which you should not.
Push_back the names/string in the respective vector. newline is a default delimiter for getline.
string street_name;
string city_name;
string state_name;
string country_name;
while(getline(cin, street_name, ';') && getline(cin, city_name, ';') &&
getline(cin, state_name, ';') && getline(cin, country_name))
{
street.push_back(street_name);
city.push_back(city_name);
state.push_back(state_name);
country.push_back(country_name);
}

Reading an Input File And Store The Data Into an Array (beginner)!

The Input file:
1 4 red
2 0 blue
3 1 white
4 2 green
5 2 black
what I want to do is take every row and store it into 2D array.
for example:
array[0][0] = 1
array[0][1] = 4
array[0][2] = red
array[1][0] = 2
array[1][1] = 0
array[1][2] = blue
etc..
code Iam working on it:
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
using namespace std;
int convert_str_to_int(const string& str) {
int val;
stringstream ss;
ss << str;
ss >> val;
return val;
}
string getid(string str){
istringstream iss(str);
string pid;
iss >> pid;
return pid;
}
string getnumberofcolors(string str){
istringstream iss(str);
string pid,c;
iss >> pid>>c;
return c;
}
int main() {
string lineinfile ;
vector<string> lines;
ifstream infile("myinputfile.txt");
if ( infile ) {
while ( getline( infile , lineinfile ) ) {
lines.push_back(lineinfile);
}
}
//first line - number of items
int numofitems = convert_str_to_int(lines[0]);
//lopps items info
string ar[numofitems ][3];
int i = 1;
while(i<=numofitems ){
ar[i][0] = getid(lines[i]);
i++;
}
while(i<=numofitems ){
ar[i][1] = getarrivel(lines[i]);
i++;
}
infile.close( ) ;
return 0 ;
}
when I add the second while loop my program stopped working for some reason!
is there any other way to to this or a solution to my program to fix it.
It's better to show you how to do it much better:
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int main() {
ifstream infile("myinputfile.txt"); // Streams skip spaces and line breaks
//first line - number of items
size_t numofitems;
infile >> numofitems;
//lopps items info
vector<pair<int, pair<int, string>> ar(numofitems); // Or use std::tuple
for(size_t i = 0; i < numofitems; ++i){
infile >> ar[i].first >> ar[i].second.first >> ar[i].second.second;
}
// infile.close( ) ; // Not needed -- closed automatically
return 0 ;
}
You are probably solving some kind of simple algorithmic task. Take a look at std::pair and std::tuple, which are useful not only as container for two elements, but because of their natural comparison operators.
The answer given is indeed a much better solution than your's. I figured i should point out some of your design flaws and give some tips too improve it.
You redefined a function that already exists in the standard, which is
std::stoi() to convert a string to an integer. Remember, if a function
exists already, it's OK to reuse it, don't think you have to reinvent what's
already been invented. If you're not sure search your favorite c++ reference guide.
The solution stores the data "as is" while you store it as a full string. This doesn't really make sense. You know what the data is beforehand, use that to your advantage. Plus, when you store a line of data like that it must be parsed, converted, and then constructed before it can be used in any way, whereas in the solution the data is constructed once and only once.
Because the format of the data is known beforehand an even better way to load the information is by defining a structure, along with input/output operators. This would look something like this:
struct MyData
{
int num1;
int num2;
std::string color;
friend std::ostream& operator << (std::ostream& os, const MyData& d);
friend std::istream& operator >> (std::istream& os, const MyData& d);
};
Then you could simply do something like this:
...
MyData tmp;
outfile << tmp;
vData.push_back(tmp);
...
Their is no question of intent, we are obviously reading a data type from a stream and storing it in a container. If anything, it's clearer as to what you are doing than either your original solution or the provided one.

C++ formatted input must match value

I'm reading a file with C++; the file looks like:
tag1 2345
tag2 3425
tag3 3457
I would like to have something like
input>>must_be("tag1")>>var1>>must_be("tag2")>>var2>>must_be("tag3")>>var3;
Where everything blows up if what's being taken in doesn't match the argument of must_be() and, when done, var1=2345, var2=3425, var3=3457.
Is there a standard way of doing this? (Hopefully where "tag1" need not necessarily be a string, but this is not a requirement.) fscanf from C made it quite easy.
Thanks!
To clarify, each >> reads in one whitespace-delimited set of characters from input. I want to match some of the in-coming blocks of characters (tagX) against strings or data I have specified.
You need to implement operator>> for your class. Something like this :
#include <string>
#include <iostream>
#include <fstream>
#include <sstream>
struct A
{
A(const int tag_):tag(tag_),v(0){}
int tag;
int v;
};
#define ASSERT_CHECK( chk, err ) \
if ( !( chk ) ) \
throw std::string(err);
std::istream& operator>>( std::istream & is, A &a )
{
std::string tag;
is >> tag;
ASSERT_CHECK( tag.size() == 4, "tag size" );
std::stringstream ss(std::string(tag.begin()+3,tag.end()));
int tagVal;
ss >> tagVal;
std::cout<<"tag="<<tagVal<<" a.tag="<<a.tag<<std::endl;
ASSERT_CHECK( a.tag == tagVal,"tag value" );
is >> a.v;
return is;
}
int main() {
A a1(1);
A a2(2);
A a3(4);
try{
std::fstream f("in.txt" );
f >> a1 >> a2 >> a3;
}
catch(const std::string &e)
{
std::cout<<e<<std::endl;
}
std::cout<<"a1.v="<<a1.v<<std::endl;
std::cout<<"a2.v="<<a2.v<<std::endl;
std::cout<<"a3.v="<<a3.v<<std::endl;
}
Take a note that for wrong tag value, an exception will be thrown (meaning the tag much match).
Can't you read it line by line, and matching tags for each line? If the tag doesn't match what you expect you just skip the line and move on to the next.
Something like this:
const char *tags[] = {
"tag1",
"tag2",
"tag3",
};
int current_tag = 0; // tag1
const int tag_count = 3; // number of entries in the tags array
std::map<std::string, int> values;
std::string line;
while (current_tag < tag_count && std::getline(input, line))
{
std::istringstream is(line);
std::string tag;
int value;
is >> tag >> value;
if (tag == tags[current_tag])
values[tag] = value;
// else skip line (print error message perhaps?)
current_tag++;
}