Reading key value pairs from a file and ignoring # comment lines - c++

I would like to read key value pairs from a file, while ignoring comment lines.
imagine a file like:
key1=value1
#ignore me!
I came up with this,
a) it seems very clunky and
b) it doesn't work if the '=' isn't surrounded by spaces. The lineStream isn't being properly split and the entire line is being read into "key".
std::ifstream infile(configFile);
std::string line;
std::map<std::string, std::string> properties;
while (getline(infile, line)) {
//todo: trim start of line
if (line.length() > 0 && line[0] != '#') {
std::istringstream lineStream(line);
std::string key, value;
char delim;
if ((lineStream >> key >> delim >> value) && (delim == '=')) {
properties[key] = value;
}
}
}
Also, comments on my code style are welcome :)

I had to make some interpreter that read in a configuration file and stored it's values recently, this is the way I did it, it ignores lines starting with # :
typedef std::map<std::string, std::string> ConfigInfo;
ConfigInfo configValues;
std::string line;
while (std::getline(fileStream, line))
{
std::istringstream is_line(line);
std::string key;
if (std::getline(is_line, key, '='))
{
std::string value;
if (key[0] == '#')
continue;
if (std::getline(is_line, value))
{
configValues[key] = value;
}
}
}
fileStream being being an fstream of a file.
Partly from https://stackoverflow.com/a/6892829/1870760

Doesn't look that bad. I just would use string::find to find the equal sign, instead of generating the lineStream. Then take the substring from index zero to the equal-sign position and trim it. (Unfortunately, you have to write the trim routine yourself or use the boost one.) Then take the substring behind the equal sign and also trim it.

Related

How to read input from file and pair into a map in C++

I am trying to read through a text file that can possibly look like below.
HI bye
goodbye
foo bar
boy girl
one two three
I am trying to take the lines with only two words and store them in a map, the first word would be the key and second word would be the value.
below is the code I came up with but I can't figure out how to ignore the lines that do not have two words on them.
this only works properly if every line has two words. I understand why this is only working if every line has two words but, I'm not sure what condition I can add to prevent this.
pair myPair;
map myMap;
while(getline(file2, line, '\0'))
{
stringstream ss(line);
string word;
while(!ss.eof())
{
ss >> word;
myPair.first = word;
ss >> word;
myPair.second = word;
myMap.insert(myPair);
}
}
map<string, string>::iterator it=myMap.begin();
for(it=myMap.begin(); it != myMap.end(); it++)
{
cout<<it->first<<" "<<it->second<<endl;
}
Read two words into a temporary pair. If you can't, do not add the pair to the map. If you can read two words, see if you can read a third word. If you can, you have too many words on the line. Do not add.
Example:
while(getline(file2, line, '\0'))
{
stringstream ss(line);
pair<string,string> myPair;
string junk;
if (ss >> myPair.first >> myPair.second && !(ss >> junk))
{ // successfully read into pair, but not into a third junk variable
myMap.insert(myPair);
}
}
let me suggest a little different implementation
std::string line;
while (std::getline(infile, line)) {
// Vector of string to save tokens
vector <string> tokens;
// stringstream class check1
stringstream check1(line);
string intermediate;
// Tokenizing w.r.t. space ' '
while(getline(check1, intermediate, ' ')) {
tokens.push_back(intermediate);
}
if (tokens.size() == 2) {
// your condition of 2 words in a line apply
// process 1. and 2. item of vector here
}
}
You can use fscanf for take input from file and sscanf for take input from string with format. sscanf return how many input successfully take with given format. so you can easily check, how many word have a line.
#include<stdio.h>
#include<stdlib.h>
#include <iostream>
using namespace std;
int main()
{
char line[100];
FILE *fp = fopen("inp.txt", "r");
while(fscanf(fp, " %[^\n]s", line) == 1)
{
cout<<line<<endl;
char s1[100], s2[100];
int take = sscanf(line, "%s %s", s1, s2);
cout<<take<<endl;
}
return 0;
}

How do I convert a string to a double in C++?

I've attempted to use atof() (which I think is way off) and stringstream. I feel like stringstream is the answer, but I'm so unfamiliar with it. Based on some Google searches, YouTube videos, and some time at cplusplus.com, my syntax looks like below. I'm pulling data from a .csv file and attempting to put it into a std::vector<double>:
while (file.good() )
{
getline(file,line,',');
stringstream convert (line);
convert = myvector[i];
i++;
}
If you are reading doubles from a stream (file) we can simplify this:
double value;
while(file >> value) {
myvector.push_back(value);
}
The operator>> will read from a stream into the type you want and do the conversion automatically (if the conversions exists).
You could use a stringstream as an intermediate if each line had more information on it. Like a word an integer and a double.
std::string line;
while(std::getline(file, line)) {
std::stringstream lineStream(line);
std::string word;
int integer;
double real;
lineStream >> word >> integer >> real;
}
But this is overkill if you just have a single number on each line.
Now lets look at a csv file.
This is a line based file but each value is seporated by ,. So here you would read a line then loop over that line and read the value followed by the comma.
std::string line;
while(std::getline(file, line)) {
std::stringstream lineStream(line);
double value;
char comma;
while(lineStream >> value) {
// You successfully read a value
if (!(lineStream >> comma && comma == ',')) {
break; // No comma so this line is finished break out of the loop.
}
}
}
Don't put a test for good() in the while condition.
Why is iostream::eof inside a loop condition considered wrong?
Also worth a read:
How can I read and parse CSV files in C++?

Reading in file with delimiter

How do I read in lines from a file and assign specific segments of that line to the information in structs? And how can I stop at a blank line, then continue again until end of file is reached?
Background: I am building a program that will take an input file, read in information, and use double hashing for that information to be put in the correct index of the hashtable.
Suppose I have the struct:
struct Data
{
string city;
string state;
string zipCode;
};
But the lines in the file are in the following format:
20
85086,Phoenix,Arizona
56065,Minneapolis,Minnesota
85281
56065
Sorry but I still cannot seem to figure this out. I am having a really hard time reading in the file. The first line is basically the size of the hash table to be constructed. The next blank line should be ignored. Then the next two lines are information that should go into the struct and be hashed into the hash table. Then another blank line should be ignored. And finally, the last two lines are input that need to be matched to see if they exist in the hash table or not. So in this case, 85281 is not found. While 56065 is found.
As the other two answers point out you have to use std::getline, but this is how I would do it:
if (std::getline(is, zipcode, ',') &&
std::getline(is, city, ',') &&
std::getline(is, state))
{
d.zipCode = std::stoi(zipcode);
}
The only real change I made is that I encased the extractions within an if statement so you can check if these reads succeeded. Moreover, in order for this to be done easily (you wouldn't want to type the above out for every Data object), you can put this inside a function.
You can overload the >> operator for the Data class like so:
std::istream& operator>>(std::istream& is, Data& d)
{
std::string zipcode;
if (std::getline(is, zipcode, ',') &&
std::getline(is, d.city, ',') &&
std::getline(is, d.state))
{
d.zipCode = std::stoi(zipcode);
}
return is;
}
Now it becomes as simple as doing:
Data d;
if (std::cin >> d)
{
std::cout << "Yes! It worked!";
}
You can use a getline function from <string> like this:
string str; // This will store your tokens
ifstream file("data.txt");
while (getline(file, str, ',') // You can have a different delimiter
{
// Process your data
}
You can also use stringstream:
stringstream ss(line); // Line is from your input data file
while (ss >> str) // str is to store your token
{
// Process your data here
}
It's just a hint. Hope it helps you.
All you need is function std::getline
For example
std::string s;
std::getline( YourFileStream, s, ',' );
To convert a string to int you can use function std::stoi
Or you can read a whole line and then use std::istringstream to extract each data with the same function std::getline. For example
Data d = {};
std::string line;
std::getline( YourFileStream, line );
std::istringstream is( line );
std::string zipCode;
std::getline( is, zipCode, ',' );
d.zipCode = std::stoi( zipCode );
std::getline( is, d.city, ',' );
std::getline( is, d.state, ',' );

How to read a file word by word and find the position of each word?

I'm trying to read a file word by word and do some implementation on each word. In future I want to know where was the position of each word. Position is line number and character position in that line. If character position is not available I only need to know when I'm reading a file when I go to the next line. This is the sample code I have now:
string tmp;
while(fin>>tmp){
mylist.push_back(tmp);
}
I need to know when fin is going to next line?!
"I need to know when fin is going to next line"
This is not possible with stream's operator >>. You can read the input line by line and process each line separately using temporary istringstream object:
std::string line, word;
while (std::getline(fin, line)) {
// skip empty lines:
if (line.empty()) continue;
std::istringstream lineStream(line);
for (int wordPos = 0; lineStream >> word; wordPos++) {
...
mylist.push_back(word);
}
}
just don't forget to #include <sstream>
One simple way to solve this problem would be using std::getline, run your own counter, and split line's content into words using an additional string stream, like this:
string line;
int line_number = 0;
for (;;) {
if (!getline(fin, line)) {
break;
}
istringstream iss(line);
string tmp;
while (iss >> tmp) {
mylist.push_back(tmp);
}
line_number++;
}

Tokenize stringstream based on type

I have an input stream containing integers and special meaning characters '#'. It looks as follows:
... 12 18 16 # 22 24 26 15 # 17 # 32 35 33 ...
The tokens are separated by space. There's no pattern for the position of '#'.
I was trying to tokenize the input stream like this:
int value;
std::ifstream input("data");
if (input.good()) {
string line;
while(getline(data, line) != EOF) {
if (!line.empty()) {
sstream ss(line);
while (ss >> value) {
//process value ...
}
}
}
}
The problem with this code is that the processing stops when the first '#' is encountered.
The only solution I can think of is to extract each individual token into a string (not '#') and use atoi() function to convert the string to an integer. However, it's very inefficient as the majority tokens are integer. Calling atoi() on the tokens introduces big overhead.
Is there a way I can parse the individual token by its type? ie, for integers, parse it as integers while for '#', skip it. Thanks!
One possibility would be to explicitly skip whitespace (ss >> std::ws), and then to use ss.peek() to find out if a # follows. If yes, use ss.get() to read it and continue, otherwise use ss >> value to read the value.
If the positions of # don't matter, you could also remove all '#' from the line before initializing the stringstream with it.
Usually not worth testing against good()
if (input.good()) {
Unless your next operation is generating an error message or exception. If it is not good all further operations will fail anyway.
Don't test against EOF.
while(getline(data, line) != EOF) {
The result of std::getline() is not an integer. It is a reference to the input stream. The input stream is convertible to a bool like object that can be used in bool a context (like while if etc..). So what you want to do:
while(getline(data, line)) {
I am not sure I would read a line. You could just read a word (since the input is space separated). Using the >> operator on string
std::string word;
while(data >> word) { // reads one space separated word
Now you can test the word to see if it is your special character:
if (word[0] == "#")
If not convert the word into a number.
This is what I would do:
// define a class that will read either value from a stream
class MyValue
{
public:
bool isSpec() const {return isSpecial;}
int value() const {return intValue;}
friend std::istream& operator>>(std::istream& stream, MyValue& data)
{
std::string item;
stream >> item;
if (item[0] == '#') {
data.isSpecial = true;
} else
{ data.isSpecial = false;
data.intValue = atoi(&item[0]);
}
return stream;
}
private:
bool isSpecial;
int intValue;
};
// Now your loop becomes:
MyValue val;
while(file >> val)
{
if (val.isSpec()) { /* Special processing */ }
else { /* We have an integer */ }
}
Maybe you can read all values as std::string and then check if it's "#" or not (and if not - convert to int)
int value;
std::ifstream input("data");
if (input.good()) {
string line;
std::sstream ss(std::stringstream::in | std::stringstream::out);
std::sstream ss2(std::stringstream::in | std::stringstream::out);
while(getline(data, line, '#') {
ss << line;
while(getline(ss, line, ' ') {
ss2 << line;
ss2 >> value
//process values ...
ss2.str("");
}
ss.str("");
}
}
In here we first split the line by the token '#' in the first while loop then in the second while loop we split the line by ' '.
Personally, if your separator is always going to be space regardless of what follows, I'd recommend you just take the input as string and parse from there. That way, you can take the string, see if it's a number or a # and whatnot.
I think you should re-examine your premise that "Calling atoi() on the tokens introduces big overhead-"
There is no magic to std::cin >> val. Under the hood, it ends up calling (something very similar to) atoi.
If your tokens are huge, there might be some overhead to creating a std::string but as you say, the vast majority are numbers (and the rest are #'s) so they should mostly be short.