How do I read in lines from a file and assign specific segments of that line to the information in structs? And how can I stop at a blank line, then continue again until end of file is reached?
Background: I am building a program that will take an input file, read in information, and use double hashing for that information to be put in the correct index of the hashtable.
Suppose I have the struct:
struct Data
{
string city;
string state;
string zipCode;
};
But the lines in the file are in the following format:
20
85086,Phoenix,Arizona
56065,Minneapolis,Minnesota
85281
56065
Sorry but I still cannot seem to figure this out. I am having a really hard time reading in the file. The first line is basically the size of the hash table to be constructed. The next blank line should be ignored. Then the next two lines are information that should go into the struct and be hashed into the hash table. Then another blank line should be ignored. And finally, the last two lines are input that need to be matched to see if they exist in the hash table or not. So in this case, 85281 is not found. While 56065 is found.
As the other two answers point out you have to use std::getline, but this is how I would do it:
if (std::getline(is, zipcode, ',') &&
std::getline(is, city, ',') &&
std::getline(is, state))
{
d.zipCode = std::stoi(zipcode);
}
The only real change I made is that I encased the extractions within an if statement so you can check if these reads succeeded. Moreover, in order for this to be done easily (you wouldn't want to type the above out for every Data object), you can put this inside a function.
You can overload the >> operator for the Data class like so:
std::istream& operator>>(std::istream& is, Data& d)
{
std::string zipcode;
if (std::getline(is, zipcode, ',') &&
std::getline(is, d.city, ',') &&
std::getline(is, d.state))
{
d.zipCode = std::stoi(zipcode);
}
return is;
}
Now it becomes as simple as doing:
Data d;
if (std::cin >> d)
{
std::cout << "Yes! It worked!";
}
You can use a getline function from <string> like this:
string str; // This will store your tokens
ifstream file("data.txt");
while (getline(file, str, ',') // You can have a different delimiter
{
// Process your data
}
You can also use stringstream:
stringstream ss(line); // Line is from your input data file
while (ss >> str) // str is to store your token
{
// Process your data here
}
It's just a hint. Hope it helps you.
All you need is function std::getline
For example
std::string s;
std::getline( YourFileStream, s, ',' );
To convert a string to int you can use function std::stoi
Or you can read a whole line and then use std::istringstream to extract each data with the same function std::getline. For example
Data d = {};
std::string line;
std::getline( YourFileStream, line );
std::istringstream is( line );
std::string zipCode;
std::getline( is, zipCode, ',' );
d.zipCode = std::stoi( zipCode );
std::getline( is, d.city, ',' );
std::getline( is, d.state, ',' );
Related
I have a std::string line which contains a line of text read from a file. I then create a istringstream:
std::istringstream str(line);
to read this line. To get the first word, I do this:
std::string word;
str >> word;
Is there a way to get the word directly from str, without declaring the intermediate variable word?
For example, I would like to do something like:
if (str.get_next_word_directly() == "yes")
do_something();
You have to use a named variable to do this; it is possible to extract into a temporary but then you have no way of identifying that temporary to call operator== on it.
One option would be to wrap the extraction in a function, e.g.
std::string get_word(std::istream& is)
{
std::string word;
is >> word;
return word;
}
and then you can write
if( get_word(str) == "yes" )
NB. There is a proposal for C++17 that would allow str.get_word() instead of get_word(str); but for now you're stuck with get_word(str).
I am quite new to C++ and I have a txt file with data which looks something like this:
test:123:lock
qwerty:4321:unlock
asdf:12:lock
Is it possible for me to read the data line by line into a variable / array using ":" as the delimiter?
I tried doing something like:
while(!myfile.eof()) {
for (int i = 0; i < 3; i++) {
getline(myfile,UserN[i],':');
}
}
What I want to achieve is to store the data of the first line into the UserN[0], UserN[1], and UserN[2]. And when it start reading the second line, the data on the second line will replace the value in UserN[0], UserN[1], and UserN[2]. Thanks in advance!
Read the line first, then tokenize it with std::stringstream:
#include <sstream>
...
std::string line;
while(std::getline(myfile, line)) { // cache the line
std::istringstream tokenizer(line);
std::getline(tokenizer, UserN[0], ':'); // then get the tokens from it
std::getline(tokenizer, UserN[1], ':');
std::getline(tokenizer, UserN[2]); // last token: get the remainder
// of the line.
if(tokenizer) {
// success!
} else {
// There were fewer than two colons in the line
}
}
In essence, std::istringstream wraps a string in a stream interface -- the resulting stream behaves (roughly) like a file with the same contents as the string with which it was built. It is then possible to use >> or getline or anything else that you could use on files or std::cin or other input streams with it, and here we use it to take the string apart into the tokens you require.
You can do this simply with
ifstream myfile( "aFile.txt" );
// .. check whether the file is open: if( !myfile.is_oppen() ) error
for( string userN[3]
; getline( getline( getline( myfile >> ws, userN[0], ':' ), userN[1], ':' ), userN[2] ); )
{
// userN[0..2] is read correctly
}
or in a more elegant way, perhaps more suitable to Your requirements. I assume, that the second text is always a number and the third text is either 'lock' or 'unlock' or something else like an enum.
enum class LockState
{
lock, unlock
};
// -- reading a LockState
// please consider, that behind the text must follow a white space character (Space, LF, ..)
std::istream& operator>>(std::istream& in, LockState& s)
{
std::string word;
if( in >> word )
{
if( word == "lock" )
s = LockState::lock;
else if( word == "unlock" )
s = LockState::unlock;
else
in.setstate( std::ios_base::failbit );
}
return in;
}
struct Entry // change the name 'Entry' of the struct suitable for Your requirements
{
std::string someText;
int aNr;
LockState lockState;
};
// -- function to read an 'Entry'-object
std::istream& operator>>(std::istream& in, Entry& e)
{
char colon;
if( getline( in >> std::ws, e.someText, ':' ) >> e.aNr >> colon
&& colon != ':' )
in.setstate( std::ios_base::failbit );
else
in >> e.lockState;
return in;
}
and later in Your main-program
ifstream myfile( "aFile.txt" );
// .. check whether the file is open: if( !myfile.is_oppen() ) error
for( Entry e; myfile >> e; )
{
// use here the Entry-object 'e'
}
if( myfile.eof() )
cout << "Ok - You read the file till the end" << endl;
Avoid trouble here and use the split function from Boost:
#include <fstream>
#include <vector>
#include <string>
#include <boost/algorithm/string.hpp>
// ...
// Read file and throw exception on error.
std::ifstream infile;
infile.open(file_name);
std::string line;
while (std::getline(infile, line))
{
// Strip of the comments.
std::vector<std::string> strings;
boost::split(strings, line, boost::is_any_of(":"));
// You have now a vector of strings, which you can process...
}
I need cut string stream according custom separator. Current code cuts just acording to several standart separators. How to define and cut stringstream to string line according to custom delimiter?
std::istringstream input;
input.str("1\n2\n3\n4\n5\n6\n7\n");
int sum = 0;
for (std::string line; std::getline(input, line); )
{
cout<<line;
}
If you have one delimiter you want to use and it's a single character, you can just pass it to the 3-parameter overload of std::getline():
std::istringstream input;
input.str("1;2;3;4;5;6;7;");
int sum = 0;
for (std::string field; std::getline(input, field, ';'); )
{
std::cout<<field;
}
Live example
For other situations (multi-character delimiter, multiple delimiters), you might want to consider using Boost.Tokenizer.
Use third argument of overloaded std::getline
for (std::string line; std::getline(input, line, delimiter ); )
{
std::cout<< line <<'\n';
}
I am trying to parse a simple CSV file, with data in a format such as:
20.5,20.5,20.5,0.794145,4.05286,0.792519,1
20.5,30.5,20.5,0.753669,3.91888,0.749897,1
20.5,40.5,20.5,0.701055,3.80348,0.695326,1
So, a very simple and fixed format file. I am storing each column of this data into a STL vector. As such I've tried to stay the C++ way using the standard library, and my implementation within a loop looks something like:
string field;
getline(file,line);
stringstream ssline(line);
getline( ssline, field, ',' );
stringstream fs1(field);
fs1 >> cent_x.at(n);
getline( ssline, field, ',' );
stringstream fs2(field);
fs2 >> cent_y.at(n);
getline( ssline, field, ',' );
stringstream fs3(field);
fs3 >> cent_z.at(n);
getline( ssline, field, ',' );
stringstream fs4(field);
fs4 >> u.at(n);
getline( ssline, field, ',' );
stringstream fs5(field);
fs5 >> v.at(n);
getline( ssline, field, ',' );
stringstream fs6(field);
fs6 >> w.at(n);
The problem is, this is extremely slow (there are over 1 million rows per data file), and seems to me to be a bit inelegant. Is there a faster approach using the standard library, or should I just use stdio functions? It seems to me this entire code block would reduce to a single fscanf call.
Thanks in advance!
Using 7 string streams when you can do it with just one sure doesn't help wrt. performance.
Try this instead:
string line;
getline(file, line);
istringstream ss(line); // note we use istringstream, we don't need the o part of stringstream
char c1, c2, c3, c4, c5; // to eat the commas
ss >> cent_x.at(n) >> c1 >>
cent_y.at(n) >> c2 >>
cent_z.at(n) >> c3 >>
u.at(n) >> c4 >>
v.at(n) >> c5 >>
w.at(n);
If you know the number of lines in the file, you can resize the vectors prior to reading and then use operator[] instead of at(). This way you avoid bounds checking and thus gain a little performance.
I believe the major bottleneck (put aside the getline()-based non-buffered I/O) is the string parsing. Since you have the "," symbol as a delimiter, you may perform a linear scan over the string and replace all "," by "\0" (the end-of-string marker, zero-terminator).
Something like this:
// tmp array for the line part values
double parts[MAX_PARTS];
while(getline(file, line))
{
size_t len = line.length();
size_t j;
if(line.empty()) { continue; }
const char* last_start = &line[0];
int num_parts = 0;
while(j < len)
{
if(line[j] == ',')
{
line[j] = '\0';
if(num_parts == MAX_PARTS) { break; }
parts[num_parts] = atof(last_start);
j++;
num_parts++;
last_start = &line[j];
}
j++;
}
/// do whatever you need with the parts[] array
}
I don't know if this will be quicker than the accepted answer, but I might as well post it anyway in case you wish to try it.
You can load in the entire contents of the file using a single read call by knowing the size of the file using some fseek magic. This will be much faster than multiple read calls.
You could then do something like this to parse your string:
//Delimited string to vector
vector<string> dstov(string& str, string delimiter)
{
//Vector to populate
vector<string> ret;
//Current position in str
size_t pos = 0;
//While the the string from point pos contains the delimiter
while(str.substr(pos).find(delimiter) != string::npos)
{
//Insert the substring from pos to the start of the found delimiter to the vector
ret.push_back(str.substr(pos, str.substr(pos).find(delimiter)));
//Move the pos past this found section and the found delimiter so the search can continue
pos += str.substr(pos).find(delimiter) + delimiter.size();
}
//Push back the final element in str when str contains no more delimiters
ret.push_back(str.substr(pos));
return ret;
}
string rawfiledata;
//This call will parse the raw data into a vector containing lines of
//20.5,30.5,20.5,0.753669,3.91888,0.749897,1 by treating the newline
//as the delimiter
vector<string> lines = dstov(rawfiledata, "\n");
//You can then iterate over the lines and parse them into variables and do whatever you need with them.
for(size_t itr = 0; itr < lines.size(); ++itr)
vector<string> line_variables = dstov(lines[itr], ",");
std::ifstream file{ InputFilename };
std::vector<std::string> line_elements;
for (std::string line; std::getline(file, line);)
{
line_elements.clear();
std::istringstream ss(line);
for (std::string value; std::getline(ss, value, ',');)
{
line_elements.push_back(std::move(value));
}
// Do something with the line_elements.
}
I would like some help understanding how to deal with isstringstream objects.
I am trying to tokenize each line of a file so I can re-write it in another format after checking certain data values in the tokens. I am loading each line in a tokenVector and iterating through the vector. My code works, but what concerns me is that I have to instantiate a isstringstrem object for each iteration otherwise it does not work. That does not feel right. Her is my code:
std::string line;//each file line
std::ifstream myFile (info.txt.c_str());
if(myFile.is_open()){
getline(myFile, line);
std::vector<std::string> tokenVector;
//create a isstringstream object for tokenizing each line of the file
std::istringstream hasTokens(line);
while(hasTokens)
{
std::string substring;
if(! getline(hasTokens, substring,','))
break;
tokenVector.push_back(substring);
}
//look for some known header names for validation
if(!tokenVector.empty()){
if(!(tokenVector[0]=="Time")&&(tokenVector[1] == "Group")&&(tokenVector[2]=="Perception")&&(tokenVector[3] == "Sign")){
setErrorMesssage("Invalid Header in myFile");
return false;
}
tokenVector.clear();
}
//clear the isstringstream object
hasTokens.str(std::string());
//if header validates, do rest of file
while(myFile.good()){
getline(myFile , line);
//break line into tokens using istringstream
std::istringstream hasTokens(line);
//reload the vector of tokens for each line
while(hasTokens)
{
std::string substring;
if(! getline(hasTokens, substring,','))
break;
tokenVector.push_back(substring);
}
otherFileWritingFunction(tokenVector[0], tokenVector[2], tokenVector[4]);
tokenVector.clear();
hasTokens.str(std::string());
}//end while
}//end if is_open
This code works, but its not correct because I should only have to instantiate isstringstream once (I think). If I try "hasTokens.str(line)" for each iteration using just the original instantiation of hasTokens, as some example have suggested, it does not work, so I would really appreciate a suggestion.
Thanks
Nope, your worries are misplaced. Create a new stream object when you need it, and dispose of it when you're done. That's the spirit of C++. An object for each purpose, and a purpose for each object (misquoting Frank Herbert). There's nothing "expensive" about constructing a string stream that wouldn't also happen when you reassign the string data of an existing string stream.
Your code is very noisy and redundant, though. The standard idiom goes like this:
std::string line;
while (std::getline(infile, line))
{
std::istringstream iss(line);
std::string token;
while (iss >> token) { /* do stuff */ }
}
Compressed version (some would call this abuse):
for (std::string line; std::getline(infile, line); )
{
std::istringstream iss(line);
for (std::string token; iss >> token; ) { /* ... */ }
}
The second std::istringstream declaration has an entirely different scope and is being constructed in each iteration so hasTokens.str(std::string()); has no effect.
You could reuse the same object if you did hasTokens.str(line) in the while loop instead.