Need to parse a string of ints and get the white space - c++

I have a file filled with ints (variable amount on a line), delimited by a space. I would like to parse out the int, then space, then int, then space ... until the newline char then start at a new line until the eof. An example file would look something like this:
1 1 324 234 12 123
2 2 312 403 234 234 123 125 23 34
...
To grab the ints I can do something like this:
std::ifstream inStream(file.txt);
std::string line;
int myInt = 0;
while(getline(inStream, line)) {
std::stringstream ss(line);
while(ss) {
ss >> myInt;
//process...
}
}
My question is that is there an easy way to also get the whitespace and endline char from the ss? Or is my best bet to write my program assuming a space after each index and a newline at the end of the ss? something like this:
std::ifstream inStream(file.txt);
std::string line;
int myInt = 0;
while(getline(inStream, line)) {
std::stringstream ss(line);
while(ss) {
ss >> myInt;
// process...
// done with myInt
char mySpace = ' ';
// now process mySpace
}
char myNewLine = '\n';
// now process myNewLine
}

If performance is not the most important issue, the following would be a general-purpose tokenizer for your input format. Whether this is a feasible solution depends of course on what you actually want to do with the input.
#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
static void handle_number_string(std::string& literal) {
if (!literal.empty()) {
std::istringstream iss {literal};
int value;
if (iss >> value) {
std::clog << "<" << value << ">";
} else {
// TODO: Handle malformed integer literal
}
literal.clear();
}
}
int main(int argc, char** argv) {
for (int i = 1; i < argc; i++) {
std::string aux;
std::ifstream istr {argv[i]};
std::clog << argv[i] << ": ";
while (istr.good()) {
const int next = istr.get();
switch (next) {
case ' ':
handle_number_string(aux);
std::clog << "<SPC>";
break;
case '\n':
handle_number_string(aux);
std::clog << "<EOL>";
break;
default:
aux.push_back(next);
}
}
// Handle case that the last line was not terminated with '\n'.
handle_number_string(aux);
std::clog << std::endl;
}
return 0;
}
Addendum: I'd only do this if I absolutely had to. Handling all possibilities (multiple spaces, non-breaking spaces, tabs, \r\n,…) correctly will be a lot of work. If what you actually want to handle are the logical tokens field separator and end of line, manually parsing whitespace seems to be the wrong way to go. It would be sad if your program crashes just because a user has justified the columns in the input file (thus using a variable number of spaces).

Try something like this:
std::ifstream inStream(file.txt);
std::string line;
int myInt;
while (std::getline(inStream, line))
{
std::stringstream ss(line);
ss >> myInt;
if (ss)
{
do
{
// process...
// done with myInt
ss >> myInt;
if (!ss) break;
char mySpace = ' ';
// now process mySpace
}
while (true);
}
char myNewLine = '\n';
// now process myNewLine
}

Related

Read file and input into vector

Given a text file with the following fixed contents (only two lines, nothing more):
43 65 123 13 41
83 67 22
I need it to turn into a vector array where I can apply the name.size() to find out the length of the array and also to make use of the integers inside for other purposes.
The result should be:
Array 1: 43 65 123 13 41
Array 2: 83 67 22
Array 1 Length: 5
Array 2 Length: 3
I have successfully read the text file and added the content using .push_back(). However when I cout the vector .size(), it returns 2 and I realised that it consider each line as an element in the array.
What is the approach to tackling this?
Edit:
Added snippet of the code:
vector <string> readFile(const string& fileName)
{
ifstream source;
source.open(filename);
vector <string> lines;
string line;
while(getline(source, line)
{
lines.push_back(line);
} return lines;
}
int main(int argc, char ** argv)
{
string inputFile(argv[1]);
vector <string> fileData = readFile(inputFile);
// Check vector
for(auto i : fileData)
cout << i << endl;
// Check vector length
cout << fileData.size() << endl;
}
As already mentioned, std::getline and std::istringstream are good starting points.
Then you can use std::istream_iterator to create the vectors directly using the iterator overload of the std::vector constructor.
Perhaps something like this:
std::vector<std::vector<int>> read_file(std::istream& input)
{
std::vector<std::vector<int>> lines;
std::string line;
while (std::getline(input, line))
{
std::istringstream line_stream(line);
lines.emplace_back(std::istream_iterator<int>(line_stream),
std::istream_iterator<int>());
}
return lines;
}
This handles an arbitrary number of lines containing an arbitrary number of integer values.
Can be used something like this:
int main(int, char* argv[])
{
std::ifstream input(argv[1]);
auto data = read_file(input);
std::cout << "First line of data: ";
for (auto value : data[0])
{
std::cout << value << ' ';
}
std::cout << '\n';
std::cout << "Second line of data: ";
for (auto value : data[1])
{
std::cout << value << ' ';
}
std::cout << '\n';
}
Important note: The example above lacks any kind of error or range checking. It's left as an exercise for the reader.
You could read each one as a text line, then extract the numbers from the text.
std::vector<std::vector<int>> database;
std::string text_line;
while (std::getline(cin, text_line))
{
std::vector<int> numbers_row;
std::istringstream number_stream(text_line);
int value;
while (number_stream >> value)
{
numbers_row.push_back(value);
}
database.push_back(numbers_row);
}
The inner loop creates a vector based on the numbers in that text line.
The outer loop creates vectors and appends them to the database.
The common way is to use std::getline to read line by line, to put it in a std::istringstream and then to use formatted input from that stream - but you've already gotten answers for that so I'll throw in an alternative.
This uses formatted input directly from the ifstream (or any istream descendant) and peek()s to see if a newline (\n) is the next character.
It could look like this:
#include <istream>
#include <vector>
std::vector<std::vector<int>> read_stream(std::istream& is) {
std::vector<std::vector<int>> res;
std::vector<int> cur;
int x;
while(true) {
if(is.peek() == '\n') {
// save what we've gotten on the current line
res.emplace_back(std::move(cur));
cur.clear();
}
if(is >> x) cur.push_back(x);
else break;
}
return res;
}

How to read the 2nd, 3rd and 4th line of a text file. C++

I am trying to read each line of a text file in C++. I have it working for the first line in a text file, but how do I read the others? Also, I am a C++ super noob, please don't roast me.
Code:
void Grade::readInput()
{
ifstream infile("Grades.txt");
if (infile.good())
{
string sLine;
getline(infile, sLine);
cout << "This is the val " << sLine << endl;
}
infile.close();
}
This is the output I want:
This is the val 12
This is the val 17
This is the val 1
This is the val 29
I'll give you some hints - this is probably more like CodeReview.SE now...
I would recommend
separating the reading from printing
treating the grades as numbers (e.g. int or double) so you can actually work with them, and have some validation that you're not reading non-sense
Use idiomatic loops, don't just if (infile.good()) - e.g. you can't tell whether you've reached end-of-file before you try to read
Fixing up the interface, I'd suggest something like this
struct Grade {
void readInput(std::string fname);
void print(std::ostream& os = std::cout) const;
private:
static constexpr auto max_line_length = std::numeric_limits<ssize_t>::max();
std::vector<int> grades;
};
Note that readInput reads into grades without printing anything. Note also that it takes the name of the file to read as the argument, instead of hardcoding some filename.
int main() {
Grade grading;
grading.readInput("grades.txt");
grading.print();
}
This would be the main program.
The simplified/extended version of readGrades could be:
void Grade::readInput(std::string fname) {
std::ifstream infile(fname);
infile.ignore(max_line_length, '\n'); // ignore the first line
int grade = 0;
while (infile >> grade) {
grades.push_back(grade);
infile.ignore(max_line_length, '\n'); // ignore rest eating newline
}
}
Note how we ignore lines or parts thereof that we are not interested in. For extra control consider disabling white-space skipping:
infile >> std::nowskipws;
The print function could be a simple:
void Grade::print(std::ostream& os) const {
os << "Grades:";
for (int g : grades) {
os << " " << g;
}
os << std::endl;
}
Full Demo
Live On Coliru
#include <fstream>
#include <string>
#include <vector>
#include <iostream>
struct Grade {
void readInput(std::string fname);
void print(std::ostream& os = std::cout) const;
private:
static constexpr auto max_line_length = std::numeric_limits<ssize_t>::max();
std::vector<int> grades;
};
int main() {
Grade grading;
grading.readInput("grades.txt");
grading.print();
}
void Grade::readInput(std::string fname) {
std::ifstream infile(fname);
infile.ignore(max_line_length, '\n'); // ignore the first line
int grade = 0;
while (infile >> grade) {
grades.push_back(grade);
infile.ignore(max_line_length, '\n'); // ignore rest eating newline
}
}
void Grade::print(std::ostream& os) const {
os << "Grades:";
for (int g : grades) {
os << " " << g;
}
os << std::endl;
}
Prints
Grades: 12 17 1 29
Given grades.txt:
Ignore the first line
12
17
1
29
A simple version:
std::string line1;
std::string line2;
std::string line3;
std::string line4;
std::getline(infile, line1);
std::getline(infile, line2);
std::getline(infile, line3);
std::getline(infile, line4);
With a loop:
static const unsigned int LINES_TO_READ = 3;
std::string line1_ignored;
std::getline(infile, line1_ignored);
for (unsigned int i = 0; (i < LINES_TO_READ); ++i)
{
std::string text_line;
if (std::getline(infile, text_line))
{
std::cout << text_line << std::endl;
}
else
{
break;
}
}
Both versions read the first line. Ignore the contents in the variable if you wish.
The simple method reads each text line, into separate variables.
The second reads text lines using a known-quantity loop.

How to terminate using getline()

I am trying to store some integers in a file and I am storing it with ',' as delimiter. Now when I read the file, I read the line using getline() and use tokenizer to delimit the file, However, I cannot terminate the line, I need some bool condition in getline to terminate.
while(getline(read,line)) {
std::cout<<line<<std::endl;
std::istringstream tokenizer(line);
std::string token;
int value;
while(????CONDN???) {
getline(tokenizer,token,',');
std::istringstream int_value(token);
int_value>>value;
std::cout<<value<<std::endl;
}
}
Please advice.
In your case it is enough to use getline in the same way as you do in the outer loop:
while(getline(tokenizer, token, ','))
While most likely I'd do something like this:
while(std::getline(read,line)) { // read line by line
std::replace(line.begin(), line.end(), ',', ' ' ); // get rid of commas
std::istringstream tokenizer(line);
int number;
while(tokenizer >> number) // read the ints
std::cout<<number<<std::endl;
}
And two other alternatives - that use Boost.
String Algorithms:
#include <boost/algorithm/string.hpp>
...
std::vector<std::string> strings;
boost::split(strings, "1,3,4,5,6,2", boost::is_any_of(","));
or tokenizer:
#include <boost/tokenizer.hpp>
typedef boost::char_separator<char> separator_t;
typedef boost::tokenizer<separator_t> tokenizer_t;
...
tokenizer_t tokens(line, sep);
for(tokenizer_t::iterator it = tokens.begin(); it != tokens.end(); ++it)
std::cout << *it << std::endl;
If you expect to encounter non-int, non-separator characters, e.g. 1 3 2 XXXX 4. Then you'll have to decide what to do in such a case. tokenizer >> number will stop at something that is not an int and the istringstream error flags will be set. boost::lexical_cast is also your friend:
#include <boost/lexical_cast.hpp>
...
try
{
int x = boost::lexical_cast<int>( "1a23" );
}
catch(const boost::bad_lexical_cast &)
{
std::cout << "Error: input string was not valid" << std::endl;
}
Finally, in C++11 you have the stoi/stol/stoll functions:
#include <iostream>
#include <string>
int main()
{
std::string test = "1234";
std::cout << std::stoi(str) << std::endl;
}

Reading a mix of integers and characters from a file in C++

I have some trouble with reading of a file in C++. I am able to read only integers or only alphabets. But I am not able to read both for example, 10af, ff5a. My procedure is as follows:
int main(int argc, char *argv[]) {
if (argc < 2) {
std::cerr << "You should provide a file name." << std::endl;
return -1;
}
std::ifstream input_file(argv[1]);
if (!input_file) {
std::cerr << "I can't read " << argv[1] << "." << std::endl;
return -1;
}
std::string line;
for (int line_no = 1; std::getline(input_file, line); ++line_no) {
//std::cout << line << std::endl;
-----------
}
return 0;
}
So what I am trying to do is, I am allowing the user to specify the input file he wants to read, and I am using getline to obtain each line. I can use the method of tokens to read only integers or only alphabets. But I am not able to read a mix of both. If my input file is
2 1 89ab
8 2 16ff
What is the best way to read this file?
Thanks a lot in advance for your help!
I'd use a std::stringstream, and use std::hex since 89ab and 16ff look like hex numbers.
Should look like this:
std::string line;
for (int line_no = 1; std::getline(input_file, line); ++line_no)
{
std::stringstream ss(line);
int a, b, c;
ss >> a;
ss >> b;
ss >> std::hex >> c;
}
You will need to #include <sstream>
Using
std::string s;
while (input_file >> s) {
//add s to an array or process s
...
}
you can read inputs of type std::string which could be any combination of digits and alphabets. You don't necessarily need to read input line by line and then try to parse it. >> operator considers both space and newline as delimiters.

How to read formatted data in C++?

I have formatted data like the following:
Words 5
AnotherWord 4
SomeWord 6
It's in a text file and I'm using ifstream to read it, but how do I separate the number and the word? The word will only consist of alphabets and there will be certain spaces or tabs between the word and the number, not sure of how many.
Assuming there will not be any whitespace within the "word" (then it will not be actually 1 word), here is a sample of how to read upto end of the file:
std::ifstream file("file.txt");
std::string str;
int i;
while(file >> str >> i)
std::cout << str << ' ' << i << std::endl;
The >> operator is overridden for std::string and uses whitespace as a separator
so
ifstream f("file.txt");
string str;
int i;
while ( !f.eof() )
{
f >> str;
f >> i;
// do work
}
sscanf is good for that:
#include <cstdio>
#include <cstdlib>
int main ()
{
char sentence []="Words 5";
char str [100];
int i;
sscanf (sentence,"%s %*s %d",str,&i);
printf ("%s -> %d\n",str,i);
return EXIT_SUCCESS;
}
It's actually very easy, you can find the reference here
If you are using tabs as delimiters, you can use getline instead and set the delim argument to '\t'.
A longer example would be:
#include <vector>
#include <fstream>
#include <string>
struct Line {
string text;
int number;
};
int main(){
std::ifstream is("myfile.txt");
std::vector<Line> lines;
while (is){
Line line;
std::getline(is, line.text, '\t');
is >> line.number;
if (is){
lines.push_back(line);
}
}
for (std::size_type i = 0 ; i < lines.size() ; ++i){
std::cout << "Line " << i << " text: \"" << lines[i].text
<< "\", number: " << lines[i].number << std::endl;
}
}