Hi I'd like to ask how to parse multiple floats, separated by "/" and spaces, from a string.
The text format from the file is "f 1/1/1 2/2/2 3/3/3 4/4/4"
I need to parse every integer from this line of text into several int variables, which are then used to construct a "face" object(see below).
int a(0),b(0),c(0),d(0),e(0);
int t[4]={0,0,0,0};
//parsing code goes here
faces.push_back(new face(b,a,c,d,e,t[0],t[1],t[2],t[3],currentMaterial));
I could do it with sscanf(), but I've been warn away from that by my uni lecturer, so I am looking for an alternative. I am also not allowed other 3rd party libraries, including boost.
Regular expressions and parsing with stringstream() have been mentioned, but I don't really know much about either, and would appreciate some advice.
If you're reading the file with std::ifstream, there's no need for std::istringstream in the first place (although using the two is very similar because they inherit from the same base class). Here's how to do it with std::ifstream:
ifstream ifs("Your file.txt");
vector<int> numbers;
while (ifs)
{
while (ifs.peek() == ' ' || ifs.peek() == '/')
ifs.get();
int number;
if (ifs >> number)
numbers.push_back(number);
}
Taking into account your example f 1/1/1 2/2/2 3/3/3 4/4/4 what you need to read is: char int char int char int int char int char int int char int char int
To do this:
istringstream is(str);
char f, c;
int d[12];
bool success = (is >> f) && (f == 'f')
&& (is >> d[0]) && (is >> c) && (c == '/')
&& (is >> d[1]) && (is >> c) && (c == '/') &&
..... && (is >> d[11]);
The way I would do this is to change the interpretation of space to include the other separators. If I were to get fancy I would use different std::ostream objects, each with a std::ctype<char> facet set up to deal with one separator, and use a shared std::streambuf.
If you want to make the use of separators explicit you could instead use a suitable manipulator to skip the separator or, if it absent, indicate failure:
template <char Sep>
std::istream& sep(std::istream& in) {
if ((in >> std::ws).peek() != std::to_int_type(Sep)) {
in.setstate(std::ios_base::failbit);
}
else {
in.ignore();
}
return in;
}
std::istream& (* const slash)(std::istream&) = Sep<'/'>;
The code isn't tested and type on a mobile device, i.e., probably contains small errors. You'd read data like this:
if (in >> v1 >> v2 >> slash >> v3 /*...*/) {
deal_with_input(v1, v2, v3);
}
Note: the above use assumes input as
1.0 2.0/3.0
i.e. a space after the first value and a slash after the second value.
You can use boost::split.
Sample example is:
string line("test\ttest2\ttest3");
vector<string> strs;
boost::split(strs,line,boost::is_any_of("\t"));
cout << "* size of the vector: " << strs.size() << endl;
for (size_t i = 0; i < strs.size(); i++)
cout << strs[i] << endl;
more information here:
http://www.boost.org/doc/libs/1_51_0/doc/html/string_algo.html
and also related:
Splitting the string using boost::algorithm::split
Related
Beginner programmer here,
Say I want to obtain initial starting coordinates in the form (x,y), so I ask the user to enter in a point using the specific form "(x,y)". Is there a way that I could recognize the format and parse the string so that I could obtain the x and y values?
Read a line of text using:
char line[200]; // Make it large enough.
int x;
int y;
fgets(line, sizeof(line), stdin);
Then, use sscanf to read the numbers from the line of text.
if ( sscanf(line, "(%d,%d)", &x, &y) != 2 )
{
// Deal with error.
}
else
{
// Got the numbers.
// Use them.
}
If you want to use iostreams instead of stdio, use getline instead of fgets.
You can use regex to find the matching sequence anywhere in the input.
#include <iostream>
#include <string>
#include <regex>
int main() {
std::string line;
std::getline(std::cin, line);
std::regex e(R"R(\(([-+]?(?:\d*[.])?\d+)\s*,\s*([-+]?(?:\d*[.])?\d+)\))R");
std::smatch sm;
if (std::regex_search(line, sm, e)) {
auto x = std::stod(sm[1]);
auto y = std::stod(sm[2]);
std::cout << "Numbers are: " << x << ", " << y << std::endl;
}
return 0;
}
In order to parse stuff you need a parser. There are many ways to write a parser, but generally parsers read tokens and decide what to do next based on which token it is. (Emphasized words are important, look them up).
In your case you don't have to explicitly introduce a separate token entity. Reading elements from the input stream with the >> operator will do.
You need to:
read a character; verify it's a '('
read a number
read a character; verify it's a ','
read a number
read a character; verify it's a ')'
If any step fails, the entire parsing fails.
You can see the same basic step is done three times, so you can write a function for it.
bool expect_char(std::istream& is, char what)
{
char ch;
return is >> ch && ch == what;
}
This works because is >> ch returns the stream after the read operation, and the stream can be viewed as a boolean value: true if the last operation succeeded, false otherwise.
Now you can compose your parser:
bool get_vector (std::istream& is, int& x, int& y)
{
return expect_char(is, '(') &&
is >> x &&
expect_char(is, ',') &&
is >> y &&
expect_char(is, ')');
}
This method has a nice property that blanks are allowed between the numbers and the symbols.
Now this may look like a lot of stuff to type compared to the solution that uses sscanf:
bool get_numbers2 (std::istream& is, int& x, int& y)
{
std::string s;
return std::getline(in, s) &&
(std::sscanf(s.c_str(), "(%d,%d)", &x, &y) == 2);
}
But sscanf is:
dangerous (there is no typechecking of its arguments)
less powerful (the input format is very rigid)
not generic (doesn't work in a template)
It's OK to use the scanf functions family where appropriate, but I don't recommend it for new C++ programmers.
How do I get rid of the leading ' ' and '\n' symbols when I'm not sure I'll get a cin, before the getline?
Example:
int a;
char s[1001];
if(rand() == 1){
cin >> a;
}
cin.getline(s);
If I put a cin.ignore() before the getline, I may lose the first symbol of the string, so is my only option to put it after every use of 'cin >>' ? Because that's not very efficient way to do it when you are working on a big project.
Is there a better way than this:
int a;
string s;
if(rand() == 1){
cin >> a;
}
do getline(cin, s); while(s == "");
Like this:
std::string line, maybe_an_int;
if (rand() == 1)
{
if (!(std::getline(std::cin, maybe_an_int))
{
std::exit(EXIT_FAILURE);
}
}
if (!(std::getline(std::cin, line))
{
std::exit(EXIT_FAILURE);
}
int a = std::stoi(maybe_an_int); // this may throw an exception
You can parse the string maybe_an_int in several different ways. You could also use std::strtol, or a string stream (under the same condition as the first if block):
std::istringstream iss(maybe_an_int);
int a;
if (!(iss >> a >> std::ws) || iss.get() != EOF)
{
std::exit(EXIT_FAILURE);
}
You could of course handle parsing errors more gracefully, e.g. by running the entire thing in a loop until the user inputs valid data.
Both the space character and the newline character are classified as whitespace by standard IOStreams. If you are mixing formatted I/O with unformatted I/O and you need to clear the stream of residual whitespace, use the std::ws manipulator:
if (std::getline(std::cin >> std::ws, s) {
}
I have a c_str which contains [51,53]. I want to split these pairs in two integers. They are in a c_str because I read them from an input file.
There must be an easy way to parse them. I was thinking of using the .at function:. But I am sure I made it way to complicated. Plus, it does not work, since it outputs:
pair: 0x7ffffcfc2998
pair: 0x7ffffcfc2998
etc
string pairstring = buffertm.c_str();
stringstream pair1, pair2;
int pairint1, pairint2;
pair1 << pairstring.at(1) << pairstring.at(2);
cout << "pair: " << pair1;
pair1 >> pairint1;
pair2 << pairstring.at(4) << pairstring.at(5);
//cout << "pair: " << pair2;
pair2 >> pairint2;
Any better ways to do this?
Something like this:
char c1, c2, c3;
int first, second;
std::istringstream iss(str);
if (iss >> c1 >> first >> c2 >> second >> c3
&& c1 == '[' && c2 == ',' && c3 == ']' )
{
// success
}
You might want to throw in an additional check to see if there are more characters after the closing bracket:
if ((iss >> std::ws).peek() != EOF) {
//^^^^^^^^^^^^^^
// eats whitespace chars and returns reference to iss
/* there are redundant charactes */
}
Try this using C++11 std::stoi:
char c_str[] = "[51,53]";
std::string s(c_str);
int num1 = std::stoi(s.substr(1, 2));
int num2 = std::stoi(s.substr(4, 2));
If you know the numbers will be outside the range 10-99 then use this instead:
char c_str[] = "[5156789,5]";
std::string s(c_str);
s.assign(s.substr(1, s.size() - 2)); // Trim away '[' and ']'
std::string::size_type middle = s.find(','); // Find position of ','
int num1 = std::stoi(s.substr(0, middle));
int num2 = std::stoi(s.substr(middle + 1, s.size() - (middle + 1)));
The function stoi will throw std::invalid_argument if number can't be parsed.
Edit:
For a more rubust solution that will only parse base-10 numbers, you should use something like this:
char c_str[] = "[51,0324]";
int num1, num2;
try {
std::string s(c_str);
s.assign(s.substr(1, s.size() - 2));
std::string::size_type middle = s.find(',');
std::unique_ptr<std::size_t> pos{new std::size_t};
std::string numText1 = s.substr(0, middle);
num1 = std::stoi(numText1, pos.get()); // Try parsing first number.
if (*pos < numText1.size()) {
throw std::invalid_argument{{numText1.at(*pos)}};
}
std::string numText2 = s.substr(middle + 1, s.size() - (middle + 1));
num2 = std::stoi(numText2, pos.get()); // Try parsing second number.
if (*pos < numText2.size()) {
throw std::invalid_argument{{numText2.at(*pos)}};
}
} catch (const std::invalid_argument& e) {
std::cerr << "Could not parse number" << std::endl;
std::exit(EXIT_FAILURE);
}
It will throw std::invalid_argument when trying to parse strings as "1337h4x0r" and such, unlike when using std::istringstream. Se this for more info
For a number of reasons, I'd recommend a two-step process rather than trying to do it all at once.
Also, some of this depends heavily on what kinds of assumptions you can safely make.
1) Just tokenize. If you know that it will start with a "[" and you know there's going to be a comma, and you know you'll only get two numbers, search the string for a "[," then use substr() to get everything between it and the comma. That's one token. Then do something similar from the comma to the "]" to get the second token.
2) Once you have two strings, it's relatively trivial to convert them to integers, and there are a number of ways to do it.
A few answers have been added while I've been typing, but I still recommend a two-step approach.
If you're reasonably confident about the format of the string, you could put the whole thing into a stringstream, and then use the extraction operator; e.g.:
stringstream strm(pairstring);
int pairint1 = 0, pairint2 = 0;
char c = 0;
strm >> c >> pairint1 >> c >> pairint2 >> c;
In reality you'd want some error checking in there too, but hopefully that's enough of an idea to get you started.
Try to split your string using strtok function.
For convertation use atoi.
I have a c_str which contains [51,53]
No, c_str() is not a data structure, it's a method of std::basic_string that's returning a constant C string with data equivalent to those stored in the std::string.
string pairstring = buffertm.c_str(); is cumbersome, just do string pairstring = buffertm; or use directly buffertm.
Secondly, you're using your stringstream in the wrong way, you should use an istringstream (here you are using it as an ostringstream :
int i, j;
istringstream iss(buffertm);
iss.get()
iss >> i;
iss.get()
iss >> j;
iss.get().
I have this situation where I need to get two int values from each row inside a file with this format:
43=>113
344=>22
Is it possible to do someting like setting a delimiter equal to => and than use >> operator to assign ints?
ifstream iFile("input.in");
int a,b;
iFile >> a >> b;
Also can be done autoamtically to output with similar format?
oFile << a << b;
instead of
oFile << a << "=>" << b;
Thanks.
You can't do it directly, without any extra code when reading or
writing, but you can write a manipulator which handles it for
you more explicitly:
std::istream&
mysep( std::istream& source )
{
source >> std::ws; // Skip whitespace.
if ( source.get() != '=' || source.get() != '>' ) {
// We didn't find the separator, so it's an error
source.setstate( std::ios_base::failbit );
}
return source;
}
Then, if you write:
ifile >> a >> mysep >> b;
, you will get an error is the separator is absent.
On output, you can use a similar manipulator:
std::ostream&
mysep( std::ostream& dest )
{
dest << "=>";
return dest;
}
This has the advantage of keeping the information as to what the
separator is isolated in these two specific functions (which
would be defined next to one another, in the same source file),
rather than spread out where ever you are reading or writing.
Also, these data presumably represent some particular type of
information in your code. If so, you should probably define it
as a class, and then defined operators >> and << over that
class.
Given a and b are variables of inbuilt types, you can not define your own user-defined operators for streaming them (the Standard library already provides such functions).
You could just write out code with the behaviour you want...
int a, b;
char eq, gt;
// this is probably good enough, though it would accept e.g. "29 = > 37" too.
// disable whitespace skipping with <iomanip>'s std::noskipws if you care....
if (iFile >> a >> eq >> gt >> b && eq == '=' && gt == '>')
...
OR wrap a and b into a class or struct, and provider user-defined operators for that. There are plenty of SO questions with answers explaining how to write such streaming functions.
OR write a support function...
#include <iomanip>
std::istream& skip_eq_gt(std::istream& is)
{
char eq, gt;
// save current state of skipws...
bool skipping = is.flags() & std::ios_base::skipws;
// putting noskipws between eq and gt means whatever the skipws state
// has been will still be honoured while seeking the first character - 'eq'
is >> eq >> std::noskipws >> gt;
// restore the earlier skipws setting...
if (skipping)
is.flags(is.flags() | std::ios_base::skipws);
// earlier ">>" operations may have set fail and/or eof, but check extra reasons to do so
if (eq != '=' || gt != '>')
is.setstate(std::ios_base::failbit)
return is;
}
...then use it like this...
if (std::cin >> a >> skip_eq_gt >> b)
...use a and b...
This function "works" because streams are designed to accept "io manipulator" functions that reconfigure some aspect of the stream (for example, std::noskipws), but for a function to be called it just has to match the prototype for an (input) io manipulator: std::istream& (std::istream&).
If you have always have => as the deliminator, you can write a function that will parse lines of the document.
void Parse(ifstream& i)
{
string l;
while(getline(i,l))
{
//First part
string first = l.substr(0, l.find("=>"));
//Second part
string second = l.substr(l.find("=>")+2, l.length());
//Do whatever you want to do with them.
}
}
I want to read data in files that are formatted like:
Point1, [3, 4]
I'm using delimiters '[' ']' and ',' and replace them with ' ' (empty space). My code now is fine and working. But the problem is, if Point1, [3, 4] appeared once, I want it to be unique and not to appear again if the same data in the text file exist.
Here is what I have:
string line, name;
char filename[50];
int x,y;
cout << "Please enter filename : ";
cin >> filename;
ifstream myfile(filename);
if (myfile.is_open()) {
while ( myfile.good() ) {
getline(myfile, line);
for (unsigned int i = 0; i < line.size(); ++i) {
if (line[i] == '[' || line[i] == ']' || line[i] == ',') {
line[i] = ' ';
}
istringstream in(line);
in >> name >> x >> y;
}
cout <<name <<endl;
if (name=="Point") {
p.push_back(Point(x,y));
}
count++;
}
myfile.close();
cout << count;
}
else cout<< "Unable to open file";
How do i do this? I tried adding this after if(name=="Point")
for (int j=0; j<p.size(); j++) {
if(p.getX != x && p.getY) != y) {
p.push_back(Point(x,y))
}
}
...but this is not working properly as data was not stored into the vector.
Anyone can help?
Instead of storing your data into vectors you can store it into sets. A set contains only unique values, so you don't have to check the uniqueness of your Points, the set will manage that.
Instead of defining vector<Point>, you have to define set<Point>, and instead of using p.push_back to add points into your vector, you have to use p.insert if it is a set.
You can check the set documentation here.
Assuming you want to keep you data store in a std::vector<Point> you could just check that no corresponding point already exists. Assuming there is an equality operator defined, it is as easy as this:
if (p.end() == std::find(p.begin(), p.end(), Point(x, y))) {
p.push_back(Point(x, y));
}
If you Point type doesn't have an equality operator and shouldn't get one, you can use a function object together with find_if() instead, e.g.:
if (p.end() == std::find_if(p.begin(), p.end(),
[=](Point const& v) { return x == v.x && y == v.y; })) {
...
You should separate your loops from other operations: The loop you propose to check if the point already exists is basically what std::find() does except that you insert a new point in each iteration! You first want to go through all existing points and see if it exists anywhere. Only if it does not, you'd insert a new point.
Note, that you did a similar mistake when replacing characters in your string: you try to decode the string after checking each character. This isn't so much a semantic problem but it is a major performance problem. You should replace the characters in a loop and after that loop you should decode the string just once. Of course, when decoding the string you need to check if the format was OK as it is always necessary to determine if an input operation was successful. That is, after your loop replacing the characters you want something like:
std::istringstream in(line);
if (in >> name >> x >> y) {
// now process the data
}
... or, if you are like me and don't like naming things which are only around temporarily:
if (std::istringstream(line) >> std::ws >> name >> x >> y) {
As another note: checking a stream for good() is generally the wrong thing to do because a stream can be in perfectly good conditions except that it had seen EOF at some point and, thus, as set the std::ios_base::eofbit. More importantly, you need to check after the input operation, not before! That is, your first loop should start with something like this:
while (std::getline(myFile, line)) {
...