Tokenizing string in c++ using stl [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
I am trying to read data from file where each line has 15 fields separated by commas and spaces. The data are not of a single type. Currently what I am doing is reading data line by line, and pass each line to an istringstream and between each read I do the following:
ins.ignore(25,','); //ins is the istringstream
I however don't like my method and would like a cleaner one. What would be a better way of doing it?. Also I would only like to use stl and no external libraries. Basically what I want is to tokenize each line using the comma as the delimiter.

Just use a custom manipulator:
std::istream& comma(std::istream& in) {
if ((in >> std::ws).get() != std::char_traits<char>::to_int_type(',')) {
in.setstate(std::ios_base::failbit);
}
return in;
}
...
in >> v0 >> comma >> v1 >> comma ...

Cleaner method (if I understand right) is just to read the comma into a dummy variable
char comma;
ins >> comma;
This will skip any whitespace and then read the comma, which you can then ignore.

Related

How to keep reading using cin until a blank line is reached

I am trying to use standard input(cin) to read in inputs until a blank line is hit. I tried many times but still fail to achieve it. Can anyone help me out?
The following is the input format. Note: 1. // are comments 2. Comments can be randomly distributed after the second line in the input. So I also need to clear those comments. Not sure how to do it.3.The first line has to be a single letter. 4.The second line has to be an integer.
A       8      goodgreatamazingwonderfulfantasticterrific//These are some random commentsbrilliantgeniusstackoverflow
The following is what I have right now. I'm trying to use getline but the program just reads in the first two lines(the letter and the number). Then the programs ends. Not sure what is going wrong:
void read() {
vector<string> my_vec;
char my_letter;
cin >> my_letter;
int my_num
cin >> my_num;
string current_word;
while (getline(cin, current_word)) {
if (current_word.empty()) {
break;
}
if (current_word[0] != '/' ) {
my_vec.push_back(current_word);
}
}
}
The extraction cin >> my_num; does not extract the newline (which is whitespace, so the next getline call extracts an empty line.
Alternative ways to solve this:
Always use line-based string extraction and subordinate string streams.
Use std::cin >> my_num >> std::ws to gobble whitespace.
Use std::cin.ignore(1, '\n') to gobble the one newline.
Use a dummy std::getline(std::cin, current_word) call to gobble the one newline.

Switching from formatted to unformatted input in C++

I have an input text file. The first line has two int numbers a and b, and the second line is a string. I want to use formatted input to do file >> a >> b, and then unformatted input to get the characters of the string one by one. In between the two steps, I need to skip over the '\n' character at the end of the first line. I used
while(file.get()<=' ' && !file.eof()); // skip all unprintable chars
if(!file.eof()) file.unget(); // keep the eof sign once triggered
to make the input format more flexible. The user can now separate the numbers a and b from the string using an arbitrary number of empty lines '\n', tab keys '\t', and/or space keys ' ' -- the same freedom he has to separate the numbers a and b. There's even no problem reading in Linux a text file copied from Windows when every end of line now becomes "\r\n".
Is there an ifstream function that does the same thing (skip all chars <=' ' until the next printable char or EOF is reached)? The ignore function does not seem to do that.
Yes, there is: std::ws manipulator. It skips whitespace characters until a non-whitespace is found or end of stream is reached.. It is similar to use of whitespace character in scanf format string.
In fact, it is used by formatted input before actually starting to parse characters.
You can use it like that:
int x;
std::string str;
std::cin >> x >> std::ws;
std::getline(std::cin, str);
//...
//std::vector<int> vec;
for(auto& e: vec) {
std::cin >> e;
}
std::getline(std::cin >> std::ws, str);

Sentence into word c++ string [duplicate]

This question already has answers here:
How do I iterate over the words of a string?
(84 answers)
Closed 8 years ago.
How to break down a sentence of string type into words and store it in a vector of string type in c++?
Example
String str="my name";
Into
Vector word={" my","name"}
You can write a simple loop:
std::vector<std::string> words;
std::istringstream is("my name");
std::string word;
while (is >> word) {
// ...
words.push_back(word);
// ...
}
which in my opinion is good idea because you'll most likely need to do other things with those words apart the simple extraction of them. The body of the loop can be easily extended.

Reading Data from a Text File and Ignoring Others

For a small portion of my project, I'm supposed to extract data from a text file using cin which my program will know where to cin from based on command line arguments. My issue is how to extract the four pieces of data and ignore the commas. For example, the .txt file will look like the following
(1,2,3,.)
(2,1,3,#)
(3,1,0,.)
In which case I need to extract the 1, the 2, the 3, and the . for the first line. Then move to the second line. When a blank newline is reached than I can exit the getline() scenario through a while loop.
I know I need to use getline() and I was able to extract the data by using the .at() function of the string generated by getline(). I became confused however when a coordinate such as the 1, the 2, or the 3, could be double digits. When this happened, my previous algorithm didn't work so I feel I'm overthinking things and there should be a simpler way to parse this data.
Thanks!
You can just use the >> operator to a dummy 'char' variable to read in the separators. This assumes you don't care about the 4th token and that it's always a single character:
char ch;
while (ss >> ch)
{
int a,b,c;
ss >> a >> ch >> b >> ch >> c >> ch >> ch >> ch;
}
A simple approach is to use sscanf, pass the string you read from cin to it as the first argument
sscanf(s, "(%d,%d,%d,%c)", &a, &b, &c))
If you want to parse the string from scratch, just focus the pattern.
In this case, the pattern is
'(', number, ',', number, ',', number, ',', char, ')'
So you can locate the three commas, then simply extract three numbers from between them.
A more complicated method is regex.
But C++ doesn't have native support for that (the Boost library does)

getline end of line?

I have an text file with binary values in n columns and y rows.
I am using getline to extract every row of the binary values and assign them to vectors:
I am using getline to extract every row of a file, where each row consists of a series of '0' or '1' separated by space, and assign them to a vector.
std::vector< std::vector<int> > matrix; // to hold everything.
std::string line;
while(std::getline(file,line))
{
std::stringstream linestream(line);
int a,b,c,d;
linestream >> a >> sep >> b >> sep >> c >> sep >> d;
std::vector <int> vi;
vi.push_back(a);
vi.push_back(b);
vi.push_back(c);
vi.push_back(d);
matrix.push_back(vi);
}
Now the problem is that I do not know in advance how many columns are there in the file. How can I loop through every line until i reach the end of that line?
The obvious way would be something like:
while (linestream >> temp >> sep)
vi.push_back(temp);
Though this may well fail for the last item, which may not be followed by a separator. You have a couple of choices to handle that properly. One would be the typical "loop and a half" idiom. Another would be a locale that treats your separator characters as white space.
When/if you do that, you can/could also use a standard algorithm:
std::copy(std::istream_iterator<int>(linestream),
std::istream_iterator<int>(),
std::back_inserter(vi));
Why not
while (in1 >> i) row.push_back( i );
Which does not require a separator?
check for a new line character (\n). when you find one, you've completed the line/column.