How to read formatted data in C++? - c++

I have formatted data like the following:
Words 5
AnotherWord 4
SomeWord 6
It's in a text file and I'm using ifstream to read it, but how do I separate the number and the word? The word will only consist of alphabets and there will be certain spaces or tabs between the word and the number, not sure of how many.

Assuming there will not be any whitespace within the "word" (then it will not be actually 1 word), here is a sample of how to read upto end of the file:
std::ifstream file("file.txt");
std::string str;
int i;
while(file >> str >> i)
std::cout << str << ' ' << i << std::endl;

The >> operator is overridden for std::string and uses whitespace as a separator
so
ifstream f("file.txt");
string str;
int i;
while ( !f.eof() )
{
f >> str;
f >> i;
// do work
}

sscanf is good for that:
#include <cstdio>
#include <cstdlib>
int main ()
{
char sentence []="Words 5";
char str [100];
int i;
sscanf (sentence,"%s %*s %d",str,&i);
printf ("%s -> %d\n",str,i);
return EXIT_SUCCESS;
}

It's actually very easy, you can find the reference here
If you are using tabs as delimiters, you can use getline instead and set the delim argument to '\t'.
A longer example would be:
#include <vector>
#include <fstream>
#include <string>
struct Line {
string text;
int number;
};
int main(){
std::ifstream is("myfile.txt");
std::vector<Line> lines;
while (is){
Line line;
std::getline(is, line.text, '\t');
is >> line.number;
if (is){
lines.push_back(line);
}
}
for (std::size_type i = 0 ; i < lines.size() ; ++i){
std::cout << "Line " << i << " text: \"" << lines[i].text
<< "\", number: " << lines[i].number << std::endl;
}
}

Related

Read array of objects from file by custom separator

I have model class for my objects:
class customClass {
string s1;
string s2;
string s3;
}
and file like this:
text1;text1;text1 text1 text1...
text2;text2;text2 text2 text2...
...
and I want make array of objects where
s1 = "text1"
s2 = "text1"
s3 = "text1 text1 text1..."
...
My code:
infile.open("file.txt");
if (infile.is_open())
{
string line;
for (int i = 0; i < 3; i++)
{
infile >> line;
stringstream ss(line);
while (ss.good())
{
string substring;
getline(ss, substring, ';');
cout << substring <<endl;
}
}
}
But it separated every single word. How can I ignore whitespaces to make my 3rd string as text not as single word.
The reason it doesnt work for you is because infile >> line; will read up to the first space character instead of the whole line (this is when you need getline). Maybe something like this:
#include <string>
#include <iostream>
#include <fstream>
int main()
{
std::ofstream outfile("file.txt");
outfile <<
R"(text1;text1;text1 text1 text1...
text2;text2;text2 text2 text2...)";
outfile.close();
// read file
std::ifstream infile("file.txt");
std::string par1, par2, par3;
while (std::getline(infile, par1, ';') && std::getline(infile, par2, ';') && std::getline(infile, par3))
std::cout << par1 << " | " << par2 << " | " << par3 << std::endl;
}
Demo: http://coliru.stacked-crooked.com/view?id=eb52001b5d4ecbed
Read all line with getline function. Default >> operation get values until \n, (space) character.
#include <iostream>
#include <fstream>
#include <string.h>
#include <sstream>
using namespace std;
class CustomClass
{
public:
string s1;
string s2;
string s3;
};
int main()
{
ifstream infile;
infile.open("test.txt");
if (infile.is_open())
{
while (!infile.eof())
{
string line;
getline(infile, line);
stringstream ss(line);
CustomClass cls;
getline(ss, cls.s1, ';');
getline(ss, cls.s2, ';');
getline(ss, cls.s3, ';');
cout << cls.s1 << " - " << cls.s2 << " - " << cls.s3 << endl;
}
}
return 0;
}
Just don't read single string at the begining, also I added the loop until the end of file, so your code would look like:
std::ifstream infile("file.txt");
std::string line;
if (infile.is_open())
{
string line;
while (std::getline(infile, line)) {
std::stringstream ss(line);
string substring;
while (getline(ss, substring, ';')) {
cout << substring <<endl;
}
}
}

C++ Replacing a word in an array of characters

I'm working on a problem where I need to have user input a message then replace the work "see" with "c". I wanted to read in the array message[200] and then break it down into individule words. I tried a for loop but when I concatinate it just adds the privous words. I am only to use array of characters, no strings.
const int MAX_SIZE = 200;
int main(){
char message[MAX_SIZE]; //message array the user will enter
int length; // count of message lenght
int counter, i, j; //counters for loops
char updateMessage[MAX_SIZE]; //message after txt update
//prompt user to
cout << "Please type a sentence" << endl;
cin.get(message, MAX_SIZE, '\n');
cin.ignore(100, '\n');
length = strlen(message);
//Lower all characters
for( i = 0; i < length; ++i)
{
message[i] = tolower(message[i]);
//echo back sentence
cout << "You typed: " << message << endl;
cout << "Your message length is " << length << endl;
for( counter = 0; counter <= length; ++counter)
{
updateMessage[counter] = message[counter];
if(isspace(message[counter]) || message[counter] == '\0')
{
cout << "Space Found" << endl;
cout << updateMessage << endl;
cout << updateMessage << " ** " << endl;
}
}
return 0;
}
After each space is found I would like to output one work each only.
You should really try to learn some modern C++ and standard library features, so you don't end up writing C code in C++. As an example, this is how a C++14 program makes use of standard algorithms from the library to do the job in 10-15 lines of code:
#include <algorithm>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
int main()
{
using namespace std::string_literals;
std::istringstream input("Hello I see you, now you see me");
std::string str;
// get the input from the stream (use std::cin if you read from console)
std::getline(input, str);
// tokenize
std::vector<std::string> words;
std::istringstream ss(str);
for(std::string word ; ss >> word; words.push_back(word));
// replace
std::replace(words.begin(), words.end(), "see"s, "c"s);
// flatten back to a string from the tokens
str.clear();
for(auto& elem: words)
{
str += elem + ' ';
}
// display the final string
std::cout << str;
}
Live on Coliru
This is not the most efficient way of doing it, as you can perform replacement in place, but the code is clear and if you don't need to save every bit of CPU cycles it performs decently.
Below is a solution that avoids the std::vector and performs the replacement in place:
#include <algorithm>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
int main()
{
std::istringstream input("Hello I see you, now you see me");
std::string str;
// get the input from the stream (use std::cin if you read from console)
std::getline(input, str);
// tokenize and replace in place
std::istringstream ss(str);
std::string word;
str.clear();
while (ss >> word)
{
if (word == "see")
str += std::string("c") + ' ';
else
str += word + ' ';
}
// display the final string
std::cout << str;
}
Live on Coliru

C++ Stringstream: Accepts a string, but not a string stored in a variable. Why?

I've been trying to split up an input string into smaller strings delineated by whitespace. I found this code from here:
stringstream ss ("bla bla");
string s;
while (getline(ss, s, ' ')) {
cout << s << endl;
}
which works just fine. However, if I replace "bla bla" with a variable containing a string:
string userInput;
cin >> userInput;
stringstream ss (userInput);
string s;
while (getline(ss, s, ' ')) {
cout << s << endl;
}
only the first word/char/string prints out. Why is that? Is there a way to fix it? I've looked around at some stringstream questions, but the problem is that I don't really know what I'm looking for.
Your problem isn't stringstream ss (userInput);, it's the behavior of std::cin. Any whitespace will end the extraction of formatted user input, so the input bla bla will result in one std::string s = "bla" and another string "bla" waiting for extraction.
Use cin >> noskipws >> userInput; instead. If you want to get a line, use std::getline(std::cin,userInput) instead. Have a look at this little demonstration, which compares std::getline to std::cin::operator>> on your input bla bla:
Source:
#include <iostream>
#include <string>
int main(){
std::string userInput;
std::cout << "Using std::getline(std::cin,userInput) on input \"bla bla\"." << std::endl;
std::getline(std::cin,userInput);
std::cout << "userInput contains \"" << userInput << "\"" << std::endl;
std::cout << "std::cin >> userInput on input \"bla bla\"." << std::endl;
std::cin >> userInput;
std::cout << "userInput contains \"" << userInput << "\"" << std::endl;
return 0;
}
Result:
Using std::getline(std::cin,userInput) on input "bla bla".
userInput contains "bla bla"
std::cin >> userInput on input "bla bla".
userInput contains "bla"
See also:
std::getline from <string> (alternative resource).
noskipws (This will only prevent skipping leading whitespaces, a whitespace will still terminate the extraction).
istream::operator>>
This does what you said:
#include "stdafx.h"
#include <string>
#include <sstream>
#include <iostream>
int _tmain(int argc, _TCHAR* argv[])
{
std::string test("bla bla");
std::stringstream stream(test);
std::string temp;
while (getline(stream, temp, ' ')) {
std::cout << temp << std::endl;
}
return 0;
}
It is even what you said you did. But since it works - where is the difference to your code?
And for those who do not have a Microsoft Visual C++ compiler handy and do not understand the differences, here's a code snippet:
std::string test("bla bla");
std::stringstream stream(test);
std::string temp;
while (getline(stream, temp, ' ')) {
std::cout << temp << std::endl;
}
the includes required by that snippet are: <string>, <sstream> & <iostream>. Please insert it into the required method.

Trying to read from a file and skip punctuation in C++, tips?

I'm trying to read from a file, and make a vector of all the words from the file. What I tried to do below is have the user input the filename, and then have the code open the file, and skip characters if they aren't alphanumeric, then input that to a file.
Right now it just closes immediately when I input the filename. Any idea what I could be doing wrong?
#include <vector>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int main()
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(bookread.good()){
getline(bookread, line);
cout << line;
while(isalnum(bookread)){
words.push_back(bookread);
}
}
}
cout << words[];
}
I think I'd do the job a bit differently. Since you want to ignore all but alphanumeric characters, I'd start by defining a locale that treats all other characters as white space:
struct digits_only: std::ctype<char> {
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
std::fill(&rc['0'], &rc['9']+1, std::ctype_base::digit);
std::fill(&rc['a'], &rc['z']+1, std::ctype_base::lower);
std::fill(&rc['A'], &rc['Z']+1, std::ctype_base::upper);
return &rc[0];
}
};
That makes reading words/numbers from the stream quite trivial. For example:
int main() {
char const test[] = "This is a bunch=of-words and 2#numbers#4(with)stuff to\tseparate,them, I think.";
std::istringstream infile(test);
infile.imbue(std::locale(std::locale(), new digits_only));
std::copy(std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
For the moment, I've copied the words/numbers to standard output, but copying to a vector just means giving a different iterator to std::copy. For real use, we'd undoubtedly want to get the data from an std::ifstream as well, but (again) it's just a matter of supplying the correct iterator. Just open the file, imbue it with the locale, and read your words/numbers. All the punctuation, etc., will be ignored automatically.
The following would read every line, skip non-alpha numeric characters and add each line as an item to the output vector. You can adapt it so it outputs words instead of lines. I did not want to provide the entire solution, as this looks a bit like a homework problem.
#include <vector>
#include <sstream>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(!(bookread.eof())){
line = "";
getline(bookread, line);
string lineToAdd = "";
for(int i = 0 ; i < line.size(); ++i)
{
if(isalnum(line[i]) || line[i] == ' ')
{
if(line[i] == ' ')
lineToAdd.append(" ");
else
{ // just add the newly read character to the string 'lineToAdd'
stringstream ss;
string s;
ss << line[i];
ss >> s;
lineToAdd.append(s);
}
}
}
words.push_back(lineToAdd);
}
}
for(int i = 0 ; i < words.size(); ++i)
cout << words[i] + " ";
return 0;
}

getline in C++ - Help needed

I am using getline to read up to end of newline but c++ getline gets me stuff till space,
I have txt file data as
address(tab char)1420 Happy Lane
When I do
getline(reader, ss, '\t') I get address in ss string.
when I do getline(reader, ss, '\n') I just get 1420.
I want full "1420 Happy Lane", How to get it ?
Thanks.
#include <fstream>
#include <string>
#include <iostream>
#include <vector>
#include <sstream>
using namespace std;
int main( int argc, char *argv[] )
{
if( argc < 2 )
{
cout << "Missing filename as first argument" << "\n";
exit(2);
}
vector<string> myvector;
string ss;
int i=0, j=0;
ifstream reader(argv[1]);
if (! reader )
{
cout << "Error opening input file : " << " " << argv[1] << '\n';
return -1;
}
while( !reader.eof())
{
if ((i+1) % 2 == 0 )
getline(reader, ss, '\n');
else
getline(reader, ss, '\t');
if (ss[0] == '#')
{
//Skip
getline(reader,ss, '\n');i=0;
continue;
}
i++;
myvector.push_back(ss);
}
reader.close();
vector<string>::iterator it;
stringstream stream;
int vecloc=1;
string tag;
string sData;
cout << "myvector contains: \n";
for ( it=myvector.begin() ; it < myvector.end(); it++ )
{
switch (vecloc)
{
case 1: stream << *it; stream >> tag; vecloc++;break;
case 2:
stream << *it; stream >> sData;
// Do job
cout << tag << " " << sData << "\n";
// Reset.
vecloc=1; break;
default : break;
}
// Clear String stream
stream.str(""); stream.clear();
}
return(0);
}
output
/home/sr/utl
cat abc.txt
hey c++ making me nuts.
/home/sr/utl
a.out abc.txt
myvector contains:
hey c++
Paste the actual code from your editor and double check that there isn't a newline (or maybe other unexpected non-printing characters) in your data file.
This works as expected here:
#include <iostream>
#include <sstream>
using namespace std;
int main()
{
stringstream reader("address\t1420 Happy Lane\n");
string ss;
getline(reader, ss, '\t');
cout << "1: " << ss << endl;
getline(reader, ss, '\n');
cout << "2: " << ss << endl;
}
Output:
1: address
2: 1420 Happy Lane
I got a split() function you can use for that. Use \t as the delimeter:
void split(std::string &string, std::vector<std::string> &tokens, const char &delim) {
std::string ea;
std::stringstream stream(string);
while(getline(stream, ea, delim))
tokens.push_back(ea);
}
You're trying to alternate between grabbing up until a \t and grabbing up until a \n. But the times that you find a '#' comment line throw off your alternation.
By far the easiest and most robust way to handle this sort of thing is to read each line first, and then re-parse the line.