How to parse table of numbers in C++ - c++

I need to parse a table of numbers formatted as ascii text. There are 36 space delimited signed integers per line of text and about 3000 lines in the file. The input file is generated by me in Matlab so I could modify the format. On the other hand, I also want to be able to parse the same file in VHDL and so ascii text is about the only format possible.
So far, I have a little program like this that can loop through all the lines of the input file. I just haven't found a way to get individual numbers out of the line. I am not a C++ purest. I would consider fscanf() but 36 numbers is a bit much for that. Please suggest practical ways to get numbers out of a text file.
int main()
{
string line;
ifstream myfile("CorrOut.dat");
if (!myfile.is_open())
cout << "Unable to open file";
else{
while (getline(myfile, line))
{
cout << line << '\n';
}
myfile.close();
}
return 0;
}

Use std::istringstream. Here is an example:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
string line;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
istringstream strm(line);
while ( strm >> num )
cout << num << " ";
cout << "\n";
}
}
Live Example
If you want to create a table, use a std::vector or other suitable container:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string line;
// our 2 dimensional table
vector<vector<int>> table;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
vector<int> vInt;
istringstream strm(line);
while ( strm >> num )
vInt.push_back(num);
table.push_back(vInt);
}
}
The table vector gets populated, row by row. Note we created an intermediate vector to store each row, and then that row gets added to the table.
Live Example

You can use a few different approaches, the one offered above is probable the quickest of them, however in case you have different delimitation characters you may consider one of the following solutions:
The first solution, read strings line by line. After that it use the find function in order to find the first position o the specific delimiter. It then removes the number read and continues till the delimiter is not found anymore.
You can customize the delimiter by modifying the delimiter variable value.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
string delimiter = " ";
size_t pos = 0;
string token;
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
pos = 0;
while ((pos = line.find(delimiter)) != std::string::npos) {
token = line.substr(0, pos);
std::cout << token << std::endl;
line.erase(0, pos + delimiter.length());
temp.push_back(atoi(token.c_str()));
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}
The second solution make use of regex and it doesn't care about the delimiter use, it will search and match any integers found in the string.
#include <iostream>
#include <string>
#include <regex> // The new library introduced in C++ 11
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
std::smatch m;
std::regex e("[-+]?\\d+");
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
while (regex_search(line, m, e)) {
for (auto x : m) {
std::cout << x.str() << " ";
temp.push_back(atoi(x.str().c_str()));
}
std::cout << std::endl;
line = m.suffix().str();
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}

Related

Is there a way to print individual words from a .txt file without leaving out lines?

#include <iostream>
#include <string>
#include <fstream>
using namespace std;
string readFileToString(string fileName) {
fstream file;
string word;
string returnMe;
returnMe.resize(200);
file.open(fileName.c_str());
while (file >> word) {
returnMe += word + " ";
}
file.close();
return returnMe;
}
int main() {
string fileName = "example.txt";
cout << readFileToString(fileName);
}
I have this code but I have several lines in my txt file and it completely ignores them.
If you want to print out all the words in the text file then you can use the following program:
#include <iostream>
#include <sstream>
#include <fstream>
int main()
{
std::ifstream inputFile("input.txt");
std::string word, line;
if(inputFile)
{
while(std::getline(inputFile, line)) //go line by line
{
//std::cout<<line<<std::endl; //this prints the line
std::istringstream ss(line);
while(ss >> word) //go word by word
{
std::cout << word << std::endl;
}
}
}
else
{
std::cout << "File cannot be opened" << std::endl;
}
return 0;
}
The output of the above program can be seen here.

How to get input an array of strings with \n as delimiter?

#include<bits/stdc++.h>
using namespace std;
int main()
{
int i=0;
char a[100][100];
do {
cin>>a[i];
i++;
}while( strcmp(a[i],"\n") !=0 );
for(int j=0;j<i;i++)
{
cout<<a[i]<<endl;
}
return 0;
}
Here , i want to exit the do while loop as the users hits enter .But, the code doesn't come out of the loop..
The following reads one line and splits it on white-space. This code is not something one would normally expect a beginner to write from scratch. However, searching on Duckduckgo or Stackoverflow will reveal lots of variations on this theme. When progamming, know that you are probably not the first to need the functionality you seek. The engineering way is to find the best and learn from it. Study the code below. From one tiny example, you will learn about getline, string-streams, iterators, copy, back_inserter, and more. What a bargain!
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>
int main() {
using namespace std;
vector<string> tokens;
{
string line;
getline(cin, line);
istringstream stream(line);
copy(istream_iterator<string>(stream),
istream_iterator<string>(),
back_inserter(tokens));
}
for (auto s : tokens) {
cout << s << '\n';
}
return 0;
}
First of all, we need to read the line until the '\n' character, which we can do with getline(). The extraction operator >> won't work here, since it will also stop reading input upon reaching a space. Once we get the whole line, we can put it into a stringstream and use cin >> str or getline(cin, str, ' ') to read the individual strings.
Another approach might be to take advantage of the fact that the extraction operator will leave the delimiter in the stream. We can then check if it's a '\n' with cin.peek().
Here's the code for the first approach:
#include <iostream> //include the standard library files individually
#include <vector> //#include <bits/stdc++.h> is terrible practice.
#include <sstream>
int main()
{
std::vector<std::string> words; //vector to store the strings
std::string line;
std::getline(std::cin, line); //get the whole line
std::stringstream ss(line); //create stringstream containing the line
std::string str;
while(std::getline(ss, str, ' ')) //loops until the input fails (when ss is empty)
{
words.push_back(str);
}
for(std::string &s : words)
{
std::cout << s << '\n';
}
}
And for the second approach:
#include <iostream> //include the standard library files individually
#include <vector> //#include <bits/stdc++.h> is terrible practice.
int main()
{
std::vector<std::string> words; //vector to store the strings
while(std::cin.peek() != '\n') //loop until next character to be read is '\n'
{
std::string str; //read a word
std::cin >> str;
words.push_back(str);
}
for(std::string &s : words)
{
std::cout << s << '\n';
}
}
You canuse getline to read ENTER, run on windows:
//#include<bits/stdc++.h>
#include <iostream>
#include <string> // for getline()
using namespace std;
int main()
{
int i = 0;
char a[100][100];
string temp;
do {
getline(std::cin, temp);
if (temp.empty())
break;
strcpy_s(a[i], temp.substr(0, 100).c_str());
} while (++i < 100);
for (int j = 0; j<i; j++)
{
cout << a[j] << endl;
}
return 0;
}
While each getline will got a whole line, like "hello world" will be read once, you can split it, just see this post.

Locate and tag words in text file

I need to read in a text file of 500 words or more(a real world article from newspaper, etc..) and locate and tag like this, <location> word <location/>, and then print the entire article on the screen. Im using boost regex right now and its working ok. I want to try and use a list or array or some other data structure to have a list of the states and major cities, and search those and compare to the aticle. right now I'm using an array but I'm willing to use anything. Any ideas or clues?
#include <boost/regex.hpp>
#include <iostream>
#include <string>
#include <boost/iostreams/filter/regex.hpp>
#include <fstream>
using namespace std;
int main()
{
string cities[389];
string states [60];
string filename, line,city,state;
ifstream file,cityfile, statefile;
int i=0;
int j=0;
cityfile.open("c:\\cities.txt");
while (!cityfile.eof())
{
getline(cityfile,city);
cities[i]=city;
i++;
//for (int i=0;i<500;i++)
//file>>cities[i];
}
cityfile.close();
statefile.open("c:\\states.txt");
while (!statefile.eof())
{
getline(statefile,state);
states[j]=state;
//for (int i=0;i<500;i++)
//cout<<states[j];
j++;
}
statefile.close();
//4cout<<cities[4];
cout<<"Please enter the path and file name "<<endl;
cin>>filename;
file.open(filename);
while (!file.eof())
{
while(getline(file, line)
{
}
while(getline(file, line))
{
//string text = "Hello world";
boost::regex re("[A-Z/]\.[A-Z\]\.|[A-Z/].*[:space:][A-Z/]|C........a");
//boost::regex re(
string fmt = "<locations>$&<locations\>";
if(boost::regex_search(line, re))
{
string result = boost::regex_replace(line, re, fmt);
cout << result << endl;
}
/*else
{
cout << "Found Nothing" << endl;
}*/
}
}
file.close();
cin.get(),cin.get();
return 0;
}
If you are after asymptotic complexity - Aho-Corasick algorithm offers a linear time complexity ( O(n+m)) (n and m are the lengths of the input strings). for searching a dictionary in a string.
An alternative is to put the tokenized words in a map (where the value is a list to the places in the stream of each string), and search for each string in the data in the tree. The complexity will be O(|S| * (nlogn + mlogn) ) (m being the number of searched words, n is the number of words in the string, and |S| is the length of the average word)
You can use any container that has a .find() method or supports std::find(). I'd use set, since set::find() runs in less than linear time.
Here is a program which does what you talk about. Note that the parsing doesn't work great, but that's not what I'm trying to demonstrate. You could continue to find the words using your parser, and use the call to set::find() to determine if they are locations.
#include <set>
#include <string>
#include <iostream>
#include <sstream>
const std::set<std::string> locations { "Springfield", "Illinois", "Pennsylvania" };
int main () {
std::string line;
while(std::getline(std::cin, line)) {
std::istringstream iss(line);
std::string word;
while(iss >> word) {
if(locations.find(word) == locations.end())
std::cout << word << " ";
else
std::cout << "<location>" << word << "</location> ";
}
std::cout << "\n";
}
}

C++ Putting text from a text file into an array as individual characters

I want to put some text from a text file into an array, but have the text in the array as individual characters.
How would I do that?
Currently I have
#include <iostream>
#include <fstream>
#include <string>
#include <cmath>
#include <vector>
#include <sstream>
using namespace std;
int main()
{
string line;
ifstream myfile ("maze.txt");
if (myfile.is_open())
{
while ( myfile.good() )
{
getline (myfile,line);
// --------------------------------------
string s(line);
istringstream iss(s);
do
{
string sub;
iss >> sub;
cout << "Substring: " << sub << endl;
} while (iss);
// ---------------------------------------------
}
myfile.close();
}
else cout << "Unable to open file";
system ("pause");
return 0;
}
I'm guessing getline gets one line at a time. Now how would I split that line into individual characters, and then put those characters in an array?
I am taking a C++ course for the first time so I'm new, be nice :p
std::ifstream file("hello.txt");
if (file) {
std::vector<char> vec(std::istreambuf_iterator<char>(file),
(std::istreambuf_iterator<char>()));
} else {
// ...
}
Very elegant compared to the manual approach using a loop and push_back.
#include <vector>
#include <fstream>
int main() {
std::vector< char > myvector;
std::ifstream myfile("maze.txt");
char c;
while(myfile.get(c)) {
myvector.push_back(c);
}
}

Trying to read from a file and skip punctuation in C++, tips?

I'm trying to read from a file, and make a vector of all the words from the file. What I tried to do below is have the user input the filename, and then have the code open the file, and skip characters if they aren't alphanumeric, then input that to a file.
Right now it just closes immediately when I input the filename. Any idea what I could be doing wrong?
#include <vector>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int main()
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(bookread.good()){
getline(bookread, line);
cout << line;
while(isalnum(bookread)){
words.push_back(bookread);
}
}
}
cout << words[];
}
I think I'd do the job a bit differently. Since you want to ignore all but alphanumeric characters, I'd start by defining a locale that treats all other characters as white space:
struct digits_only: std::ctype<char> {
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
std::fill(&rc['0'], &rc['9']+1, std::ctype_base::digit);
std::fill(&rc['a'], &rc['z']+1, std::ctype_base::lower);
std::fill(&rc['A'], &rc['Z']+1, std::ctype_base::upper);
return &rc[0];
}
};
That makes reading words/numbers from the stream quite trivial. For example:
int main() {
char const test[] = "This is a bunch=of-words and 2#numbers#4(with)stuff to\tseparate,them, I think.";
std::istringstream infile(test);
infile.imbue(std::locale(std::locale(), new digits_only));
std::copy(std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
For the moment, I've copied the words/numbers to standard output, but copying to a vector just means giving a different iterator to std::copy. For real use, we'd undoubtedly want to get the data from an std::ifstream as well, but (again) it's just a matter of supplying the correct iterator. Just open the file, imbue it with the locale, and read your words/numbers. All the punctuation, etc., will be ignored automatically.
The following would read every line, skip non-alpha numeric characters and add each line as an item to the output vector. You can adapt it so it outputs words instead of lines. I did not want to provide the entire solution, as this looks a bit like a homework problem.
#include <vector>
#include <sstream>
#include <string>
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
string line; //for storing words
vector<string> words; //unspecified size vector
string whichbook;
cout << "Welcome to the book analysis program. Please input the filename of the book you would like to analyze: ";
cin >> whichbook;
cout << endl;
ifstream bookread;
//could be issue
//ofstream bookoutput("results.txt");
bookread.open(whichbook.c_str());
//assert(!bookread.fail());
if(bookread.is_open()){
while(!(bookread.eof())){
line = "";
getline(bookread, line);
string lineToAdd = "";
for(int i = 0 ; i < line.size(); ++i)
{
if(isalnum(line[i]) || line[i] == ' ')
{
if(line[i] == ' ')
lineToAdd.append(" ");
else
{ // just add the newly read character to the string 'lineToAdd'
stringstream ss;
string s;
ss << line[i];
ss >> s;
lineToAdd.append(s);
}
}
}
words.push_back(lineToAdd);
}
}
for(int i = 0 ; i < words.size(); ++i)
cout << words[i] + " ";
return 0;
}