Skip non-integers from text file C++ - c++

I have a program which reads integers from a text file and skips non-integers and strange symbols. Then text file looks like:
# Matrix A // this line should be skipped because it contains # symbol
1 1 2
1 1$ 2.1 // this line should be skipped because it contains 2.1 and $
3 4 5
I have to print out the matrix without strange symbols and non-integers line. That is the output should be:
1 1 2
3 4 5
My code
ifstream matrixAFile("a.txt", ios::in); // open file a.txt
if (!matrixAFile)
{
cerr << "Error: File could not be opened !!!" << endl;
exit(1);
}
int i, j, k;
while (matrixAFile >> i >> j >> k)
{
cout << i << ' ' << j << ' ' << k;
cout << endl;
}
But it fails when it gets the first # symbol. Anyone helps please ?

Your problem is with this code.
int i, j, k;
while (matrixAFile >> i >> j >> k)
The assignment is "Find out if the line contains integers"
But your code is saying "I already know that the line contains integers"

If you are set to three integers per row, I suggest this pattern:
#include <fstream>
#include <sstream>
#include <string>
std::ifstream infile("matrix.txt");
for (std::string line; std::getline(infile, line); )
{
int a, b, c;
if (!(std::istringstream(line) >> a >> b >> c))
{
std::cerr << "Skipping unparsable line '" << line << "'\n";
continue;
}
std::cout << a << ' ' << b << ' ' << c << std::endl;
}
If the number of numbers per line is variable, you could use a skip condition like this:
line.find_first_not_of(" 0123456789") != std::string::npos

Of course, this fails at the # character: The # isn't an integer and, thus, reading it as an integer fails. What you could do is try to read three integers. If this fails and you haven't reached EOF (i.e., matrixAFile.eof() yields false, you can clear() the error flags, and ignore() everything up to a newline. The error recovery would look something like this:
matrixAFile.clear();
matrixAFile.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
Note, that you need to bail out if you failed because eof() is true.

Since it's an assignment, I'm not giving full answer.
Read the data line by line to a string(call it str),
Split str into substrings,
In each substring, check if it's convertible to integer value.
Another trick is to read a line, then check that every char is between 0-9. It works if you don't need to consider negative numbers.

I think I'd just read a line at a time as a string. I'd copy the string to the output as long as it contains only digits, white-space and (possibly) -.

Related

read aligned data in c++

I want to read data from a file with a quite strange structure. The file looks like this below:
some lines with text....
10 1000 10
1 1 1
1 100 100
.
.
.
some lines with text...
again data like above..
some lines with text... etc
So I have two questions:
How can I read only the specific lines with the data?
How can I read these right aligned data?
Here is one of my trials:
string line;
ifstream myfile ("aa.txt");
double a,b,c;
while (! myfile.eof() )
{
for (int lineno = 0; getline (myfile,line); lineno++)
if (lineno>2 && lineno<5){
myfile>>a>>b>>c;
cout<<lineno<<" " << line << endl;}
}
myfile.close();
how can I read only the specific lines with the data?
well, read all the lines, and then write a function to detect whether the current line is a "data" one or not.
What are the characteristics of your data line? It consists only of digits and spaces? Could there be tabs? What about the columns, are they fixed width? Your predicate function can check there are spaces in the required columns if so.
how can I read these right aligned data?
You want to extract the integer values? Well, you can create a std::istringstream for your line (once you've checked it is data), and then use the >> stream extraction operator to read values into variables of the appropriate type.
Read up on how it handles whitespace (and/or experiment) - it might just do what you need with no effort.
this is just a simple example: you declare 3 variables as you did a, b , c as integer and a string line you open a file and input line convert line to integer if ok assign it to a if not don't do anything to b and c until next read until a valid conversion for a is ok then input b and c and like this:
#include <iostream>
#include <string>
#include <fstream>
int main()
{
std::ifstream in("data.txt");
int a = 0, b = 0, c = 0;
std::string sLine;
if(in.is_open())
{
while(in >> sLine)
{
if( a = atoi(sLine.c_str()))
std::cout << "a: " << a << std::endl;
if(a)
{
in >> sLine;
if( b = atoi(sLine.c_str()))
std::cout << "b: " << b << std::endl;
}
if(b)
{
in >> sLine;
if( c = atoi(sLine.c_str()))
std::cout << "c: " << c << std::endl;
}
}
}
in.close();
std::cout << std::endl;
return 0;
}

Count first digit on each line of a text file

My project takes a filename and opens it. I need to read each line of a .txt file until the first digit occurs, skipping whitespace, chars, zeros, or special chars. My text file could look like this:
1435 //1, nextline
0 //skip, next line
//skip, nextline
(*Hi 245*) 2 //skip until second 2 after comment and count, next line
345 556 //3 and count, next line
4 //4, nextline
My desired output would be all the way up to nine but I condensed it:
Digit Count Frequency
1: 1 .25
2: 1 .25
3: 1 .25
4: 1 .25
My code is as follows:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() {
int digit = 1;
int array[8];
string filename;
//cout for getting user path
//the compiler parses string literals differently so use a double backslash or a forward slash
cout << "Enter the path of the data file, be sure to include extension." << endl;
cout << "You can use either of the following:" << endl;
cout << "A forwardslash or double backslash to separate each directory." << endl;
getline(cin,filename);
ifstream input_file(filename.c_str());
if (input_file.is_open()) { //if file is open
cout << "open" << endl; //just a coding check to make sure it works ignore
string fileContents; //string to store contents
string temp;
while (!input_file.eof()) { //not end of file I know not best practice
getline(input_file, temp);
fileContents.append(temp); //appends file to string
}
cout << fileContents << endl; //prints string for test
}
else {
cout << "Error opening file check path or file extension" << endl;
}
In this file format, (* signals the beginning of a comment, so everything from there to a matching *) should be ignored (even if it contains a digit). For example, given input of (*Hi 245*) 6, the 6 should be counted, not the 2.
How do I iterate over the file only finding the first integer and counting it, while ignoring comments?
One way to approach your problem is the following:
Create a std::map<int, int> where the key is the digit and the value is the count. This allows you to compute statistics on your digits such as the count and the frequency after you have parsed the file. Something similar can be found in this SO answer.
Read each line of your file as a std::string using std::getline as shown in this SO answer.
For each line, strip the comments using a function such as this:
std::string& strip_comments(std::string & inp,
std::string const& beg,
std::string const& fin = "") {
std::size_t bpos;
while ((bpos = inp.find(beg)) != std::string::npos) {
if (fin != "") {
std::size_t fpos = inp.find(fin, bpos + beg.length());
if (fpos != std::string::npos) {
inp = inp.erase(bpos, fpos - bpos + fin.length());
} else {
// else don't erase because fin is not found, but break
break;
}
} else {
inp = inp.erase(bpos, inp.length() - bpos);
}
}
return inp;
}
which can be used like this:
std::string line;
std::getline(input_file, line);
line = strip_comments(line, "(*", "*)");
After stripping the comments, use the string member function find_first_of to find the first digit:
std::size_t dpos = line.find_first_of("123456789");
What is returned here is the index location in the string for the first digit. You should check that the returned position is not std::string::npos, as that would indicate that no digits are found. If the first digit is found, the corresponding character can be extracted using const char c = line[dpos]; and converted to an integer using std::atoi.
Increment the count for that digit in the std::map as shown in that first linked SO answer. Then loop back to read the next line.
After reading all lines from the file, the std::map will contain the counts for all first digits found in each line stripped of comments. You can then iterate over this map to retrieve all the counts, accumulate the total count over all digits found, and compute the frequency for each digit. Note that digits not found will not be in the map.
I hope this helps you get started. I leave the writing of the code to you. Good luck!

Reading numbers then letters in a text file in C++

I'm trying to do something in C++ and I'm having a little bit of trouble figuring out exactly how to do it.
I have a text file that contains information that I need to parse and act on. The format of the file is multiple lines, each containing a number followed by a letter:
1234 A
5678 B
9101 C
What I need to do is, line by line, read the number and do a calculation based on it. Then I need to do an operation depending on the value of the letter on that line. Once both operations are finished, I repeat with the next line until all lines from the file have been parsed.
I've found some articles on how to read into strings line by line, but I can't quote figure out how to separate the lines into number and letter.
Any assistance is greatly appreciated!
The following example might help:
#include <iostream>
#include <fstream>
#include <string>
int main(int argc, char** argv)
{
std::ifstream fp("input.dat");
char ch;
int n;
while (fp >> n && fp >> ch)
{
std::cout << "Here is your number: " << n << std::endl;
std::cout << "Here is your char: " << ch << std::endl;
}
return 0;
}
In the above, the file input.dat contains your input. This example doesn't care if numbers and letters are separated by newlines, tabs, spaces or any other whitespace. If you really need to ready a particular format you could look into fscanf or std::getline (http://www.cplusplus.com/reference/string/string/getline/).
using namespace std;
ifstream file("file.txt");
int number;
char character;
while( file >> number >> character ) {
// read each line as a number followed by a character
cout << number << " " << character << endl;
switch( character ) {
case 'A':
break;
case 'B':
break;
case 'C':
break;
}
}

Reading a File through get, getline and read : C++

I am trying to explore ifstream class and have written below code which reads a file Test.txt
'Test.txt' - Content
This is Line One
This is Line Two
This is Line Three
This is Line Four
This is Line Five
Code Written:
#include <iostream>
#include <fstream>
#include <limits>
using namespace std;
int main()
{
char buff[50];
char ch;
ifstream is("test.txt");
if (!is)
cout << "Unable to open " << endl;
while(is)
{
ch=(char)is.get();
if(ch != EOF)//If EOF is not checked then
//EOF converted as a char is displyed as
// last char of the file
cout << ch;
}
cout << "\n\n###########\n\n";
is.clear(); //clearing ios_base::eofbit which was set
//in previous action
is.seekg(0,ios_base::beg); //Going back to start of File
while(is)
{
is.get(buff,50,'\n');
cout << buff ;
cout << "\n--------------\n";
is.ignore(std::numeric_limits<std::streamsize>::max(),'\n');
//Flushing the is stream as '\n' was left by get fn
}
cout << "\n\n##############\n\n";
is.clear();
is.seekg(0,ios_base::beg);
while(!is.eof())
{
is.getline(buff,50,'\n');
cout << buff;
cout << "\n--------------\n";
//No need to flush the is stream as '\n'
//was extracted and discarded by getline
}
cout << "\n\n$$$$$$$$$$$$$$\n\n";
is.clear();
is.seekg(0,ios_base::end);
int size=is.tellg();
is.seekg(0,ios_base::beg);
cout << "size : " << size << endl;
//char* readBuff = (char *) ::operator new(sizeof(char)*size);
char* readBuff = new char[size];
is.read(readBuff,size);
cout << readBuff;
delete(readBuff);
is.close();
return 0;
}
OutPut:
Gaurav#Gaurav-PC /cygdrive/d/Trial
$ ./Trial
This is Line One
This is Line Two
This is Line Three
This is Line Four
This is Line Five
###########
This is Line One
--------------
This is Line Two
--------------
This is Line Three
--------------
This is Line Four
--------------
This is Line Five
--------------
##############
This is Line One
--------------
This is Line Two
--------------
This is Line Three
--------------
This is Line Four
--------------
This is Line Five
--------------
$$$$$$$$$$$$$$
size : 92
This is Line One
This is Line Two
This is Line Three
This is Line Four
This is Line Five▒u
There are some issues which I want to ask and get clarified:
1) When I use get as below
while(is)
{
is.get(buff,50,'\n');
cout << buff ;
// cout << "\n--------------\n";
is.ignore(std::numeric_limits<std::streamsize>::max(),'\n');
//Flushing the is stream as '\n' was left by get fn
}
i.e. I commented out cout << "\n--------------\n"; then the file is read as
###########
This is Line Fivee
i.e. it misses first four lines and reads only last one with extra 'e' .. not able to figure out why so ?
2) When I use getline as below:
// while(!is.eof())
while(is)
{
is.getline(buff,50,'\n');
cout << buff;
cout << "\n--------------\n";
//No need to flush the is stream as '\n'
//was extracted and discarded by getline
}
i.e. I used while(is) instead of while(!is.eof()) - I got the output:
##############
This is Line One
--------------
This is Line Two
--------------
This is Line Three
--------------
This is Line Four
--------------
This is Line Five
--------------
--------------
i.e. after the last line I get two extra lines. Again not able to figure out why so?
3) With read function the size I am getting is 92 where as total number of charaters in the file is 89 including EOF,spaces and '\n'. Also the last line shows two garbage characters after rearing the last character of the file. Why such behavior?
cout << "\n\n$$$$$$$$$$$$$$\n\n";
is.clear();
is.seekg(0,ios_base::end);
int size=is.tellg();
is.seekg(0,ios_base::beg);
cout << "size : " << size << endl;
//char* readBuff = (char *) ::operator new(sizeof(char)*size);
char* readBuff = new char[size];
is.read(readBuff,size);
cout << readBuff;
delete(readBuff);
OutPut:
$$$$$$$$$$$$$$
size : 92
This is Line One
This is Line Two
This is Line Three
This is Line Four
This is Line Five▒u
Thanks
EDIT:
As Per Answer received by Mats Peterson , I tried below code:
while(is.get(buff,50,'\n'))
{
cout << buff ;
//cout << "\n--------------\n";
is.ignore(std::numeric_limits<std::streamsize>::max(),'\n');
//Flushing the is stream as '\n' was left by get fn
}
cout << "\n\n##############\n\n";
is.clear();
is.seekg(0,ios_base::beg);
// while(!is.eof())
while(is.getline(buff,50,'\n'))
{
cout << buff;
//cout << "\n--------------\n";
//No need to flush the is stream as '\n'
//was extracted and discarded by getline
}
But got the output:
###########
This is Line Fivee
##############
This is Line Fivee
i.e Only Last line read... if I uncomment //cout << "\n--------------\n"; I get proper reading
#Down Votes At least comment what made you do so? I faced this issue that is why asked here to gain more insight from experts..
In the first two questions, you are because you are reading "one more than you have", which is a typical consequence of "the failure state is not set until we have tried to read past the end". This is why you should use
while(is.get(... ))
while(is.getline(...))
as the condtions for ending loops - because that will not run the loop when the read fails.
The third issue is because Windows used "CR+LF" for newlines, where reading a file in text mode (which is the default) collapses these into a single newline character. So the size of your file according to is.tellg is larger by one character for each newline than the data you actually read. You can use is.gcount() to see how many characters you ACTUALLY read. (and if (!is.read(... )) actual = is.gcount(); else actual = size; should give you a complete piece of code).
One of the main reason of reading extra value or line which is termed as garbage value (i thinks it so) because of use of eof ( you have used it) . when we used it, for ex, to read a character from a file, then it reads the character but because file does'nt end as well loop does'nt end at point, so it again reads extra value from file. so main thing which i want to say is that avoid function eof in any looping statement untill to end the file reading and not in any input-output conditions.

why does vector.size() read in one line too little?

when running the following code, the amount of lines will read on less then there actually is (if the input file is main itself, or otherwise)
why is this and how can i change that fact (besides for just adding 1)?
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main()
{
// open text file for input
string file_name;
cout << "please enter file name: ";
cin >> file_name;
// associate the input file stream with a text file
ifstream infile(file_name.c_str());
// error checking for a valid filename
if ( !infile ) {
cerr << "Unable to open file "
<< file_name << " -- quitting!\n";
return( -1 );
}
else cout << "\n";
// some data structures to perform the function
vector<string> lines_of_text;
string textline;
// read in text file, line by line
while (getline( infile, textline, '\n' )) {
// add the new element to the vector
lines_of_text.push_back( textline );
// print the 'back' vector element - see the STL documentation
cout << "line read: " << lines_of_text.back() << "\n";
}
cout<<lines_of_text.size();
return 0;
}
The code you have is sound. Here's a small test case that might help:
void read_lines(std::istream& input) {
using namespace std;
vector<string> lines;
for (string line; getline(input, line);) {
lines.push_back(line);
cout << "read: " << lines.back() << '\n';
}
cout << "size: " << lines.size() << '\n';
}
int main() {
{
std::istringstream ss ("abc\n\n");
read_lines(ss);
}
std::cout << "---\n";
{
std::istringstream ss ("abc\n123\n");
read_lines(ss);
}
std::cout << "---\n";
{
std::istringstream ss ("abc\n123"); // last line missing newline
read_lines(ss);
}
return 0;
}
Output:
read: abc
read:
size: 2
---
read: abc
read: 123
size: 2
---
read: abc
read: 123
size: 2
I think I have tracked down the source of your problem. In Code::Blocks, a completely empty file will report that there is 1 line in it (the current one) in the gizmo on the status bar at the bottom of the IDE. This means that were you actually to enter a line of text, it would be line 1. In other words, Code::Blocks will normally over-report the number of actual lines in a file. You should never depend on CB, or any other IDE, to find out info on files - that's not what they are for.
Well, if the last line of your file is just '\n', you don't push it into the vector. If you want it to be there, change the loop to:
while (getline( infile, textline, '\n' ).gcount() > 0)
{
if (infile.fail()) break; //An error occurred, break or do something else
// add the new element to the vector
lines_of_text.push_back( textline );
// print the 'back' vector element - see the STL documentation
cout << "line read: " << lines_of_text.back() << "\n";
}
Use the gcount() member to check how many characters were read in the last read - this will return 1 if it only read a delimiter character.
Ok so here is an explanation that you will hopefully understand. Your code should work fine if the file we're talking about doesn't end with newline. But what if it does? Let's say it looks like this:
"line 1"
"line 2"
""
Or as a sequence of characters:
line 1\nline 2\n
This file has THREE lines -- the last one being empty but it's there. After calling getline twice, you've read all the characters from the file. The third call to getline will say oops, end of file, sorry no more characters so you'll see only two lines of text.