Checking whether file contains only whitespace C++ - c++

I am trying to read input from a file in C++, and I need to give an error message if there is no input.
This statement works if the file is completely empty:
if (f.peek() == std::ifstream::traits_type::eof()) return error("Empty file");
where error() is a simple function:
int error(string message){
cerr << "ERROR: " << message << "\n";
return -1;
}
How can I check for files that contain only whitespace so I can raise the same error? Such as 7 newlines?
My should continue executing normally if the file contains anything but only whitespace characters.

Scanning the file in advance if it only contains whitespaces would be very inefficient.
I believe what you really need, is to keep track if some non-whitespace data could be read at all, and raise the error if not:
std::string line;
std::vector<std::string> lines_with_data;
while(std::getline(f,line)) {
// check if line is empty or contains only whitespace
if(!(line.empty() ||
(std::find_if_not(line.begin(),line.end(),std::isspace) != line.end())) {
lines_with_data.push_back(line);
}
}
if(lines_with_data.empty()) { // No data could be found
return error("Empty file");
}

reading characters and checking with std::isspace may help. something like this:
char c;
bool isEmpty = true;
while (f.get(c))
{
if(!std::isspace(c))
{
isEmpty = false;
break;
}
}
this way, isEmpty will be what it says

You could use an std::istream_iterator over the file input and check individual characters with std:isspace(), I suppose.
But I wonder if this is really what you want to do. Why not try to read whatever it is you want to read, and only worry about things when you fail your reading/parsing?

Related

Why is C++'s file I/O ignoring initial empty lines when reading a text file? How can I make it NOT do this?

I'm trying to build myself a mini programming language using my own custom regular expression and abstract syntax tree parsing library 'srl.h' (aka. "String and Regular-Expression Library") and I've found myself an issue I can't quite seem to figure out.
The problem is this: When my custom code encounters an error, it obviously throws an error message, and this error message contains information about the error, one bit being the line number from which the error was thrown.
The issue comes in the fact that C++ seems to just be flat out ignoring the existence of lines which contain no characters (ie. line that are just the CRLF) until it finds a line which does contain characters, after which point it stops ignoring empty lines and treats them properly, thus giving all errors thrown an incorrect line number, with them all being incorrect by the same offset.
Basically, if given a file which contains the contents "(crlf)(crlf)abc(crlf)def", it'll be read as though its content were "abc(crlf)def", ignoring the initial new lines and thus reporting the wrong line number for any and all errors thrown.
Here's a copy of the (vary messily coded) function I'm using to get the text of a text file. If one of y'all could tell me what's going on here, that'd be awesome.
template<class charT> inline std::pair<bool, std::basic_string<charT>> load_text_file(const std::wstring& file_path, const char delimiter = '\n') {
std::ifstream fs(file_path);
std::string _nl = srl::get_nlp_string<char>(srl::newline_policy);
if (fs.is_open()) {
std::string s;
char b[SRL_TEXT_FILE_MAX_CHARS_PER_LINE];
while (!fs.eof()) {
if (s.length() > 0)
s += _nl;
fs.getline(b, SRL_TEXT_FILE_MAX_CHARS_PER_LINE, delimiter);
s += std::string(b);
}
fs.close();
return std::pair<bool, std::basic_string<charT>>(true, srl::string_cast<char, charT>(s));
}
else
return std::pair<bool, std::basic_string<charT>>(false, std::basic_string<charT>());
}
std::ifstream::getline() does not input the delimiter (in this case, '\n') into the string and also flushes it from the stream, which is why all the newlines from the file (including the leading ones) are discarded upon reading.
The reason it seems the program does not ignore newlines between other lines is because of:
if (s.length() > 0)
s += _nl;
All the newlines are really coming from here, but this cannot happen at the very beginning, since the string is empty.
This can be verified with a small test program:
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ifstream inFile{ "test.txt" }; //(crlf)(crlf)(abc)(crlf)(def) inside
char line[80]{};
int lineCount{ 0 };
std::string script;
while (inFile.peek() != EOF) {
inFile.getline(line, 80, '\n');
lineCount++;
script += line;
}
std::cout << "***Captured via getline()***" << std::endl;
std::cout << script << std::endl; //prints "abcdef"
std::cout << "***End***" << std::endl << std::endl;
std::cout << "Number of lines: " << lineCount; //result: 5, so leading /n processed
}
If the if condition is removed, so the program has just:
s += _nl;
, newlines will be inserted instead of the discarded ones from the file, but as long as '\n' is the delimiter, std::ifstream::getline() will continue discarding the original ones.
As a final touch, I would suggest using
while (fs.peek() != EOF){};
instead of
while(fs){}; or while(!fs.eof()){};
If you look at int lineCount's final value in the test program, the latter two give 6 instead of 5, as they make a redundant iteration in the end.

Reading from a file without skipping whitespaces

I'm trying to make a code which would change one given word from a file, and change it into another one. The program works in a way that it copies word by word, if it's normal word it just writes it into the output file, and if it's the one i need to change it writes the one i need to change to. However, I've enountered a problem. Program is not putting whitespaces where they are in the input file. I don't know the solution to this problem, and I have no idea if I can use noskipws since I wouldn't know where the file ends.
Please keep in mind I'm a complete newbie and I have no idea how things work. I don't know if the tags are visible enough, so I will mention again that I use C++
Since each reading of word is ended with either a whitespace or end of file, you could simply check whether the thing which stop your reading is end of file, or otherwise a whitespace:
if ( reached the end of file ) {
// What I have encountered is end of file
// My job is done
} else {
// What I have encountered is a whitespace
// I need to output a whitespace and back to work
}
And the problem here is how to check the eof(end of file).
Since you are using ifstream, things will be quite simple.
When a ifstream reach the end of file (all the meaningful data have been read), the ifstream::eof() function will return true.
Let's assume the ifstream instance that you have is called input.
if ( input.eof() == true ) {
// What I have encountered is end of file
// My job is done
} else {
// What I have encountered is a whitespace
// I need to output a whitespace and back to work
}
PS : ifstream::good() will return false when it reaches the eof or an error occurs. Checking whether input.good() == false instead can be a better choice here.
First I would advise you not to read and write in the same file (at least not during reading) because it will make your program much more difficult to write/read.
Second if you want to read all whitespaces easiest is to read whole line with getline().
Program that you can use for modifying words from one file to another could look something like following:
void read_file()
{
ifstream file_read;
ofstream file_write;
// File from which you read some text.
file_read.open ("read.txt");
// File in which you will save modified text.
file_write.open ("write.txt");
string line;
// Word that you look for to modify.
string word_to_modify = "something";
string word_new = "something_new";
// You need to look in every line from input file.
// getLine() goes from beginning of the file to the end.
while ( getline (file_read,line) ) {
unsigned index = line.find(word_to_modify);
// If there are one or more occurrence of target word.
while (index < line.length()) {
line.replace(index, word_to_modify.length(), word_new);
index = line.find(word_to_modify, index + word_new.length());
}
cout << line << '\n';
file_write << line + '\n';
}
file_read.close();
file_write.close();
}

How do I know if the specified file has been read correctly?

Why does ifstream set the failbit to 1 after reading the last line of the specified file? How do I know if the specified file has been read correctly?
bool read_csv_file(const char* filename, vector<string>& lines, bool adding = false)
{
if( !adding ) lines.clear();
ifstream csvfile;
csvfile.open(filename);
if( csvfile.is_open() )
{
string line;
while( csvfile.good() && getline(csvfile,line) )
{
lines.push_back(line);
cout << "fail: " << csvfile.fail() << endl;
}
cout << "fail: " << csvfile.fail() << endl;
csvfile.close();
return (!csvfile.fail());
}
return false;
}
The fail bit is set after you run off the end of the file. Once that happens, you mustn't attempt to interpret the result of your input operation. That's perfectly fine, though, and getline will not set the fail bit while there's still any data to be read. So the following standard loop extracts all the lines:
for (std::string line; std::getline(csvfile, line); )
{
// process "line"
}
// all done
The only reason failbit could be set after reading the last
line (or any line) is if there were an error in the library, and
I don't really believe it. If failbit is set, it means that
you didn't read anything. In your case, it will never be set
when you're in the loop; if it were set, getline would have
evaluated to false, and you wouldn't have entered the loop.
And of course, the loop terminates precisely because getline
fails (or would fail—normally, you would not test for
good before doing input, and if you do, consider that the
failbit was set, regardless, if the test fails).
The usual pattern for this sort of thing is:
while ( someInput ) {
// ...
}
if ( csvfile.bad() ) {
// Serious error (disk read error, etc.)...
} else if ( ! csvfile.eof() ) {
// Formatting error...
} else {
// Normal end of file...
}
When someInput is std::getline(), however, you will never
fail because of a formatting error, so the else if above will
never be true (and a lot of code treats hard disk errors as if
they were an end of file, and so ignores the if part as well).
Too check for erroneous reads, you must test badbit using stream.bad().
Failbit indicates failure in operation logic, and apparently getline sets it when reaching EOF (confirmed on my machine).

getline() reads an extra line

ifstream file("file.txt");
if(file.fail())
{
cout<<"Could not open the file";
exit(1);
}
else
{
while(file)
{
file.getline(line[l],80);
cout<<line[l++]<<"\n";
}
}
I am using a two dimensional character array to keep the text (more than one line) read from a file to count the number of lines and words in the file but the problem is that getline always reads an extra line.
Your code as I'm writing this:
ifstream file("file.txt");
if(file.fail())
{
cout<<"Could not open the file";
exit(1);
}
else
{
while(file)
{
file.getline(line[l],80);
cout<<line[l++]<<"\n";
}
}
The first time getline fails, you still increment the line counter and output the (non-existing) line.
Always check for an error.
extra advice: use std::string from the <string> header, and use its getline function.
cheers & hth.
The problem is when you're at the end of the file the test on file will still succeed because you have not yet read past the end of file. So you need to test the return from getline() as well.
Since you need to test the return from getline() to see if it succeeded, you may as well put it right in the while loop:
while (file.getline(line[l], 80))
cout << line[l++] << "\n";
This way you don't need a separate test on file and getline().
This will solve your problem:
ifstream file("file.txt");
if(!file.good())
{
cout<<"Could not open the file";
exit(1);
}
else
{
while(file)
{
file.getline(line[l],80);
if(!file.eof())
cout<<line[l++]<<"\n";
}
}
Its more robust
Does the file end with a newline? If it does, the EOF flag will not be triggered until one extra loop passes. For example, if the file is
abc\n
def\n
Then the loop will be run 3 times, the first time it will get abc, the second time it will get def and the third time it will get nothing. That's probably why you see an additional line.
Try checking the failbit on the stream AFTER the getline.
Only do the cout if file.good() is true. The extra line you're seeing comes from the last call to file.getline() which reads past the end of the file.

getline and file handling

I want to read the first lines of 2 separate files and then compare them...the following is the code i use but it gives me "istream to string error". do i need to use a while condition to start reading the files first?
ifstream data_real(filename.c_str()); /*input streams to check if the flight info
are the same*/
ifstream data_test("output_check.txt");
string read1, read2;
string first_line_input = getline(is,read1);
string first_line_output_test = getline(data_test,read2);
string test_string1, test_string2;
int num_lines_output_test, num_lines_input;
if((first_line_input.substr(0,3)==first_line_output_test.substr(0,3)))
{
while(!data_test.eof()) // count the number of lines for the output test file with the first flight info
{
getline(data_test,test_string1);
num_lines_output_test++;
}
while(getline(is,test_string2)) // count the number of lines for the output test file with the first flight info
{
if(test_string2.substr(0,3)!="ACM")
num_lines_input++;
else
break;
}
}
getline(istream, string) returns a reference to the istream, not a string.
So, comparing the first line of each file could be something like:
string read1, read2;
if !(getline(is,read1) && getline(data_test,read2)){
// Reading failed
// TODO: Handle and/or report error
}
else{
if(read1.substr(0,3) == read2.substr(0,3)){
//...
Also: Never use eof() as a termination condition for a stream reading loop. The idiomatic way to write it is:
while(getline(data_test,test_string1)) // count the number of lines for the output test file with the first flight info
{
num_lines_output_test++;
}
Try adding this helper function:
std::string next_line(std::istream& is) {
std::string result;
if (!std::getline(is, result)) {
throw std::ios::failure("Failed to read a required line");
}
return result;
}
Now you can use lines from the file the way you want (i.e. to initialize strings, rather than modify them):
string first_line_input = next_line(is);
string first_line_output_test = next_line(data_test);