Reading from a file without skipping whitespaces - c++

I'm trying to make a code which would change one given word from a file, and change it into another one. The program works in a way that it copies word by word, if it's normal word it just writes it into the output file, and if it's the one i need to change it writes the one i need to change to. However, I've enountered a problem. Program is not putting whitespaces where they are in the input file. I don't know the solution to this problem, and I have no idea if I can use noskipws since I wouldn't know where the file ends.
Please keep in mind I'm a complete newbie and I have no idea how things work. I don't know if the tags are visible enough, so I will mention again that I use C++

Since each reading of word is ended with either a whitespace or end of file, you could simply check whether the thing which stop your reading is end of file, or otherwise a whitespace:
if ( reached the end of file ) {
// What I have encountered is end of file
// My job is done
} else {
// What I have encountered is a whitespace
// I need to output a whitespace and back to work
}
And the problem here is how to check the eof(end of file).
Since you are using ifstream, things will be quite simple.
When a ifstream reach the end of file (all the meaningful data have been read), the ifstream::eof() function will return true.
Let's assume the ifstream instance that you have is called input.
if ( input.eof() == true ) {
// What I have encountered is end of file
// My job is done
} else {
// What I have encountered is a whitespace
// I need to output a whitespace and back to work
}
PS : ifstream::good() will return false when it reaches the eof or an error occurs. Checking whether input.good() == false instead can be a better choice here.

First I would advise you not to read and write in the same file (at least not during reading) because it will make your program much more difficult to write/read.
Second if you want to read all whitespaces easiest is to read whole line with getline().
Program that you can use for modifying words from one file to another could look something like following:
void read_file()
{
ifstream file_read;
ofstream file_write;
// File from which you read some text.
file_read.open ("read.txt");
// File in which you will save modified text.
file_write.open ("write.txt");
string line;
// Word that you look for to modify.
string word_to_modify = "something";
string word_new = "something_new";
// You need to look in every line from input file.
// getLine() goes from beginning of the file to the end.
while ( getline (file_read,line) ) {
unsigned index = line.find(word_to_modify);
// If there are one or more occurrence of target word.
while (index < line.length()) {
line.replace(index, word_to_modify.length(), word_new);
index = line.find(word_to_modify, index + word_new.length());
}
cout << line << '\n';
file_write << line + '\n';
}
file_read.close();
file_write.close();
}

Related

Read a file line by line in C++

I wrote the following C++ program to read a text file line by line and print out the content of the file line by line. I entered the name of the text file as the only command line argument into the command line.
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char buf[255] = {};
if (argc != 2)
{
cout << "Invalid number of files." << endl;
return 1;
}
ifstream f(argv[1], ios::in | ios::binary);
if (!f)
{
cout << "Error: Cannot open file." << endl;
return 1;
}
while (!f.eof())
{
f.get(buf,255);
cout << buf << endl;
}
f.close();
return 0;
}
However, when I ran this code in Visual Studio, the Debug Console was completely blank. What's wrong with my code?
Apart from the errors mentioned in the comments, the program has a logical error because istream& istream::get(char* s, streamsize n) does not do what you (or I, until I debugged it) thought it does. Yes, it reads to the next newline; but it leaves the newline in the input!
The next time you call get(), it will see the newline immediately and return with an empty line in the buffer, for ever and ever.
The best way to fix this is to use the appropriate function, namely istream::getline() which extracts, but does not store the newline.
The EOF issue
is worth mentioning. The canonical way to read lines (if you want to write to a character buffer) is
while (f.getline(buf, bufSz))
{
cout << buf << "\n";
}
getline() returns a reference to the stream which in turn has a conversion function to bool, which makes it usable in a boolean expression like this. The conversion is true if input could be obtained. Interestingly, it may have encountered the end of file, and f.eof() would be true; but that alone does not make the stream convert to false. As long as it could extract at least one character it will convert to true, indicating that the last input operation made input available, and the loop will work as expected.
The next read after encountering EOF would then fail because no data could be extracted: After all, the read position is still at EOF. That is considered a read failure. The condition is wrong and the loop is exited, which was exactly the intent.
The buffer size issue
is worth mentioning, as well. The standard draft says in 30.7.4.3:
Characters are extracted and stored until one of the following occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c
(in which case the input character
is extracted but not stored);
n is less than one or n - 1 characters are stored
(in which case the function calls setstate(
failbit)).
The conditions are tested in that order, which means that if n-1 characters have been stored and the next character is a newline (the default delimiter), the input was successful (and the newline is extracted as well).
This means that if your file contains a single line 123 you can read that successfully with f.getline(buf, 4), but not a line 1234 (both may or may not be followed by a newline).
The line ending issue
Another complication here is that on Windows a file created with a typical editor will have a hidden carriage return before the newline, i.e. a line actually looks like "123\r\n" ("\r" and "\n" each being a single character with the values 13 and 10, respectively). Because you opened the file with the binary flag the program will see the carriage return; all lines will contain that "invisible" character, and the number of visible characters fitting in the buffer will be one shorter than one would assume.
The console issue ;-)
Oh, and your Console was not entirely empty; it's just that modern computers are too fast and the first line which was probably printed (it was in my case) scrolled away faster than anybody could switch windows. When I looked closely there was a cursor in the bottom left corner where the program was busy printing line after line of nothing ;-).
The conclusion
Debug your programs. It's very easy with VS.
Use getline(istream, string).
Use the return value of input functions (typically the stream)
as a boolean in a while loop: "As long as you can extract any input, use that input."
Beware of line ending issues.
Consider C I/O (printf, scanf) for anything non-trivial (I didn't discuss this in my answer but I think that's what many people do).

std::getline is reading line where specified delimiter is not present?

I want to print each object in console from the array of the following string (stored in a file):
{ beforechars [{Object1},{Object2},{Object3}] afterchars }
I'm doing it as follows:
std::ifstream is("content.txt");
std::getline(is, content, '[');
while (std::getline(is,content,'{')) {
std::getline(is,content,'}');
std::cout << content << std::endl;
}
in.close();
But i am getting this output:
Object1
Object2
Object3
] afterchars }
My understanding is that after Object3 iteration, the ifstream should have "}] afterchars }" and the while's guard shouldn't be true because there isn't any '{' char... Am i right? Where is the mistake?
The while condition doesn't work as you expect: getline() will read successfully until it reaches an '{' or to the end of the file if not.
So what happens here ?
when you've displayed Object3 your position in the stream is after the closing '}'.
The getline() in the while condition will read all the remaining of the file into content as it encounters no '{'. As it could read something successfully, the condition is evaluated to true.
the getline() within the while block then fails to read anything, so content will remain unchanged. The stream is then in fail status. No subsequent operation will succeed until you clear this state. But nothing visible happens for now in your code.
after displaying this last result, the next loop condition will fail.
Simple workaround:
A very easy workaround would be to keep the current position in the stream before looking for '{', and in case it was not found, go back to this position. Attention: this way of parsing files is not so nice from point of view of performance, but it's ok for small files.
std::getline(is, content, '[');
auto pos = is.tellg(); // read current position
while (std::getline(is,content,'{') && !is.eof()) {
std::getline(is,content,'}');
pos = is.tellg(); // update position before iterating again
std::cout << content << std::endl;
}
is.seekg(pos); // and get back to last position
The trick here is that if '{' is not found, after the getline() the stream is not yet in fail state, but eof() is already true. We can then end the loop and go back to the last recorded position.
Online demo
std::getline reads characters until delimiter (consuming it) or until the end of the stream. It sets failbit on stream only if there were no character consumed (called on empty/invalid stream).
So your loop will terminate only when stream is empty.
Streams interface allows only to see next character, there is no way to scan input and do read if there specific character present.
If you need random access to characters, you need to read input in string and then parse it (with regular expressions or something else.)

Searching for char from end of file using seekg() c++

I have a file and I want to only output the last line to the console.
Here's my thoughts on how I would do it. Use file.seekg(0, ios::end) to put myself at the end of file.
Then, I could create a decrement variable int decrement = -1; and use a while loop
while (joke.peek() != '\n')
{
decrement--;
}
and get the starting position for my final line (going backwards from the end).
Knowing this, joke.seekg(decrement, ios::end); should set me to the beginning of the final line, and assuming I previously declared string input;
I would think that
getline(joke, input);
cout << input << endl;
would output it to the console.
Full code
void displayLastLine(ifstream &joke)
{
string input;
int decrement = -1;
joke.clear();
joke.seekg(0, ios::end);
while (joke.peek() != '\n')
{
decrement--;
}
joke.clear();
joke.seekg(decrement, ios::end);
getline(joke, input);
cout << input << endl;
}
The problem is, when I go to call this method for the file, nothing happens. When I step through it, the decrement just keeps on subtracting one, far beyond where a '\n' would be. To give an example of a text file, it would look something like this:
junk
garbage
skip this line
This is the line that we're looking for!
joke.seekg(0, ios::end);
This positions the file at the end.
while (joke.peek() != '\n')
Well, here's problem #1. When you're at the end of the file, peek() always returns EOF.
decrement--;
You write:
When I step through it, the decrement just keeps on subtracting one,
Well, what did you expect to happen, since that's the only thing that the loop does? The only thing your for loop does is subtract 1 from decrement. So that's what happens.
This a common problem: a computer does only what you tell it to do, instead of what you want it to do.
Although this is not optimal, your missing step is that before you peek(), you need to seek() back by one character. Then, peek() shows you the character at the current cursor position. Then, seek() back by one more character, and check peek() again, and so on.
But that still will not be sufficient for you. Most text files end with a newline character. That is, a newline is the last character in the file. So, even if you add back the missing seek(), in nearly all cases, what your code will end up doing is finding the last character in the file, the final newline character.
My recommendation for you is to stop writing code for a moment, and, instead, come up with a logical process for doing what you want to do, and describe this process in plain words. Then, discuss your proposed course of action with your rubber duck. Only after your rubber duck agrees that what you propose will work, then translate your plain language description into code.
peek does not move the file pointer, it reads the character without extracting it. So, you are constantly peeking the same value (character) and ending up in an infinite loop.
What would you need to do is something like:
while (joke.peek() != '\n')
{
joke.seek(-1, ios::cur);
}
That would put the input position at the \n, using the 2nd overload of seekg.
Please note that this is not a perfect solution. You need to check for errors and boundary conditions, but it explains your observed behaviour and gives you something to start fixing your code.
Your loop is actually only decrementing "decrement" and not using it to make the next search.
while (joke.peek() != '\n')
{
joke.seekg(decrement, std::ios::end);
decrement--;
}

seekg() not working as expected

I have a small program, that is meant to copy a small phrase from a file, but it appears that I am either misinformed as to how seekg() works, or there is a problem in my code preventing the function from working as expected.
The text file contains:
//Intro
previouslyNoted=false
The code is meant to copy the word "false" into a string
std::fstream stats("text.txt", std::ios::out | std::ios::in);
//String that will hold the contents of the file
std::string statsStr = "";
//Integer to hold the index of the phrase we want to extract
int index = 0;
//COPY CONTENTS OF FILE TO STRING
while (!stats.eof())
{
static std::string tempString;
stats >> tempString;
statsStr += tempString + " ";
}
//FIND AND COPY PHRASE
index = statsStr.find("previouslyNoted="); //index is equal to 8
//Place the get pointer where "false" is expected to be
stats.seekg(index + strlen("previouslyNoted=")); //get pointer is placed at 24th index
//Copy phrase
stats >> previouslyNotedStr;
//Output phrase
std::cout << previouslyNotedStr << std::endl;
But for whatever reason, the program outputs:
=false
What I expected to happen:
I believe that I placed the get pointer at the 24th index of the file, which is where the phrase "false" begins. Then the program would've inputted from that index onward until a space character would have been met, or the end of the file would have been met.
What actually happened:
For whatever reason, the get pointer started an index before expected. And I'm not sure as to why. An explanation as to what is going wrong/what I'm doing wrong would be much appreciated.
Also, I do understand that I could simply make previouslyNotedStr a substring of statsStr, starting from where I wish, and I've already tried that with success. I'm really just experimenting here.
The VisualC++ tag means you are on windows. On Windows the end of line takes two characters (\r\n). When you read the file in a string at a time, this end-of-line sequence is treated as a delimiter and you replace it with a single space character.
Therefore after you read the file you statsStr does not match the contents of the file. Every where there is a new line in the file you have replaced two characters with one. Hence when you use seekg to position yourself in the file based on numbers you got from the statsStr string, you end up in the wrong place.
Even if you get the new line handling correct, you will still encounter problems if the file contains two or more consecutive white space characters, because these will be collapsed into a single space character by your read loop.
You are reading the file word by word. There are better methods:
while (getline(stats, statsSTr)
{
// An entire line is read into statsStr.
std::string::size_type posn = statsStr.find("previouslyNoted=");
// ...
}
By reading entire text lines into a string, there is no need to reposition the file.
Also, there is a white-space issue when reading by word. This will affect where you think the text is in the file. For example, white space is skipped, and there is no telling how many spaces, newlines or tabs were skipped.
By the way, don't even think about replacing the text in the same file. Replacement of text only works if the replacement text has the same length as the original text in the file. Write to a new file instead.
Edit 1:
A better method is to declare your key strings as array. This helps with positioning pointers within a string:
static const char key_text[] = "previouslyNoted=";
while (getline(stats, statsStr))
{
std::string::size_type key_position = statsStr.find(key_text);
std::string::size_type value_position = key_position + sizeof(key_text) - 1; // for the nul terminator.
// value_position points to the character after the '='.
// ...
}
You may want to save programming type by making your data file conform to an existing format, such as INI or XML, and using appropriate libraries to parse them.

.get function with an opened file stream in C++

I have an open input file stream. It is able to open the other file (a txt file) successfully. And by making adjustments to the code right below I can get it read and output the other txt file (all ASCII characters, just letters) just fine. However, I was playing around with the below function. This results in one line being read, when there are in fact three lines. I want to know why. The size of the array is not the problem, i.e., making it larger does not seem to fix anything.
void DispFile(fstream& iFile)
{
auto char fileChar[256];
while (inFile.get(fileChar,256))
{
cout << fileChar;
}
}
Here is the code that WORKS:
void DispFile(fstream& iFile)
{
auto char fileChar[256];
while (inFile.getline(fileChar,256))
{
cout << fileChar;
cout << endl;
}
}
OR
void DispFile(fstream& iFile)
{
char file;
while (inFile.get(file)
{
cout << file;
}
}
So why does using inFile.get(array, dimension) result in only one line being read, while the others work like a charm (so to speak).
In the first version the .get(array,size) extracts characters until the delimiting characters. By default this is a newline, '\n'. However once reaches this character, it does not extract it from the input stream but leaves it for the next input attempt. Therefore the next time you call get() it will find the newline from the previous get() and immediately stop.
The .getline() works because it extracts the newline and the .get works because it simply gets each character one at a time until the end of file.
get(char*, int) reads until the deliminator ('\n') but does not extract it - so it gets "stuck" on it. getline removes it.