How to read substitution char with ifstream in C++ ? (SUB in ASCII) - c++

I am having a hard time finding out why I can't read all characters with fstream get function.
My code is the following :
ifstream input_stream(input_filename.c_str(), ios::in);
string input;
if(input_stream)
{
char character;
while(input_stream.get(character))
{
input += character;
}
input_stream.close();
}
else
cerr << "Error" << endl;
By testing a little, I found out that I get a problem when character = 26 (SUB in ASCII) because input_stream.get(26) return false and I get out of my while loop.
I would like to put in my string input all characters from the file including SUB.
I tryed with getline function at first and I got a similar problem.
Could you help me please ?

You need to read a binary stream, not a textual one (since SUB i.e. '0x1a' (that is 26) is a control character in ASCII or UTF8, not a printable one) Use ios::binary at opening time:
ifstream input_stream(input_filename.c_str(), ios::in | ios::binary);
Maybe you would then code
do {
int c= input_stream.get();
if (c==std::char_traits::eof()) break;
input += (char)c;
} while (!input_stream.fail());
Did you consider using std::getline to read an entire line, assuming the input file is still organized in ('\n' terminated) lines?

Related

Find and Replace a string in a text file and output to another file

I'm trying to write a program that can open a text file, find a certain string and substitute it with another string and then write the altered text to an output file.
This is what I've coded so far. It works fine, except for that the output file is missing spaces and new line characters.
I need to preserve all spaces and new line characters. How do I do it?
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string search = "HELLO"; //String to find
string replace = "GOODBYE"; //String that will replace the string we find
string filename = ""; //User-provided filename of the input file
string temp; //temp variable for our loop to hold the characters from the file stream
char c;
cout << "Input filename? ";
cin >> filename;
ifstream filein(filename); //File to read from
ofstream fileout("temp.txt"); //Temporary file
if (!fileout || !filein) //if either file is not available
{
cout << "Error opening " << filename << endl;
return 1;
}
while (filein >> temp) //While the stream continues
{
if (temp == search) //Check if the temp variable has captured the string we are looking for
{
temp = replace; //When we found the string, we substitute it with the replacement string
}
fileout << temp; //Dump everything to fileout (our temp.txt file)
}
//Close our file streams
filein.close();
fileout.close();
return 0;
}
UPDATE:
I followed your advice and did the following, but now it doesn't work at all (the previous code worked fine, except for white spaces). Could you kindly tell me what I'm doing wrong here?
Thank you.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string search = "or"; //String to find
string replace = "OROROR"; //String that will replace the string we find
string filename = ""; //User-provided filename of the input file
string temp = ""; //temp variable for our loop to hold the characters from the file stream
char buffer;
cout << "Input filename? ";
cin >> filename;
ifstream filein(filename); //File to read from
ofstream fileout("temp.txt"); //Temporary file
if (!fileout || !filein) //if either file is not available
{
cout << "Error opening " << filename << endl;
return 1;
}
while (filein.get(buffer)) //While the stream continues
{
if (buffer == ' ') //check if space
{
if (temp == search) //if matches pattern,
{
temp = replace; //replace with replace string
}
}
temp = string() + buffer;
for (int i = 0; temp.c_str()[i] != '\0'; i++)
{
fileout.put(temp.c_str()[i]);
}
return 0;
}
}
while (filein >> temp)
This temp variable is a std::string. The formatted extraction operator, >>, overload for a std::string skips all whitespace characters (spaces, tabs, newlines) in the input and completely discards them. This formatted extraction operator discards all whitespace until the first non-whitespace character, then extracts it and all following non-whitespace characters and places them into your std::string, which is this temp variable. This is how it works.
Subsequently:
fileout << temp;
This then writes out this string to the output. There's nothing in the shown code that tells your computer to copy all whitespace from the input to the output, as is. The only thing that the shown code does is extract every sequence of non-space characters from the input file, immediately throwing on the floor all spaces and newlines, never to be seen again; and then write what's left (with the appropriate changes) to the output file. And a computer will always do exactly what you tell it to do, and not what you want it to do.
while (filein >> temp)
This is where all spaces in the input file gets thrown in the trash, and discarded. Therefore you wish to preserve them and copy them to the output file, as is, you will have to replace this.
There are several approaches that can be used here. The simplest solution is to simply read the input file one character at a time. If it's not a whitespace character, add it to the temp buffer. If it's a whitespace character, and temp is not empty, then you've just read a complete word; check if it needs replacing; write it out to the output file; clear the temp buffer (in preparation for reading the next word); and then manually write the just-read whitespace character to the output file. In this manner you will copy the input to the output, one character at a time, including spaces, but buffering non-space character into the temp buffer, until each complete word gets read, before copying it to the output file. And you will also need to handle the edge case of handling the very last word in the file, without any trailing whitespace.

Infinite loop with get function

Can Anyone tell me that what's wrong with using get function here instead of getline. Get works perfectly in reading a single line without any loop. Why it isn't working here. It results in infinite loop.
int main() {
ofstream outfile;
outfile.open("Myfile.txt", ios::trunc);
outfile <<"aabc"<<endl;
outfile <<"Hello Helloo"<<endl;
outfile <<"3abc"<<endl;
outfile <<"Somee text here "<<endl;
outfile.close();
ifstream infile;
infile.open("Myfile.txt");
char ch[20];
while(!infile.eof()) {
infile.get(ch,20);
cout<<ch;
}
infile.close();
return 0;
}
When called with a char*, as in your get(ch,20), the get method will read up to 19 characters or until it reaches a delimiter (\n by default).
The delimiting character is explicitly not read, so it's still the next character. So when you call it a second time, without having done anything to read that character, it immediately returns the 0-length string up to that same delimiter, over and over again.
Since that behavior is the key difference between get and getline, if it's not the behavior you want, just don't use it.

C++ how to remove all chars and special characters from a file

I have seen how to remove specific chars from a string but I am not sure how to do it with a file open or if you can even do that. Basically a file will be open with anything in it, my goal is to remove all the letters a-z, special characters, and whitespace that may appear so that all that is left is my numbers. Can you easily remove all chars rather than specifying a,b,c etc when the file is open or would I have to convert it to a string? Also would it be better to do this in memory?
My code this far as is follows:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() {
string filename;
cout << "Enter the name of the data file to open" << endl;
cin >> filename >> endl;
ofstream myfile;
myfile.open(filename);
if (myfile.is_open()) { //if file is open then
while(!myfile.eof()){ //while not end of file
//remove all chars, special and whitespace
}
}
else{
cout << "Error in opening file" << endl;
}
return 0;
}
Preliminary remarks
If I understand well, you want to keep only the numbers. Maybe it's easier to retain chars that are ascii numbers and eliminate the others rather than eliminate a lot of other chars classes and hope that the remainder is only numbers.
Also never loop on eof to read a file. Loop on the stream instead.
finally, you should read from an ifstream and write to an ofstream
First approach: reading strings
You can read/write the file line by line. You need enough memory to store the largest line, but you benefit from buffering effect.
if (myfile.is_open()) { //if file is open then
string line;
while(getline(myfile, line)){ //while succesful read
line.erase(remove_if(line.begin(), line.end(), [](const char& c) { return !isdigit(c); } ), line.end());
... // then write the line in the output file
}
}
else ...
Online demo
Second approach: reading chars
You can read/write char by char, which gives very flexible option for handling individual characters (toggle string flags, etc...). You also benefit from buffering, but you have function call overhaead for every single char.
if (myfile) { //if file is open then
int c;
while((c = myfile.get())!=EOF){ //while succesful read
//remove all chars, special and whitespace
if (isdigit(c) || c=='\n')
... .put(c); // then write the line in the output file
}
}
else ...
Online demo
Other approaches
You could also read a large fixed size buffer, and operate similarly as with the strings (but don't eliminate LF then). The advantage is that the memory need is not impacted by some very large lines in the file.
You could also determine the file size, and try to read the full file at once (or in very large chunks). You'd then maximize performance at the cost of memory consumption.
This is just an example in order to extract all chars you want from a file with a dedicated filter:
std::string get_purged_file(const std::string& filename) {
std::string strbuffer;
std::ifstream infile;
infile.open(filename, std::ios_base::in);
if (infile.fail()) {
// throw an error
}
char c;
while ((infile >> c).eof() == false) {
if (std::isdigit(c) || c == '.') {
strbuffer.push_back(c);
}
}
infile.close();
return strbuffer;
}
Note: this is just an example and it has to be subject to optimizations. Just to give you an idea:
Read more than one char at time, (with a proper buffer).
Reserve memory in string.
Once you have the buffer "purged" you can overwrite your file on save the content into another file.

Write a C++ pgm to read from a file "input.txt" , whenever a period is encountered insert newline and write modified content to "output.txt"

Question: Write a program in C++ to read from a file "input.txt" and whenever a period is encountered in the file "input.txt" insert a newline character and then write the modified contents to a new file "output.txt" and save it. Finally print the number of periods encountered.
I wrote the following program however this program compiles fine but it doesn't execute so please help me out. Thanks and regards.
#include<iostream>
#include<fstream>
using namespace std;
int main(){
int count = 0;
ofstream myFile;
char ch;
myFile.open("D:\\Files\\input.txt");
myFile<<"Hi this is Yogish. I'm from Bengaluru, India. And you are ??"<<endl;
myFile.close();
ofstream myHandler;
myHandler.open("D:\\Files\\output.txt");
fstream handler;
handler.open("D:\\Files\\input.txt");
if(handler.is_open()){
while(!handler.eof()){
handler>>ch;
if(ch != '.'){
handler<<ch;
}
else{
myHandler<<ch<<'\n';
handler<<'.'<<'\n';
count++;
}
}
}
cout<<"The number of periods : "<<count++<<endl;
system("pause");
}
I assume the question means that you only have to write the modified contents to the new file output.txt. At present you are trying to write into input file as well.
You should read the entire line in one string and then use std::replace_if algorithm from the <algorithm> header.
Also, in general, you should avoid the check for termination condition as file.eof() since it is only set after the read operation. Hence, there is a possibility that the eof() bit is set after you read a character, which means that the last character read is invalid and you would output this invalid character to the file.
It will result in undefined behaviour.
Instead you should try something like:
bool isDot( const char& character ) {
return character == '.';
}
And in your main function:
std::string newLine;
// enter the loop only if the read operation is successful
while ( getline( handler, newLine ) ) {
count += std::count_if( newLine.begin(), newLine.end(), isDot );
std::replace_if( newLine.begin(), newLine.end(), isDot, '\n' );
myHandler << newLine;
}

How do I copy the binary code of an executable into a new file without using a system copy command?

This is the code I have, but the file is a little smaller and doesn't execute:
int WriteFileContentsToNewFile(string inFilename, string outFilename)
{
ifstream infile(inFilename.c_str(), ios::binary);
ofstream outfile(outFilename.c_str(), ios::binary);
string line;
// Initial read
infile >> line;
outfile << line;
// Read the rest
while( infile )
{
infile >> line;
outfile << line;
}
infile.close();
outfile.close();
return 0;
}
What am I doing wrong? Is there a better way to read in the binary of an executable file and immediately write it out to another name? Any code examples?
I need to do it without a system copy in order to simulate writing to disk.
One way is to use the stream inserter for a streambuf:
int WriteFileContentsToNewFile(string inFilename, string outFilename)
{
ifstream infile(inFilename.c_str(), ios::binary);
ofstream outfile(outFilename.c_str(), ios::binary);
outfile << infile.rdbuf();
}
The stream operator>>() performs formatted input even if you open the stream in binary mode. Formatted input expects to see strings of printable characters separated by spaces, but this is not what binary files like executables consist of. You need to read the file with the stream's read() function, and write it with the output stream's write() function.
Off the top of my head: (no error checking)
EDIT: Changed to fix feof bug.
int WriteFileContentsToNewFile(string inFilename, string outFilename)
{
FILE* in = fopen(inFilename.c_str(),"rb");
FILE* out = fopen(outFilename.c_str(),"wb");
char buf[4096]; //1024 is a habit of mine. 4096 is most likely your blocksize. it could also be 2<<13 instead.
int len;
while( (len = fread(buf,1,1024,in)) > 0 )
{
fwrite(buf,1,len,out);
}
fclose(in);
fclose(out);
}
(unix) the system cp command not only copies the contents of the file, but also copies (some) of the file permissions, which include the execute bit.
Make sure your copy also sets the execute bit on the output file as appropriate.