Removing "funny" characters from a file in C++ [closed] - c++

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I have a text file that consists of "funny" non-ASCII characters such as NUL, RS, CAN all in a black square. When I read the file line by line, it just stops each line where one of these appear.
All I want to do is to copy the same file only without these characters.
How to do that?

Let's say you are reading the file line by line and write the output to a different file like this:
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main() {
string inPath("a.txt");
string outPath("b.txt");
string line;
ifstream in(inPath.c_str(), ifstream::in | ifstream::binary);
if ( ! in.is_open() ) {
cerr << "Error: Failed to read file \"" << inPath << "\"." << endl;
return EXIT_FAILURE;
}
ofstream out(outPath.c_str(), ofstream::out | ofstream::binary);
if ( ! out.is_open() ) {
cerr << "Error: Failed to write file \"" << outPath << "\"." << endl;
return EXIT_FAILURE;
}
while ( getline(in, line) ) {
out << line;
}
cout << "Done." << endl;
return EXIT_SUCCESS;
}
The problem is that the input stream gets interpreted if not opened in binary mode. That means all control characters (the ones you see in Notepad++ for example in black boxes) are not handled as ordinary characters but in a special way.Depending on the library implementation the read operation may just stop, ignore those characters, convert them into different character sequences or tread them in their special way (like as end of text mark for example).You can check if a characters is a control character with iscntrl() for example.To remove these characters in every line you can use the following code:
#include <algorithm>
#include <cstdlib>
#include <cctype>
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main() {
string inPath("a.txt");
string outPath("b.txt");
string line;
ifstream in(inPath.c_str(), ifstream::in | ifstream::binary);
if ( ! in.is_open() ) {
cerr << "Error: Failed to read file \"" << inPath << "\"." << endl;
return EXIT_FAILURE;
}
ofstream out(outPath.c_str(), ofstream::out | ofstream::binary);
if ( ! out.is_open() ) {
cerr << "Error: Failed to write file \"" << outPath << "\"." << endl;
return EXIT_FAILURE;
}
while ( getline(in, line) ) {
/* this also removes line-feed and carrier-return */
line.erase(remove_if(line.begin(), line.end(), ::iscntrl), line.end());
out << line << '\n';
}
cout << "Done." << endl;
return EXIT_SUCCESS;
}

you can loop through each char in the file and use utility functions like isalpha(), isalnum() and isdigit() to make sure each char is is ascii, and skip out the others.
see http://www.cplusplus.com/reference/cctype/isalpha/

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string>
using std::string;
int die(string &msg) {
fprintf(stderr, "%s\n", msg.c_str());
exit(-1);
return -1; // Not really.
}
int
main(int argc, char **argv) {
string msg;
string inpt;
FILE *INPT;
string oupt;
FILE *OUPT;
int c;
(argc > 1) ||
(die(msg += "Missing filename arg."));
inpt += argv[1];
(oupt += inpt) += ".nxt";
(INPT = fopen(inpt.c_str(), "r")) ||
(die(((msg += "Can't open \"") += inpt) += "\" for input."));
(OUPT = fopen(oupt.c_str(), "w")) ||
(die(((msg += "Can't open \"") += oupt) += "\" for output."));
for (;(c = fgetc(INPT)) != EOF;) {
((unsigned)c < 0x80u) &&
(
(isprint(c)) ||
((iscntrl(c)) && (isspace(c)))
) &&
(fputc(c, OUPT));
}
fclose(OUPT);
fclose(INPT);
return 0;
}

Related

Problem with getting text from a .txt file in c++ using fstream

And thisI am trying to get the things written in a .txt file called CodeHere.txt and here is my main.cpp:
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, const char * argv[]) {
string line;
string lines[100];
ifstream myfile ("CodeHere.txt");
int i = 0;
if (myfile.is_open())
{
while ( getline (myfile,line) )
{
lines[0] = line;
i++;
}
myfile.close();
}
else cout << "Unable to open file";
cout << lines[0];
myfile.close();
return 0;
}
And the output is: Writing this to a file.Program ended with exit code: 0
But in my CodeHere.txt it has: hello
I tried saving it, but the result didn't change. I'm not sure whats going on. Can anyone help?
Are you sure that your .txt file is in the same repertory? To me, it just looks like you entered the path wrong. Try with the absolute path (full one). Another option is that you haven't saved the text file yet, you're just editing it, and so it is in fact empty, that would be why your cout doesn't print anything.
This should work, using a vector<string> to store the lines read from file
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
int main(int argc, const char * argv[]) {
string line;
vector<string> lines;
ifstream myfile ("CodeHere.txt");
int i = 0;
if (myfile.is_open())
{
while ( getline(myfile, line) )
{
lines.push_back(line);
i++;
}
myfile.close();
}
else {
cout << "Unable to open file";
return -1;
}
cout << lines[0] << '\n';
return 0;
}
Try this:
vector<string> lines;
if (file.is_open()) {
// read all lines from the file
std::string line;
while (getline(file, line)) {
lines.emplace_back(line);
}
file.close();
}
else {
cout << "Unable to open file";
return -1;
}
cout << "file has " << lines.size() << " lines." << endl;
for (auto l : lines) {
cout << l << endl;
}

ifstream: /dev/stdin is not working the same as std::cin

For my formation, an exercise ask us to create a program similar to the linux 'cat' command.
So to read the file, i use an ifstream, and everything work fine for regular file.
But not when i try to open /dev/ files like /dev/stdin: the 'enter' is not detected and so, getline really exit only when the fd is being closed (with a CTRL-D).
The problem seems to be around how ifstream or getline handle reading, because with the regular 'read' function from libc, this problem is not to be seen.
Here is my code:
#include <iostream>
#include <string>
#include <fstream>
#include <errno.h>
#ifndef PROGRAM_NAME
# define PROGRAM_NAME "cato9tails"
#endif
int g_exitCode = 0;
void
displayErrno(std::string &file)
{
if (errno)
{
g_exitCode = 1;
std::cerr << PROGRAM_NAME << ": " << file << ": " << strerror(errno) << std::endl;
}
}
void
handleStream(std::string file, std::istream &stream)
{
std::string read;
stream.peek(); /* try to read: will set fail bit if it is a folder. */
if (!stream.good())
displayErrno(file);
while (stream.good())
{
std::getline(stream, read);
std::cout << read;
if (stream.eof())
break;
std::cout << std::endl;
}
}
int
main(int argc, char **argv)
{
if (argc == 1)
handleStream("", std::cin);
else
{
for (int index = 1; index < argc; index++)
{
errno = 0;
std::string file = std::string(argv[index]);
std::ifstream stream(file, std::ifstream::in);
if (stream.is_open())
{
handleStream(file, stream);
stream.close();
}
else
displayErrno(file);
}
}
return (g_exitCode);
}
We can only use method from libcpp.
I have search this problem for a long time, and i only find this post where they seems to have a very similar problem to me:
https://github.com/bigartm/bigartm/pull/258#issuecomment-128131871
But found no really usable solution from them.
I tried to do a very ugly solution but... well...:
bool
isUnixStdFile(std::string file)
{
return (file == "/dev/stdin" || file == "/dev/stdout" || file == "/dev/stderr"
|| file == "/dev/fd/0" || file == "/dev/fd/1" || file == "/dev/fd/2");
}
...
if (isUnixStdFile(file))
handleStream(file, std::cin);
else
{
std::ifstream stream(file, std::ifstream::in);
...
As you can see, a lot of files are missing, this can only be called a temporary solution.
Any help would be appreciated!
The following code worked for me to deal with /dev/fd files or when using shell substitute syntax:
std::ifstream stream(file_name);
std::cout << "Opening file '" << file_name << "'" << std::endl;
if (stream.fail() || !stream.good())
{
std::cout << "Error: Failed to open file '" << file_name << "'" << std::endl;
return false;
}
while (!stream.eof() && stream.good() && stream.peek() != EOF)
{
std::getline(stream, buffer);
std::cout << buffer << std::endl;
}
stream.close();
Basically std::getline() fails when content from the special file is not ready yet.

How to append to one file, then copy said file into another file

I feel like I've tried everything, I can get the first file to append to the second but cannot get the second file into a third. What am I doing wrong?
To be clear I need to take one file, append it to a second file, then put the contents of that second file into a third. I was able to simulate this outcome by putting both files into strings and then putting those strings into a third file, but that's not 'correct' in this problem.
I'm not particular to any way or any technique, I've tried a few and nothing works. This is the latest attempt, still doesn't work for the last step.
Here's my code:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string a,b,c;
cout << "Enter 3 file names: ";
cin >> a >> b >> c;
fstream inf;
ifstream two;
fstream outf;
string content = "";
string line = "";
int i;
string ch;
inf.open(a, ios::in | ios:: out | ios::app);
two.open(b);
outf.open(c, ios::in);
//check for errors
if (!inf)
{
cerr << "Error opening file" << endl;
exit(1);
}
if (!two)
{
cerr << "Error opening file" << endl;
exit(1);
}
if (!outf)
{
cerr << "Error opening file" << endl;
exit(1);
}
for(i=0; two.eof() != true; i++)
content += two.get();
i--;
content.erase(content.end()-1);
two.close();
inf << content;
inf.clear();
inf.swap(outf);
outf.close();
inf.close();
return 0;
Here's an idea:
#include <fstream>
using namespace std;
void appendf( const char* d, const char* s )
{
ofstream os( d, ios::app );
if ( ! os )
throw "could not open destination";
ifstream is( s );
if ( ! is )
throw "could not open source";
os << is.rdbuf();
}
int main()
{
try
{
appendf( "out.txt", "1.txt" );
return 0;
}
catch ( const char* x )
{
cout << x;
return -1;
}
}

Need help diagnosing errors in C++ program designed to extract timestamps from an XML file

With some help I've almost completed a program which enables me to extract the timestamps(eg:timestamp="2014-07-08T18:14:16.468Z" ) and only the timestamps from and XML file and output them to a designated output file. However, there are a handful of errors left in my code which have me at wits end, which can't seem to redress. Would someone more experienced with C++ mind helping me out?
The errors appear in lines 35,38, & 47.
Screenshot of errors: http://i.imgur.com/jVUig4T.jpg
Link to XML file: http://pastebin.com/DLVF0cXY
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
using namespace std;
string tempStr;
// escaped double qoute.
string findStr = "timestamp=\"";
ifstream inFile;
ofstream outFile;
outFile.open("Outputdata.txt");
inFile.open("Groupproject.xml");
if (inFile.fail()) {
cout << "Error Opening File" << endl;
system("pause");
exit(1);
}
size_t found;
while (inFile) {
getline(inFile, tempStr);
found = tempStr.find(findStr);
if (found != std::string::npos)
{
break;
}
}
// Erases from beggining to end of timestamp="
tempStr.erase(tempStr.begin(), (found + tempStr.length()));
// Finds index of next double qoute.
found = tempStr.findStr("\"");
if (found = std::string::npos)
{
cerr << "Could not find matching qoute:";
exit(1);
}
// Erases from matching qoute to the end of the string.
tempStr.erase(found, tempStr.end());
cout << "timestamp found" << tempStr << "Saving to outFile" << endl;
outFile << tempStr;
inFile.close();
outFile.close();
system("pause");
return 0;
}
Are you sure you carefully read the reference for all the functions you are using ? Your new friend
#include <iostream>
#include <string>
#include <fstream>
#include <cstdlib>
using namespace std;
int main()
{
string tempStr;
string findStr = "timestamp=\""; // escaped double quote
ifstream inFile;
ofstream outFile;
outFile.open( "Outputdata.txt" );
inFile.open( "Groupproject.xml" );
if ( inFile.fail() )
{
cout << "Error Opening File" << endl;
cin.get();
exit( 1 );
}
size_t found;
while ( inFile )
{
getline( inFile, tempStr );
cout << tempStr << endl;
found = tempStr.find( findStr );
if ( found != string::npos )
break;
}
tempStr.erase( 0, found + findStr.length() ); // erases from beggining to end of timestamp="
found = tempStr.find( "\"" ); // finds index of next double quote
if ( found == string::npos )
{
cerr << "Could not find matching quote" << endl;
exit( 1 );
}
tempStr.erase( found, string::npos ); // erases from matching quote to the end of the string.
cout << "timestamp found:" << tempStr << " Saving to outFile" << endl;
outFile << tempStr;
inFile.close();
outFile.close();
cin.get();
return 0;
}

C++ fstream multiple input files

I am writing a simple program to take in two files. The terminal command line looks like this.
./fileIO foo.code foo.encode
When it runs, the second file is not read in. When I enter
./fileIO foo.code foo.code
it works. I can't seem to figure out why the second one is not opening. Any ideas? Thanks!
#include <fstream>
#include <iostream>
#include <queue>
#include <iomanip>
#include <map>
#include <string>
#include <cassert>
using namespace std;
int main( int argc, char *argv[] )
{
// convert the C-style command line parameter to a C++-style string,
// so that we can do concatenation on it
assert( argc == 3 );
const string code = argv[1];
const string encode = argv[2];
string firstTextFile = code;
string secondTextFile = encode;
//manipulate the first infile
ifstream firstFile( firstTextFile.c_str(), ios::in );
if( !firstFile )
{
cerr << "Cannot open text file for input" << endl;
return 1;
}
string lineIn;
string codeSubstring;
string hexSubstring;
while( getline( firstFile, lineIn ) )
{
hexSubstring = lineIn.substr(0, 2);
codeSubstring = lineIn.substr(4, lineIn.length() );
cout << hexSubstring << ", " << codeSubstring << endl;
}
//manipulate the second infile
ifstream secondFile( secondTextFile.c_str(), ios::in );
if( !secondFile )
{
cerr << "Cannot open text file for input" << endl;
return 1;
}
char characterIn;
while( secondFile.get( characterIn ) )
{
cout << characterIn << endl;
}
return 0;
}
One thing you might want to try is adding the close() call as is standard procedure after you're done using files. Sometimes issues arise with re-opening files if they were not closed properly in a previous run.
firstFile.close();
secondFile.close();
Also, you may try restarting the computer if there is some lingering file handle that hasn't been released.