C++ most robust way to copy a file

C++ most robust way to copy a file - c++

Okay, so I know that disk writing errors are very rare, so please just look past that because the data I am working with is very incredibly important (like SSIDs kind of important). So, I want to copy a file in the absolute most robust way using the absolute minimal amount of memory to do so. So far, this is was far as I have got. It sucks up a lot of memory, but I can't find the source. The way it works is by rechecking tons of times until it gets a confirmed result (it may increase the number of false positives for errors by a lot, but it might reduce the chance of an actual error by a big margin). Also, the sleep at the bottom is so you have time to analyze the programs overall performance using the windows task manager.
#include <cstdio> // fopen, fclose, fread, fwrite, BUFSIZ
#include <cstdlib>
#include <unistd.h>
#include <iostream>
using namespace std;
__inline__ bool copy_file(const char* From, const char* To)
{
FILE infile = (*fopen(From, "rb"));
FILE outfile = (*fopen(To, "rwb+"));
setvbuf( &infile, nullptr, _IONBF, 0);
setvbuf( &outfile, nullptr, _IONBF, 0);
fseek(&infile,0,SEEK_END);
long int size = ftell(&infile);
fseek(&infile,0,SEEK_SET);
unsigned short error_amount;
bool success;
char c;
char w;
char l;
for ( fpos_t i=0; (i != size); ++i ) {
error_amount=0;
fsetpos( &infile, &i );
c = fgetc(&infile);
fsetpos( &infile, &i );
success=true;
for ( l=0; (l != 126); ++l ) {
fsetpos( &infile, &i );
success = ( success == ( fgetc(&infile)==c ) );
}
while (success==false) {
fsetpos( &infile, &i );
if (error_amount==32767) {
cerr << "There were 32768 failed attemps at accessing a part of the file! exiting the program...";
return false;
}
++error_amount;
//cout << "an error has occured at position ";
//printf("%d in the file.\n", (int)i);
c = fgetc(&infile);
fsetpos( &infile, &i );
success=true;
for ( l=0; (l != 126); ++l ) {
fsetpos( &infile, &i );
success = ( success == ( fgetc(&infile)==c ) );
}
}
fsetpos( &infile, &i );
fputc( c, &outfile);
fsetpos( &outfile, &i );
error_amount=0;
w = fgetc(&infile);
fsetpos( &outfile, &i );
success=true;
for ( l=0; (l != 126); ++l ) {
fsetpos( &outfile, &i );
success = ( success == ( fgetc(&outfile)==w ) );
}
while (success==false) {
fsetpos( &outfile, &i );
fputc( c, &outfile);
if (error_amount==32767) {
cerr << "There were 32768 failed attemps at writing to a part of the file! exiting the program...";
return false;
}
++error_amount;
w = fgetc(&infile);
fsetpos( &infile, &i );
success=true;
for ( l=0; (l != 126); ++l ) {
fsetpos( &outfile, &i );
success = ( success == ( fgetc(&outfile)==w ) );
}
}
fsetpos( &infile, &i );
}
fclose(&infile);
fclose(&outfile);
return true;
}
int main( void )
{
int CopyResult = copy_file("C:\\Users\\Admin\\Desktop\\example file.txt","C:\\Users\\Admin\\Desktop\\example copy.txt");
std::cout << "Could it copy the file? " << CopyResult << '\n';
sleep(65535);
return 1;
}
So, if my code is on the right track with the best way, then what can be done with my code to improve it? But, if my code is totally off with the best solution, then what is the best solution? Please note that this question is essentially about detection of rare disk writing errors for the application of copying very very very very (etc.) important data.

I would just copy the file without any special checks, and in the end I would read the file and compare its hash value to the expected one. For a hash function, I would use MD5 or SHA-1.

#include <boost/filesystem.hpp>
#include <iostream>
int main()
{
try
{
boost::filesystem::copy_file( "C:\\Users\\Admin\\Desktop\\example file.txt",
"C:\\Users\\Admin\\Desktop\\example copy.txt" );
}
catch ( boost::filesystem::filesystem_error const & ex )
{
std::cerr << "Copy failed: " << ex.what();
}
}
This will call the arguably most robust implementation available -- the one provided by the operating system -- and report any failure.
My point being:
The chance of having your saved data end up corrupted are astronomically small to begin with.
Any application where this might actually be an issue should be running on redundant storage, a.k.a. RAID arrays, filesystems doing checksums (like Btrfs, ZFS) etc., again reducing chance of failure significantly.
Doing complex things in home-grown I/O functions, on the other hand, increases the probability of mistakes and / or false negatives immensely.

Related

How to stop all pthreads when one has completed its work?

I'm trying to create a code to brute-force a random string but running it on one thread makes it take too long (as expected). I'm fiddling around with pthreads and this is what i've come up with:
void*
bruteForce ( void* ARGS )
{
args *arg = ( args * ) ARGS;
string STRING= arg->STRING;
string charSet = arg->charSet;
string guess = arg->guess;
char c;
int size;
int pos;
int lenght;
int j = 0;
char CHAR[STRING.length ( )];
size = charSet.length ( );
do
{
for ( j = 0; j < STRING.length ( ); j++ )
{
pos = rand ( ) % size;
CHAR[j] = charSet[pos];
guess = string ( CHAR );
//cout << guess[j];
}
//cout << guess << endl;
}
while ( STRING!= guess );
}
int
main ( int argc, char** argv )
{
srand ( ( unsigned ) ( time ( NULL ) ) );
const int NUMBER_OF_THREADS = 10;
args arg;
ifstream myFile;
string STRING;
string charSet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
string guess;
pthread_t threads[NUMBER_OF_THREADS];
void* status;
arg.charSet = charSet;
arg.STRING= STRING;
char c;
int size;
int pos;
int lenght;
int j = 0;
myFile.open ( "string.txt" );
getline ( myFile, STRING);
size = charSet.length ( );
int rc;
//Creating threads for cracking the string
for ( int i = 0; i < NUMBER_OF_THREADS; i++ )
{
rc = pthread_create ( &threads[i], NULL, bruteForce, ( void* ) &arg );
if ( rc )
{
cout << "Couldnt create thread";
exit ( 1 );
}
}
//Joining threads
for ( int i = 0; i < NUMBER_OF_THREADS; i++ )
{
rc = pthread_join ( threads[i], &status );
if ( rc )
{
cout << "thread number " << i << " was unable to join: " << rc << endl;
exit ( 1 );
}
}
}
Now, I need someway of signaling that one of the threads has already guessed the string correctly and terminate the others. I read some of the documentation for pthread library and couldn't find anything. Any help is appreciated.
PS: I know the brute-force algorithm is by far not the best.

As long as you don't want your program to run any longer after the answer is found, you can just call exit(0) from the thread which found the answer.
do
{
// ...
}
while ( STRING!= guess );
std::cout << guess << std::endl;
std::exit(0);

Clumsy but workable in your case:
Add a DONE flag in global scope. Set it when a result is found by any thread.
Make each thread's loop be dependent on the flag.
bool DONE=false; // set to true to stop other threads
void*bruteForce ( void* ARGS )
{ ...
do
{ <try a string>
}
while ( !DONE && STRING!= guess );
DONE=true; // set redundantly but that doesn't hurt
}
Your main program can still do the join to collect finished pthreads, and then continue on with any work it might want to do on the guessed answer.

Pointer Arithmetic, Pass by Reference

I got the following question from a past exam paper:
Consider the following source code:
using namespace std;
int main()
{
char dest[20];
printf( "%s\n", altcopy( dest, "demo" ) );
printf( "%s\n", altcopy( dest, "demo2" ) );
printf( "%s\n", altcopy( dest, "longer" ) );
printf( "%s\n", altcopy( dest, "even longer" ) );
printf( "%s\n", altcopy( dest, "and a really long string" ) );
}
Provide an implementation for the function called altcopy() which uses pointer arithmetic to copy alternate characters of a C-type string to the destination (i.e. the first, third, fifth etc character). Your answer must not use the [] operator to access an array index. The above code would then output the following:
dm
dm2
lne
ee ogr
adaral ogsrn
And I have attempted as follows:
using namespace std;
char* altcopy (char* dest, const char* str)
{
char* p = dest;
const char* q = str;
for ( int n=0; n<=20; n++ )
{
*p = *q;
p++;
q++;
q++;
}
return dest;
}
int main()
{
char dest[20];
printf( "%s\n", altcopy( dest, "demo" ) );
printf( "%s\n", altcopy( dest, "demo2" ) );
printf( "%s\n", altcopy( dest, "longer" ) );
printf( "%s\n", altcopy( dest, "even longer" ) );
printf( "%s\n", altcopy( dest, "and a really long string" ) );
}
And the results are:
dm
dm2lne
lne
ee ogradaral ogsrn
adaral ogsrn
I'm not sure why it happened to have duplicate of next statement result on certain output instead of performing as what the question asked for. Any help here?

Your function is invalid at least because it uses magic number 20.
The function should be similar to standard function strcpy That is it has to copy the source string until the terminating zero will be encountered.
Here is a simple function realization
#include <iostream>
char * altcopy( char *dest, const char *str )
{
char *p = dest;
while ( *p++ = *str++ )
{
if ( *str ) ++str;
}
return dest;
}
int main()
{
char dest[20];
std::cout << altcopy( dest, "demo" ) << std::endl;
std::cout << altcopy( dest, "demo2" ) << std::endl;
std::cout << altcopy( dest, "longer" ) << std::endl;
std::cout << altcopy( dest, "even longer" ) << std::endl;
std::cout << altcopy( dest, "and a really long string" ) << std::endl;
return 0;
}
The output is
dm
dm2
lne
ee ogr
adaral ogsrn
Enjoy!:)

Since in your loop:
for ( int n=0; n<=20; n++ )
{
*p = *q;
p++;
q++;
q++;
}
you are just looping 20 times regardless of the string length you are reading random memory from past the end of the string. In most cases you probably read a 0 after the real characters and write that next which terminates the string as far as printf is concerned, but sometimes because the way the strings happen to be stored in memory you are getting some characters from the next string.

It's because q++ is being done twice, and it skips over the terminating null, and into the next string.
In fact it doesn't even check for the terminating null.

Is it possible to use a C++ stream class to buffer reads from a pipe?

In short, is it possible to do buffered reads from a pipe from a stream class, along the lines of what this pseudo-example describes.
Please ignore any pedantic problems you see (like not checking errors & the like); I'm doing all that in my real code, this is just a pseudo-example to get across my question.
#include <iostream> // or istream, ifstream, strstream, etc; whatever stream could pull this off
#include <unistd.h>
#include <stdlib.h>
#include <sstream>
void myFunc() {
int pipefd[2][2] = {{0,0},{0,0}};
pipe2( pipefd[0], O_NONBLOCK );
pipe2( pipefd[1], O_NONBLOCK );
if( 0 == fork() ) {
close( pipefd[0][1] );
close( pipefd[1][1] );
dup2( pipefd[0][0], stdout );
dup2( pipefd[1][0], stderr );
execv( /* some arbitrary program */ );
} else {
close( pipefd[0][0] );
close( pipefd[1][0] );
/* cloudy bubble here for the 'right thing to do'.
* Obviously this is faulty code; look at the intent,
* not the implementation.
*/
#ifdef RIGHT_THING_TO_DO
for( int ii = 0; ii < 2; ++ii ) {
cin.tie( pipefd[ii][1] );
do {
cin.readline( /* ... */ );
} while( /* ... */ );
}
#else
// This is what I'm doing now; it works, but I'm
// curious whether it can be done more concisely
do {
do {
select( /* ... */ );
for( int ii = 0; ii < 2; ++ii ) {
if( FD_SET( fd[ii][1], &rfds ) ) {
read( fd[ii][1], buff, 4096 );
if( /* read returned a value > 0 */ ) {
myStringStream << buff;
} else {
FD_CLR( fd[ii][1], &rfds );
}
}
}
} while( /* select returned a value > 0 */ );
} while( 0 == waitpid( -1, 0, WNOHANG ) );
#endif
}
}
Edit
Here's a simple example of how to use boost::file_descriptor to work with a pipe; should work with sockets too, didn't test though.
This is how I compiled it:
g++ -m32 -DBOOST_IOSTREAMS_NO_LIB -isystem ${BOOST_PATH}/include \
${BOOST_SRC_PATH}/libs/iostreams/src/file_descriptor.cpp blah.cc -o blah
Here's the example:
#include <fcntl.h>
#include <stdio.h>
#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/stream.hpp>
int main( int argc, char* argv[] ) {
// if you just do 'using namespace...', there's a
// namespace collision with the global 'write'
// function used in the child
namespace io = boost::iostreams;
int pipefd[] = {0,0};
pipe( pipefd, 0 ); // If you use O_NONBLOCK, you'll have to
// add some extra checks to the loop so
// it will wait until the child is finished.
if( 0 == fork() ) {
// child
close( pipefd[0] ); // read handle
dup2( pipefd[1], FILENO_STDOUT );
printf( "This\nis\na\ntest\nto\nmake sure that\nit\nis\working as expected.\n" );
return 0; // ya ya, shoot me ;p
}
// parent
close( pipefd[1] ); // write handle
char *buff = new char[1024];
memset( buff, 0, 1024 );
io::stream<io::file_descriptor_source> fds(
io::file_descriptor_source( pipefd[0], io::never_close_handle ) );
// this should work with std::getline as well
while( fds.getline( buff, 1024 )
&& fds.gcount() > 0 // this condition is not enough if you use
// O_NONBLOCK; it should only bail if this
// is false AND the child has exited
) {
printf( "%s,", buff );
}
printf( "\n" );
}

There sure is. There's an example from the book "The C++ Standard Library: a Tutorial and Reference" for how to make a std::streambuf that wraps file descriptors (like those you get from pipe()). From that creating a stream on top of it is trivial.
Edit: here's the book: http://www.josuttis.com/libbook/
And here's an example output buffer using file descriptors: http://www.josuttis.com/libbook/io/outbuf2.hpp.html
Also, here's an example input buffer: http://www.josuttis.com/libbook/io/inbuf1.hpp.html

You'd want a stream that can be created with an existing file descriptor, or a stream that creates a pipe itself. Unfortunately there's no such standard stream type.
You could write your own or use, for example, boost::iostreams::file_descriptor.
Writing your own entails creating a subclass of basic_streambuf, and then then creating a very simple subclass of basic_i/ostream that does little more than hold your streambuf class and provide convenient constructors.

Downloading Binary Files With Wininet

I am currently programming a simple program, I want to distribute to my friends. What I am trying to accomplish, is to write some external binary files to a buffer from the internet, upon starting the program. To do this, I am using windows internet(wininet). Currently, I am using InternetReadFile to write the file to a buffer which I use later in the program. However, the File is not read completely, as in, the resulting size is much smaller than the size of the file on the server, when it should be the same.
I would like to do this, without using any external libraries.
Any idea of what could solve my problem?
Thanks,
Andrew

The documentation makes the following remarks:
InternetReadFile operates much like the base ReadFile function, with a few exceptions. Typically, InternetReadFile retrieves data from an HINTERNET handle as a sequential stream of bytes. The amount of data to be read for each call to InternetReadFile is specified by the dwNumberOfBytesToRead parameter and the data is returned in the lpBuffer parameter. A normal read retrieves the specified dwNumberOfBytesToRead for each call to InternetReadFile until the end of the file is reached. To ensure all data is retrieved, an application must continue to call the InternetReadFile function until the function returns TRUE and the lpdwNumberOfBytesRead parameter equals zero.
Basically, there is no guarantee that the function to read exactly dwNumberOfBytesToRead. Check out how many bytes were actually read using the lpdwNumberOfBytesRead parameter.
Moreover, as soon as the total file size is larger than dwNumberOfBytesToRead, you will need to invoke the call multiple times. Because it cannot read more than dwNumberOfBytesToRead at once.
If you have the total file size in advance, the loop takes the following form:
::DWORD error = ERROR_SUCCESS;
::BYTE data[SIZE]; // total file size.
::DWORD size = 0;
::DWORD read = 0;
do {
::BOOL result = ::InternetReadFile(stream, data+size, SIZE-size, &read);
if ( result == FALSE ) {
error = ::GetLastError();
}
}
while ((error == ERROR_SUCCESS) && (read > 0) && ((size+=read) < SIZE));
// check that `SIZE` was correct.
if (size != SIZE) {
}
If not, then you need to write the data in the buffer to another file instead of accumulating it.
EDIT (SAMPLE TEST PROGRAM):
Here's a complete program that fetches StackOverflow's front page. This downloads about 200K of HTML code in 1K chunks and the full page is retrieved. Can you run this and see if it works?
#include <Windows.h>
#include <Wininet.h>
#include <iostream>
#include <fstream>
namespace {
::HINTERNET netstart ()
{
const ::HINTERNET handle =
::InternetOpenW(0, INTERNET_OPEN_TYPE_DIRECT, 0, 0, 0);
if ( handle == 0 )
{
const ::DWORD error = ::GetLastError();
std::cerr
<< "InternetOpen(): " << error << "."
<< std::endl;
}
return (handle);
}
void netclose ( ::HINTERNET object )
{
const ::BOOL result = ::InternetCloseHandle(object);
if ( result == FALSE )
{
const ::DWORD error = ::GetLastError();
std::cerr
<< "InternetClose(): " << error << "."
<< std::endl;
}
}
::HINTERNET netopen ( ::HINTERNET session, ::LPCWSTR url )
{
const ::HINTERNET handle =
::InternetOpenUrlW(session, url, 0, 0, 0, 0);
if ( handle == 0 )
{
const ::DWORD error = ::GetLastError();
std::cerr
<< "InternetOpenUrl(): " << error << "."
<< std::endl;
}
return (handle);
}
void netfetch ( ::HINTERNET istream, std::ostream& ostream )
{
static const ::DWORD SIZE = 1024;
::DWORD error = ERROR_SUCCESS;
::BYTE data[SIZE];
::DWORD size = 0;
do {
::BOOL result = ::InternetReadFile(istream, data, SIZE, &size);
if ( result == FALSE )
{
error = ::GetLastError();
std::cerr
<< "InternetReadFile(): " << error << "."
<< std::endl;
}
ostream.write((const char*)data, size);
}
while ((error == ERROR_SUCCESS) && (size > 0));
}
}
int main ( int, char ** )
{
const ::WCHAR URL[] = L"http://stackoverflow.com/";
const ::HINTERNET session = ::netstart();
if ( session != 0 )
{
const ::HINTERNET istream = ::netopen(session, URL);
if ( istream != 0 )
{
std::ofstream ostream("output.txt", std::ios::binary);
if ( ostream.is_open() ) {
::netfetch(istream, ostream);
}
else {
std::cerr << "Could not open 'output.txt'." << std::endl;
}
::netclose(istream);
}
::netclose(session);
}
}
#pragma comment ( lib, "Wininet.lib" )

Outputting from file one line at a time

I'm trying to output text from a file one line at a time. I'm currently hardcoding it and I have this so far:
int main(int argc, char *argv[])
{
int x;
int k;
int limit = 5;
FILE *file;
file = fopen("C:\\Documents and Settings\\jon\\My Documents\\Visual Studio 2008\\Projects\\Project1\\Assignment8_2\\Debug\\TestFile1.txt", "r");
if (file == NULL) {
perror("Error");
}
for (k = 1; k <= limit; k++) {
while ((x = fgetc(file)) != '\n') {
printf("%c", x);
}
}
fclose(file);
}
I was wondering where in the code above, if at all, I can check for EOF. I assume I need to do that, but not sure why. Still learning.... Thanks!

If you can bound the maximum length of a line, fgets may be a better way to read each line; but since you mention C++, you might consider using, instead, getline (caveat: fgets also put the \n in the buffer it fills, getline doesn't). Both make easy to check for end of file (fgets returns NULL on eof, getline sets the eofbit on its istream argument, which it also returns).

Maybe you can try this:
#include <iostream>
#include <iomanip>
#include <fstream>
using namespace std;
int main() {
int sum = 0;
int x;
ifstream inFile;
inFile.open("test.txt");
if (!inFile) {
cout << "Unable to open file";
exit(1); // terminate with error
}
while (inFile >> x) {
sum = sum + x;
}
inFile.close();
cout << "Sum = " << sum << endl;
return 0;
}

fgets() for C, getline() for C++.
C:
#include <stdio.h>
#include <stdlib.h>
// adjust as appropriate
size_t const MAX_LINE_LENGTH = 1024;
int main()
{
FILE * in;
char line[ MAX_LINE_LENGTH ];
if ( ( in = fopen( "test.txt", "r" ) ) == NULL )
{
puts( "Failed to open test.txt." );
return EXIT_FAILURE;
}
while ( fgets( line, MAX_LINE_LENGTH, in ) != NULL )
{
printf( "%s", line );
}
fclose( in );
return EXIT_SUCCESS;
}
C++:
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ifstream in( "test.txt" );
std::string line;
while ( getline( in, line ) )
{
std::cout << line << std::endl;
}
in.close();
return 0;
}

you can call feof() to check for EOF or check if the return code for fgetc() matches EOF.
I'm adding both versions to your code although I'm not sure what the loops (especially the outer one) are supposed to do, but within the context of your sample, EOF checking would look like this..
/* EOF would now terminate both loops, using feof() and fgetc() return to check EOF */
for (k = 1; k <= limit && !feof(file); k++) {
while ((x = fgetc(file))!='\n' && x!=EOF) {
printf("%c", x);
}
}

you should check the eof from the output of fgetc:
...
x = fgetc(file);
while (x != '\n' && x != EOF) {
...
fgetc manual there

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ most robust way to copy a file - c++

I would just copy the file without any special checks, and in the end I would read the file and compare its hash value to the expected one. For a hash function, I would use MD5 or SHA-1.

Related

How to stop all pthreads when one has completed its work?

Pointer Arithmetic, Pass by Reference

Is it possible to use a C++ stream class to buffer reads from a pipe?

Downloading Binary Files With Wininet

Outputting from file one line at a time

Categories

Resources