Checking file existence, size and similarity in C++ - c++

I am new to C++ and I am trying to do a few things with my code. I have been researching on how to do them but haven't been able to get my head around it and have been fairly unsuccessful.
bool Copy(char filenamein[], char filenameout[]);
int main(int argc, char **argv)
{
if (argc !=3) {
cerr << "Usage: " << argv[0] << " <input filename> <output filename>" << endl;
int keypress; cin >> keypress;
return -1;
}
if (Copy(argv[1], argv[2]))
cout << "Copy completed" << endl;
else
cout << "Copy failed!" << endl;
system("pause");
return 0;
}
bool Copy(char filenamein[], char filenameout[])
{
ifstream fin(filenamein);
if(fin.is_open())
{
ofstream fout(filenameout);
char c;
while(fin.get(c))
{
fout.put(c);
}
fout.close();
fin.close();
return true;
}
return false;
}
This code already creates 2 text files, input.txt and output.txt. Both files also contains the same items/characters.
What I'm trying to do if checking if the input.txt file already exists before trying to copy it.
I am also wanting to check both files to make sure they are the same as well as checking the file sizes are equal.
How do I go about on doing this?

For general filesystem operations there's Boost Filesystem.
http://www.boost.org/doc/libs/1_57_0/libs/filesystem/doc/index.htm
To compare files you can calculate hashes and compare the hashes. For two files it would be just as efficient to compare them character by character but for more than two files comparing hashes wins.
For this there's Crypto++.
http://www.cryptopp.com/
Example of using the two libraries to solve the 3 problems in the question.
// C++ standard library
#include <iostream>
// Boost
#include <boost/filesystem.hpp>
// Crypto++
#include <cryptopp/sha.h>
#include <cryptopp/hex.h>
#include <cryptopp/files.h>
using std::string;
const string file_hash(const boost::filesystem::path &file);
int main( int argc, char** argv) {
if (argc != 3)
{
std::cout << "Usage: " << argv[0] << "filepath1 filepath2\n";
return 1;
}
const string filename1(argv[1]);
const string filename2(argv[2]);
std::cout << "filename 1: " << filename1 << std::endl;
std::cout << "filename 2: " << filename2 << std::endl;
// file existence
const bool file_exists1 = boost::filesystem::exists(filename1);
const bool file_exists2 = boost::filesystem::exists(filename2);
std::cout << "file 1 exists: " << std::boolalpha << file_exists1 << std::endl;
std::cout << "file 2 exists: " << std::boolalpha << file_exists2 << std::endl;
if (!file_exists1 || !file_exists2)
return EXIT_SUCCESS;
// file size
const boost::filesystem::path file_path1(filename1);
const boost::filesystem::path file_path2(filename2);
const uintmax_t file_size1 = boost::filesystem::file_size(file_path1);
const uintmax_t file_size2 = boost::filesystem::file_size(file_path2);
std::cout << "file 1 size: " << std::boolalpha << file_size1 << std::endl;
std::cout << "file 2 size: " << std::boolalpha << file_size2 << std::endl;
// comparing files
const string hash1 = file_hash(file_path1);
const string hash2 = file_hash(file_path2);
std::cout << "hash1: " << hash1 << std::endl;
std::cout << "hash2: " << hash2 << std::endl;
const bool same_file = hash1 == hash2;
std::cout << "same file: " << same_file << std::endl;
}
const string file_hash(const boost::filesystem::path& file)
{
string result;
CryptoPP::SHA1 hash;
CryptoPP::FileSource(file.string().c_str(),true,
new CryptoPP::HashFilter(hash, new CryptoPP::HexEncoder(
new CryptoPP::StringSink(result), true)));
return result;
}
Compilation on my laptop (the directories will of course be specific to wherever you have the headers and libraries but these are how homebrew installs them on OS X):
clang++ -I/usr/local/include -L/usr/local/lib -lcryptopp -lboost_system -lboost_filesystem demo.cpp -o demo
Example usage:
$ ./demo demo.cpp demo.cpp
filename 1: demo.cpp
filename 2: demo.cpp
file 1 exists: true
file 2 exists: true
file 1 size: 2084
file 2 size: 2084
hash1: 57E2E81D359C01DA02CB31621C9565DF0BCA056E
hash2: 57E2E81D359C01DA02CB31621C9565DF0BCA056E
same file: true
$ ./demo demo.cpp Makefile
filename 1: demo.cpp
filename 2: Makefile
file 1 exists: true
file 2 exists: true
file 1 size: 2084
file 2 size: 115
hash1: 57E2E81D359C01DA02CB31621C9565DF0BCA056E
hash2: 02676BFDF25FEA9E3A4D099B16032F23C469E70C
same file: false
Boost Filesystem will throw exceptions if you try to do stuff like get the size of a file that doesn't exist. You should be prepared to catch those exceptions so you don't need to explicitly test for file existence since you should have a catch block anyway. (If all you want to know is if a file exists but you don't want to do stuff with the file then it makes sense to test for existence explicitly.)
This is how I would go about doing these things in practice. If what you're asking is how these things would be done without libraries then you can check if a file exists by using the C or C++ standard library to try and open a file and check if you succeeded. For checking file size, you can open a file, you can seek to the end and compare the position to the beginning of the file.
However, it's preferable to rely on operating system support to interact with filesystems in general.
https://www.securecoding.cert.org/confluence/display/seccode/FIO19-C.+Do+not+use+fseek%28%29+and+ftell%28%29+to+compute+the+size+of+a+regular+file
fstat() for example is specific to Unix and Unix-like systems and returns a struct containing file size data but on Microsoft systems you use GetFileSizeEx() to get a file size. Because of this, if you want a portable solution then you have to use libraries that interact with the various operating systems for you and present a consistent API across operating systems.
Comparing files using only standard library support can be done by either implementing hashing functions or comparing files character by character.

Look at fstat, it will tell you the file size (or return an error if it does not exist).
You could also force the last update date of the copied file to be the same as the source file, so that if the source file changes but keeps the same size you will notice it (look at futimes to do so).

Related

How to read data from AVRO file using C++ interface?

I'm attempting to write a simple program to extract some data from a bunch of AVRO files. The schema for each file may be different so I would like to read the files generically (i.e. without having to pregenerate and then compile in the schema for each) using the C++ interface.
I have been attempting to follow the generic.cc example but it assumes a separate schema where I would like to read the schema from each AVRO file.
Here is my code:
#include <fstream>
#include <iostream>
#include "Compiler.hh"
#include "DataFile.hh"
#include "Decoder.hh"
#include "Generic.hh"
#include "Stream.hh"
const std::string BOLD("\033[1m");
const std::string ENDC("\033[0m");
const std::string RED("\033[31m");
const std::string YELLOW("\033[33m");
int main(int argc, char**argv)
{
std::cout << "AVRO Test\n" << std::endl;
if (argc < 2)
{
std::cerr << BOLD << RED << "ERROR: " << ENDC << "please provide an "
<< "input file\n" << std::endl;
return -1;
}
avro::DataFileReaderBase dataFile(argv[1]);
auto dataSchema = dataFile.dataSchema();
// Write out data schema in JSON for grins
std::ofstream output("data_schema.json");
dataSchema.toJson(output);
output.close();
avro::DecoderPtr decoder = avro::binaryDecoder();
auto inStream = avro::fileInputStream(argv[1]);
decoder->init(*inStream);
avro::GenericDatum datum(dataSchema);
avro::decode(*decoder, datum);
std::cout << "Type: " << datum.type() << std::endl;
return 0;
}
Everytime I run the code, no matter what file I use, I get this:
$ ./avrotest twitter.avro
AVRO Test
terminate called after throwing an instance of 'avro::Exception'
what(): Cannot have negative length: -40 Aborted
In addition to my own data files, I have tried using the data files located here: https://github.com/miguno/avro-cli-examples, with the same result.
I tried using the avrocat utility on all of the same files and it works fine. What am I doing wrong?
(NOTE: outputting the data schema for each file in JSON works correctly as expected)
After a bunch more fooling around, I figured it out. You're supposed to use DataFileReader templated with GenericDatum. With the end result being something like this:
#include <fstream>
#include <iostream>
#include "Compiler.hh"
#include "DataFile.hh"
#include "Decoder.hh"
#include "Generic.hh"
#include "Stream.hh"
const std::string BOLD("\033[1m");
const std::string ENDC("\033[0m");
const std::string RED("\033[31m");
const std::string YELLOW("\033[33m");
int main(int argc, char**argv)
{
std::cout << "AVRO Test\n" << std::endl;
if (argc < 2)
{
std::cerr << BOLD << RED << "ERROR: " << ENDC << "please provide an "
<< "input file\n" << std::endl;
return -1;
}
avro::DataFileReader<avro::GenericDatum> reader(argv[1]);
auto dataSchema = reader.dataSchema();
// Write out data schema in JSON for grins
std::ofstream output("data_schema.json");
dataSchema.toJson(output);
output.close();
avro::GenericDatum datum(dataSchema);
while (reader.read(datum))
{
std::cout << "Type: " << datum.type() << std::endl;
if (datum.type() == avro::AVRO_RECORD)
{
const avro::GenericRecord& r = datum.value<avro::GenericRecord>();
std::cout << "Field-count: " << r.fieldCount() << std::endl;
// TODO: pull out each field
}
}
return 0;
}
Perhaps an example like this should be included with libavro...

fstream fails to write/open files on raspberry pi

I am trying to run a cpp program on raspberry pi 3 b+ (from 'pi' user) but when I try to open a file with 'fstream' library it doesn't work.
I am using the following code (from main):
std::ios::sync_with_stdio(false);
std::string path = "/NbData";
std::ofstream nbData(path);
if (!nbData) {
std::cout << "Error during process...";
return 0;
}
nbData.seekp(std::ios::beg);
The program always fails there and stops because no file is created (I don't get a fatal error but the test fails and it outputs 'Error during process' which means no file was created).
I am compiling with the following command (there are no issues when I compile):
g++ -std=c++0x nbFinder.cpp -o nbFinder
I have already tried my program on Xcode and everything worked perfectly...
The problem is your path. You must put the file, you are using just the path and if the path do not exist will throw an error. In your case you just using std::string path = "/NbData";, that is you path not your file.
To be able to open your file you need make sure your path exist. Try use the code bellow, he will check if the path exist case not will create and then try to open your file.
#include <iostream>
#include <fstream>
#include <sys/types.h>
#include <sys/stat.h>
int main() {
std::ios::sync_with_stdio(false);
std::string path = "./test_dir/";
std::string file = "test.txt";
// Will check if thie file exist, if not will creat
struct stat info;
if (stat(path.c_str(), &info) != 0) {
std::cout << "cannot access " << path << std::endl;
system(("mkdir " + path).c_str());
} else if(info.st_mode & S_IFDIR) {
std::cout << "is a directory" << path << std::endl;
} else {
std::cout << "is no directory" << path << std::endl;
system(("mkdir " + path).c_str());
}
std::ofstream nbData(path + file);
if (!nbData) {
std::cout << "Error during process...";
return 0;
}
nbData.seekp(std::ios::beg);
return 0;
}

protobuffer throw runtime error as Failed to parse address book in latest current version 3.0.0?

I have create a addressbook.proto with this i m good to generate below two files
addressbook.pb.h
addressbook.pb.cc
with protoc -I=$SRC_DIR --cpp_out=$DST_DIR $SRC_DIR/addressbook.proto
i have a code which read my address book name readproto.cc
readproto.cc
int main(int argc, char* argv[]) {
if (argc != 2) {
cerr << "Usage: " << argv[0] << " ADDRESS_BOOK_FILE" << endl;
return -1;
}
tutorial::AddressBook address_book;
{
// Read the existing address book.
fstream input(argv[1], ios::in | ios::binary);
if (!input) {
cout << argv[1] << ": File not found. Creating a new file." << endl;
} else if (!address_book.ParseFromIstream(&input)) {
cerr << "Failed to parse address book." << endl;
return -1;
}
}
----
}
and i compile as
c++ readproto.cc addressbook.pb.cc `pkg-config --cflags --libs protobuf
i get executable file nothing bad but my doubt is what file should i load with this executable?
i tired as
./a.out addressbook.proto
Not sure which file need to load addressbook.proto is good ??
result : Failed to parse address book
.
i am new with protobuffer need help on it .Struggling from last three days this my last hope plss help with in this thank you
Since you are using the protobuf tutorial, Call your program like this:
./a.out my_adress_book.bin
It will create an empty my_adress_book.bin and then prompt you to add entries.
This part creates an empty file:
fstream input(argv[1], ios::in | ios::binary);
if (!input) {
cout << argv[1] << ": File not found. Creating a new file." << endl;
}
....
You have to specify the actual binary data. The data format is defined in the .proto file. Try to export the data to the file first. And then read it.

is it possible to grab data from an .exe file in c++?

I am new at C/C++,
So basically I want to call an .exe file that displays 2 numbers and be able to grab those two numbers to use them in my code.
To call the .exe file I've used the system command, but I am still not able to grab those two numbers that are displayed by the .exe file
char *files = "MyPath\file.exe";
system (files);
I think this is better aproach:
Here you just create new process, and you read data that process gives you. I tested this on OS X 10.11 with .sh file and works like a charm. I think that this would probably work on Windows also.
FILE *fp = popen("path to exe","r");
if (fp == NULL)
{
std::cout << "Popen is null" << std::endl;
}else
{
char buff[100];
while ( fgets( buff, sizeof(buff), fp ) != NULL )
{
std::cout << buff;
}
}
You need to escapr back slashes in C++ string literals so:
// note the double "\\"
char* files = "MyPath\\file.exe";
Or just use forward slashes:
char* files = "MyPath/file.exe";
Its not very efficient but one thing you can to with std::system is redirect the output to a file and then read the file:
#include <cstdlib>
#include <fstream>
#include <iostream>
int main()
{
// redirect > the output to a file called output.txt
if(std::system("MyPath\\file.exe > output.txt") != 0)
{
std::cerr << "ERROR: calling system\n";
return 1; // error code
}
// open a file to the output data
std::ifstream ifs("output.txt");
if(!ifs.is_open())
{
std::cerr << "ERROR: opening output file\n";
return 1; // error code
}
int num1, num2;
if(!(ifs >> num1 >> num2))
{
std::cerr << "ERROR: reading numbers\n";
return 1; // error code
}
// do something with the numbers here
std::cout << "num1: " << num1 << '\n';
std::cout << "num2: " << num2 << '\n';
}
NOTE: (thnx #VermillionAzure)
Note that system doesn't always work everywhere because unicorn
environments. Also, shells can differ from each other, like cmd.exe
and bash. – VermillionAzure
When using std::system the results are platform dependant and not all shells will have redirection or use the same syntax or even exist!

Nicolai Josuttis says in his book that the open member function doesn't clear the state flags. That's not what I found in VS2010. Is this a MS issue?

Nicolai Josuttis in page 547 of his book "The C++ Standard Library" says the following in relation to the code below :
Note that after the processing of a file, clear() must be called to clear the state flags that are set at end-of-file. This is required because the stream object is used for multiple files. The member function open() does not clear the state flags. open() never clears any state flags. Thus, if a stream was not in a good state, after closing and reopening it you still have to call clear() to get to a good state. This is also the case, if you open a different file.
// header files for file I/O
#include <fstream>
#include <iostream>
using namespace std;
/* for all file names passed as command-line arguments
* - open, print contents, and close file
*/
int main (int argc, char* argv[])
{
ifstream file;
// for all command-line arguments
for (int i=1; i<argc; ++i) {
// open file
file.open(argv[i]);
// write file contents to cout
char c;
while (file.get(c)) {
cout.put(c);
}
// clear eofbit and failbit set due to end-of-file
file.clear();
// close file
file.close();
}
}
My code below works without a problem in VS2010. Note that after the file "data.txt" is created, it's read twice without clearing the input stream flags.
#include <iostream>
#include <fstream>
#include <string>
int main()
{
// Create file "data.txt" for writing, write 4 lines into the file and close the file.
std::ofstream out("data.txt");
out << "Line 1" << '\n' << "Line 2" << '\n' << "Line 3" << '\n' << "Line 4" << '\n';
out.close();
// Open the file "data.txt" for reading and write file contents to cout
std::ifstream in("data.txt");
std::string s;
while( std::getline(in, s) ) std::cout << s << '\n';
std::cout << '\n';
std::cout << std::boolalpha << "ifstream.eof() before close - " << in.eof() << '\n';
// Close the file without clearing its flags
in.close();
std::cout << std::boolalpha << "ifstream.eof() after close - " << in.eof() << '\n';
// Open the file "data.txt" again for reading
in.open("data.txt");
std::cout << std::boolalpha << "ifstream.good() after open - " << in.good() << '\n';
std::cout << '\n';
// Read and print the file contents
while( std::getline(in, s) ) std::cout << s << '\n';
std::cout << '\n';
}
Ouput
That was changed for C++11. The C++98 rule (as correctly described by Josuttis) was clearly wrong, so I wouldn't be surprised if implementations didn't honor it.