In c++ seekg seems to include cr chars, but read() drops them - c++

I'm currently trying to read the contents of a file into a char array.
For instance, I have the following text in a char array. 42 bytes:
{
type: "Backup",
name: "BackupJob"
}
This file is created in windows, and I'm using Visual Studio c++, so there is no OS compatibility issues.
However, executing the following code, at the completion of the for loop, I get Index: 39, with no 13 displayed prior to the 10's.
// Create the file stream and open the file for reading
ifstream fs;
fs.open("task.txt", ifstream::in);
int index = 0;
int ch = fs.get();
while (fs.good()) {
cout << ch << endl;
ch = fs.get();
index++;
}
cout << "----------------------------";
cout << "Index: " << index << endl;
return;
However, when attempting to create a char array the length of the file, reading the file size as per below results in the 3 additional CR chars attributing to the total filesize so that length is equal 42, which is adding screwing up the end of the array with dodgy bytes.
// Create the file stream and open the file for reading
ifstream fs;
fs.seekg(0, std::ios::end);
length = fs.tellg();
fs.seekg(0, std::ios::beg);
// Create the buffer to read the file
char* buffer = new char[length];
fs.read(buffer, length);
buffer[length] = '\0';
// Close the stream
fs.close();
Using a hex viewer, I have confirmed that file does indeed contain the CRLF (13 10) bytes in the file.
There seems to be a disparity with getting the end of the file, and what the get() and read() methods actually return.
Could anyone please help with this?
Cheers,
Justin

You should open your file in binary mode. This will stop read dropping CR.
fs.open("task.txt", ifstream::in|ifstream::binary);

Related

Read big files in C++ but also small files as well in C++?

I want to make a C++ program to read huge files (like 50Gb each) while you have only 4 or 8Gb of RAM.
I want this algorithm to be faster and work with small files as well.
This is the code I have until now:
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
//Making a buffer to store the chuncks of the file read:
// Buffer size 1 Megabyte (or any number you like)
size_t buffer_size = 1<<20;
char *buffer = new char[buffer_size];
int main(){
string filename="stats.txt";
//compute file size
size_t iFileSize = 0;
std::ifstream ifstr(filename.c_str(), std::ios::binary); // create the file stream - this is scoped for destruction
if(!ifstr.good()){
cout<<"File is not valid!"<<endl;
exit(EXIT_FAILURE);
}
//get the file size
iFileSize = ifstr.tellg();
ifstr.seekg( 0, std::ios::end ); // open file at the end to get the size
iFileSize = (int) ifstr.tellg() - iFileSize;
cout<<"File size is: "<<iFileSize<<endl;
//close the file and reopen it for reading:
ifstr.close();
cout<<"Buffer size before check is:"<<buffer_size<<endl;
if(buffer_size>iFileSize){
buffer_size=iFileSize;
}
cout<<"Buffer size after check is:"<<buffer_size<<endl;
ifstream myFile;
myFile.open(filename);
if(myFile.fail()){
cerr<<"Error opening file!"<<endl;
exit(EXIT_FAILURE);
}
if(!myFile.good()){
cout<<"File is not valid!"<<endl;
exit(EXIT_FAILURE);
}
if(!myFile.is_open()){
cout<<"File is NOT opened anymore!"<<endl;
return 1;
}
while(myFile.is_open()&&myFile){
// Try to read next chunk of data
myFile.read(buffer, buffer_size);
// Get the number of bytes actually read
size_t count = myFile.gcount();
// If nothing has been read, break
if (!count){
break;
}
// Do whatever you need with first count bytes in the buffer:
string line;
while(getline(myFile, line)){
if(!line.empty()){
cout <<"Line: '" << line << "'" <<endl;
}
}
}
delete[] buffer;
buffer = NULL;
myFile.close();
return 0;
}
My files could have blank lines between the text line, also even the first lines could be blank lines.
So, I tested the program on a small file size (128kb in size) named to see how it works. But it doesn't work. It doesn't display any line on the screen even the file is so small.
What is wrong? Also, if I change the buffer size to a very small number, it reads just first one or two lines but why it doesn't loop to the end of the file to read and display all of the lines from that small file? Any help, please?
Thank you in advance!
This is the test file: (It starts with a few blank lines also.)
Population UK: 97876876723
Population France: 898989
This is the test end of the file: Yay!
This is the result:
And no line from file is displayed.

C++ - Read the bytes of any file into an unsigned char array

I have an assignment where I have to implement the Rijndael Algorithm for AES-128 Encryption. I have the algorithm operational, but I do not have proper file input/output.
The assignment requires us to use parameters passed in from the command line. In this case, the parameter will be the file path to the particular file the user wishes to encrypt.
My problem is, I am lost as to how to read in the bytes of a file and store these bytes inside an array for later encryption.
I have tried using ifstream and ofstream to open, read, write, and close the files and it works fine for plaintext files. However, I need the application to take ANY file as input.
When I tried my method of using fstream with a pdf as input, it would crash my program. So, I now need to learn how to take the bytes of a file, store them inside an unsigned char array for Encryption, and then store them inside another file. This process of encryption and storage of ciphertext needs to occur in 16 byte intervals.
The below implementation is my first attempt to read files in binary mode and then write whatever was read in another file also in binary mode.
The output is readable in a hex reader.
int main(int argc, char* argv[])
{
if (argc < 2)
{
cerr << "Use: " << argv[0] << " SOURCE_FILEPATH" << endl << "Ex. \"C\\Users\\Anthony\\Desktop\\test.txt\"\n";
return 1;
}
// Store the Command Line Parameter inside a string
// In this case, a filepath.
string src_fp = argv[1];
string dst_fp = src_fp.substr(0, src_fp.find('.', 0)) + ".enc";
// Open the filepaths in binary mode
ifstream srcF(src_fp, ios::in | ios::binary);
ofstream dstF(dst_fp, ios::out | ios::binary);
// Buffer to handle the input and output.
unsigned char fBuffer[16];
srcF.seekg(0, ios::beg);
while (!srcF.eof())
{
srcF >> fBuffer;
dstF << fBuffer << endl;
}
dstF.close();
srcF.close();
}
The code implementation does not work as intended.
Any direction on how to solve my dilemma would be greatly appreciated.
Like you, I really struggled to find a way to read a binary file into a byte array in C++ that would output the same hex values I see in a hex editor. After much trial and error, this seems to be the fastest way to do so without extra casts.
It would go faster without the counter, but then sometimes you end up with wide chars. To truly get one byte at a time I haven't found a better way.
By default it loads the entire file into memory, but only prints the first 1000 bytes.
string Filename = "BinaryFile.bin";
FILE* pFile;
pFile = fopen(Filename.c_str(), "rb");
fseek(pFile, 0L, SEEK_END);
size_t size = ftell(pFile);
fseek(pFile, 0L, SEEK_SET);
uint8_t* ByteArray;
ByteArray = new uint8_t[size];
if (pFile != NULL)
{
int counter = 0;
do {
ByteArray[counter] = fgetc(pFile);
counter++;
} while (counter <= size);
fclose(pFile);
}
for (size_t i = 0; i < 800; i++) {
printf("%02X ", ByteArray[i]);
}

ifstream::read keeps returning incorrect value

I am currently working my way through teaching myself how to work with files in c++, and I am having a good bit of difficulty extracting binary information from files.
My code:
std::string targetFile = "simplehashingfile.txt";
const char* filename = targetFile.c_str();
std::ifstream file;
file.open( filename, std::ios::binary | std::ios::in );
file.seekg(0, std::ios::end); // go to end of file
std::streamsize size = file.tellg(); // get size of file
std::vector<char> buffer(size); // create vector of file size bytes
file.read(buffer.data(), size); // read file into buffer vector
int totalread = file.gcount();
// Check that data was read
std::cout<<"total read: " << totalread << std::endl;
// check buffer:
std::cout<<"from buffer vector: "<<std::endl;
for (int i=0; i<size; i++){
std::cout << buffer[i] << std::endl;
}
std::cout<<"\n\n";
The "simplehashingfile.txt" file only contains 50 bytes of normal text. The size is correctly determined to be 50 bytes, but gcount returns 0 chars read, and the buffer output is (understandably from the gcount) a 50 line list of nothing.
For the life of me I cannot figure out where I went wrong! I made this test code earlier:
// Writing binary to file
std::ofstream ofile;
ofile.open("testbinary", std::ios::out | std::ios::binary);
uint32_t bytes4 = 0x7FFFFFFF; // max 32-bit value
uint32_t bytes8 = 0x12345678; // some 32-bit value
ofile.write( (char*)&bytes4 , 4 );
ofile.write( (char*)&bytes8, 4 );
ofile.close();
// Reading from file
std::ifstream ifile;
ifile.open("testbinary", std::ios::out | std::ios::binary);
uint32_t reading; // variable to read data
uint32_t reading2;
ifile.read( (char*)&reading, 4 );
ifile.read( (char*)&reading2, 4 );
std::cout << "The file contains: " << std::hex << reading << std::endl;
std::cout<<"next 4 bytes: "<< std::hex << reading2 << std::endl;
And that test code wrote and read perfectly. Any idea what I am doing wrong? Thank you to anyone who can point me in the right direction!
You never reset the file back to the beginning when you read from it
std::streamsize size = file.tellg(); //<- goes to the end of the file
std::vector<char> buffer(size); // create vector of file size bytes
file.read(buffer.data(), size); //<- now we read from the end of the file which will read nothing
int totalread = file.gcount();
You need to call seekg() again and reset the file pointer back to the beginning. To do that use
fille.seekg(0, std::ios::beg);
before
file.read(buffer.data(), size);
It would be worth to return to the begin of the file, before trying to read:
file.seekg(0, std::ios::beg)
I think the problem is that you do a seek to the end to get the file size, but don't seek back to the beginning before trying to read the file.

Retrieving File Data Stored in Buffer

I'm new to the forum, but not to this website. I've been searching for weeks on how to process a large data file quickly using C++ 11. I'm trying to have a function with a member that will capture the trace file name, open and process the data. The trace file contains 2 million lines of data, and each line is structured with a read/write operation and a hex address:
r abcdef123456
However, with a file having that much data, I need to read in and parse those 2 values quickly. My first attempt to read the file was the following:
void getTraceData(string filename)
{
ifstream inputfile;
string file_str;
vector<string> op, addr;
// Open input file
inputfile.open(filename.c_str());
cout << "Opening file for reading: " << filename << endl;
// Determine if file opened successfully
if(inputfile.fail())
{
cout << "Text file failed to open." << endl;
cout << "Please check file name and path." << endl;
exit(1);
}
// Retrieve and store address values and operations
if(inputfile.is_open())
{
cout << "Text file opened successfully." << endl;
while(inputfile >> file_str)
{
if((file_str == "r") || (file_str == "w"))
{
op.push_back(file_str);
}
else
{
addr.push_back(file_str);
}
}
}
inputfile.close();
cout << "File closed." << endl;
}
It worked, it ran, and read in the file. Unfortunately, it took the program 8 minutes to run and read the file. I modified the first program to the second program, to try and read the file in faster. It did, reading the file into a buffer in a fraction of a second versus 8 mins. using ifstream:
void getTraceData()
{
// Setup variables
char* fbuffer;
ifstream ifs("text.txt");
long int length;
clock_t start, end;
// Start timer + get file length
start = clock();
ifs.seekg(0, ifs.end);
length = ifs.tellg();
ifs.seekg(0, ifs.beg);
// Setup buffer to read & store file data
fbuffer = new char[length];
ifs.read(fbuffer, length);
ifs.close();
end = clock();
float diff((float)end - (float)start);
float seconds = diff / CLOCKS_PER_SEC;
cout << "Run time: " << seconds << " seconds" << endl;
delete[] fbuffer;
}
But when I added the parsing portion of the code, to get each line, and parsing the buffer contents line-by-line to store the two values in two separate variables, the program silently exits at the while-loop containing getline from the buffer:
void getTraceData(string filename)
{
// Setup variables
char* fbuffer;
ifstream ifs("text.txt");
long int length;
string op, addr, line;
clock_t start, end;
// Start timer + get file length
start = clock();
ifs.seekg(0, ifs.end);
length = ifs.tellg();
ifs.seekg(0, ifs.beg);
// Setup buffer to read & store file data
fbuffer = new char[length];
ifs.read(fbuffer, length);
ifs.close();
// Setup stream buffer
const int maxline = 20;
char* lbuffer;
stringstream ss;
// Parse buffer data line-by-line
while(ss.getline(lbuffer, length))
{
while(getline(ss, line))
{
ss >> op >> addr;
}
ss.ignore( strlen(lbuffer));
}
end = clock();
float diff((float)end - (float)start);
float seconds = diff / CLOCKS_PER_SEC;
cout << "Run time: " << seconds << " seconds" << endl;
delete[] fbuffer;
delete[] lbuffer;
}
I was wondering, once my file is read into a buffer, how do I retrieve it and store it into variables? For added value, my benchmark time is under 2 mins. to read and process the data file. But right now, I'm just focused on the input file, and not the rest of my program or the machine it runs on (the code is portable to other machines). The language is C++ 11 and the OS is a Linux computer. Sorry for the long posting.
Your stringstream ss is not associated to fbuffer at all. You are trying to getline from an empty stringstream, thus nothing happens. Try this:
string inputedString(fbuffer);
istringstream ss(fbuffer);
And before ss.getline(lbuffer, length), please allocate memory for lbuffer.
Actually you can directly read your file into a string to avoid the copy construction. Check this Reading directly from an std::istream into an std::string .
Last but not least, since your vector is quite large, you'd better reserve enough space for it before push_back the items one by one. When a vector reaches its capacity, attempt to push_back another item into it will result in reallocation and copy of all previous items in order to ensure continuous storage. Millions of items will make that happen quite a few times.

Read from a simple encrypted file C++

I am trying to write a program which will output a XOR encrypted string to a file and will read this string and decrypt it back . To encrypt my string I have used a simple XOR Encryption : (thanks to Kyle W.Banks site)
string encryptDecrypt(string toEncrypt)
{
char key = 'K'; //Any char will work
string output = toEncrypt;
for (int i = 0; i < toEncrypt.size(); i++)
output[i] = toEncrypt[i] ^ key;
return output;
}
Then In my program I use the following code to write and then read the string :
string encrypted = encryptDecrypt("Some text");
cout << "Encrypted:" << encrypted << "\n";
ofstream myFile("test.txt");
myFile << encrypted;
// Read all the txt file in binary mode to obtain the txt file in one string
streampos size;
char * memblock;
ifstream file ("test.txt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
}
//Convert the memblock into string and show the result of decrypted string
string result(memblock);
string decrypted = encryptDecrypt(result);
cout << "Decrypted:" << decrypted << "\n";
In result I have :
Encrypted : ,<.c%.;%
Decrypted : Õ52E65AD0
Maybe to save the file cause some problems into the byte saved so It can't retrieve the same byte when the program tried to read the string, but I'm not sure at all.
Best regards
Since you're not closing the output, there's a fair chance that your OS won't let you open the file for reading.
You're decrypting regardless of whether the file was successfully read.
If it wasn't successfully read, you'll have undefined behaviour due to memblock not being initialised - most likely getting a result constructed from random garbage data.
Once you get that fixed, you need to zero-terminate memblock to make it a "proper" C-style string.
Encryption with XOR is kind of dangerous. Assume your plain text contains the letter 'K', the encrypted string will contain a '\0' at this position. Your string will be cut off there.
Same thing for the other direction, you are reading the encrypted string. Converting the memory block to a string will result in a shorter string because std::string::string(const char*) will stop reading at '\0'.
Apart from that, memblock isn't initialized when the file could not be opened, so put the encryption part into the if (file.IsOpen()) clause.
As said by Zuppa it is dangerous to use it that way the string may terminate unexpectedly due to '\0'
you should post - calculate the length of the text you are dealing with it can be easily done by using stream.seekg(0,ios_base::end)
and you can use read and write functions to write or get the text from the file
ifstream file ("test.txt", ios::in|ios::binary|ios::ate);
file.seekg(0,ios::end);
int length=file.tellg();//length of the text in the file
file.seekg(0);
char *memblock=new char[length];
file.read(memblock,length);
you may consult this Simple xor encryption