ifstream::read doesn't tell how many bytes it really reads? - c++

I'm using ifstream::read to read a file,
ifstream ifs("a.txt");
char buf[1024];
ifs.read(buf, 1024);
But a.txt's size might be less than 1000 bytes, so how am I supposed to know how many bytes have been read from ifs?

You can get the amount of characters extracted by the last operation with std::ifstream::gcount:
ifstream ifs("a.txt");
char buf[1024];
ifs.read(buf, 1024);
size_t extracted = ifs.gcount();
or
ifstream ifs("a.txt");
char buf[1024];
size_t extracted = ifs.read(buf, 1024).gcount();
since read(...) returns *this.

Related

Reading and Writing any file in C++

I have a program where I need to operate on different types of files.
I want the input and output files of the following program to be the same.
#include<iostream>
#include<string>
#include<fstream>
#include<sstream>
typedef unsigned char u8;
using namespace std;
char* readFileBytes(string name)
{
ifstream fl(name);
fl.seekg( 0, ios::end );
size_t len = fl.tellg();
char *ret = new char[len];
fl.seekg(0, ios::beg);
fl.read(ret, len);
fl.close();
return ret;
}
int main(int argc, char *argv[]){
string name = "file.pdf";
u8* file = (u8*) readFileBytes(name);
// cout<<str<<endl;
int len = 0;
while(file[len] != '\0')
len++;
cout<<"FILESIZE : "<<len<<endl;
string filename = "file2.pdf";
ofstream outfile(filename,ios::out | ios::binary);
outfile.write((char*) file,len);
outfile.close();
exit(0);
}
The difference between the output and input files is checked using diff
diff file.pdf file2.pdf
What should I do to make file2.pdf the same as file.pdf?
I have tried using xxd to change the binary into hexadecimal but the disadvantage is that the overall size doubles. So therefore I want to operate in binary only.
size_t len = fl.tellg();
char *ret = new char[len];
In this manner the shown code determines the number of characters in the file. This is fine. The only problem with it is that after this number of characters is read, this very important information is completely forgotten and thrown away. This function returns only this ret pointer, and the actual number of characters in it is now an unsolvable mystery.
But then, main() attempts to solve this mystery as follows:
int len = 0;
while(file[len] != '\0')
len++;
This attempts to reverse-engineer the number of characters by looking for the first 0 byte in the buffer.
Which has absolutely nothing to do with anything. The first character in the file may be a 0 byte, so this will calculate that the file is empty, and not ten gigabytes in size.
Or the file can contain just a string "Hello world", which this for loop will happily blow past, then start rooting around in some random memory after this buffer, resulting in undefined behavior.
That's the fatal logical flaw in the shown code: the actual size of the file is thrown away, and instead reverse-engineered in a flawed way.
You will need to rework the code so that the number of characters in the file, the original len, is also returned to main(), and it uses that, instead of attempting to guess what it originally was.
P.S. delete-ing the ret buffer, after you're done with it, would also be a good idea too. An even better idea is to avoid using new, using vector instead, which will happily give you its size() any time you ask for it, and you won't have to worry about deleting the allocated memory.
In order to correctly process binary data, the size must be stored and cannot be computed from a sentinel null byte, because null bytes can be legimate bytes in a binary file. So you should return the read lenght in addition to the buffer, or even better copy each buffer to the new file until you have exhausted the input file:
int main(int argc, char *argv[]){
constexpr size_t sz = 10240; // size of buffer
char buffer[sz];
string name = "file.pdf";
string filename = "file2.pdf";
ifstream fl(name);
ofstream outfile(filename,ios::out | ios::binary);
int len = 0, buflen;
for (;;) {
buflen = fl.read(buf, len);
if (buflen == 0) break; // reached EOF
len += buflen;
if (buflen != outfile.write(buf, buflen)) {
// display an error message
return 1;
}
}
fl.close();
outfile.close()
cout<<"FILESIZE : "<<len<<endl;
exit(0);
}

Trying to read from a file using file descriptor prints numbers and slashes to console

I am trying to write a simple program that reads a file by encapsulating functions like open, lseek, pread.
My file for test contains:
first second third forth fifth sixth
seventh eighth
my main function that tries to read 20 bytes with offset 10 from the file:
#include <iostream>
#include "CacheFS.h"
using namespace std;
int main(int argc, const char * argv[]) {
char * filename1 = "/Users/Desktop/File";
int fd1 = CacheFS_open(filename1);
//read from file and print it
void* buf[20];
CacheFS_pread(fd1, &buf, 20, 10);
cout << (char*)buf << endl;
}
implementation of the functions that the main is using:
int CacheFS_open(const char *pathname)
{
mode_t modes = O_SYNC | 0 | O_RDONLY;
int fd = open(pathname, modes);
return fd;
}
int CacheFS_pread(int file_id, void *buf, size_t count, off_t offset)
{
off_t seek = lseek(file_id, offset, SEEK_SET);
off_t fileLength = lseek(file_id, 0, SEEK_END);
if (count + seek <= fileLength) //this case we are not getting to the file end when readin this chunk
{
pread(file_id, &buf, count, seek);
} else { //count is too big so we can only read a part of the chunk
off_t size = fileLength - seek;
pread(file_id, &buf, size, seek);
}
return 0;
}
My main function prints this to the console:
\350\366\277_\377
I would expect it to print some values from the file itself, and not some numbers and slashes that represenet something I do not really understand.
Why does this happen?
The following changes will make your program work:
Your buffer has to be an existent char array and your CacheFS_pread function is called without the address operator & then. Also use the buffer size minus 1 because the pread function will override the terminating \0 because it's just read n bytes of the file. I use a zero initialized char array here so that there will be a null terminating \0 at least at the end.
char buf[20] = { '\0' }; // declare and initialize with zeros
CacheFS_pread(fd1, buf, sizeof(buf) - 1, 10);
Your function header should accept only a char pointer for typesafety reasons.
int CacheFS_pread(int file_id, char* buf, size_t count, off_t offset)
Your pread call is then without the address operator &:
pread(file_id, buf, count, seek);
Output: nd third forth fift because buffer is just 20!
Also I would check if your calculations and your if statements are right. I have the feeling that it's not exactly right. I would also recomment to use the return value of pread.

read from binary file and store to a buffer

Can somebody tell if this is correct?
I try to read from binary file line by line and store it in a buffer? does the new line that it stores in the buffer delete the previous stored line?
ifs.open(filename, std::ios::binary);
for (std::string line; getline(ifs, line,' '); )
{
ifs.read(reinterpret_cast<char *> (buffer), 3*h*w);
}
For some reason you are mixing getline which is text-based reading, and read(), which is binary reading.
Also, it's completely unclear, what is buffer and what's it size. So, here is a simple example for you to start:
ifs.open(filename, std::ios::binary); // assume, that everything is OK
constexpr size_t bufSize = 256;
char buffer[bufSize];
size_t charsRead{ 0 };
do {
charsRead = ifs.read(buffer, bufSize)
// check if charsRead == 0, if it's ok
// do something with filled buffer.
// Note, that last read will have less than bufSize characters,
// So, query charsRead each time.
} while (charsRead == bufSize);

How to read a file in multiple chunks until EOF (C++)

So, here's my problem: I want to make a program that reads chunks of data from a file. Let's say, 1024 bytes per chunk.
So I read the first 1024 bytes, perform various operations and then open the next 1024 bytes, without reading the old data. The program should keep reading data untile the EOF is reached.
I'm currently using this code:
std::fstream fin("C:\\file.txt");
vector<char> buffer (1024,0); //reads only the first 1024 bytes
fin.read(&buffer[0], buffer.size());
But how can I read the next 1024 bytes? I was thinking by using a for loop, but I don't really know how. I'm totally a noob in C++, so if anyone can help me out, that would be great. Thanks!
You can do this with a loop:
std::ifstream fin("C:\\file.txt", std::ifstream::binary);
std::vector<char> buffer (1024,0); //reads only the first 1024 bytes
while(!fin.eof()) {
fin.read(buffer.data(), buffer.size())
std::streamsize s=fin.gcount();
///do with buffer
}
##EDITED
http://en.cppreference.com/w/cpp/io/basic_istream/read
Accepted answer doesn't work for me - it doesn't read last partial chunk. This does:
void readFile(std::istream &input, UncompressedHandler &handler) {
std::vector<char> buffer (1024,0); //reads only 1024 bytes at a time
while (!input.eof()) {
input.read(buffer.data(), buffer.size());
std::streamsize dataSize = input.gcount();
handler({buffer.begin(), buffer.begin() + dataSize});
}
}
Here UncompressedHandler accepts std::string, so I use constructor from two iterators.
I think you missed up that there is a pointer points to the last place you've visit in the file , so that when you read for the second time you will not start from the first , but from the last point you've visit .
Have a look to this code
std::ifstream fin("C:\\file.txt");
char buffer[1024]; //I prefer array more than vector for such implementation
fin.read(buffer,sizeof(buffer));//first read get the first 1024 byte
fin.read(buffer,sizeof(buffer));//second read get the second 1024 byte
so that how you may think about this concept .
I think that will work
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <fstream>
// Buffer size 16 Megabyte (or any number you like)
size_t buffer_size = 1 << 24; // 20 is 1 Megabyte
char* buffer = new char[buffer_size];
std::streampos fsize = 0;
std::ifstream file("c:\\file.bin", std::ios::binary);
fsize = file.tellg();
file.seekg(0, std::ios::end);
fsize = file.tellg() - fsize;
int loops = fsize / buffer_size;
int lastChunk = fsize % buffer_size;
for (int i = 0; i < loops; i++) {
file.read(buffer, buffer_size);
// DO what needs with the buffer
}
if (lastChunk > 0) {
file.read(buffer, lastChunk);
// DO what needs with the buffer
}
delete[] buffer;

C++ fread() into a std::string

Like always, problems with pointers. This time I am trying to read a file (opened in binary mode) and store some portion of it in a std::string object.
Let's see:
FILE* myfile = fopen("myfile.bin", "rb");
if (myfile != NULL) {
short stringlength = 6;
string mystring;
fseek(myfile , 0, SEEK_SET);
fread((char*)mystring.c_str(), sizeof(char), (size_t)stringlength, myfile);
cout << mystring;
fclose(myfile );
}
Is this possible? I don't get any message. I am sure the file is O.K. When I try with char* it does work but I want to store it directly into the string. Thanks for your help!
Set the string to be large enough first to avoid buffer overrun, and access the byte array as &mystring[0] to satisfy const and other requirements of std::string.
FILE* myfile = fopen("myfile.bin", "rb");
if (myfile != NULL) {
short stringlength = 6;
string mystring( stringlength, '\0' );
fseek(myfile , 0, SEEK_SET);
fread(&mystring[0], sizeof(char), (size_t)stringlength, myfile);
cout << mystring;
fclose(myfile );
}
There are many, many issues in this code but that is a minimal adjustment to properly use std::string.
I would recommend this as the best way to do such a thing. Also you should check to make sure that all the bytes were read.
FILE* sFile = fopen(this->file.c_str(), "r");
// if unable to open file
if (sFile == nullptr)
{
return false;
}
// seek to end of file
fseek(sFile, 0, SEEK_END);
// get current file position which is end from seek
size_t size = ftell(sFile);
std::string ss;
// allocate string space and set length
ss.resize(size);
// go back to beginning of file for read
rewind(sFile);
// read 1*size bytes from sfile into ss
fread(&ss[0], 1, size, sFile);
// close the file
fclose(sFile);
string::c_str() returns const char* which you can not modify.
One way to do this would be use a char* first and construct a string from it.
Example
char buffer = malloc(stringlength * sizeof(char));
fread(buffer, sizeof(char), (size_t)stringlength, myfile);
string mystring(buffer);
free(buffer);
But then again, if you want a string, you should perhaps ask yourself Why am I using fopen and fread in the first place??
fstream would be a much much better option.
You can read more about it here
Please check out the following regarding c_str to see some things that are wrong with your program. A few issues include the c_str not being modifiable, but also that it returns a pointer to your string contents, but you never initialized the string.
http://www.cplusplus.com/reference/string/string/c_str/
As for resolving it... you could try reading into a char* and then initializing your string from that.
No it is not. std::string::c_str() method does not return a modifiable character sequence as you can validate from here. A better solution would be using a buffer char array. Here is an example:
FILE* myfile = fopen("myfile.bin", "rb");
if (myfile != NULL) {
char buffer[7]; //Or you can use malloc() / new instead.
short stringlength = 6;
fseek(myfile , 0, SEEK_SET);
fread(buffer, sizeof(char), (size_t)stringlength, myfile);
string mystring(buffer);
cout << mystring;
fclose(myfile );
//use free() or delete if buffer is allocated dynamically
}