How to read a file containing '\0' characters? - c++

I wrote a simple program to read a TXT file. The problem is the file contains some '\0' characters. Here's a sample :
And here's the solution I've found to solve my problem :
FILE *pInput = fopen("Encoded.txt", "rb");
fseek(pInput, 0, SEEK_END);
size_t size = ftell(pInput);
fseek(pInput, 0, SEEK_SET);
char *buffer = new char[size];
for (int i = 0; i < size; i++)
buffer[i] = fgetc(pInput);
I would like to replace the following code :
for (int i = 0; i < size; i++)
buffer[i] = fgetc(pInput);
By just a simple function call. Is there a function which can do this job ?
I tried with fread, fgets but they stop to read at the first '\0' character.
Thanks a lot in advance for your help.

fread is fine for reading arbitrary binary; it returns the number of elements read, which is a value you should store and use in all dealings with your buffer. (Read some documentation on fread to find out how it works.)
(On the other hand, with fgets you won't be able to find out how many characters were read because a pointer to a [assumedly null-terminated] C-string is all you get out of it.)
You need to ensure that your handling of your resultant buffer is zero-safe. That means no strlen or the like, which are all designed to work on ASCII input (more or less).

Quoting cplusplus.com and removing the plumbering that you'll find in the link:
// Open the file with the pointer at the end
ifstream file("example.bin", ios::in|ios::binary|ios::ate);
// Get the file size
streampos size = file.tellg();
// Allocate a block
char* memblock = new char [size];
// We were at the end go to the begining
file.seekg 0, ios::beg);
// Read the whole file
file.read(memblock, size);
Et voilĂ  !

Related

Weird seek behaviour in C and C++ [duplicate]

I did a sample project to read a file into a buffer.
When I use the tellg() function it gives me a larger value than the
read function is actually read from the file. I think that there is a bug.
here is my code:
EDIT:
void read_file (const char* name, int *size , char*& buffer)
{
ifstream file;
file.open(name,ios::in|ios::binary);
*size = 0;
if (file.is_open())
{
// get length of file
file.seekg(0,std::ios_base::end);
int length = *size = file.tellg();
file.seekg(0,std::ios_base::beg);
// allocate buffer in size of file
buffer = new char[length];
// read
file.read(buffer,length);
cout << file.gcount() << endl;
}
file.close();
}
main:
void main()
{
int size = 0;
char* buffer = NULL;
read_file("File.txt",&size,buffer);
for (int i = 0; i < size; i++)
cout << buffer[i];
cout << endl;
}
tellg does not report the size of the file, nor the offset
from the beginning in bytes. It reports a token value which can
later be used to seek to the same place, and nothing more.
(It's not even guaranteed that you can convert the type to an
integral type.)
At least according to the language specification: in practice,
on Unix systems, the value returned will be the offset in bytes
from the beginning of the file, and under Windows, it will be
the offset from the beginning of the file for files opened in
binary mode. For Windows (and most non-Unix systems), in text
mode, there is no direct and immediate mapping between what
tellg returns and the number of bytes you must read to get to
that position. Under Windows, all you can really count on is
that the value will be no less than the number of bytes you have
to read (and in most real cases, won't be too much greater,
although it can be up to two times more).
If it is important to know exactly how many bytes you can read,
the only way of reliably doing so is by reading. You should be
able to do this with something like:
#include <limits>
file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear(); // Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );
Finally, two other remarks concerning your code:
First, the line:
*buffer = new char[length];
shouldn't compile: you have declared buffer to be a char*,
so *buffer has type char, and is not a pointer. Given what
you seem to be doing, you probably want to declare buffer as
a char**. But a much better solution would be to declare it
as a std::vector<char>& or a std::string&. (That way, you
don't have to return the size as well, and you won't leak memory
if there is an exception.)
Second, the loop condition at the end is wrong. If you really
want to read one character at a time,
while ( file.get( buffer[i] ) ) {
++ i;
}
should do the trick. A better solution would probably be to
read blocks of data:
while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
i += file.gcount();
}
or even:
file.read( buffer, size );
size = file.gcount();
EDIT: I just noticed a third error: if you fail to open the
file, you don't tell the caller. At the very least, you should
set the size to 0 (but some sort of more precise error
handling is probably better).
In C++17 there are std::filesystem file_size methods and functions, so that can streamline the whole task.
std::filesystem::file_size - cppreference.com
std::filesystem::directory_entry::file_size - cppreference.com
With those functions/methods there's a chance not to open a file, but read cached data (especially with the std::filesystem::directory_entry::file_size method)
Those functions also require only directory read permissions and not file read permission (as tellg() does)
void read_file (int *size, char* name,char* buffer)
*buffer = new char[length];
These lines do look like a bug: you create an char array and save to buffer[0] char. Then you read a file to buffer, which is still uninitialized.
You need to pass buffer by pointer:
void read_file (int *size, char* name,char** buffer)
*buffer = new char[length];
Or by reference, which is the c++ way and is less error prone:
void read_file (int *size, char* name,char*& buffer)
buffer = new char[length];
...
fseek(fptr, 0L, SEEK_END);
filesz = ftell(fptr);
will do the file if file opened through fopen
using ifstream,
in.seekg(0,ifstream::end);
dilesz = in.tellg();
would do similar

I have an issue with fgets in c++

Im doing a small exercise to read a file which contains one long string and load this into an array of strings. So far I have:
char* data[11];
char buf[15];
int i = 0;
FILE* indata;
indata = fopen( "somefile.txt", "r" );
while( i < 11)
{
fgets(buf, 16, indata);
data[i] = buf;
i++;
}
fclose( indata );
somefile.txt: "aaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbaahhhhhbbbbdddddddddddddbbbbb"
etc..
This reads in 15 characters, adds that string to the array and gets the next 15. The problem is the array always equals the last string, so if the last string is "ccccv" the whole array, data[0] = "ccccv", data[1] = "ccccv", data[2] = "ccccv" and so on.
Does anyone know why this is happening and whether there is a better way to do it? Thanks
Each pointer in data will point to the same memory area, which is buf.
You need to use strcpy + malloc.
Also is seems like you have a "minor" buffer overflow. buf is size 15 and you're reading 16 characters.

C++ Char pointer to char array

None of the posted answers I've read work, so I'm asking again.
I'm trying to copy the string data pointed to by a char pointer into a char array.
I have a function that reads from a ifstream into a char array
char* FileReader::getNextBytes(int numberOfBytes) {
char *buf = new char[numberOfBytes];
file.read(buf, numberOfBytes);
return buf;
}
I then have a struct :
struct Packet {
char data[MAX_DATA_SIZE]; // can hold file name or data
} packet;
I want to copy what is returned from getNextBytes(MAX_DATA_SIZE) into packet.data;
EDIT: Let me show you what I'm getting with all the answers gotten below (memcpy, strcpy, passing as parameter). I'm thinking the error comes from somewhere else. I'm reading a file as binary (it's a png). I'll loop while the fstream is good() and read from the fstream into the buf (which might be the data array). I want to see the length of what I've read :
cout << strlen(packet.data) << endl;
This returns different sizes every time:
8
529
60
46
358
66
156
After that, apparently there are no bytes left to read although the file is 13K + bytes long.
This can be done using standard library function memcpy, which is declared in / :
strcpy(packet.data, buf);
This requires file.read returns proper char series that ends with '\0'. You might also want to ensure numberOfBytes is big enough to accommodate the whole string. Otherwise you could possibly get segmentation fault.
//if buf not properly null terminated added a null char at the end
buf[numberofbytes] = "\0"
//copy the string from buf to struc
strcpy(packet.data, buf);
//or
strncpy(packet.data, buf);
Edit:
Whether or not this is being handled as a string is a very important distinction. In your question, you referred to it as a "string", which is what got us all confused.
Without any library assistance:
char result = reader.getNextBytes(MAX_DATA_SIZE);
for (int i = 0; i < MAX_DATA_SIZE; ++MAX_DATA_SIZE) {
packet.data[i] = result[i];
}
delete [] result;
Using #include <cstring>:
memcpy(packet.data, result, MAX_DATA_SIZE);
Or for extra credit, rewrite getNextBytes so it has an output parameter:
char* FileReader::getNextBytes(int numberOfBytes, char* buf) {
file.read(buf, numberOfBytes);
return buf;
}
Then it's just:
reader.getNextBytes(MAX_DATA_SIZE, packet.data);
Edit 2:
To get the length of a file:
file.seekg (0, ios::end);
int length = file.tellg();
file.seekg (0, ios::beg);
And with that in hand...
char* buffer = new char[length];
file.read(buffer, length);
Now you have the entire file in buffer.
strlen is not a valid way to determine the amount of binary data. strlen just reads until it finds '\0', nothing more. If you want to read a chunk of binary data, just use a std::vector, resize it to the amount of bytes you read from the file, and return it as value. Problem solved.

c++ use a buffer in memory instead reading directly a file

I have this code that works fine:
FILE *fp;
fp = fopen(filename.c_str(), "rb");
char id[5];
fread(id,sizeof(char),4,fp);
now I've changed something in my architecture, and instead the filename as fullpath of the file I have a char pointer that contains the data of the file.. so I don't need to read (fopen, etc..) but only to read the char* buffer...
how can I do this?
thanks in advance
If I'm understanding your question correctly, you want to access a four character ID somewhere in the middle of your buffer. The easiest way to do this is just to copy the data into a new buffer and add a NULL terminator.
size_t index = 0;
// ...
char id[5];
memcpy(id, &myData[index], 4);
id[4] = '\0';
index += 4;
You can then read through your buffer sequentially by updating the index value every time you read something.
char id[5];
strncpy(id,bfr,4);
id[4]='\0';
Where bfr is the buffer with your file data.
Also strongly advise you read the chapter on pointers and strings in K&R: The C Programming Language.

using fread to read into int buffer

I would like to know if I can use fread to read data into an integer buffer.
I see fread() takes void * as the first parameter. So can't I just pass an integer
buffer (typecast to void *) and then use this to read howmuchevery bytes I want to from the file, as long as the buffer is big enough ?
ie. cant i do:
int buffer[10];
fread((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
What is wrong here ?
Thanks
This should work if you wrote the ints to the file using something like fwrite ("binary" write). If the file is human-readable (you can open it with a text editor and see numbers that make sense) you probably want fscanf / cin.
As others have mentioned fread should be able to do what you want
provided the input is in the binary format you expect. One caveat
I would add is that the code will have platform dependencies and
will not function correctly if the input file is moved between
platforms with differently sized integers or different
endian-nesses (sp).
Also, you should always check your return values; fread could fail.
Yes you can use fread to read into an array of integers
int buffer[10];
size_t readElements = fread((void *)buffer, sizeof(int), 10, somefile);
for(int i = 0; i < readElements; i++)
cout << buffer[i] << endl
You can check the number of elements fread returns to print out.
EDIT: provided you are reading from a file in binary mode and the values were written as cnicutar mentioned with fwrite.
I was trying the same and was getting the same result as yours, large int value when trying to read integer using fread() from a file and finally got the reason for it.
So suppose if your input file contains only:
"5"
"5 5 5"
The details I got from http://www.programmersheaven.com/mb/beginnercpp/396198/396198/fread-returns-invalid-integer/
fread() reads binary data (even if the file is opened in 'text'-mode). The number 540352565 in hex is 0x20352035, the 0x20 is the ASCII code of a space and 0x35 is the ASCII code of a '5' (they are in reversed order because using a little-endian machine).
So what fread does is read the ASCII codes from the file and builds an int from it, expecting binary data. This should explain the behavior when reading the '5 5 5' file. The same happens when reading the file with a single '5', but only one byte can be read (or two if it is followed by a newline) and fread should fail if it reads less than sizeof(int) bytes, which is 4 (in this case).
As the reaction to response is that it still does not work, I will provide here complete code, so you can try it out.
Please note that following code does NOT contain proper checks, and CAN crash if file does not exist, there is no memory left, no rights, etc.
In code should be added check for each open, close, read, write operations.
Moreover, I would allocate the buffer dynamically.
int* buffer = new int[10];
That is because I do not feel good when normal array is taken as pointer. But whatever. Please also note, that using correct type (uint32_t, 16, 8, int, short...) should be done to save space, according to number range.
Following code will create file and write there correct data that you can then read.
FILE* somefile;
somefile = fopen("/root/Desktop/CAH/scripts/cryptor C++/OUT/TOCRYPT/wee", "wb");
int buffer[10];
for(int i = 0; i < 10; i++)
buffer[i] = 15;
fwrite((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
fclose(somefile);
somefile = fopen("/root/Desktop/CAH/scripts/cryptor C++/OUT/TOCRYPT/wee", "rb");
fread((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
fclose(somefile);