I have a program which uses stdio for reading and writing a binary file. It caches the current stream position and will not seek if the read/write offset is already at the desired position.
However, an interesting problem appears, that when a byte is read and the following byte is written, it doesn't actually get written!
Here is a program to reproduce the problem:
#include <cstdio>
int main() {
FILE *f = fopen("test.bin", "wb");
unsigned char d[1024] = { 0 };
fwrite(d, 1, 1024, f);
fclose(f);
f = fopen("test.bin", "rb+");
for (size_t i = 0; i < 1024; i++) {
unsigned char a[1] = { 255 - (unsigned char)(i) };
fflush(f);
fwrite(a, 1, 1, f);
fflush(f);
fseek(f, i, SEEK_SET);
fread(a, 1, 1, f);
printf("%02X ", a[0]);
}
fclose(f);
return 0;
}
You are supposed to see it write the bytes FF down to 00, however only the first byte is written because it does not follow a fread immediately.
If it seeks before fwrite, it acts correctly.
The problem happens on Visual Studio 2010/2012 and TDM-GCC 4.7.1 (Windows), however it works on codepad which I guess is due to it being executed on Linux.
Any idea why this happens?
C99 §7.18.5.3/6 (quoted from N869 final draft):
“When a file is opened with update mode (’+’ as the second or third character in the
above list of mode argument values) […] input shall not be directly followed by output without an
intervening call to a file positioning function, unless the input operation encounters end-
of-file.”
Related
I'm trying to use a sparse file to store sparse array of data, logically I thought the code had no bugs but the unit tests keep failing, after many inspections of code I decided to check the file content after every step and found out the holes were not created, aka: write first element, seek x amount of elements, write 2nd element ends up writing first element then second element in file without any space at all between them.
My simplified code:
FILE* file = fopen64(fn.c_str(), "ar+b");
auto const entryPoint = 220; //calculated at runtime, the size of each element is 220 bytes
auto r = fseeko64(file, entryPoint, SEEK_SET);
if(r!=0){
std::cerr << "Error seeking file" << std::endl;
}
size_t written = fwrite(&page->entries[0], sizeof(page->entries), 1, file);
if(written != 1) {
perror("error writing file");
}
fclose(file);
The offset is being calculated correctly, current behavior is writing first element, leaving 20 elements empty then writing 22nd element. When inspecting file using hex dumps it shows 2 elements at offset 0 and 220 (directly after first element). unit tests also fail because reading 2nd element actually returns element number 22.
Anyone could explain what is wrong with my code? maybe I misunderstood the concept of holes???
------Edit1------
Here's my full code
Read function:
FILE* file = fopen64(fn.c_str(), "r+b");
if(file == nullptr){
memset(page->entries, 0, sizeof(page->entries));
return ;
}
MoveCursor(file, id, sizeof(page->entries));
size_t read = fread(&page->entries[0], sizeof(page->entries), 1, file);
fclose(file);
if(read != 1){ //didn't read a full page.
memset(page->entries, 0, sizeof(page->entries));
}
Write function:
auto fn = dir.path().string() + std::filesystem::path::preferred_separator + GetFileId(page->pageId);
FILE* file = fopen64(fn.c_str(), "ar+b");
MoveCursor(file, page->pageId, sizeof(page->entries));
size_t written = fwrite(&page->entries[0], sizeof(page->entries), 1, file);
if(written != 1) {
perror("error writing file");
}
fclose(file);
void MoveCursor(FILE* file, TPageId pid, size_t pageMultiplier){
auto const entryPoint = pid * pageMultiplier;
auto r = fseeko64(file, entryPoint, SEEK_SET);
if(r!=0){
std::cerr << "Error seeking file" << std::endl;
}
}
And here's a simplified page class:
template<typename TPageId uint32_t EntriesCount>
struct PODPage {
bool dirtyBit = false;
TPageId pageId;
uint32_t entries[EntriesCount];
};
The reason I'm saying it is fseeko problem when writing is because when inspecting file content with xdd it shows data is out of order. Break points in MoveCursor function shows the offset is calculated correctly and manual inspection of file fields shows the offset is set correctly however when writing it doesn't leave a hole.
=============Edit2============
Minimal reproducer, logic goes as: write first chunk of data, seek to position 900, write second chunk of data, then try to read from position 900 and compare to data that was supposed to be there. Each operation opens and closes file which is what happens in my original code, keeping a file open is not allowed.
Expected behavior is to create a hole in file, actual behavior is the file is written sequentially without holes.
#include <iostream>
#define _FILE_OFFSET_BITS 64
#define __USE_FILE_OFFSET64 1
#include <stdio.h>
#include <cstring>
int main() {
uint32_t data[10] = {1,2,3,4,5,6,7,8,9};
uint32_t data2[10] = {9,8,7,6,5,4,3,2,1};
{
FILE* file = fopen64("data", "ar+b");
if(fwrite(&data[0], sizeof(data), 1, file) !=1) {
perror("err1");
return 0;
}
fclose(file);
}
{
FILE* file = fopen64("data", "ar+b");
if (fseeko64(file, 900, SEEK_SET) != 0) {
perror("err2");
return 0;
}
if(fwrite(&data2[0], sizeof(data2), 1, file) !=1) {
perror("err3");
return 0;
}
fclose(file);
}
{
FILE* file = fopen64("data", "r+b");
if (fseeko64(file, 900, SEEK_SET) != 0) {
perror("err4");
return 0;
}
uint32_t data3[10] = {0};
if(fread(&data3[0], sizeof(data3), 1, file)!=1) {
perror("err5");
return 0;
}
fclose(file);
if (memcmp(&data2[0],&data3[0],sizeof(data))!=0) {
std::cerr << "err6";
return 0;
}
}
return 0;
}
I think your problem is the same as discussed here:
fseek does not work when file is opened in "a" (append) mode
Does fseek() move the file pointer to the beginning of the file if it was opened in "a+b" mode?
Summary of the two above: If a file is opened for appending (using "a") then fseek only applies to the read position, not to the write position. The write position will always be at the end of the file.
You can fix this by opening the file with "w" or "w+" instead. Both worked for me with your minimal code example.
I came across a problem when writing a binary file. In order to debug the problem, I made the output file stream un-buffered by call setbuf(pf,0). Following is part of the code that represents the problem. Before calling fwrite, pos0=943195. By calling fwrite I wrote an int number to the file. However,after calling fwrite, pos=943200, which should be 943199 I think.
FILE* pf = NULL;
pfSenc = fopen(szFilePath, "r+b");
setbuf(pfSenc, 0);
// lots of read and write operations on this file
int pos0 = ftell(pf);
fwrite(&some_num, sizeof(int), 1, pf);
int pos = ftell(pf);
// more read and write operations on this file.
fclose(pf);
Is this some kind of alignment requirements when writing binary file in Solaris OS? Since 943195 can not be divided by 4, so fwrite starts at 943196? The content written to file also shows that fwrite starts write with 1 byte offset. If it is an alignment problem, can this feature be disabled somehow? The code works as expected in other OSs like Windows and Centos7.
Update:
The following system calls:
1334: llseek(18, 0, SEEK_CUR) = 943195
1334: write(18, "B1", 1) = 1
1334: write(18, " !CC13\0", 4) = 4
1334: llseek(18, 0, SEEK_CUR) = 943200
are generated by this 3 lines of code:
int pos0 = ftell(pf);
fwrite(&some_num, sizeof(int), 1, pf);
int pos = ftell(pf);
There is an unexpected extra write call before write the int number:
1334: write(18, "B1", 1) = 1
I have no idea where this call comes from? This application is single-threaded and the file is not opened anywhere else.
I want to write three characters to a file, then a struct, then one more character.
Finally I would like to read the character before the struct, the struct itself, the character after the struct and display them on the screen.
struct stuff{
int a;
int b;
};
int main(){
FILE * fp = fopen("input.txt", "w+");
char charA = 'z';
char charB = 's';
char charC = 'q';
char charD = 'e';
//create a struct of type stuff
stuff s;
s.a = 123;
s.b = 2111;
//fwrite three first chars
fwrite(&charA, 1, sizeof(char), fp);
fwrite(&charB, 1, sizeof(char), fp);
fwrite(&charC, 1, sizeof(char), fp);
//fwrite the struct
fwrite(&s, 1, sizeof(struct stuff), fp);
//fwrite the last char
fwrite(&charD, 1, sizeof(char), fp);
//read the char before the struct, the struct itself,
// and the char after the struct
char expectedCharC;
stuff expectedStructS;
char expectedCharD;
fseek(fp, sizeof(struct stuff) + sizeof(char), SEEK_END);
fread(&expectedCharC, 1, sizeof(char), fp);
fread(&expectedStructS, 1, sizeof(struct stuff), fp);
fseek(fp, sizeof(char)*3 + sizeof(struct stuff), SEEK_SET);
fread(&expectedCharD, 1, sizeof(char), fp);
cout<<expectedCharC<<" "<<expectedStructS.a<<" ";
cout<<expectedStructS.b<<" "<<expectedCharD<<endl;
fclose(fp);
return 0;
}
Instead of this result:
q 123 2111 e
I get this result:
4197174 0 e
I don't know what I'm doing wrong. I'm writing bytes to the file, reading them back and displaying them on the screen. What goes wrong?
thank you in advance
Wow, lots of problems in your code. Let's tackle them one by one.
As mentioned by unwind, the mode you're using to open the file seems to be incorrect as to what you're trying to do. For one, you're trying to read from a file that is opened for write-only.
You're using fwrite wrongly. It goes fwrite(pointer to data, size of each data, number of data, FILE pointer);.
You're using fseek wrongly. I see you're confused with the offset parameter. This offset defines a signed distance from the origin specified as the last argument to fseek. Therefore, if you're at SEEK_END, you should be moving backwards by having your offset be a negative number.
I've done these changes myself and now it works. Output: q 123 2111 e
Here's a nice little website for you too. Helped me with your problem.
Thank you for reading.
First, as has been pointed out, you must open the file in binary
mode. Even then, just dumping the bytes of a struct means
that you won't be able to read it correctly some time in the
future. But as long as you're reading from the same process, it
should be OK.
The real problem is what you are doing with all of the fseek:
before the first fread, you do an fseek beyond the end of
the file. Any read from that position is guaranteed to fail.
You really should check the status of the file, and ensure that
the fread has succeeded before accessing any of the values you
read. If it failed, accessing the variables (at least those in
stuff) is undefined behavior; most likely, you'll get some
random garbage.
Your first fseek should probably be to the beginning of the file, or
else:
fseek( fp, -(sizeof( stuff ) + 4), SEEK_BEG);
If you've just read the struct, then the second fseek is
unnecessary as well. (In your case, it means that the final
'e' is correctly read.)
You must open your file in binary mode for this to work.
FILE * fp = fopen("input.txt", "wb+");
^
|
blam!
Your wanted result is also a bit unclear, shouldn't it start with the three characters 'z', 's' and 'q', and then have the integers? Note that the integers are likely to appear byte-swapped if you're on a little-endian machine.
To help debug the code, you should add return-value checking to all I/O calls, since I/O can fail. Also note that sizeof (char) is always 1, so it's not very beneficial to write it like that.
I would like to know if I can use fread to read data into an integer buffer.
I see fread() takes void * as the first parameter. So can't I just pass an integer
buffer (typecast to void *) and then use this to read howmuchevery bytes I want to from the file, as long as the buffer is big enough ?
ie. cant i do:
int buffer[10];
fread((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
What is wrong here ?
Thanks
This should work if you wrote the ints to the file using something like fwrite ("binary" write). If the file is human-readable (you can open it with a text editor and see numbers that make sense) you probably want fscanf / cin.
As others have mentioned fread should be able to do what you want
provided the input is in the binary format you expect. One caveat
I would add is that the code will have platform dependencies and
will not function correctly if the input file is moved between
platforms with differently sized integers or different
endian-nesses (sp).
Also, you should always check your return values; fread could fail.
Yes you can use fread to read into an array of integers
int buffer[10];
size_t readElements = fread((void *)buffer, sizeof(int), 10, somefile);
for(int i = 0; i < readElements; i++)
cout << buffer[i] << endl
You can check the number of elements fread returns to print out.
EDIT: provided you are reading from a file in binary mode and the values were written as cnicutar mentioned with fwrite.
I was trying the same and was getting the same result as yours, large int value when trying to read integer using fread() from a file and finally got the reason for it.
So suppose if your input file contains only:
"5"
"5 5 5"
The details I got from http://www.programmersheaven.com/mb/beginnercpp/396198/396198/fread-returns-invalid-integer/
fread() reads binary data (even if the file is opened in 'text'-mode). The number 540352565 in hex is 0x20352035, the 0x20 is the ASCII code of a space and 0x35 is the ASCII code of a '5' (they are in reversed order because using a little-endian machine).
So what fread does is read the ASCII codes from the file and builds an int from it, expecting binary data. This should explain the behavior when reading the '5 5 5' file. The same happens when reading the file with a single '5', but only one byte can be read (or two if it is followed by a newline) and fread should fail if it reads less than sizeof(int) bytes, which is 4 (in this case).
As the reaction to response is that it still does not work, I will provide here complete code, so you can try it out.
Please note that following code does NOT contain proper checks, and CAN crash if file does not exist, there is no memory left, no rights, etc.
In code should be added check for each open, close, read, write operations.
Moreover, I would allocate the buffer dynamically.
int* buffer = new int[10];
That is because I do not feel good when normal array is taken as pointer. But whatever. Please also note, that using correct type (uint32_t, 16, 8, int, short...) should be done to save space, according to number range.
Following code will create file and write there correct data that you can then read.
FILE* somefile;
somefile = fopen("/root/Desktop/CAH/scripts/cryptor C++/OUT/TOCRYPT/wee", "wb");
int buffer[10];
for(int i = 0; i < 10; i++)
buffer[i] = 15;
fwrite((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
fclose(somefile);
somefile = fopen("/root/Desktop/CAH/scripts/cryptor C++/OUT/TOCRYPT/wee", "rb");
fread((void *)buffer, sizeof(int), 10, somefile);
// print contents of buffer
for(int i = 0; i < 10; i++)
cout << buffer[i] << endl;
fclose(somefile);
Everything I'm finding via google is garbage... Note that I want the answer in C, however if you supplement your answer with a C++ solution as well then you get bonus points!
I just want to be able to read some floats into an array from a binary file
EDIT: Yes I know about Endian-ness... and no I don't care how it was stored.
How you have to read the floats from the file completely depends on how the values were saved there in the first place. One common way could be:
void writefloat(float v, FILE *f) {
fwrite((void*)(&v), sizeof(v), 1, f);
}
float readfloat(FILE *f) {
float v;
fread((void*)(&v), sizeof(v), 1, f);
return v;
}
float f;
if(read(fd,&f,sizeof(f))==sizeof(f))
printf("%f\n",f);
else
printf("oops\n");
Provided that it's written as compatible binary representation.
read for file descriptors, fread for FILE*s and istream::read for c++ iostreams. Pick whatever pleases you:
read(fd,&f,sizeof(f))==sizeof(f)
fread(&f,sizeof(f),1,fp)==1
fin.read((char*)&f,sizeof(f)).gcount()==sizeof(f)
You could use fread. (Note the the API is for C, even though the website says C++ reference :))
Use fread() from <stdio.h>. The assertions should be replaced with actual error handling code.
#include <stdio.h>
#include <assert.h>
#define countof(ARRAY) (sizeof (ARRAY) / sizeof *(ARRAY))
float data[5];
FILE *file = fopen("foo.bin", "rb");
assert(file);
size_t n = fread(data, sizeof(float), countof(data), file);
assert(n == countof(data));
Keep in mind that you might run into endian issues if you transfer files between different architectures.
If the file is all "float" and you wanted to read it X number of times, all you have to do is this:
FILE *fp;
if((fp=fopen("filename.whatever", "rb"))==NULL)
return 0;
fseek(fp, 0, SEEK_END);
long size = ftell(fp);
fseek(fp, 0, SEEK_SET);
float *f = (float *)malloc(sizeof(float)*size);
if(f==NULL)
{
fclose(fp);
return 0;
}
if(fread(f, sizeof(float), size, fp)!=size)
{
fclose(fp);
return 0;
}
fclose(fp);
// do something with f
FILE *thisFile=fopen("filename","fb");
float myFloat;
fscanf(thisFile,"%f",&myFloat);
fclose(thisFile);
This works if the data is written using fprintf (implementation specific)
However, you can also typecast your float to int32 and save , load and typecast.
std::fstream thisFile;
thisFile.open("filename",ios::read|ios::binary);
float myFloat;
thisFile>>myFloat;
thisFile.close();
May be wrong (I haven't used the C++ F.IO functions for a loooong loooong time)
If these values are sequentially placed into a binary file you can do a read of sizeof(float) bytes per float value into a character array. You can then cast these into a float value.