Problems with fwrite - c++

I'm doing an external merge sort for an assignment and I'm given two structs:
// This is the definition of a record of the input file. Contains three fields, recid, num and str
typedef struct {
unsigned int recid;
unsigned int num;
char str[STR_LENGTH];
bool valid; // if set, then this record is valid
int blockID; //The block the record belongs too -> Used only for minheap
} record_t;
// This is the definition of a block, which contains a number of fixed-sized records
typedef struct {
unsigned int blockid;
unsigned int nreserved; // how many reserved entries
record_t entries[MAX_RECORDS_PER_BLOCK]; // array of records
bool valid; // if set, then this block is valid
unsigned char misc;
unsigned int next_blockid;
unsigned int dummy;
} block_t;
Also I'm given this:
void MergeSort (char *infile, unsigned char field, block_t *buffer,
unsigned int nmem_blocks, char *outfile,
unsigned int *nsorted_segs, unsigned int *npasses,
unsigned int *nios)
Now, at phase 0 I'm allocating memory like this:
buffer = (block_t *) malloc (sizeof(block_t)*nmem_blocks);
//Allocate disc space for records in buffer
record_t *records = (record_t*)malloc(nmem_blocks*MAX_RECORDS_PER_BLOCK*sizeof(record_t));
And then after I read the records from a binary file (runs smoothly), I write them to multiple files (after sorting of course and some other steps) with this command:
outputfile = fopen(name.c_str(), "wb");
fwrite(records, recordsIndex, sizeof(record_t), outputfile);
and read like this:
fread(&buffer[b].entries[rec],sizeof(record_t),1,currentFiles[b])
And it works! Then I want to combine some of these files to produce a larger sorted file using a priority_queue turned to minheap (it's tested, it works), but when I try to write to files using this command:
outputfile = fopen(outputName.c_str(), "ab"); //Opens file for appending
fwrite(&buffer[nmem_blocks-1].entries, buffer[nmem_blocks-1].
nreserved, sizeof(record_t), outputfile);
It writes nonsense in the file, as if it reads random parts of memory.
I know that the code is probably not nearly enough, but all of it is quite large.
I'm making sure I'm closing the output file before I open it again using a new name. Also I use memset() (and not free()) to clear the buffer before I fill it again.

Finally the main problem was the way I was trying to open the file:
outputfile = fopen(outputName.c_str(), "ab"); //Opens file for appending
Instead I should have used again:
outputfile = fopen(outputName.c_str(), "wb"); //Opens file for writing to end of file
Because the file in the meantime never closed, so it was trying to open an already open file for appending and it didn't work quite well. But you couldn't have known since you didn't have the full code. Thank you for your help though! :)

Related

create a buffer to store binary data to save to file later

I'm reading a file in binary mode with std::ifstream. Here is the way I'm reading from the file:
enum class Resources : uint64_t
{
Resource_1 = 0xA97082C73B2BC4BB,
Resource_2 = 0xB89A596B420BB2E2,
};
struct ResourceHeader
{
Resources hash;
uint32_t size;
};
int main()
{
std::ifstream input(path, std::ios::binary);
while (true)
{
if (input.eof()) break;
ResourceHeader RCHeader{};
input.read((char*)&RCHeader, sizeof(ResourceHeader));
uint16_t length = 0;
input.read((char*)&length, sizeof(uint16_t));
std::string str;
str.resize(length);
input.read(&str[0], length); /* here I get a string from binary file */
/* and some other reading */
}
}
After reading, I want to make some changes in some data that I read from the file, and after all changes then write the changed data in a new file.
So I want to know, how can I store the edited char into a buffer (BTW, I don't know the exact size of the buffer with edits)? Also, I need to be able to go back and forth in new created buffer and edit some data again.
So, how can I achieve this?

Fetch and output binary data from a file

I've been trying in several ways to tackle this problem, but it is likely that I misunderstood how I'm supposed to actually read the data from the file, here's what I came up with:
#define LENGTH 0xF8C
.
.
.
unsigned short int eval_checksum()
{
unsigned short int checksum=255;
int seekhead=0x2598;
char * buffer = new char [10000];
ifstream binaryFile(SAVEFILE_NAME, ios::in | ios::out | ios::binary);
binaryFile.read (buffer,LENGTH); //how does it read???
checksum-=int(buffer[0]);
return checksum;
}
which is obviously not correct. Before this I tried with a for loop
#define SEEK seekhead+i
unsigned short int eval_checksum()
{
...
for(int i=0;i<0xF8C;i++)
{
binaryFile.read(buffer,SEEK);
checksum-=buffer[0];
cout<<checksum<<endl;
}
}
Again, no luck. I don't need to store the whole thing in a buffer, so I figured I could just store the last byte on an array of size 1. The only thing I need to do is to sequentially read every byte and calculate the checksum accordingly (start from 255 and subtract the value of each byte).
I managed to successfully locate a byte in a specific offset and change it, but I couldn't really read and output what's in there.
What would be an elegant way to read each byte?

ifstream not reading the same characters as they are written in the file

console
file
Simple explanation: ifstream's get() is reading the wrong chars (console is different from file) and I need to know why.
I am recording registers into a file as a char array. When I write it to the file, it writes successfully. I open the file and find the chars I intended, except notepad apparently shows unicode character 0000 ( NULL) as a space.
For instance, the entries
id = 1000; //an 8-byte long long
name = "stack"; //variable size
surname = "overflow"; //variable size
degree = "internet"; //variable size
sex = 'c'; //1-byte char
birthdate = 256; //4-byte int
become this on the file:
& èstackoverflowinternetc
or, putting the number of unicode characters that disappear when posted here between brackets:
&[3]| [1]è|stack|overflow|internet|c| [1] | //separating each section with a | for easier reading. Some unicode characters disappear when I post them here, but I assure you they are the correct ones
SIZE| ID | name| surname| degree |g| birth
(writing is working fine and puts the expected characters)
Trouble is, when the console in the code below prints what the buffer is reading from the file, it gives me the following record (extra spaces included)
Þstackoverflowinternetc
Which is bad because it returns me the wrong ID and birthdate. Either "-21" and "4747968" or "Ù" and "-1066252288". Other fields are unnaffected. Weird because size bytes show up as empty space in the console, so it shouldn't be able to split name, surname, degree and sex.
ifstream infile("alumni.freire", ios::binary);
if(infile.is_open()){
infile.seekg(pos, ios::beg);
int size;
size = infile.get();
char charreg[size];
charreg[0] = size;
//testing what buffer gives me
for(int i = 1; i < size; i++){
charreg[i] = infile.get();
cout << charreg[i];
}
}
What am I doing wrong?
EDIT: to explain better what I did:
I get the entries on the first "code" from user input and use them as parameters when creating a "reg" class I implemented. The reg class then does (adequatly, I've already tested it) the conversion to strings, and calculates a hidden four-element char array containing instance size, name size, surname size and degree size. When the program writes the class on-file, it is written perfectly, as I showed in the second "code" section. (If you do the calculations you'll see '&' equals the size of the entire thing, for example). When I read it from the file, it appears differently on console for some reason. Different characters. But it reads the right amount of characters because "name", "surname" and "degree" appear correctly.
EDIT n2: I made "charreg[]" into an int array and printed it and the values are correct. I have no idea what's happening anymore.
EDIT n3: Apparently the reason I was getting the wrong chars is that I should have used unsigned chars...
The idea to write, as is, your structure is good. But your approach is wrong.
You must have something to separate your fields.
For example you know that your ID is 8 byte long, great ! You can read 8 bytes :
long long id;
read(fd, &id, 8);
In your example you got -24 because you read the first byte of the full id number.
But for the rest of the file, how can you know the length of the first name and the last name ?
You could read byte by byte until you find an null byte.
But I suggest you to use a more structured file.
For example, you can define a structure like this :
long long id; // 8 bytes
char firstname[256]; // 256 bytes
char lastname[256]; // 256 bytes
char sex; // 1 byte
int birthdate; // 4 bytes
With this structure you can read and write super easily :
struct my_struct s;
read(fd, &s, sizeof(struct my_struct)); // read 8+256+256+1+4 bytes
s.birthdate = 128;
write(fd, &s, sizeof(struct my_struct));// write the structure
Of course you loose the "variable length" of the first name and last name. Do you really need more than 100 chars for a name ?
In a case you really need, you could introduce an header over each variable length value. But you loose the ability to read everything at once.
long long id;
int foo_size;
char *foo;
And then to read it :
struct my_struct s;
read(fd, &s, 12); // read the header, 8 + 4 bytes
char foo[s.foo_size];
read(fd, &s, s.foo_size);
You should define what exactly you need to save. Define a precise data structure that you can easily deduce at read, avoid things like "oh, let's read until null-byte".
I used C function to explain you because it's much more representative. You know what you read and what you write.
Start to play with this, and then try the same with c++ streams/function
I don't know how you are writing back information to the file but here is how I would do that, I'm hoping this is a fairly simple way of doing it. Keep in mind I have no idea what kind of file you are actually working with.
long long id = 1000;
std::string name = "name";
std::string surname = "overflow";
std::string degree = "internet";
unsigned char sex = 'c';
int birthdate = 256;
ofstream outfile("test.txt", ios::binary);
if (outfile.is_open())
{
const char* idBytes = static_cast<char*>(static_cast<void*>(&id));
const char* nameBytes = name.c_str();
const char* surnameBytes = surname.c_str();
const char* degreeBytes = degree.c_str();
const char* birthdateBytes = static_cast<char*>(static_cast<void*>(&birthdate));
outfile.write(idBytes, sizeof(id));
outfile.write(nameBytes, name.length());
outfile.write(surnameBytes, surname.length());
outfile.write(degreeBytes, degree.length());
outfile.put(sex);
outfile.write(birthdateBytes, sizeof(birthdate));
outfile.flush();
outfile.close();
}
and here is how I am going to output it, which to me seems to be coming out as expected.
ifstream infile("test.txt", std::ifstream::ate | ios::binary);
if (infile.is_open())
{
std::size_t fileSize = infile.tellg();
infile.seekg(0);
for (int i = 0; i < fileSize; i++)
{
char c = infile.get();
std::cout << c;
}
std::cout << std::endl;
}

How to check whether ifstream is end of file in C++

I need to read all blocks of one large file(about 10GB) sequentially, the file contains many floats with a few strings, like this(each item splited by '\n'):
6.292611
-1.078219E-266
-2.305673E+065
sod;eiwo
4.899747e-237
1.673940e+089
-4.515213
I read MAX_NUM_PER_FILE items each time and process them and write to another file, but i don't know when the ifstream is ended.
Here is my code:
ifstream file_input(path_input); //my file is a text file, but i tried both text and binary mode, both failed.
if(file_input)
{
file_input.seekg(0,file_input.end);
unsigned long long length = file_input.tellg(); //get file size
file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.tellg()<length)
{
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c; //get a complete item
//process with buffer...
itoa(i++,tmp,10); //int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,j);
file_output.close();
}
file_input.close();
delete[] buffer;
}
My code goes wrong, length is bigger than real file size. I have tried file_input.good() or !file_input.eof(), they didn't work, getline(file_input,s) is good, but it is much slower than read, i want read, but i don't know how to check whether ifstream is end-of-file.
I do my work in WINDOWS 7 with VS2010.
I have searched, but there are not any answer about it, How to open a file using ifstream and keep reading it until the end this link can't answer my question.
Update, Problem solved
Hi everyone, I have figured it out that it's my fault. Both while(file_input.tellg()<length) and while(file_input.peek()!=EOF) work fine! while(file_input.peek()!=EOF) is recommended.
The extra items written after the end-of-file is the left items in buffer written in the last time.
Here is the correct code:
ifstream file_input(path_input);
if(file_input)
{
//file_input.seekg(0,file_input.end);
//unsigned long long length = file_input.tellg(); //get file size
//file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.peek()!=EOF)
{
memset(buffer,0,sizeof(char)*(MAX_NUM_PER_FILE+MAX_NUM_PER_LINE)); //clear first!
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c;
itoa(i++,tmp,10);//int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,strlen(buffer)); //use the correct buffer size instead of j
file_output.close();
}
file_input.close();
delete[] buffer;
}
while( file_input.peek() != EOF )
{
// code
}
Basically peek() will read the next char without extracting it.
So you can simply compare it to EOF.

Load a formatted binary file and assign information to structure c++

I've finally figured out how to write some specifically formatted information to a binary file, but now my problem is reading it back and building it back the way it originally was.
Here is my function to write the data:
void save_disk(disk aDisk)
{
ofstream myfile("disk01", ios::out | ios::binary);
int32_t entries;
entries = (int32_t) aDisk.current_file.size();
char buffer[10];
sprintf(buffer, "%d",entries);
myfile.write(buffer, sizeof(int32_t));
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (const file_node& aFile)
{
myfile.write(aFile.name, MAX_FILE_NAME);
myfile.write(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
}
and my structure that it originally was created with and what I want to load it back into is composed as follows.
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node(){};
};
struct disk
{
vector<file_node> current_file;
};
I don't really know how to read it back in so that it is arranged the same way, but here is my pathetic attempt anyway (I just tried to reverse what I did for saving):
void load_disk(disk aDisk)
{
ifstream myFile("disk01", ios::in | ios::binary);
char buffer[10];
myFile.read(buffer, sizeof(int32_t));
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (file_node& aFile)
{
myFile.read(aFile.name, MAX_FILE_NAME);
myFile.read(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
}
^^ This is absolutely wrong. ^^
I understand the basic operations of the ifstream, but really all I know how to do with it is read in a file of text, anything more complicated than that I'm kind of lost.
Any suggestions on how I can read this in?
You're very close. You need to write and read the length as binary.
This part of your length-write is wrong:
char buffer[10];
sprintf(buffer, "%d",entries);
myfile.write(buffer, sizeof(int32_t));
It only writes the first four bytes of whatever the length is, but the length is character data from a sprintf() call. You need to write this as a binary-value of entries (the integer):
// writing your entry count.
uint32_t entries = (uint32_t)aDisk.current_file.size();
entries = htonl(entries);
myfile.write((char*)&entries, sizeof(entries));
Then on the read:
// reading the entry count
uint32_t entries = 0;
myFile.read((char*)&entries, sizeof(entries));
entries = ntohl(entries);
// Use this to resize your vector; for_each has places to stuff data now.
aDisk.current_file.resize(entries);
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (file_node& aFile)
{
myFile.read(aFile.name, MAX_FILE_NAME);
myFile.read(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
Or something like that.
Note 1: this does NO error checking nor does it account for portability for potentially different endian-ness on different host machines (a big-endian machine writing the file, a little endian machine reading it). Thats probably ok for your needs, but you should at least be aware of it.
Note 2: Pass your input disk parameter to load_disk() by reference:
void load_disk(disk& aDisk)
EDIT Cleaning file_node content on construction
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node()
{
memset(name, 0, sizeof(name));
memset(data, 0, sizeof(data));
}
};
If you are using a compliant C++11 compiler:
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node() : name(), data() {}
};