How would I write a class object to a file? - c++

Alright, I have an object:
LivingObject* myPlayer=new LivingObject(...);
And I would like to write it to a file on exit. Here is what I have so far:
std::fstream myWrite;
myWrite.open("Character.dat",std::ios::binary|std::ios::app);
myWrite.write((char*)myPlayer,sizeof(myPlayer));
myWrite.close();
I watched over the file when exiting and the size did not increase at all(me assuming it didnt write). What did I do wrong?

This code writes the only the first 4 (or 8 in 64 bits) bytes of the object to file not the whole object. To write the whole object use:
myWrite.write((char*)myPlayer,sizeof(LivingObject));
As for the size of the file: some operating systems report file size as the space allocated to the file on disk, which is multiple of the physical block size. So as long as the write did not increase beyond the block size, you will not see an increase of the file size.

myPlayer is a pointer to a LivingObject
myWrite.write((char*)myPlayer,sizeof(myPlayer)); This line, you're casting a pointer to another pointer, and then saying the sizeof a pointer type (which is usually 4). So you'd be writing 4 bytes of data (the address), and not the object instead.
So instead, what you'll need to do is to serialize the class, either to a binary packed format, or another format (XML, JSON etc), and write that to the file.

Search the web for "boost serialize". The operation you are performing is called serialization.
If you want to share data between platforms, you will need to either choose a format that is not binary or write down the format, be sure to mention which multi-byte quanties are Little Endian or Big Endian.

Related

c++ How can I change the size of a void* according to a file I want to process

I am currently trying to make a program that can read a .blend file. Well trying is the important part, since I am already stuck on reading the file block info.
Im gonna quickly explain my problem, please refer this page for context
So in the .blend header there is a char that determines wheter or not the pointer size, later used in the file info block (Or just fileBlock on the linked webpage) among other things, is 4 or 8 bytes long. From what I have read, in c++ the void pointer only changes size according to the target platform it was compiled for ( 8 bytes for 64 bit and 4 bytes for 32 bits ). However .blend files can have either one, regardless of the platform I presume.
Now since blender itself does also read its own files using c, there must be a way to change the pointer to match the required pointer size, according to the info in the header. However my best guess would be to dynamically allocate a void pointer array to either one or two pointers, which then makes actually using the data even more complicated.
Please help me find the intended way of handling the different pointer sizes!
Go back to the top of the wiki page and you will find the File Header structure. The header of a blend file starts with "BLENDER" which is followed by the pointer size for the file -
Size of a pointer
All pointers in the file are stored in this format
'_' (underscore) means 4 bytes or 32 bit
'-' (minus) means 8 bytes or 64 bits.
So by reading the eighth byte of the file you know the size of the pointers in the file.
if (file_bytes[7] == "_")
ptr_size = 4;
else if (file_bytes[7] == "-")
ptr_size = 8;
The copy of blender creating the file determines the sizes used for the file, so a 32bit build will save 32bit pointers in the file while a 64 bit build will save 64bit pointers.
You should also read the next byte, it tells you whether the file was saved as big or little endian, to see if you need to do any byte swapping. The use of blender on big endian machines might be getting smaller, but you may still come across big endian files.
Another important thing that doesn't seem to be mentioned, is that blend files can be compressed and often are. Reading a compressed blend file will mean using gzread() to read the file. A compressed file has the first two bytes set to 0x1f 0x8b
You will find the code that blender uses to read blend files in source/blender/blenloader.
Yup, that's painful. The solution is not to treat them as C++ at all. Instead, create your own class BlendPointer to abstract this away. Those would be read from a BlendFile, and that BlendFile would store whether its BlendPointers are 4 or 8 bytes on disk.

Writing to a text file, binary vs ascii

So I am having the hardest time trying to understand this concept. I have a program that reads a text file, and writes it to another file and replaces the most common words with unsigned chars. But what I cannot for the life of me understand is how then do I determine the difference between the two.
If I write to the new file the original char I read in or an unsigned char value corresponding to 1-255, how then do I determine the difference when I go back in reverse to the original file contents?
When you write a file as binary, then a number such as "1253553" is written using 2 or 4 bytes (depending on the size of the int on the platform). So, in a binary file, you will see a sequence of 2 or 4 bytes representing that number. For chars, it should not make a difference as each char is represented on one byte.
Usually, you have to have some well known and obvious way to determine the format of your file.
One way to do this is to create your own file extension. You could naively expect that any file with that extension is in your compressed format, but it's actually quite likely other files out there have the same extension (e.g., ".dat" is probably a bad choice). So, you'll want to take further steps, like having the first few bytes of the file be something that is unlikely to be there in any other file (some "magic numbers"). Let's use two bytes, and let's simply choose 0xAB 0xCD as those two bytes.
So, when your program is presented with a file that has the proper extension, open it and read the first two bytes. If they're 0xAB and 0xCD, you can assume you're reading your special format.
This isn't a very strong way of accomplishing this task, but it is one way of doing it. You could get more extravagant if you like.
For more information, you might want to read the Wikipedia page on the subject. It's a start.

Portability for Binary File in C++?

I have a doubt regarding binary I/O for portability of the binary file.
Lets say the PC running my software uses 8 bytes for storing double variable.
The binary file generated will have 8 bytes for a double variable.
Now say the file is being opened in a PC which uses 6 bytes for a double variable (just assuming).
Then the application will read only 6 bytes from the file and store it in the double variable in memory.
Not only does this result in underflow/overflow of data but also the data read after the double will definitely be incorrect because of the 2 byte offset created due to under reading.
I want to support my application for not only 32/64 bit, but also Windows, Ubuntu PC's.
So how do you make sure that the data read from the same file in any PC would be the same?
In general, you should wrap data to be stored in binaries in your own data structures and implement platform independent read/write operations for those data structures - basically, size of binary data structure written to disk should be same for all platforms (max possible size of elementary data over all supported platforms).
When writing data on platform with smaller data size, data should be padded with extra 0 bytes to ensure size of recorded data stays same.
When reading, whole data can be read in fixed data blocks of known size, and conversion should be performed depending on platform it was written/it is being read on. This should take care of endianess too. You may want to include some header indicating sizes of data to distinguish between files recorded on different platforms when reading them.
this would give truly platform independent serialization for binary file.
Example for doubles
class CustomDouble
{
public:
double val;
static const int DISK_SIZE;
void toFile(std::ofstream &file)
{
int bytesWritten(0);
file.write(reinterpret_cast<const char*>(&val),sizeof(val));
bytesWritten+=sizeof(val);
while(bytesWritten<CustomDouble::DISK_SIZE)
{
char byte(0);
file.write(&byte,sizeof(byte));
bytesWritten+=sizeof(byte);
}
}
};
const int CustomDouble::DISK_SIZE = 8;
This ensures you always write 8 bytes regarding of size of double on your platform. When you read the file, you always read those 8 bytes still as binary, and do conversions if necessary depending whioch platform it was written on/ is being read on (you will probably add some small header to the file to identify platform it was recorded on)
While custom conversion does add some overhead, it is way less then those of storing values as text, and normally you will only perform conversions for incompatible platforms, while for same platform there will be no overhead.
cstdint includes type definitions that are a fixed size, so int32_t will always be 4 bytes long. You can use these in place of regular types when the size of the type is important to you.
Use Google Protocol Buffers or any other cross-platform serialization library. You can also roll out your own solution, based on fact, that char is guaranteed to be 1 byte (i.e. serialize anything into char arrays).

Most efficient to read file into separate variables using fstream

I have tons of files which look a little like:
12-3-125-BINARYDATA
What would be the most efficient way to save the 12, 3 and 125 as separate integer variables, and the BINARYDATA as a char-vector?
I'd really like to use fstream, but I don't exactly know how to (got it working with std::strings, but the BINARYDATA part was all messed up).
The most efficient method for reading data is to read many "chunks", or records into memory using the fewest I/O function calls, then parsing the data in memory.
For example, reading 5 records with one fread call is more efficient than 5 calls to fread to read in a record. Accessing memory is always faster than accessing external data such as files.
Some platforms have the ability to memory-map a file. This may be more efficient than reading the using I/O functions. Profiling will determine the most efficient.
Fixed length records are always more efficient than variable length records. Variable length records involve either reading until a fixed size is read or reading until a terminal (sentinel) value is found. For example, a text line is a variable record and must be read one byte at a time until the terminating End-Of-Line marker is found. Buffering may help in this case.
What do you mean by Binary Data? Is it a 010101000 char by char or "real" binary data? If they are real "binary data", just read the file as binary file. First read 2 bytes for the first int, next 1 bytes for -, 2 bytes for 3,and so on, until you read the first pos of binary data, just get file length and read all of it.

Reading Superblock into a C Structure

I have a disk image which contains a standard image using fuse. The Superblock contains the following, and I have a function read_superblock(*buf) that returns the following raw data:
Bytes 0-3: Magic Number (0xC0000112)
4-7: Block Size (1024)
8-11: Total file system size (in blocks)
12-15: FAT length (in blocks)
16-19: Root Directory (block number)
20-1023: NOT USED
I am very new to C and to get me started on this project I am curious what is a simple way to read this into a structure or some variables and simply print them out to the screen using printf for debugging.
I was initially thinking of doing something like the following thinking I could see the raw data, but I think this is not the case. There is also no structure and I am trying to read it in as a string which also seems terribly wrong. for me to grab data out of. Is there a way for me to specify the structure and define the number of bytes in each variable?
char *buf;
read_superblock(*buf);
printf("%s", buf);
Yes, I think you'd be better off reading this into a structure. The fields containing useful data are all 32-bit integers, so you could define a structure that looks like this (using the types defined in the standard header file stdint.h):
typedef struct SuperBlock_Struct {
uint32_t magic_number;
uint32_t block_size;
uint32_t fs_size;
uint32_t fat_length;
uint32_t root_dir;
} SuperBlock_t;
You can cast the structure to a char* when calling read_superblock, like this:
SuperBlock_t sb;
read_superblock((char*) &sb);
Now to print out your data, you can make a call like the following:
printf("%d %d %d %d\n",
sb.magic_number,
sb.block_size,
sb.fs_size,
sb.fat_length,
sb.root_dir);
Note that you need to be aware of your platform's endianness when using a technique like this, since you're reading integer data (i.e., you may need to swap bytes when reading your data). You should be able to determine that quickly using the magic number in the first field.
Note that it's usually preferable to pass a structure like this without casting it; this allows you to take advantage of the compiler's type-checking and eliminates potential problems that casting may hide. However, that would entail changing your implementation of read_superblock to read data directly into a structure. This is not difficult and can be done using the standard C runtime function fread (assuming your data is in a file, as hinted at in your question), like so:
fread(&sb.magic_number, sizeof(sb.magic_number), 1, fp);
fread(&sb.block_size, sizeof(sb.block_size), 1, fp);
...
Two things to add here:
It's a good idea, when pulling raw data into a struct, to set the struct to have zero padding, even if it's entirely composed of 32-bit unsigned integers. In gcc you do this with #pragma pack(0) before the struct definition and #pragma pack() after it.
For dealing with potential endianness issues, two calls to look at are ntohs() and ntohl(), for 16- and 32-bit values respectively. Note that these swap from network byte order to host byte order; if these are the same (which they aren't on x86-based platforms), they do nothing. You go from host to network byte order with htons() and htonl(). However, since this data is coming from your filesystem and not the network, I don't know if endianness is an issue. It should be easy enough to figure out by comparing the values you expect (e.g. the block size) with the values you get, in hex.
It's not difficult to print the data after you successfully copied data into a structure Emerick proposed. Suppose the instance of the structure you use to hold data is named SuperBlock_t_Instance.
Then you can print its fields like this:
printf("Magic Number:\t%u\nBlock Size:\t%u\n etc",
SuperBlock_t_Instance.magic_number,
SuperBlock_t_Instance.block_size);