writing structs and classes to disk

writing structs and classes to disk - c++

The following function writes a struct to a file.
#define PAGESIZE sizeof(BTPAGE)
#define HEADERSIZE 2L
int btwrite(short rrn, BTPAGE *page_ptr)
{
long addr;
addr = (long) rrn * (long) PAGESIZE + HEADERSIZE;
lseek(btfd, addr, 0);
return (write(btfd, page_ptr, PAGESIZE));
}
The following is the struct.
typedef struct {
short keycount; /* number of keys in page */
int key[MAXKEYS]; /* the actual keys */
int value[MAXKEYS]; /* the actual values */
short child[MAXKEYS+1]; /* ptrs to rrns of descendants */
} BTPAGE;
What would happen if I changed the struct to a class, would it still work the same?
If I added class functions, would the size it takes up on disk increase?

There's a lot you need to learn here.
First of all, you're treating a structure as an array of bytes. This is strictly undefined behavior due to the strict aliasing rule. Anything can happen. So don't do it. Use proper serialization (for example via boost) instead. Yes, it's tedious. Yes, it's necessary.
Even if you ignore the undefinedness, and choose to become dependant on some particular compiler implementation (which may change even in the next compiler version), there's still reasons not to do it.
If you save a file on one machine, then load it on another, you may get garbage, because the second machine uses a different float representation, or a different endianness, or has different alignment rules, etc.
If your struct contains any pointers, it's very likely that saving them verbatim then loading them back will result in an address that doesn't not point to any meaningful place.
Typically when you add a member function, this happens:
the function's machine code is stored in a place shared by all the class instances (it wouldn't make sense to duplicate it, since it's logically immutable)
a hidden "this" pointer is passed to the function when it's called, so it knows which object it's been called on.
none of this requires any storage space in the instances.
However, when you add at least one virtual function, the compiler typically needs to also add a data chunk called a vtable (read up on it). This makes it possible to call different code depending on the current runtime type of the object (aka polymorphism). So the first virtual function you add to the class likely does increase the object size.

In C++, the difference between a struct and a class is simply that the members and base classes of a struct are public by default, whereas for a class they are private by default.
The technique of simply writing the bytes of the struct to a file and then reading them back in again only works if the struct is a plain old data, or POD, type. If you modify your struct such that it is no longer POD, this technique is not guaranteed to work (the rules describing what makes a POD struct are listed in answers to thet linked question).

If the class has any virtual function, then you're in trouble; if no virtual functions, you should still be OK (the same applies to a struct, of course, since it, too, could have virtual functions: the difference between struct and class is just that the default visibility in struct is public, in class it's private).

If you are doing more serialisation of classes consider using google protocol buffers, or something similar see this question

Related

Memory-mapped C++ objects non hardware members

I am developing a driver for a piece of memory mapped hardware using C++ and I defined a class that represents this device. It looks something like this:
class device
{
method1();
method2();
private:
device_register reg1;
device_register reg2;
}
The class has two files: device.cpp and device.h. The member variables represent actual registers on the device itself. Now I want to define more members that are not actual registers on the device itself, but I cannot define them inside the class because if I do, they will be defined at the memory mapped location of the device, which could contain other devices/registers. if I define them as public, that breaks the standard layout and the class won't work anymore.
So what I did is that I defined them as global variables outside the class definition. The problem is if I define more than one object, they will all share these global variables, but I want each object to have its own, how can I do this?

Perhaps you can use a separate class with a pointer to current "device"
class device_glob {
private :
device * mmaped_device;
int other_attrib;
};
You can allocate in mmaped location the device in the ctor probably. And then you need to integrate this in whatever registry system you are using to get references/pointers to a device currently.

You should create a separate POD (Plain Old Data) struct for the registers, because a non-POD class/struct could ruin your memory mapping even without adding extra data members. In POD types, the offset of the first data member is guaranteed to be 0, which is what you need. In non-POD types this contract is not always followed, e.g. it may be disrupted by vtables or RTTI. True, you could dutifully avoid only the changes that would change the offset of the first data member, but a more robust solution is to stick to all the POD requirements so that the standard requires your compiler to do what you want.
The requirements for a POD type can be found at cppreference here, but an easy mnemonic is that a struct with nothing but native type members will always be POD.
You should also make sure the registers are marked "volatile," though I have a hunch that if you already have a special "device_register" typedef, it includes a volatile specifier. In which case, you don't need an extra one in your own struct.
Example:
struct MyPeripheralRegisters
{
volatile device_register reg1;
volatile device_register reg2;
};
class MyPeripheralDriver
{
private:
MyPeripheralRegisters* hardware = MY_MEMORY_MAPPED_ADDR;
int my_private_state;
public:
void method1();
void method2();
};
This pattern for memory mapped components is often used in embedded driver libraries, so it has the principle of least astonishment on its side.

Are classes larger in memory than their members in C++?

Let's say I have some class who's only member is an int. If it wasn't in a class, the int alone would be 4 bytes. Does the class take more than 4 bytes of memory (in C++)?

The decision about how big a class ends up being is implementation-specific and depends on a lot of different factors. Sometimes, due to structure and class padding, a class might end up bigger than the size of its members. If you have any virtual functions in your class, then you'll typically end up with a virtual function table pointer (vtable pointer) at the front of the class that adds a bit of space. And it's entirely possible that the compiler might just For The Heck Of It make your class bigger than the size of its members if it think it will help out in some way (or if you have a lazy compiler!)
In your case, with a single 32-bit integer, I'd be surprised if the class ended up being any larger than the integer itself, since you aren't using any virtual functions and there aren't any members to insert padding bytes between. However, you cannot necessarily rely on this across systems.
If you're working on an application where it's absolutely essential that your class be the same size as the fields - perhaps, for example, if you're reading raw bytes and want to reinterpret them as class objects - you could use a static_assert to check for this:
class MyClass {
...
};
static_assert(sizeof(MyClass) == sizeof(int), "MyClass must have the same size as an integer.");
Many compilers have custom options (often through #pragma directives) that you can tune to ensure that classes get sized in a way that you'd like, so you could also consider reading up on that.

The actual size is implementation-dependent, so it can change across different compilers and architectures due to padding and other implementation details. Never trust a simple sum like in the following pseudocode:
size = sizeof(member1) + ... + sizeof(memberN)
Also if the class has virtual functions, yes, it can be more than 4 bytes.
Moreover, in the case of virtual functions and class inheritance the size can be complicated to be understood at first sight:
Each class that include virtual functions will store a vtable in memory with function pointers to these virtual functions.
Class A, with virtual functions, that inherit from another class B, that has virtual functions too, could need more than one table to store both A and B function pointers.
See this answer for more details: how to determine sizeof class with virtual functions?

Choosing data for parcelable in C++

In a previous program that I have written in C I needed a single object with several "core" data in it that can be accessed by all the functions in my program, I end up picking a struct and i have used a pointer to this struct for reading or writing data; it was fast and good for the job, also it was cheap because accessing a pointer is probably one of the cheapest thing that you can do in C and I have never found something better so I'm happy with this solution.
Now in C++ I have the same problem, I need to share a state composed of some primitive types, I'm tempted to use one of the so called POD, which basically mean, struct, again, but this time with references for safety.
Supposing that I need this "Blob" of data to be carried around my program, a struct accessed by reference is the fastest thing in C++? How much a getter methods can cost?

If your getter code is inline (in the header file), then the compiler can eliminate the need to call a function in the machine code it outputs.
eg:
class Data
{
private:
int number_;
public:
int GetNumber() { return number_; }
};
The compiler will see GetNumber's definition, will know what it does is simple and and where you've called GetNumber(), it will simply replace it with number_. So, using a getter versus accessing the member directly will result in the equivalent code, and both will perform the same.

How to implement virtual table in c++

Virtual table is arrary of function pointers.
How can i implement it as every function has different signature ?

You don't implement it.
The compiler generates it (or something with equivalent functionality), and it's not constrained by the type system so it can simply store the function addresses and generate whatever code is needed to call them correctly.
You can implement something vaguely similar using a struct containing different types of function pointer, rather than an array. That's quite a common way of implementing dynamic polymorphism in C; for example, the Linux kernel provides polymorphic behaviour for file-like objects by defining an interface along the lines of:
struct fileops {
int (*fo_read) (struct file *fp, ...);
int (*fo_write) (struct file *fp, ...);
// and so on
};

If functions in a virtual table have different signatures, you'll have to implement it as a structure type containing members with heterogeneous types.
Alternately, if you have other information telling you what the signatures are, you can cast a function pointer to another function pointer type, as long as you cast it back to the correct type before calling it.

If you know every function at compile time, then you could use a struct of differently typed function pointers (however, if you know every function at compile time, why wouldn't you just use a class with virtual methods?).
If you want to do this at runtime, then an array of void* would probably suffice. You'd need to cast the pointers in when you store them and out (to the correct type) again before you call them. Of course, you'll need to keep track of the function types (including calling convention) somewhere else.
Without knowing what you're planning to do with this it's very difficult to give a more useful answer.
There are valid reasons for implementing vtables in code. They're an implementation detail though, so you'll need to be targeting a known ABI rather than just 'C++'. The only time I've done this was an experiment to dynamically create new COM classes at runtime (the ABI expected of a COM object is a pointer to a vtable that contains functions following the __stdcall calling convention where the first 3 functions implement the IUnknown interface).

Dynamic structures in C++

I am running a simulation in which I have objects of a class which use different models. These models are randomly selected for some objects of the class and specifically decided for some objects too. These objects communicate with each other for which I am using structures (aka struct) in C++ which has some
standard variables and
some additional variables which depends on models which the objects communicating with each other have.
So, how can I do this?
Thanks in advance.

You can hack around with:
the preprocessor;
template meta-programming;
inheritance/polymorphism.
Each gives a different way of producing a different user-defined type, based on different kinds of conditions.
Without knowing what you're trying to accomplish, this is the best I can do.

All instances of a structure or class have the same structure. Luckily, there are some tricks that can be used to 'simulate' what you try to do.
The first trick (which can also be used in C), is to use a union, e.g.:
struct MyStruct
{
int field1;
char field2;
int type;
union
{
int field3a;
char field3b;
double field3c;
} field3;
};
In a union, all members take up the same space in memory. As a programmer you have to be careful. You can only get out of the union what you put in. If you initialize one member of a union, but you read another member, you will probable get garbage (unless you want to do some low-level hacks, but don't do this unless you are very experienced).
Unions often come together with another field (outside the union) that indicates which member is actually used in the union. You could consider this your 'condition'.
A second trick is use the 'state' pattern (see http://en.wikipedia.org/wiki/State_pattern). From the outside world, the context class looks always the same, but internally, the different states can contain different kinds of information.
A somewhat simplified approach for state is to use simple inheritance, and to use dynamic casts. Depending on your 'condition', use a different subclass, and perform a dynamic cast to get the specific information.
E.g., suppose that we have a Country class. Some countries have a president, others have a king, others have an emperor. You could something like this:
class Country
{
...
};
class Republic : public Country
{
public:
const string &getPresident() const;
const string &getVicePresident() const;
};
class Monarchy : public Country
{
public:
const string &getKing() const;
const string &getQueen() const;
};
In your application you could work with pointers to Country, and do a dynamic cast to Republic or Monarchy where the president or king is needed.
This example can be easily transformed into one using the 'state' pattern, but I leave this as an exercise for you.
Personally, I would go for the state pattern. I'm not a big fan of dynamic casts and they always seem to be kind-of-hack for me.

If it's at compile-time, a simple #ifdef or template specialization will serve this purpose just fine. If it's at run-time and you need value semantics, you can use a boost::optional<my_struct_of_optional_members>, and if you're fine with reference semantics, inheritance will solve the problem at hand.
A union and that kind of dirty trick is not necessary.

There are several common approaches for "dynamic" attributes/properties in languages, and a few that tend to work well in C++.
For example, you can make a C++ class called "MyProperties" that has a sparse set of values, and your MyStructureClass would have its well-known members, plus a single MyProperties instance which may have zero-or-more values.
Similarly, languages like Python and Perl make extensive use of Associative Arrays/Dictionaries/Hashes to achieve this: The (string) key uniquely identifies the value. In C++, you can index your MyProperties class with a string or any type you want (after overloading the operator[]()), and the value can be a string, a MyVariant, or any other pointer-or-type that you want to inspect. The values are dynamically added to the parent container as they are assigned (e.g., the class "remembers" the last value it is given, uniquely identified by key).
Finally, in the "olden days", what you describe was commonly done for distributed application processing: You defined a C-struct with "well-known" (typed) fields/members, and the last field was a char* member. Then, that char* member would identify the start of a serialized stream of bytes that were also part of that struct (you merely serialized that array of chars when you marshalled the struct across systems). In the context of C++, you could similarly extract your values dynamically from that char* stream buffer on-access-demand (which logically should be "owned" by the class). This worked for marshalling across systems because the size of the struct was the size of everything (including the last char* member), but the "allocation" for that struct was much larger (e.g., the size of the struct itself, which was logically a "header", plus a certain number of bytes after that header, which represented the "payload" and which was indexed by the last member, the char* member.) Thus, it was a contiguous-block-of-memory struct, with dynamic size. (This would also work in C++ as long as you passed-by-reference, and never by value.)

embed an union into your structure, and use a flag to tell which part of the union is valid.
enum struct_type
{
cool,
fine,
bad
};
struct demo
{
struct_type type;
union
{
struct
{
double cool_factor;
} cool_part;
struct
{
int fineness;
} fine_part;
struct
{
char *bad_stuff;
} bad_part;
};
struct
{
int life_is_cool;
} common_part;
};

The pure and simple C++ answer is: use classes.
I can't determine from your question what you are trying to achieve: runtime variation or compile time variation, but either way, I doubt you'll get a workable implementation any other way. (Template metaprogramming aside... which isn't for the faint of heart.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

writing structs and classes to disk - c++

If you are doing more serialisation of classes consider using google protocol buffers, or something similar see this question

Related

Memory-mapped C++ objects non hardware members

Are classes larger in memory than their members in C++?

Choosing data for parcelable in C++

How to implement virtual table in c++

Dynamic structures in C++

Categories

Resources