When we declare object of a class is its memory layout successive(One after the other)?If its successive than does padding occurs in it (like structure padding)?Please help me out with the concepts of memory layout for a class
Thanks in advance.
When we declare object of a class is
its memory allocation successive(One
after the other)?
The Standard doesn't give any such guarantee. Object memory layout is implementation-defined.
Usually, memory address for data members increases in the order they're defined in the class . But this order may be disrupted at any place where the access-specifiers (private, protected, public) are encountered. This has been discussed in great detail in Inside the C++ Object Model by Lippman.
An excerpt from C/C++ Users Journal,
The compiler isn't allowed to do this
rearrangement itself, though. The
standard requires that all data that's
in the same public:, protected:, or
private: must be laid out in that
order by the compiler. If you
intersperse your data with access
specifiers, though, the compiler is
allowed to rearrange the
access-specifier-delimited blocks of
data to improve the layout, which is
why some people like putting an access
specifier in front of every data
member.
Interesting, isn't it?
Related
Looking around I found many places where the way to get the size of a certain object (class or struct) is explained. I read about the padding, about the fact that virtual function table influences the size and that "pure method" object has size of 1 byte. However I could not find whether these are facts about implementation or C++ standard (at least I was not able to find all them).
In particular I am in the following situation: I'm working with some data which are encoded in some objects. These objects do not hold pointers to other data. They do not inherit from any other class, but they have some methods (non virtual). I have to put these data in a buffer to send them via some socket. Now reading what I mentioned above, I simply copy my objects on the sender buffer, noticing that the data are "serialized" correctly, i.e. each member of the object is copied, and methods do not affect the byte structure.
I would like to know if what I get is just because of the implementation of the compiler or if it is prescribed by the standard.
The memory layouts of classes are not specified in the C++ standard precisely. Even the memory layout of scalar objects such as integers aren't specified. They are up to the language implementation to decide, and generally depend on the underlying hardware. The standard does specify restrictions that the implementation specific layout must satisfy.
If a type is trivially copyable, then it can be "serialised" by copying its memory into a buffer, and it can be de-it serialised back as you describe. However, such trivial serialisation only works when the process that de-serialises uses the same memory layout. This cannot generally be assumed to be the case since the other process may be running on entirely different hardware and may have been compiled with a different (version of) compiler.
You should use POD (plain-old-data). A structure is POD if it hasn't virtual table, some constructors, private methods and many other things.
There is garantee the pod data is placed in memory in declaration order.
There is alignment in pod data. And you should specify right alignment (it's your decision). See #pragma pack (push, ???).
So I have read this about if class definitions occupy memory and this about if function occupy memory. This is what I do not get: How come class definitions do not occupy memory if functions do, or their code does. I mean, class definitions are also code, so shouldn't that occupy memory just like function code does?
It is not entirely correct to say that class definitions do not occupy memory: any class with member functions may place some code in memory, although the amount of code and its actual placement depends heavily on function inlining.
The Q&A at the first link talks about sizeof, which shows a per-instance memory requirement of the class, which excludes memory requirements for storing member functions, static members, inlined functions, dispatch tables, and so on. This is because all these elements are shared among all instances of the class.
You don't need to keep the class definition anywhere, because the details of how to create an instance of a class are encoded in its constructors.
(In a sense, the class definition is code, it's just not represented explicitly.)
All you need to know in order to create an object is
How big it is,
Which constructor to use for creating it, and
Which its virtual functions are.
To create an instance of class A:
Reserve a piece of memory of size sizeof(A) (or be handed one),
Associate that piece of memory with the virtual functions of A, if any (usually held in a table in a predetermined location), and
Tell the relevant A constructor where the A should be created, and then let it do the actual work.
You don't need to know a thing about the types of member variables or anything like that, the constructors know what to do once they know where the object is to be created.
(Every member variable can be found at an offset from the beginning of the object, so the constructor knows where things must be.)
To create a function, on the other hand, you would need to store its definition in some form and then generate the code at runtime. (This is usually called "Just-in-time" compilation.)
This requires a compiler, which means that you need to either
Include a compiler in every executable, or
Provide (or require everyone to install) a shared compiler for all executables (Java VMs usually contain at least one).
C++ compilers instead generate the functions in advance.
Abusing terminology a little, you could say that the functions are "instantiated" by the compilation process, with the source code as a blueprint.
This question is likely a "what does the C++ standard say" thing, but my Google searching hasn't given me the answer I'm looking for.
I know that when you have classes, and you have one class inherit from another class, you get into the world of virtual function tables, since the code needs to figure out which class contains the function you're trying to call.
But what about inheritance between structs that only contain data? For example, if you have a widget struct, and then you want a specialized version of that struct that has a few extra variables, but you still want to be able to pass its original data to functions that handle widgets, it would be simpler to inherit from the original widget struct than to make your code handle two types of widget structs. Is there any overhead when there is only data involved in the inheritance? Is the specialized widget still a simple struct (in terms of memory layout) with both data combined, or is the original widget data stored separate from the new data?
Ultimately, I'd like to keep my data simple and contiguous, as a basic struct would be, and I don't know if inheriting data would break that.
In the C++ memory model an object is always laid out in contiguous memory. You need to use members pointing to data outside this object if you want to have non-contiguous memory. That is, if you inherit any class whether it is a struct or has virtual function, the actual object is always contiguous. There are few other implications about types which may be of interested: if a class is a standard layout type you can e.g. memcpy() the object. I'm not sure what C++2011 says about inheritance and standard layout type but I'm pretty sure that C++2003 didn't allow inheritance and C++2011 allows it.
know that when you have classes, and you have one class inherit from another class, you get into the world of virtual function tables
only if you have virtual functions...
so to answer your question: if you have a plain struct without member functions, then the compiler won't generate a virtual function table.
and BTW you shouldn't be worrying about it, that table is per class, and you only need a simple extra pointer per instance (if you use simple inheritance).
I have several questions to ask that pertains to data position and alignment in C++. Do classes have the same memory placement and memory alignment format as structs?
More specifically, is data loaded into memory based on the order in which it's declared? Do functions affect memory alignment and data position or are they allocated to another location? Generally speaking, I keep all of my memory alignment and position dependent stuff like file headers and algorithmic data within a struct. I'm just curious to know whether or not this is intrinsic to classes as it is to structs and whether or not it will translate well into classes if I chose to use that approach.
Edit: Thanks for all your answers. They've really helped a lot.
Do classes have the same memory placement and memory alignment format
as structs?
The memory placement/alignment of objects is not contingent on whether its type was declared as a class or a struct. The only difference between a class and a struct in C++ is that a class have private members by default while a struct have public members by default.
More specifically, is data loaded into memory based on the order in
which it's declared?
I'm not sure what you mean by "loaded into memory". Within an object however, the compiler is not allowed to rearrange variables. For example:
class Foo {
int a;
int b;
int c;
};
The variables c must be located after b and b must be located after a within a Foo object. They are also constructed (initialized) in the order shown in the class declaration when a Foo is created, and destructed in the reverse order when a Foo is destroyed.
It's actually more complicated than this due to inheritance and access modifiers, but that is the basic idea.
Do functions affect memory alignment and data position or are they
allocated to another location?
Functions are not data, so alignment isn't a concern for them. In some executable file formats and/or architectures, function binary code does in fact occupy a separate area from data variables, but the C++ language is agnostic to that fact.
Generally speaking, I keep all of my memory alignment and position
dependent stuff like file headers and algorithmic data within a
struct. I'm just curious to know whether or not this is intrinsic to
classes as it is to structs and whether or not it will translate well
into classes if I chose to use that approach.
Memory alignment is something that's almost automatically taken care of for you by the compiler. It's more of an implementation detail than anything else. I say "almost automatically" since there are situations where it may matter (serialization, ABIs, etc) but within an application it shouldn't be a concern.
With respect with reading files (since you mention file headers), it sounds like you're reading files directly into the memory occupied by a struct. I can't recommend that approach since issues with padding and alignment may make your code work on one platform and not another. Instead you should read the raw bytes a couple at a time from the file and assign them into the structs with simple assignment.
Do classes have the same memory placement and memory alignment format as structs?
Yes. Technically there is no difference between a class and a struct. The only difference is the default member access specification otherwise they are identical.
More specifically, is data loaded into memory based on the order in which it's declared?
Yes.
Do functions affect memory alignment and data position or are they allocated to another location?
No. They do not affect alignment. Methods are compiled separately. The object does not contain any reference to methods (to those that say virtual tables do affect members the answer is yes and no but this is an implementation detail that does not affect the relative difference between members. The compiler is allowed to add implementation specific data to the object).
Generally speaking, I keep all of my memory alignment and position dependent stuff like file headers and algorithmic data within a struct.
OK. Not sure how that affects anything.
I'm just curious to know whether or not this is intrinsic to classes as it is to structs
Class/Structs different name for the same thing.
and whether or not it will translate well into classes if I chose to use that approach.
Choose what approach?
C++ classes simply translate into structs with all the instance variables as the data contained inside the structs, while all the functions are separated from the class and are treated like functions with accept those structs as an argument.
The exact way instance variables are stored depends on the compiler used, but they generally tend to be in order.
C++ classes do not participate in "persistence", like binary-mode structures, and shouldn't have alignment attached to them. Keep the classes simple.
Putting alignment with classes may have negative performance benefits and may have side effects too.
May order of members in binary architecture of objects of a class somehow have an impact on performance of applications which use that class? and I'm wondering about how to decide order of members of PODs in case the answer is yes since programmer defines order of members via order of their declaraions
Absolutely. C++ guarantees that the order of objects in memory is the same as the order of declaration, unless an access qualifier intervenes.
Objects which are directly adjacent are more likely to be on the same cacheline, so one memory access will fetch them both (or flush both from the cache). Cache effectiveness may also be improved as the proportion of useful data inside it may be higher. Simply put, spatial locality in your code translates to spatial locality for performance.
Also, as Jerry notes in the comments, order may affect the amount of padding. Sort the members by decreasing size, which is also by decreasing alignment (usually treat an array as just one element of its type, and a member struct as its most-aligned member). Unnecessary padding may increase the total size of the structure, leading to higher memory traffic.
C++03 ยง9/12:
Nonstatic data members of a
(non-union) class declared without an
intervening access-specifier are allocated so that later members have
higher addresses within a class
object. The order of allocation of
nonstatic data members separated by an
access-specifier is unspecified
(11.1). Implementation alignment
requirements might cause two
adjacent members not to be allocated
immediately after each other; so might
requirements for space for managing
virtual functions (10.3) and virtual
base classes (10.1).
Absolutely agree with Potatoswatter. However one more point should be added about the CPU cache lines.
If your application is multithreaded and different threads read/write members of your structure - it's very important to make sure those members are not within the same cache line.
The point is that whenever a thread modifies a memory address that is cached in other CPU - that CPU immediately invalidates the cache line containing that address. So that improper members order may lead to the unjustified cache invalidation and performance degradation.
In addition to the runtime performance, described in the cache-line related answers, I think one should also consider memory performance, i.e. the size of the class object.
Due to the padding, the size of the class object is dependent on the order of member variable declaration.
The following declaration would probably take 12 bytes
class foo {
char c1;
int i;
char c2;
}
However, upon simple re-ordering of the order of member declaration, the following would probably take 8 bytes
class bar {
int i;
char c1;
char c2;
}
In machines aligned with 4-byte words:
sizeof( foo ) = 12
but
sizeof( bar ) = 8