Writing into the last byte of a class - c++

Given a standard layout class with standard layout members such as:
struct foo {
int n;
int m;
unsigned char garbage;
};
will it always be safe, according to the standard, to write in the last byte of the struct without writing into the memory areas of n and m (but possibly writing into garbage)? E.g.,
foo f;
*(static_cast<unsigned char *>(static_cast<void *>(&f)) + (sizeof(foo) - 1u)) = 0u;
After spending some time reading the C++11 standard, it seems to me like the answer might be yes.
From 9.2/15:
Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so
that later members have higher addresses within a class object. The order of allocation of non-static data
members with different access control is unspecified (11). Implementation alignment requirements might
cause two adjacent members not to be allocated immediately after each other; so might requirements for
space for managing virtual functions (10.3) and virtual base classes (10.1).
Hence, the garbage member has a higher address than the other two members (which are stored contiguously themselves as they have standard layout), and the last byte of the struct must hence either belong to garbage or be part of the final padding.
Is this reasoning correct? Am I meddling with the f object lifetime here? Is writing into padding bytes a problem?
EDIT
In reply to the comments, what I am trying to achieve here has to do with a variant-like class I am writing.
If proceed in a straightforward way (i.e., place an int member in the variant class to record which type is being stored), the padding will make the class almost 50% bigger than it needs to be.
What I am trying to do is to make sure that every last byte of each class type I am going to store in the variant is writable, so I can incorporate the storage flag into the raw storage (aligned raw char array) I am using in the variant. In my specific case, this eliminates most of the wasted space.
EDIT 2
As an actual example, consider these two classes to be stored in a variant on a typical 64-bit machine:
// Small dynamic vector class storing 8-bit integers.
class first {
std::int8_t *m_ptr;
unsigned short m_size_capacity; // Size and capacity packed into a single ushort.
};
// Vector class with static storage.
class second {
std::int8_t m_data[15];
std::uint8_t m_size;
};
class variant
{
char m_data[...] // Properly sized and aligned for first and second.
bool m_flag; // Flag to signal which class is being stored.
};
The size of these two classes is 16 on my machine, the extra member needed in the variant class makes the size go to 24. If I now add the garbage byte in the end:
// Small dynamic vector class storing 8-bit integers.
class first {
std::int8_t *m_ptr;
unsigned short m_size_capacity; // Size and capacity packed into a single ushort.
unsigned char m_garbage;
};
// Vector class with static storage.
class second {
std::int8_t m_data[14]; // Note I lost a vector element here.
std::uint8_t m_size;
unsigned char m_garbage;
};
The size of both classes will still be 16, but if now I can use the last byte of each class freely I can do away with the flag member in the variant, and the final size will still be 16.

Instead, you should put the tag first, followed by the other small members.
// Small dynamic vector class storing 8-bit integers.
struct first
{
unsigned char m_tag;
std::uint8_t m_size;
std::uint8_t m_capacity;
std::int8_t *m_ptr;
};
// Vector class with static storage.
struct second
{
unsigned char m_tag;
std::uint8_t m_size;
std::int8_t m_data[14];
};
Then, the language rules allow you to put these into a union and use either one to access m_tag, even if that wasn't the "active" member of the union, because the initial layout is the same (special rule for common initial sequence of members).
union tight_vector
{
first dynamic;
second small_opt;
};
tight_vector v;
if (v.dynamic.m_size < 4) throw std::exception("Not enough data");
if (v.dynamic.m_tag == DYNAMIC) { /* use v.dynamic */ }
else { /* use v.small_opt */ }
The rule in question is 9.2/18:
If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members.

Yes, in C++ any object, including a class, is represented as a series of addressable char objects ("bytes"), and objects declared in sequence in a class without intervening access specifiers have sequentially ascending addresses. Therefore storage for garbage must have a higher address (when addressed as char *) than n or m.
In theory the compiler could store a base class at the end of the object, or something like a vtable pointer, but in practice such things always go into the beginning for the sake of simplicity. I'm not sure what the standard guarantees about the size of a standard-layout class, which would pertain to whether padding may be added, which would have implications for whether an implementation could rely on presence of padding for some purpose, but it probably comes down to the implementation being allowed to use padding which happens to be there, but it can't add any (and access would perhaps not be simple or efficient anyway).
What I am trying to do is to make sure that every last byte of each class type I am going to store in the variant is writable
How is this different from using the garbage member itself? If you know it's there, presumably you can simply access it.

The last byte of the struct will not be part of n and m, but how do you know the compiler hasn't stored something else in the last byte, such as type information?
I don't recall the standard guaranteeing any such thing. Only that a memcpy of sizeof(T) into and out will result in the same value, which doesn't mean that the final byte doesn't hold information.

Related

Variable class/struct structure? (Not template & not union?)

I have tried union...
struct foo
{
union
{
struct // 2 bytes
{
char var0_1;
};
struct // 5 bytes
{
char var1_1;
int var1_2;
};
};
};
Problem: Unions do what I want, except they will always take the size of the biggest datatype. In my case I need struct foo to have some initialization that allows me to tell it which structure to chose of the two (if that is even legal) as shown below.
So after that, I tried class template overloading...
template <bool B>
class foo { }
template <>
class foo<true>
{
char var1;
}
template <>
class foo<false>
{
char var0;
int var1;
}
Problem: I was really happy with templates and the fact that I could use the same variable name on the char and int, but the problem was the syntax. Because the classes are created on compile-time, the template boolean variable needed to be a hardcoded constant, but in my case the boolean needs to be user-defined on runtime.
So I need something of the two "worlds." How can I achieve what I'm trying to do?
!!NOTE: The foo class/struct will later be inherited, therefore as already mentioned, size of foo is of utmost importance.
EDIT#1::
Application:
Basically this will be used to read/write (using a pointer as an interface) a specific data buffer and also allow me to create (new instance of the class/struct) the same data buffer. The variables you see above specify the length. If it's a smaller data buffer, the length is written in a char/byte. If it's a bigger data buffer, the first char/byte is null as a flag, and the int specifies the length instead. After the length it's obvious that the actual data follows, hence why the inheritance. Size of class is of the utmost importance. I need to have my cake and eat it too.
A layer of abstraction.
struct my_buffer_view{
std::size_t size()const{
if (!m_ptr)return 0;
if (*m_ptr)return *m_ptr;
return *reinterpret_cast<std::uint32_t const*>(m_ptr+1);
}
std::uint8_t const* data() const{
if(!m_ptr)return nullptr;
if(*m_ptr)return m_ptr+1;
return m_ptr+5;
}
std::uint8_t const* begin()const{return data();}
std::uint8_t const* end()const{return data()+size();}
my_buffer_view(std::uint_t const*ptr=nullptr):m_ptr(ptr){}
my_buffer_view(my_buffer_view const&)=default;
my_buffer_view& operator=(my_buffer_view const&)=default;
private:
std::uint8_t const* m_ptr=0;
};
No variable sized data anywhere. I coukd have used a union for size etx:
struct header{
std::uint8_t short_len;
union {
struct{
std::uint32_t long_len;
std::uint8_t long_buf[1];
}
struct {
std::short_buf[1];
}
} body;
};
but I just did pointer arithmetic instead.
Writing such a buffer to a bytestream is another problem entirely.
Your solution does not make sense. Think about your solution: you could define two independents classes: fooTrue and fooFalse with corresponding members exactly with the same result.
Probably, you are looking for a different solution as inheritance. For example, your fooTrue is baseFoo and your fooFalse is derivedFoo with as the previous one as base and extends it with another int member.
In this case, you have the polymorphism as the method to work in runtime.
You can't have your cake and eat it too.
The point of templates is that the specialisation happens at compile time. At run time, the size of the class is fixed (albeit, in an implementation-defined manner).
If you want the choice to be made at run time, then you can't use a mechanism that determines size at compile-time. You will need a mechanism that accommodates both possible needs. Practically, that means your base class will need to be large enough to contain all required members - which is essentially what is happening with your union based solution.
In reference to your "!!NOTE". What you are doing qualifies as premature optimisation. You are trying to optimise size of a base class without any evidence (e.g. measurement of memory usage) that the size difference is actually significant for your application (e.g. that it causes your application to exhaust available memory). The fact that something will be a base for a number of other classes is not sufficient, on its own, to worry about its size.

Structure Packing. Is there a automatic way to do it?

Question: Is there an automatic way to do structure packing?
Background:
Structure packing is very useful to reduce the memory cost of certain fundamental data. Basically it is the trick to achieve minimum memory cost by reordering the data inside. My question is that is there an auto way to do that? For example, I have a struct Foo here.(suppose 32bit)
struct Foo {
char flag;
char* p;
short number;
};
After a auto check(either it is a script or not, native or not), I should get a memory-optimization version of Foo, which is:
struct Foo {
char* p;
short number;
char flag;
};
This is just a toy example. Consider more difficult situations below, it would be quite a work for manual reordering.
struct has dependent struct:
struct Foo {
char* p;
short number;
MoreFoo more_foo // How to deal with this?
char flag;
};
struct is in legacy code and you are not familiar with codebase.
you want the code to be cross-platform. Sadly enough, this trick is compiler-dependent.
I am not considering using "packed" attribute since it will lead to some performance issue.
Can __attribute__((packed)) affect the performance of a program?
In C++03 you can give the compiler permission to reorder members by placing every one inside a separate access section, e.g.:
struct Foo
{
public:
char* p;
public:
short number;
public:
MoreFoo more_foo;
public:
char flag;
};
Whether a particular compiler uses this additional flexibility or not I do not know.
This doesn't change the declaration order, it simply unlinks memory order from declaration order, so PaulMcKenzie's concern about initialization order does not apply. (And I think he overstated the concern; it is very rare for member initializers to refer to other subobjects in the first place)
The way this works is because it causes the following rule, from 9.2) to no longer have effect:
Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
Also, it's questionable whether this still works in C++11, since the wording changed from "without an intervening access-specifier" to "with the same access control":
Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (Clause 11). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
In C programming, automatic optimization of a struct is not possible because this would go against the very way it was designed. C allows low-level access to the hardware, in fact, it is only a step higher from assembly language. It is designed to create dependent code that controls hardware.
Given this, unfortunately, you can only re-order a struct manually. You would probably need to find the size of all the struct's attributes, as so:
printf ("Size of char is %d\n", sizeof (char));
printf ("Size of char* is %d\n", sizeof (char*));
printf ("Size of short is %d\n", sizeof (short));
printf ("Size of MoreFoo is %d\n", sizeof (MoreFoo more_foo));
And then order the struct based on these values.

Can anyone explain to me why the sizeof function returns different values in below code?

Can anyone explain me why the sizeof function returns different values in the code below?
//static member
class one
{
public :
static const int a = 10;
};
//non static member
class two
{
public :
int a;
};
int main()
{
cout << sizeof(one); //print 1 to lcd
cout << sizeof(two); //print 4 to lcd,differ from size of one class
}
The first thing you should learn is that sizeof is not a function, it's an operator just like + or ||.
Then as for your question. Static member variables are not actually in the class the same way non-static member variables are, so a class with only static members will have zero size. But at the same time all objects needs to be addressable, and therefore have, which is why sizeof give you 1 for the first class.
one has no non-static members, so an instance of it is empty. The static member is not contained in any object of that type, but exists independently of any objects. It has size 1, rather than zero, because C++ doesn't allow types to have size zero (in order to ensure that different objects have different addresses).
two does have a non-static member, so an instance has to be large enough to contain that member. In your case, its size is 4, the same as the size of its int member.
Static data members are not stored in the class itself and therefore will not contribute to the sizeof the class. We can see this by going to the draft C++ standard section 9.4.2 Static data members which says:
A static data member is not part of the subobjects of a class.
Class one has a size of 1 since complete objects shall have non-zero size, from section 9 Classes which says:
Complete objects and member subobjects of class type shall have nonzero size.106
Note, that sizeof is an operator not a function.
Simple answer is that one and tow are different classes with different sizes.
tow contains int which I assume is 4 bytes on your compiler. I think you understand that part.
A static member is not present in every instance of a class, but it is a global variable shared between all classes. As such, it is not included in the class size. This is because sizeof is commonly used to allocate memory for an object, and there is no need to allocate memory for a variable which isn't in the class instance. This is why one isn't 4 bytes.
The reason why it is 1 byte is because the C++ standard doesn't allow a class to have a size of 0 bytes, so the compiler has padded it up to a non-0 size.
In one the static variable a would not be considered in the calculation of the size of the class/object.
In two, the a would be considered, in this case equivalent to the sizeof(int).
Notes:
sizeof is an operator, not a function.
The size of a class may not be 0, hence, the size of one must be something, hence it has a size of 1.
Useful reference on the sizeof operator; http://en.cppreference.com/w/cpp/language/sizeof
Note: original question the variable was tow not two.

Why the size of class is 1 byte contaning union members?

I'm working on C++:
Following is the code snippet:
class my
{
union o
{
int i;
unsigned int j;
};
union f
{
int a;
unsigned int b;
};
};
I found the size of class "my" is 1 byte. I don't understood why the value is 1 byte?
Can any one explain me this result?
Your class is empty, it's contains two union type declarations but no data members. It's usual for empty classes to have a non-zero size, as explained here
You will find that my::o and my::f have a size appropriate for their content (typically 4 bytes). But since neither my::f or my::o are actually part of your class as such, just declared in the class, they are not taking up space within the class.
A similar example would be:
class other
{
typedef char mytype[1024];
};
now, other::mytype would be 1024 bytes long, but other would still not contain a member of mytype.
A union is a user-defined data or class type that, at any given time, contains only one object from its list of members. So dont expect from class to allocate 8 bytes to each unions.
You have also defined union types but you didnt initiate/alloc a space for it. Since class is defined allready it has no-zero size, it is 1 byte to check if class is defined or not..
#RithchiHindle the answer still exists in the same link..
Although you have 2 unions inside your code still that is not allocated yet.. even after you have created an object of the same class and check the size than you will get the size as 1 again as memory has been not initialized for same.. if you have created an instance of your union than you will get the 8 bytes as size

Use the right alignment for a buffer which is supposed to hold a struct in C++

Suppose we have some struct, say
struct S
{
double a, b;
~S(); // S doesn't have to be POD
};
Such a struct should typically have an alignment of 8, as the size of its largest contained type is 8.
Now imagine we want to declare a placeholder struct to hold the value of S:
struct Placeholder
{
char bytes[ sizeof( S ) ];
};
Now we want to place it inside of another class:
class User
{
char someChar;
Placeholder holder;
public:
// Don't mind that this is hacky -- this just shows a possible use but
// that's not the point of the question
User() { new ( holder.bytes ) S; }
~User() { ( ( S * )( holder.bytes ) )->~S(); }
};
Problem is, Placeholder is now aligned incorrectly within User. Since the compiler knows that Placeholder is made of chars, not doubles, it would typically use an alignment of 1.
Is there a way to declare Placeholder with the alignment matching that of S in C++03? Note that S is not a POD type. I also understand C++11 has alignas, but this is not universally available yet, so I'd rather not count on it if possible.
Update: just to clarify, this should work for any S - we don't know what it contains.
You can use a union, if you can make S conform to the requirements of being a member of a union*.
A union is guaranteed to have enough storage for its largest member, and aligned for its most reastrictive member. So if we make the placeholder a union of both the raw char buffer and all the types that will actually be stored there, you will have sufficient size and correct alignment.
We will never access the members of the union other than the storage itself. They are present only for alignment.
Something along these lines:
struct Placeholder
{
union
{
char bytes [sizeof(S)];
double alignDouble;
};
};
"requirements of being a member of a union" : Members of unions cannot have: non-trivial constructor, non-trivial copy constructor, non-trivial destructor, non-trivial copy-assignment operator.
I believe that boost::aligned_storage may be exactly what you're looking for. It uses the union trick in such a way that your type doesn't matter (you just use sizeof(YourType) to tell it how to align) to make sure the alignment works out properly.