Does fixed array from derived extend the fixed array from base? - c++

Assuming I have the following:
struct A
{
unsigned x,y;
char b[4];
};
template <unsigned N> struct B : public A
{
static constexpr unsigned L = N + sizeof(A::b);
char e[N];
};
Should I assume that the static array from B will be appended to the static array from A? Such that I could treat the array from A as having a size that would also include the array from B.
For example, the following would output than 4 bytes:
using T = B<60>;
T o;
snprintf(o.b, T::L, "more than 4 bytes");
puts(o.e);
Which it does. But I'm not an expert in how a more complex compiler actually deals with deciding the layout the structures and/or in what order it might arrange those types in memory. Depending on the requested optimizations.
Which is why I'm asking if this might have unexpected results. And if so, under what circumstances. And what should I expect?
Leaving aside the warnings given by the compiler for "out of range access" (if any).
Also, this is not the actual use case. But rather an example to better describe my question.

Behavior is undefined by C++ standard (and not just because of alignment). However on mainstream compilers IF you add __attribute__((packed)) to both structures (packed is gcc thing, but others have analogs), then it should work. Your code will depend on compiler implementation details though, you'll need static assertion to safeguard against breakage:
// dummy class B_ used in static assertion in B since the value
// of offsetof(B,e) is yet undefined at that point
template <unsigned N> struct B_ : public A
{
char e[sizeof(A::b) + N];
} __attribute__((packed));
template <unsigned N> struct B : public A
{
STATIC_ASSERT(OFFSETOF(B_, A::b) + sizeof(A::b) == OFFSETOF(B_::e));
char e[sizeof(A::b) + N];
} __attribute__((packed));
But it's all so ugly. I recommend asking question closer to you actual code, likely there are easier ways to accomplish what you want.

Related

How to zero array members when my compiler isn't standard conform

My compiler (C++Builder6) syntactically allows array member initialization (at least with zero), but actually it doesn't really do it. So the assert in the example given below fails depending from the context.
#include <assert.h>
struct TT {
char b[8];
TT(): b() {}
};
void testIt() {
TT t;
assert(t.b[7] == 0);
}
Changing the compiler isn't an option at the moment. My question is: what will be the best way to "repair" this flaw with respect to future portability and standard conformance?
Edit:
As it turns out, my first example was too short. It missed the point, that the fill level of the array is so essential, that it has to be stored very close to the array, which is: in the same class.
Even if the original problem remains, my actual problem pattern is usually this:
struct TT2 {
int size;
char data[8];
// ... some more elements
TT2(): size(0), data() {}
// ... some more methods
};
I think you may use this:
TT() { std::fill(b, b + 8, char()); }
This way you will solve your problem while nothing is wrong with portability and standard conformance!
You may use fill_n like suggested in:
C/C++ initialization of a normal array with one default value
If no fill_n is available, you can always use memset like:
TT() {memset(b, 0, sizeof b);}
I would like to append previous posts that if you are using a character array as a string then it is enough to write in the constructor
TT() { b[0] = '\0'; }

Binary serialization of variable length data and zero length arrays, is it safe?

I did some research but cannot find a definite approval or disapproval.
What I want is, a fixed size structure + variable length part, so that serialization can be expressed in simple and less error prone way.
struct serialized_data
{
int len;
int type;
char variable_length_text[0];
};
And then:
serialize_data buff = (serialize_data*)malloc(sizeof(serialize_data)+5);
buff->len=5;
buff->type=1;
memcpy(buff->variable_length_text, "abcd", 5);
Unfortunately I can't find if MSVC, GCC, CLang etc., are ok with it.
Maybe there is a better way to achieve the same?
I really don't want those ugly casts all around:
memcpy((char*)(((char*)buffer)+sizeof(serialize_data)), "abcd", 5);
This program is using a zero length array. This is not C but a GNU extension.
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
A common idiom in C89, called the struct hack, was to use:
struct serialized_data
{
int len;
int type;
char variable_length_text[1];
};
Unfortunately its common use as a flexible array is not strictly conforming.
C99 comes with something similar to perform the same task: a feature called the flexible array member.
Here is an example right from the Standard (C99, 6.7.2.1p17)
struct s { int n; double d[]; };
int m = 12; // some value
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));

Best way to initialize a statically initialized per-struct character buffer?

Continuing from Absolute fastest (and hopefully elegant) way to return a certain char buffer given a struct type I want to now initialize once each static character buf per struct individually.
Ie, for:
#pragma pack(push, 1);
struct Header {
int a;
int b;
char c;
};
struct X {
int x;
int y;
};
struct Y {
char someStr[20];
};
struct Msg {
Header hdr;
union {
X x;
Y y;
};
};
#pragma pack(pop)
We have:
tempate<typename T>
struct Buffer {
static char buffer[sizeof(T)];
}
template<class T>
inline char* get_buffer() {
return Buffer<T>::buffer;
}
The two things I'm looking for are:
There are exactly 2 buffers: 1 for X and one for Y. They should each be the length of sizeof(Msg.hdr) + sizeof(Msg.x) and sizeof(Msg.hdr) + sizeof(Msg.y), respectively.
Each buffer will be retrieved a lot during the application lifetime and only some fields really (or need to) change.
2a. Msg for X backed by it's char buffer should be initialized to m.hdr.a = 1, m.hdr.b = 0; and for Msg Y it should be m.hdr.a = 16; m.hdr.b = 1; as an example.
The app will frequently fetch these buffers as type Msg backed by either X or Y (the app would know which one) and then change x and y or someStr only and then output it to the file for example then repeat.
Just wondering what nice way builds on these great examples by #6502 and #Fred Nurk to elegantly initialize these 2 buffers while being human readable. I'd prefer to keep using structs and to limit the use of reinterpret_cast<>() as much as possible as there may be aliasing issues that might develop.
Please let me know if I'm not clear and I will do my best to answer any questions and/or edit this question description.
Thanks.
*** Update: my usage pattern of these buffers is that I will be sending copying the char* out to a stream or file. hence I need to get a char* pointer to the underlying data. However I need to work on the char buffers via their structs for readability and convenience. Also this char buffer should be decoupled and not necessarily contained or "attached" to the struct as the structs are pretty much in separate files and used elsewhere where the buffers are not needed/wanted. Would just doing a simple static X x; static Y y; suffice or Maybe better buffers of length Header + X for X's Msg buffer? and then somehow just keep a char* reference to each Msg for X and Y? Will I run into aliasing issues potentially?
If you would be writing it in C, you could look into a fairly common C compiler extension called "cast to a union type", but in C++ it is no longer present.
In C++ there is no way around reinterpret_cast<> for what you require, but at least you can do it fairly safely by calculating the member offset on NULL pointer casted to the union, and then subtracting this offset from your data pointer before casting it to the union. I believe that on most compilers the offset will be 0, but it is better to be on the safe side.
template<class T>
union Aligner {
T t;
char buffer[sizeof(T)];
};
template<class T>
inline char* get_buffer(T* pt) {
return reinterpret_cast<Aligner<T>*>(reinterpret_cast<char*>(pt) - reinterpret_cast<ptrdiff_t>(&reinterpret_cast<Aligner<T>*>(NULL)->t))->buffer;
}

Actual total size of struct's members

I must write array of struct Data to hard disk:
struct Data {
char cmember;
/* padding bytes */
int imember;
};
AFAIK, most of compilers will add some padding bytes between cmember and imember members of Data, but I want save to file only actual data (without paddings).
I have next code for saving Datas array (in buffer instead of file for simplification):
bool saveData(Data* data, int dataLen, char* targetBuff, int buffLen)
{
int actualLen = sizeof(char) + sizeof(int); // this code force us to know internal
// representation of Data structure
int actualTotalLen = dataLen * actualLen;
if(actualTotalLen > buffLen) {
return false;
}
for(int i = 0; i &lt dataLen; i++) {
memcpy(targetBuff, &data[i].cmember, sizeof(char));
targetBuff += sizeof(char);
memcpy(targetBuff, &data[i].imember, sizeof(int));
targetBuff += sizeof(int);
}
return true;
}
As you can see, I calculate actual size of Data struct with the code: int actualLen = sizeof(char) + sizeof(int). Is there any alternative to this ? (something like int actualLen = actualSizeof(Data))
P.S. this is synthetic example, but I think you understand idea of my question...
Just save each member of the struct one at a time. If you overload << to write a variable to a file, you can have
myfile << mystruct.member1 << mystruct.member2;
Then you could even overload << to take an entire struct, and do that inside the struct's operator<<, so in the end you have:
myfile << mystruct;
Resulting in save code that looks like:
myfile << count;
for (int i = 0; i < count; ++i)
myFile << data[i];
IMO all that fiddling about with memory addresses and memcpy is too much of a headache when you could do it this way. This general technique is called serialization - hit google for more, it's a well-developed area.
You will have to pack your structure.
The way to do that changes depending on the compiler you are using.
For visual c++:
#pragma pack(push)
#pragma pack(1)
struct PackedStruct {
/* members */
};
#pragma pack(pop)
This will tell the compiler to not pad members in the structure and restore the pack parameter to its initial value. Be aware that this will affect performance. If this struicture is used in critical code, you might want to copy the unpacked structure into a packed structure.
Also, resist temptations to use the command line parameter that totally disable padding, this will greatly affect performance.
IIUC, you are trying to copy the values of the structure members rather than the structure as a whole and store it to disk. Your approach looks good to me. I do not agree with those suggesting #pragma pack -- since they will help you get a packed structure at runtime.
Few notes:
sizeof(char) == 1, always, by definition
use the offsetof() macro
do not try to instantiate a Data object directly from this targetBuff (i.e. via casting) -- this is when you get into alignment issues and trip. Instead, copy the members out as you did while writing the buffer and you should not have issues
There is not an easy solution to this problem. You can usually create separate structures and tell the compiler to pack them tightly, something like:
/* GNU has attributes */
struct PackedData {
char cmember;
int imember;
} __attribute__((packed));
or:
/* MSVC has headers and #pragmas */
#include <pshpack1.h>
struct PackedData {
char cmember;
int imember;
};
#include <poppack.h>
Then you have to write code that transforms your unpacked structures into packed structures and vice-versa. If you are using C++, you can create template helper functions that are predicated on the structure type and then specialize them:
template <typename T>
std::ostream& encode_to_stream(std::ostream& os, T const& object) {
return os.write((char const*)&object, sizeof(object));
}
template <typename T>
std::istream& decode_from_stream(std::istream& is, T& object) {
return is.read((char*)&object, sizeof(object));
}
template<>
std::ostream& encode_to_stream<Data>(std::ostream& os, Data const& object) {
encode_to_stream<char>(os, object.cmember);
encode_to_stream<int>(os, object.imember);
return os;
}
template <>
std::istream& decode_from_stream<Data>(std::istream& is, Data& object) {
decode_from_stream<char>(is, object.cmember);
decode_from_stream<int>(is, object.imember);
return is;
}
The bonus is that the defaults will read and write POD objects including the padding. You can specialize as necessary to optimize your storage. However, you probably want to consider endianess, versioning, and other binary storage issues as well. It might be prudent to simply write an archival class that wraps your storage and provides methods for serialization and deserialization of primitives and then an open ended method that you can specialize as needed:
class Archive {
protected:
typedef unsigned char byte;
void writeBytes(byte const* byte_ptr, std::size_t byte_size) {
m_fstream.write((char const*)byte_ptr, byte_size);
}
public:
template <typename T>
void writePOD(T const& pod) {
writeBytes((byte const*)&pod, sizeof(pod));
}
// Users are required to specialize this to use it. If it is used
// for a type that it is not specialized for, a link error will occur.
template <typename T> void serializeObject(T const& obj);
};
template<>
void Archive::serializeObject<Data>(Data const& obj) {
writePOD(cmember);
writePOD(imember);
}
This is the approach that I have always ended up at after a bunch of perturbations in between. It is nicely extensible without requiring inheritance and gives you the flexibility to change your underlying data storage format as needed. You can even specialize writePOD to do different things for different underlying data types like ensuring that multibyte integers are written in network order or whatnot.
Don't know if this will help you, but I'm in the habit of ordering the members of the structs that I intend to write to files (or send over networks) so they have as little padding as possible. This is done my putting the members with the widest datatypes and most strict alignment first:
• pointers first
•double
•long long
•long
•float
•int
•short
•char
• bitfields last
Any padding added by the compiler will come at the end of the struct data.
In other words, you could simplify your problem by eliminating the padding (if possible) by reordering the struct members:
struct Data
{
int imember;
char cmember;
/* padding bytes here */
};
Obviously this won't solve your problem if you can't reorder the struct members (because it's used by a third-party API or because you need the initial members to have specific datatypes).
I would say that you are actually looking for serialization.
There are a number of framework for serialization, but I personally prefer Google Protocol Buffers over Boost.Serialization and other approaches.
Protocol Buffers has versioning and binary/human readable output.
If you are concerned about size, you always have the possibility of compressing the data. There are lightning fast compression algorithm like LZW which offer a good ratio speed/compression for example.
Look into the #pragma pack macro for your compiler. Some compilers use #pragma options align=packed or something similar.
As you can see, I calculate actual size of Data struct with the code: int actualLen = sizeof(char) + sizeof(int). Is there any alternative to this ?
No, not in standard C++.
Your compiler might provide a compiler-specific option, though. Packed structs as shown by Graeme and Coincoin might do.
If you don't want to use pragma pack, try to manually re-order the variables,
like
struct Data {
int imember;
char cmember;
};
You said #Coincoin that can not pack. If you just need size for some reason, here is dirty solution
#define STRUCT_ELEMENTS char cmember;/* padding bytes */ int imember;
typedef struct
{
STRUCT_ELEMENTS
}paddedData;
#pragma pack(push)
#pragma pack(1)
typedef struct
{
STRUCT_ELEMENTS
}packedData;
#pragma pop
now you have size of both;
sizeof(packedData);
sizeof(paddedData);
Only reason that I can think of why you can not pack is linking this to other program. In that case you will need to pack your structure and then unpeck when working whit external program.
No, there is no way within the language proper to get this information. One way to approach a solution is to define your data classes indirectly, using some feature of the language - it could be as old-fashioned as macros and the preprocessor, or as new-fangled as tuple templates. You need something which lets you iterate over the class members systematically.
Here's a macro based approach:
#undef Data_MEMBERS
#define Data_MEMBERS(Data_OP) \
Data_OP(c, char) \
Data_OP(i, int)
#undef Data_CLASS_DEFINITION
#define Data_CLASS_DEFINITION(name, type) \
type name##member;
struct Data {
Data_MEMBERS(Data_CLASS_DEFINITION)
};
#define Data_SERIAL_SIZER(name, type) \
sizeof(type) +
#define Data_Serial_Size \
(Data_MEMBERS(Data_SERIAL_SIZER) 0)
And so forth.
If you can rewrite the struct definition, you could try to use field specifiers to get rid of the holes, like so:
struct Data {
char cmember : 1;
int imember : 4;
};
Sadly, this does not guarantee that it still won't place imember 4 bytes after the start of cmember. But many compilers will get the idea and do it anyway.
Other alternatives:
Reorder your members by size (largest first). This is an old embedded world trick to minimize holes.
Use Ada instead.
The code
type Data is record
cmember : character;
imember : integer;
end record;
for Data use record
cmember at 0 range 0..7;
imemeber at 1 range 0..31;
end record;
Does exactly what you want.

Template Metaprogramming - Difference Between Using Enum Hack and Static Const

I'm wondering what the difference is between using a static const and an enum hack when using template metaprogramming techniques.
EX: (Fibonacci via TMP)
template< int n > struct TMPFib {
static const int val =
TMPFib< n-1 >::val + TMPFib< n-2 >::val;
};
template<> struct TMPFib< 1 > {
static const int val = 1;
};
template<> struct TMPFib< 0 > {
static const int val = 0;
};
vs.
template< int n > struct TMPFib {
enum {
val = TMPFib< n-1 >::val + TMPFib< n-2 >::val
};
};
template<> struct TMPFib< 1 > {
enum { val = 1 };
};
template<> struct TMPFib< 0 > {
enum { val = 0 };
};
Why use one over the other? I've read that the enum hack was used before static const was supported inside classes, but why use it now?
Enums aren't lvals, static member values are and if passed by reference the template will be instanciated:
void f(const int&);
f(TMPFib<1>::value);
If you want to do pure compile time calculations etc. this is an undesired side-effect.
The main historic difference is that enums also work for compilers where in-class-initialization of member values is not supported, this should be fixed in most compilers now.
There may also be differences in compilation speed between enum and static consts.
There are some details in the boost coding guidelines and an older thread in the boost archives regarding the subject.
For some the former one may seem less of a hack, and more natural. Also it has memory allocated for itself if you use the class, so you can for example take the address of val.
The latter is better supported by some older compilers.
On the flip side to #Georg's answer, when a structure that contains a static const variable is defined in a specialized template, it needs to be declared in source so the linker can find it and actually give it an address to be referenced by. This may unnecessarily(depending on desired effects) cause inelegant code, especially if you're trying to create a header only library. You could solve it by converting the values to functions that return the value, which could open up the templates to run-time info as well.
"enum hack" is a more constrained and close-enough to #define and that helps to initialise the enum once and it's not legal to take the address of an enum anywhere in the program and it's typically not legal to take the address of a #define, either. If you don't want to let people get a pointer or reference to one of your integral constants, an enum is a good way to enforce that constraint. To see how to implies to TMP is that during recursion, each instance will have its own copy of the enum { val = 1 } during recursion and each of those val will have proper place in it's loop. As #Kornel Kisielewicz mentioned "enum hack" also supported by older compilers those forbid the in-class specification of initial values to those static const.