sizeof and types, guarantees - c++

I cannot find a proof/disproof that the following code snippet has no design flaws, speaking about the correctness.
template <class Item>
class MyDirtyPool {
public:
template<typename ... Args>
std::size_t add(Args &&... args) {
if (!m_deletedComponentsIndexes.empty()) {
//reuse old block
size_t positionIndex = m_deletedComponentsIndexes.back();
m_deletedComponentsIndexes.pop_back();
void *block = static_cast<void *> (&m_memoryBlock[positionIndex]);
new(block) Item(std::forward<Args>(args) ...);
return positionIndex;
} else {
//not found, add new block
m_memoryBlock.emplace_back();
void *block = static_cast<void *> (&m_memoryBlock.back());
new(block) Item(std::forward<Args>(args) ...);
return m_memoryBlock.size() - 1;
}
}
//...all the other methods omitted
private:
struct Chunk {
char mem[sizeof(Item)]; //is this sane?
};
std::vector<Chunk> m_memoryBlock; //and this one too, safe?
std::deque<std::size_t> m_deletedComponentsIndexes;
};
I am concerned about all the stuff with Chunk, which is used here in essence as a bag of memory having the same size as the supplied type.
I don't want to explicitly create Item objects in the m_memoryBlock, therefore I need some kind of "chunk of memory having enough of space".
Can I be sure that Chunk will have the same size as the supplied type? Please provide some examples where this assumption won't work.
If I am totally wrong in my assumptions, how should I deal with it?

In such designs the memory must be suitably aligned for the type of objects you would like to create there.
Standard built-in types normally have natural alignment, which equals to their sizeof, i.e. sizeof(T) == alignof(T).
The alignment of char array is 1 byte, which is insufficient for anything else.
One simple way to enforce alignment is to use std::max_align_t like this:
union Chunk
{
std::max_align_t max_align;
char mem[sizeof(Item)];
};
That will make Chunk::mem suitably aligned for any standard built-in type.
Another way is to use the exact alignment of the type you would like to place in that char array using C++11 alignas keyword:
struct Chunk
{
alignas(Item) char mem[sizeof(Item)];
};
And this is exactly what std::aligned_storage does for you.
However, that would require exposing the definition of Item, which may be inconvenient or undesirable in some cases.
For example, this method could be used as an optimization for Pimpl idiom to avoid a memory allocation, however, that would require exposing the definition of the implementation in the header file to get its size and alignment, therefore defeating the purpose of Pimpl. See The Fast Pimpl Idiom for more details.
Another detail, is that if you would like to copy/move Chunks and expect the stored objects to remain valid, those stored objects must be trivially copyable, e.g.
static_assert(std::is_trivially_copyable<Item>::value,
"Item cannot be copied byte-wise");
typedef std::aligned_storage<sizeof(Item), alignof(Item)> Chunk;
For a memory pool std::vector<Chunk> is not a good choice because on reallocation when the vector grows all pointers and references to the objects stored in the pool get invalidated.
The objects should not move in a general memory pool.

Related

Idiomatic way to handle T[]-like objects in C++

I am using some C-library in my C++ code. The library wants me to allocate some amount of the memory and pass the pointer to the library. Unfortunately, the exact required memory size is not known in advance, so the library also requires me to provide C-callback with the following signature:
void* callback_realloc(void* ptr, size_t new_size);
where ptr is previously passed memory and new_size is required size, and the callback must return the pointer to newly allocated memory. There is no direct way to store the allocator state. Instead, I need to rely on pointer arithmetic somehow as the following:
template<class T>
struct o_s {
std::aligned_storage_t<sizeof(T), alignof(T)> data;
};
template<class Alloc>
struct o_i: private Alloc {
std::size_t allocated_size;
const Alloc& get_allocator() const { return *this; }
};
template<class T>
struct o: public o_i, public o_s<T> {
void* ptr() {
return &data;
}
// Additionally, override class operator new and operator delete...
};
void* my_callback(void* ptr, size_t new_size) {
auto meta = static_cast<o*>(reinterpret_cast<o_s<char>*>(ptr));
// access to the allocator state ...
}
Then sizeof(o_i) + initial_size memory is allocated and ptr() is passed to the C library.
At this point, I understand that I am not the first person in the world who needs this pattern. Unfortunately (and surprisingly) I have not found anything suitable for this in Boost or STL. I would like to use ready implementation to avoid possible underwater rocks.
The simplest solution is to allocate the memory using std::malloc, and use std::realloc as the callback. C API such as this are the case where using those makes sense in C++.
You don't necessarily have to use std::malloc however. You can implement a custom allocator if you want to. Using some of the allocated storage is one way of storing allocation metadata, and it's an efficient way. That's not necessary either though, since you can also store the metadata separately in a map-like structure.
I don't think you'd find an idiomatic way to do this in C++, as this is a C idiom.
Conceptually, you have two ways of going about this:
Store the metadata alongside the buffer you've allocated (as you seem to be doing). I feel using double inheritance etc. is a bit overkill here, when you could just allocate a char buffer of sizeof(size_t)+allocation_size and use the first part for your metadata.
Allocate and return to the API a raw buffer as needed, and use a separate static data structure to manage this with a map of ptr->allocation size. I suppose this is what malloc/realloc is doing behind the scenes anyway.
Both are valid, and the one you chose depends on the specific details of your application. When dealing with memory it's ok to actually deal with memory, be it pointer arithmetic or what not.
The important thing is probably to keep the ugly bit to a single location, and provide a C++ style API to this on the C++ side of things, so that the client code isn't exposed to implementation details.

Allocate a struct containing a string in a single allocation

I'm working on a program that stores a vital data structure as an unstructured string with program-defined delimiters (so we need to walk the string and extract the information we need as we go) and we'd like to convert it to a more structured data type.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself. The length of the string will always be known at allocation time. We've determined through testing that doubling the number of allocations required for each of these data types is an unnacceptable cost. Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation? If we were using cstrings I'd just have a char * in the struct and point it to the end of the struct after allocating a block big enough for the struct and string, but we'd prefer std::string if possible.
Most of my experience is with C, so please forgive any C++ ignorance displayed here.
If you have such rigorous memory needs, then you're going to have to abandon std::string.
The best alternative is to find or write an implementation of basic_string_ref (a proposal for the next C++ standard library), which is really just a char* coupled with a size. But it has all of the (non-mutating) functions of std::basic_string. Then you use a factory function to allocate the memory you need (your struct size + string data), and then use placement new to initialize the basic_string_ref.
Of course, you'll also need a custom deletion function, since you can't just pass the pointer to "delete".
Given the previously linked to implementation of basic_string_ref (and its associated typedefs, string_ref), here's a factory constructor/destructor, for some type T that needs to have a string on it:
template<typename T> T *Create(..., const char *theString, size_t lenstr)
{
char *memory = new char[sizeof(T) + lenstr + 1];
memcpy(memory + sizeof(T), theString, lenstr);
try
{
return new(memory) T(..., string_ref(theString, lenstr);
}
catch(...)
{
delete[] memory;
throw;
}
}
template<typename T> T *Create(..., const std::string & theString)
{
return Create(..., theString.c_str(), theString.length());
}
template<typename T> T *Create(..., const string_ref &theString)
{
return Create(..., theString.data(), theString.length());
}
template<typename T> void Destroy(T *pValue)
{
pValue->~T();
char *memory = reinterpret_cast<char*>(pValue);
delete[] memory;
}
Obviously, you'll need to fill in the other constructor parameters yourself. And your type's constructor will need to take a string_ref that refers to the string.
If you are using std::string, you can't really do one allocation for both structure and string, and you also can't make the allocation of both to be one large block. If you are using old C-style strings it's possible though.
If I understand you correctly, you are saying that through profiling you have determined that the fact that you have to allocate a string and another data member in your data structure imposes an unacceptable cost to you application.
If that's indeed the case I can think of a couple solutions.
You could pre-allocate all of these structures up front, before your program starts. Keep them in some kind of fixed collection so they aren't copy-constructed, and reserve enough buffer in your strings to hold your data.
Controversial as it may seem, you could use old C-style char arrays. It seems like you are fogoing much of the reason to use strings in the first place, which is the memory management. However in your case, since you know the needed buffer sizes at start up, you could handle this yourself. If you like the other facilities that string provides, bear in mind that much of that is still available in the <algorithm>s.
Take a look at Variable Sized Struct C++ - the short answer is that there's no way to do it in vanilla C++.
Do you really need to allocate the container structs on the heap? It might be more efficient to have those on the stack, so they don't need to be allocated at all.
Indeed two allocations can seem too high. There are two ways to cut them down though:
Do a single allocation
Do a single dynamic allocation
It might not seem so different, so let me explain.
1. You can use the struct hack in C++
Yes this is not typical C++
Yes this requires special care
Technically it requires:
disabling the copy constructor and assignment operator
making the constructor and destructor private and provide factory methods for allocating and deallocating the object
Honestly, this is the hard-way.
2. You can avoid allocating the outer struct dynamically
Simple enough:
struct M {
Kind _kind;
std::string _data;
};
and then pass instances of M on the stack. Move operations should guarantee that the std::string is not copied (you can always disable copy to make sure of it).
This solution is much simpler. The only (slight) drawback is in memory locality... but on the other hand the top of the stack is already in the CPU cache anyway.
C-style strings can always be converted to std::string as needed. In fact, there's a good chance that your observations from profiling are due to fragmentation of your data rather than simply the number of allocations, and creating an std::string on demand will be efficient. Of course, not knowing your actual application this is just a guess, and really one can't know this until it's tested anyways. I imagine a class
class my_class {
std::string data() const { return self._data; }
const char* data_as_c_str() const // In case you really need it!
{ return self._data; }
private:
int _type;
char _data[1];
};
Note I used a standard clever C trick for data layout: _data is as long as you want it to be, so long as your factory function allocates the extra space for it. IIRC, C99 even gave a special syntax for it:
struct my_struct {
int type;
char data[];
};
which has good odds of working with your C++ compiler. (Is this in the C++11 standard?)
Of course, if you do do this, you really need to make all of the constructors private and friend your factory function, to ensure that the factory function is the only way to actually instantiate my_class -- it would be broken without the extra memory for the array. You'll definitely need to make operator= private too, or otherwise implement it carefully.
Rethinking your data types is probably a good idea.
For example, one thing you can do is, rather than trying to put your char arrays into a structured data type, use a smart reference instead. A class that looks like
class structured_data_reference {
public:
structured_data_reference(const char *data):_data(data) {}
std::string get_first_field() const {
// Do something interesting with _data to get the first field
}
private:
const char *_data;
};
You'll want to do the right thing with the other constructors and assignment operator too (probably disable assignment, and implement something reasonable for move and copy). And you may want reference counted pointers (e.g. std::shared_ptr) throughout your code rather than bare pointers.
Another hack that's possible is to just use std::string, but store the type information in the first entry (or first several). This requires accounting for that whenever you access the data, of course.
I'm not sure if this exactly addressing your problem. One way you can optimize the memory allocation in C++ by using a pre-allocated buffer and then using a 'placement new' operator.
I tried to solve your problem as I understood it.
unsigned char *myPool = new unsigned char[10000];
struct myStruct
{
myStruct(char* aSource1, char* aSource2)
{
original = new (myPool) string(aSource1); //placement new
data = new (myPool) string(aSource2); //placement new
}
~myStruct()
{
original = NULL; //no deallocation needed
data = NULL; //no deallocation needed
}
string* original;
string* data;
};
int main()
{
myStruct* aStruct = new (myPool) myStruct("h1", "h2");
// Use the struct
aStruct = NULL; // No need to deallocate
delete [] myPool;
return 0;
}
[Edit] After, the comment from NicolBolas, the problem is bit more clear. I decided to write one more answer, eventhough in reality it is not that much advantageous than using a raw character array. But, I still believe that this is well within the stated constraints.
Idea would be to provide a custom allocater for the string class as specified in this SO question.
In the implementation of the allocate method, use the placement new as
pointer allocate(size_type n, void * = 0)
{
// fail if we try to allocate too much
if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
//T* t = static_cast<T *>(::operator new(n * sizeof(T)));
T* t = new (/* provide the address of the original character buffer*/) T[n];
return t;
}
The constraint is that for the placement new to work, the original string address should be known to the allocater at run time. This can be achieved by external explicit setting before the new string member creation. However, this is not so elegant.
In essence, this will require a struct with a field describing what kind of data the struct contains and another field that's a string with the data itself.
I have a feeling that may you are not exploiting C++'s type-system to its maximum potential here. It looks and feels very C-ish (that is not a proper word, I know). I don't have concrete examples to post here since I don't have any idea about the problem you are trying to solve.
Is there any way to allocate the memory for the struct and the std::string contained in the struct in a single allocation?
I believe that you are worrying about the structure allocation followed by a copy of the string to the structure member? This ideally shouldn't happen (but of course, this depends on how and when you are initializng the members). C++11 supports move construction. This should take care of any extra string copies that you are worried about.
You should really, really post some code to make this discussion worthwhile :)
a vital data structure as an unstructured string with program-defined delimiters
One question: Is this string mutable? If not, you can use a slightly different data-structure. Don't store copies of parts of this vital data structure but rather indices/iterators to this string which point to the delimiters.
// assume that !, [, ], $, % etc. are your program defined delims
const std::string vital = "!id[thisisdata]$[moredata]%[controlblock]%";
// define a special struct
enum Type { ... };
struct Info {
size_t start, end;
Type type;
// define appropriate ctors
};
// parse the string and return Info obejcts
std::vector<Info> parse(const std::string& str) {
std::vector<Info> v;
// loop through the string looking for delims
for (size_t b = 0, e = str.size(); b < e; ++b) {
// on hitting one such delim create an Info
switch( str[ b ] ) {
case '%':
...
case '$;:
// initializing the start and then move until
// you get the appropriate end delim
}
// use push_back/emplace_back to insert this newly
// created Info object back in the vector
v.push_back( Info( start, end, kind ) );
}
return v;
}

Creating dynamically sized objects

Removed the C tag, seeing as that was causing some confusion (it shouldn't have been there to begin with; sorry for any inconvenience there. C answer as still welcome though :)
In a few things I've done, I've found the need to create objects that have a dynamic size and a static size, where the static part is your basic object members, the dynamic part is however an array/buffer appended directly onto the class, keeping the memory contiguous, thus decreasing the amount of needed allocations (these are non-reallocatable objects), and decreasing fragmentation (though as a down side, it may be harder to find a block of a big enough size, however that is a lot more rare - if it should even occur at all - than heap fragmenting. This is also helpful on embedded devices where memory is at a premium(however I don't do anything for embedded devices currently), and things like std::string need to be avoided, or can't be used like in the case of trivial unions.
Generally the way I'd go about this would be to (ab)use malloc(std::string is not used on purpose, and for various reasons):
struct TextCache
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache* pNext;
char pBuffer[0];
};
TextCache* pCache = (TextCache*)malloc(sizeof(TextCache) + (sizeof(char) * nLength));
This however doesn't sit too well with me, as firstly I would like to do this using new, and thus in a C++ environment, and secondly, it looks horrible :P
So next step was a templated C++ varient:
template <const size_t nSize> struct TextCache
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache<nSize>* pNext;
char pBuffer[nSize];
};
This however has the problem that storing a pointer to a variable sized object becomes 'impossible', so then the next work around:
class DynamicObject {};
template <const size_t nSize> struct TextCache : DynamicObject {...};
This however still requires casting, and having pointers to DynamicObject all over the place becomes ambiguous when more that one dynamically sized object derives from it (it also looks horrible and can suffer from a bug that forces empty classes to still have a size, although that's probably an archaic, extinct bug...).
Then there was this:
class DynamicObject
{
void* operator new(size_t nSize, size_t nLength)
{
return malloc(nSize + nLength);
}
};
struct TextCache : DynamicObject {...};
which looks a lot better, but would interfere with objects that already have overloads of new(it could even affect placement new...).
Finally I came up with placement new abusing:
inline TextCache* CreateTextCache(size_t nLength)
{
char* pNew = new char[sizeof(TextCache) + nLength];
return new(pNew) TextCache;
}
This however is probably the worst idea so far, for quite a few reasons.
So are there any better ways to do this? or would one of the above versions be better, or at least improvable? Is doing even considered safe and/or bad programming practice?
As I said above, I'm trying to avoid double allocations, because this shouldn't need 2 allocations, and cause this makes writing(serializing) these things to files a lot easier.
The only exception to the double allocation requirement I have is when its basically zero overhead. the only cause where I have encountered that is where I sequentially allocate memory from a fixed buffer(using this system, that I came up with), however its also a special exception to prevent superflous copying.
I'd go for a compromise with the DynamicObject concept. Everything that doesn't depend on the size goes into the base class.
struct TextBase
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextBase* pNext;
};
template <const size_t nSize> struct TextCache : public TextBase
{
char pBuffer[nSize];
};
This should cut down on the casting required.
C99 blesses the 'struct hack' - aka flexible array member.
§6.7.2.1 Structure and union specifiers
¶16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. With two
exceptions, the flexible array member is ignored. First, the size of the structure shall be
equal to the offset of the last element of an otherwise identical structure that replaces the
flexible array member with an array of unspecified length.106) Second, when a . (or ->)
operator has a left operand that is (a pointer to) a structure with a flexible array member
and the right operand names that member, it behaves as if that member were replaced
with the longest array (with the same element type) that would not make the structure
larger than the object being accessed; the offset of the array shall remain that of the
flexible array member, even if this would differ from that of the replacement array. If this
array would have no elements, it behaves as if it had one element but the behavior is
undefined if any attempt is made to access that element or to generate a pointer one past
it.
¶17 EXAMPLE Assuming that all array members are aligned the same, after the declarations:
struct s { int n; double d[]; };
struct ss { int n; double d[1]; };
the three expressions:
sizeof (struct s)
offsetof(struct s, d)
offsetof(struct ss, d)
have the same value. The structure struct s has a flexible array member d.
106) The length is unspecified to allow for the fact that implementations may give array members different
alignments according to their lengths.
Otherwise, use two separate allocations - one for the core data in the structure and the second for the appended data.
That may look heretic regarding to the economic mindset of C or C++ programmers, but the last time I had a similar problem to solve I chosed to put a fixed size static buffer in my struct and access it through a pointer indirection. If the given struct grew larger than my static buffer the indirection pointer was then allocated dynamically (and the internal buffer unused). That was very simple to implement and solved the kind of issues you raised like fragmentation, as static buffer was used in more than 95% of actual use case, with the remaining 5% needing really large buffers, hence I didn't cared much of the small loss of internal buffer.
I believe that, in C++, technically this is Undefined Behavior (due to alignment issues), although I suspect it can be made to work for probably every existing implementation.
But why do that anyway?
You could use placement new in C++:
char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;
The only caveat being that you need to delete buff instead of pCache and if pCache has a destructor you'll have to call it manually.
If you are intending to access this extra area using pBuffer I'd recommend doing this:
struct TextCache
{
...
char *pBuffer;
};
...
char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;
pCache->pBuffer = new (buff + sizeof(TextCache)) char[nLength];
...
delete [] buff;
There's nothing wrong with managing your own memory.
template<typename DerivedType, typename ElemType> struct appended_array {
ElemType* buffer;
int length;
~appended_array() {
for(int i = 0; i < length; i++)
buffer->~ElemType();
char* ptr = (char*)this - sizeof(DerivedType);
delete[] ptr;
}
static inline DerivedType* Create(int extra) {
char* newbuf = new char[sizeof(DerivedType) + (extra * sizeof(ElemType))];
DerivedType* ptr = new (newbuf) DerivedType();
ElemType* extrabuf = (ElemType*)newbuf[sizeof(DerivedType)];
for(int i = 0; i < extra; i++)
new (&extrabuf[i]) ElemType();
ptr->lenghth = extra;
ptr->buffer = extrabuf;
return ptr;
}
};
struct TextCache : appended_array<TextCache, char>
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache* pNext;
// IT'S A MIRACLE! We have a buffer of size length and pointed to by buffer of type char that automagically appears for us in the Create function.
};
You should consider, however, that this optimization is premature and there are way better ways of doing it, like having an object pool or managed heap. Also, I didn't count for any alignment, however it's my understanding that sizeof() returns the aligned size. Also, this will be a bitch to maintain for non-trivial construction. Also, this is totally untested. A managed heap is a way better idea. But you shouldn't be afraid of managing your own memory- if you have custom memory requirements, you need to manage your own memory.
Just occurred to me that I destructed but not deleted the "extra" memory.

Variable Sized Struct C++

Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.
struct Packet
{
unsigned int bytelength;
unsigned int data[];
};
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bytelength = length;
return output;
}
Edit: renamed variable names and changed code to be more correct.
Some thoughts on what you're doing:
Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if struct Packet contained a std::vector. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.
This structure represents a packet. Are you planning to read/write from a socket directly into a struct Packet? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.
Likewise, you may need to take alignment and packing into account.
You can never subclass Packet. If you did, then the subclass's member variables would overlap with the array.
Instead of malloc and free, you could use Packet* p = ::operator new(size) and ::operator delete(p), since struct Packet is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the global operator new handles errors using the global new-handler and/or exceptions, if that matters to you.
It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom operator new that takes an array length by implementing static void* operator new(size_t size, unsigned int bitlength), but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expression Packet* p = new(len) Packet(len) to allocate a packet. The only benefit I see compared to using global operator new and operator delete would be that clients of your code could just call delete p instead of ::operator delete(p). Wrapping the allocation/deallocation in separate functions (instead of calling delete p directly) is fine as long as they get called correctly.
If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.
It's frowned upon in c++ circles, but I consider the usage of it okay if you document it in the code.
Some comments to your code:
struct Packet
{
unsigned int bitlength;
unsigned int data[];
};
If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be compliant declare your array of length 1.
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bitlength = length;
return output;
}
This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:
Packet* CreatePacket(unsigned int length)
{
size_t s = sizeof (Packed) - sizeof (Packed.data);
Packet *output = (Packet*) malloc(s + length * sizeof(unsigned int));
output->bitlength = length;
return output;
}
And write a comment into your packet structure definition that data must be the last member.
Btw - allocating the structure and the data with a single allocation is a good thing. You halve the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.
Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.
This is OK (and was standard practice for C).
But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.
For Example:
void copyRHSToLeft(Packet& lhs,Packet& rhs)
{
lhs = rhs; // The compiler generated code for assignement kicks in here.
// Are your objects going to cope correctly??
}
Packet* a = CreatePacket(3);
Packet* b = CreatePacket(5);
copyRHSToLeft(*a,*b);
Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.
Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html
You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:
struct Packet
{
unsigned int bytelength;
unsigned int data[];
private:
// Will cause compiler error if you misuse this struct
void Packet(const Packet&);
void operator=(const Packet&);
};
I'd probably just stick with using a vector<> unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.
However, there are several The advantages of going with vector<>:
it already handles copy, assignment & destruction properly - if you roll your own you need to ensure you handle these correctly
all the iterator support is there - again, you don't have to roll your own.
everybody already knows how to use it
If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from vector<> privately or has a vector<> member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.
There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way
If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:
struct PacketStruct
{
unsigned int bitlength;
unsigned int data[];
};
class PacketClass
{
public:
unsigned int bitlength;
unsigned int data[];
};
The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.
struct Packet
{
unsigned long bytelength;
unsigned char data[];
Packet(unsigned long length = 256) // default constructor replaces CreatePacket()
: bytelength(length),
data(new unsigned char[length])
{
}
~Packet() // destructor to avoid memory leak
{
delete [] data;
}
};
A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:
Packet p; // default packet with 256-byte data array
Packet p(1024); // packet with 1024-byte data array
The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.
You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.
Fortunately the boost library did most of the hard part:
struct packet
{
boost::uint32_t _size;
boost::scoped_array<unsigned char> _data;
packet() : _size(0) {}
explicit packet(packet boost::uint32_t s) : _size(s), _data(new unsigned char [s]) {}
explicit packet(const void * const d, boost::uint32_t s) : _size(s), _data(new unsigned char [s])
{
std::memcpy(_data, static_cast<const unsigned char * const>(d), _size);
}
};
typedef boost::shared_ptr<packet> packet_ptr;
packet_ptr build_packet(const void const * data, boost::uint32_t s)
{
return packet_ptr(new packet(data, s));
}
There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).
Disclaimer: I wrote a small library to explore this concept: https://github.com/ppetr/refcounted-var-sized-class
We want to allocate a single block of memory for a data structure of type T and an array of elements of type A. In most cases A will be just char.
For this let's define a RAII class to allocate and deallocate such a memory block. This poses several difficulties:
C++ allocators don't provide such API. Therefore we need to allocate plain chars and place the structure in the block ourselves. For this std::aligned_storage will be helpful.
The memory block must be properly aligned. Because in C++11 there doesn't seem to be API for allocating an aligned block, we need to slightly over-allocate by alignof(T) - 1 bytes and then use std::align.
// Owns a block of memory large enough to store a properly aligned instance of
// `T` and additional `size` number of elements of type `A`.
template <typename T, typename A = char>
class Placement {
public:
// Allocates memory for a properly aligned instance of `T`, plus additional
// array of `size` elements of `A`.
explicit Placement(size_t size)
: size_(size),
allocation_(std::allocator<char>().allocate(AllocatedBytes())) {
static_assert(std::is_trivial<Placeholder>::value);
}
Placement(Placement const&) = delete;
Placement(Placement&& other) {
allocation_ = other.allocation_;
size_ = other.size_;
other.allocation_ = nullptr;
}
~Placement() {
if (allocation_) {
std::allocator<char>().deallocate(allocation_, AllocatedBytes());
}
}
// Returns a pointer to an uninitialized memory area available for an
// instance of `T`.
T* Node() const { return reinterpret_cast<T*>(&AsPlaceholder()->node); }
// Returns a pointer to an uninitialized memory area available for
// holding `size` (specified in the constructor) elements of `A`.
A* Array() const { return reinterpret_cast<A*>(&AsPlaceholder()->array); }
size_t Size() { return size_; }
private:
// Holds a properly aligned instance of `T` and an array of length 1 of `A`.
struct Placeholder {
typename std::aligned_storage<sizeof(T), alignof(T)>::type node;
// The array type must be the last one in the struct.
typename std::aligned_storage<sizeof(A[1]), alignof(A[1])>::type array;
};
Placeholder* AsPlaceholder() const {
void* ptr = allocation_;
size_t space = sizeof(Placeholder) + alignof(Placeholder) - 1;
ptr = std::align(alignof(Placeholder), sizeof(Placeholder), ptr, space);
assert(ptr != nullptr);
return reinterpret_cast<Placeholder*>(ptr);
}
size_t AllocatedBytes() {
// We might need to shift the placement of for up to `alignof(Placeholder) - 1` bytes.
// Therefore allocate this many additional bytes.
return sizeof(Placeholder) + alignof(Placeholder) - 1 +
(size_ - 1) * sizeof(A);
}
size_t size_;
char* allocation_;
};
Once we've dealt with the problem of memory allocation, we can define a wrapper class that initializes T and an array of A in an allocated memory block.
template <typename T, typename A = char,
typename std::enable_if<!std::is_destructible<A>{} ||
std::is_trivially_destructible<A>{},
bool>::type = true>
class VarSized {
public:
// Initializes an instance of `T` with an array of `A` in a memory block
// provided by `placement`. Callings a constructor of `T`, providing a
// pointer to `A*` and its length as the first two arguments, and then
// passing `args` as additional arguments.
template <typename... Arg>
VarSized(Placement<T, A> placement, Arg&&... args)
: placement_(std::move(placement)) {
auto [aligned, array] = placement_.Addresses();
array = new (array) char[placement_.Size()];
new (aligned) T(array, placement_.Size(), std::forward<Arg>(args)...);
}
// Same as above, with initializing a `Placement` for `size` elements of `A`.
template <typename... Arg>
VarSized(size_t size, Arg&&... args)
: VarSized(Placement<T, A>(size), std::forward<Arg>(args)...) {}
~VarSized() { std::move(*this).Delete(); }
// Destroys this instance and returns the `Placement`, which can then be
// reused or destroyed as well (deallocating the memory block).
Placement<T, A> Delete() && {
// By moving out `placement_` before invoking `~T()` we ensure that it's
// destroyed even if `~T()` throws an exception.
Placement<T, A> placement(std::move(placement_));
(*this)->~T();
return placement;
}
T& operator*() const { return *placement_.Node(); }
const T* operator->() const { return &**this; }
private:
Placement<T, A> placement_;
};
This type is moveable, but obviously not copyable. We could provide a function to convert it into a shared_ptr with a custom deleter. But this will need to internally allocate another small block of memory for a reference counter (see also How is the std::tr1::shared_ptr implemented?).
This can be solved by introducing a specialized data type that will hold our Placement, a reference counter and a field with the actual data type in a single structure. For more details see my refcount_struct.h.
You should declare a pointer, not an array with an unspecified length.

What is the Performance, Safety, and Alignment of a Data member hidden in an embedded char array in a C++ Class?

I have seen a codebase recently that I fear is violating alignment constraints. I've scrubbed it to produce a minimal example, given below. Briefly, the players are:
Pool. This is a class which allocates memory efficiently, for some definition of 'efficient'. Pool is guaranteed to return a chunk of memory that is aligned for the requested size.
Obj_list. This class stores homogeneous collections of objects. Once the number of objects exceeds a certain threshold, it changes its internal representation from a list to a tree. The size of Obj_list is one pointer (8 bytes on a 64-bit platform). Its populated store will of course exceed that.
Aggregate. This class represents a very common object in the system. Its history goes back to the early 32-bit workstation era, and it was 'optimized' (in that same 32-bit era) to use as little space as possible as a result. Aggregates can be empty, or manage an arbitrary number of objects.
In this example, Aggregate items are always allocated from Pools, so they are always aligned. The only occurrences of Obj_list in this example are the 'hidden' members in Aggregate objects, and therefore they are always allocated using placement new. Here are the support classes:
class Pool
{
public:
Pool();
virtual ~Pool();
void *allocate(size_t size);
static Pool *default_pool(); // returns a global pool
};
class Obj_list
{
public:
inline void *operator new(size_t s, void * p) { return p; }
Obj_list(const Args *args);
// when constructed, Obj_list will allocate representation_p, which
// can take up much more space.
~Obj_list();
private:
Obj_list_store *representation_p;
};
And here is Aggregate. Note that member declaration member_list_store_d:
// Aggregate is derived from Lesser, which is twelve bytes in size
class Aggregate : public Lesser
{
public:
inline void *operator new(size_t s) {
return Pool::default_pool->allocate(s);
}
inline void *operator new(size_t s, Pool *h) {
return h->allocate(s);
}
public:
Aggregate(const Args *args = NULL);
virtual ~Aggregate() {};
inline const Obj_list *member_list_store_p() const;
protected:
char member_list_store_d[sizeof(Obj_list)];
};
It is that data member that I'm most concerned about. Here is the pseudocode for initialization and access:
Aggregate::Aggregate(const Args *args)
{
if (args) {
new (static_cast<void *>(member_list_store_d)) Obj_list(args);
}
else {
zero_out(member_list_store_d);
}
}
inline const Obj_list *Aggregate::member_list_store_p() const
{
return initialized(member_list_store_d) ? (Obj_list *) &member_list_store_d : 0;
}
You may be tempted to suggest that we replace the char array with a pointer to the Obj_list type, initialized to NULL or an instance of the class. This gives the proper semantics, but just shifts the memory cost around. If memory were still at a premium (and it might be, this is an EDA database representation), replacing the char array with a pointer to an Obj_list would cost one more pointer in the case when Aggregate objects do have members.
Besides that, I don't really want to get distracted from the main question here, which is alignment. I think the above construct is problematic, but can't really find more in the standard than some vague discussion of the alignment behavior of the 'system/library' new.
So, does the above construct do anything more than cause an occasional pipe stall?
Edit: I realize that there are ways to replace the approach using the embedded char array. So did the original architects. They discarded them because memory was at a premium. Now, if I have a reason to touch that code, I'll probably change it.
However, my question, about the alignment issues inherent in this approach, is what I hope people will address. Thanks!
Ok - had a chance to read it properly. You have an alignment problem, and invoke undefined behaviour when you access the char array as an Obj_list. Most likely your platform will do one of three things: let you get away with it, let you get away with it at a runtime penalty or occasionally crash with a bus error.
Your portable options to fix this are:
allocate the storage with malloc or
a global allocation function, but
you think this is too
expensive.
as Arkadiy says, make your buffer an Obj_list member:
Obj_list list;
but you now don't want to pay the cost of construction. You could mitigate this by providing an inline do-nothing constructor to be used only to create this instance - as posted the default constructor would do. If you follow this route, strongly consider invoking the dtor
list.~Obj_list();
before doing a placement new into this storage.
Otherwise, I think you are left with non portable options: either rely on your platform's tolerance of misaligned accesses, or else use any nonportable options your compiler gives you.
Disclaimer: It's entirely possible I'm missing a trick with unions or some such. It's an unusual problem.
The alignment will be picked by the compiler according to its defaults, this will probably end up as four-bytes under GCC / MSVC.
This should only be a problem if there is code (SIMD/DMA) that requires a specific alignment. In this case you should be able to use compiler directives to ensure that member_list_store_d is aligned, or increase the size by (alignment-1) and use an appropriate offset.
Can you simply have an instance of Obj_list inside Aggregate? IOW, something along the lines of
class Aggregate : public Lesser
{
...
protected:
Obj_list list;
};
I must be missing something, but I can't figure why this is bad.
As to your question - it's perfectly compiler-dependent. Most compilers, though, will align every member at word boundary by default, even if the member's type does not need to be aligned that way for correct access.
If you want to ensure alignment of your structures, just do a
// MSVC
#pragma pack(push,1)
// structure definitions
#pragma pack(pop)
// *nix
struct YourStruct
{
....
} __attribute__((packed));
To ensure 1 byte alignment of your char array in Aggregate
Allocate the char array member_list_store_d with malloc or global operator new[], either of which will give storage aligned for any type.
Edit: Just read the OP again - you don't want to pay for another pointer. Will read again in the morning.