Does the alignas specifier work with 'new'? - c++

My question is rather simple;
Does the alignas specifier work with 'new'? That is, if a struct is defined to be aligned, will it be aligned when allocated with new?

Before C++17, if your type's alignment is not over-aligned, then yes, the default new will work. "Over-aligned" means that the alignment you specify in alignas is greater than alignof(std::max_align_t). The default new will work with non-over-aligned types more or less by accident; the default memory allocator will always allocate memory with an alignment equal to alignof(std::max_align_t).
If your type's alignment is over-aligned however, your out of luck. Neither the default new, nor any global new operator you write, will be able to even know the alignment required of the type, let alone allocate memory appropriate to it. The only way to help this case is to overload the class's operator new, which will be able to query the class's alignment with alignof.
Of course, this won't be useful if that class is used as the member of another class. Not unless that other class also overloads operator new. So something as simple as new pair<over_aligned, int>() won't work.
C++17 adds a number of memory allocators which are given the alignment of the type being used. These allocators are used specifically for over-aligned types (or more specifically, new-extended over-aligned types). So new pair<over_aligned, int>() will work in C++17.
Of course, this only works to the extent that the allocator handles over-aligned types.

No it does not. The struct will be padded to the alignment requested, but it will not be aligned. There is a chance, however, that this will be allowed in C++17 (the fact that this C++17 proposal exists should be pretty good proof this can't work in C++11).
I have seen this appear to work with some memory allocators, but that was pure luck. For instance, some memory allocators will align their memory allocations to powers of 2 of the requested size (up to 4KB) as an optimization for the allocator (reduce memory fragmentation, possibly make it easier to reuse previously freed memory, etc...). However, the new/malloc implementations that are included in the OS X 10.7 and CentOS 6 systems that I tested do not do this and fail with the following code:
#include <stdlib.h>
#include <assert.h>
struct alignas(8) test_struct_8 { char data; };
struct alignas(16) test_struct_16 { char data; };
struct alignas(32) test_struct_32 { char data; };
struct alignas(64) test_struct_64 { char data; };
struct alignas(128) test_struct_128 { char data; };
struct alignas(256) test_struct_256 { char data; };
struct alignas(512) test_struct_512 { char data; };
int main() {
test_struct_8 *heap_8 = new test_struct_8;
test_struct_16 *heap_16 = new test_struct_16;
test_struct_32 *heap_32 = new test_struct_32;
test_struct_64 *heap_64 = new test_struct_64;
test_struct_128 *heap_128 = new test_struct_128;
test_struct_256 *heap_256 = new test_struct_256;
test_struct_512 *heap_512 = new test_struct_512;
#define IS_ALIGNED(addr,size) ((((size_t)(addr)) % (size)) == 0)
assert(IS_ALIGNED(heap_8, 8));
assert(IS_ALIGNED(heap_16, 16));
assert(IS_ALIGNED(heap_32, 32));
assert(IS_ALIGNED(heap_64, 64));
assert(IS_ALIGNED(heap_128, 128));
assert(IS_ALIGNED(heap_256, 256));
assert(IS_ALIGNED(heap_512, 512));
delete heap_8;
delete heap_16;
delete heap_32;
delete heap_64;
delete heap_128;
delete heap_256;
delete heap_512;
return 0;
}

Related

When may I or may not have unaligned memory access?

I have a little confusion on when I may have unaligned access and when not.
I know that when I have a void pointer that came from network or somewhere else and there is no guarantee that it aligned with my machine word size, when I cast the pointer and de-referencing it, I need to be carful because I can access unaligned memory.
what I don't know, is when I don't need to worry about that, When do I have a guarantee that this will not happen?
considering this function:
static int compare_size_t(
const void * pcv_elem0,
const void * pcv_elem1)
{
size_t sz_elem0;
size_t sz_elem1;
sz_elem0 = *static_cast<size_t*>(pcv_elem0);
sz_elem1 = *static_cast<size_t*>(pcv_elem1);
if (sz_elem0 > sz_elem1)
{
return 1;
}
else if (sz_elem0 < sz_elem1)
{
return -1;
}
return 0;
}
static void foo(bool * pb_test_passed)
{
size_t arr[1024];
// fill the array with data
qsort(
arr,
1024,
sizeof(arr[0]),
compare_size_t);
// some other code...
}
Can I have unaligned access with ARM processor? should I replace
sz_elem0 = *static_cast<size_t*>(pcv_elem0);
sz_elem1 = *static_cast<size_t*>(pcv_elem1);
with
(void) memcpy(
&sz_elem0,
pcv_elem0,
sizeof(sz_elem0));
(void) memcpy(
&sz_elem1,
pcv_elem1,
sizeof(sz_elem1));
?
You don't have to worry about any of this in standard C++ because you can only dereference T* if there exists an object of T at the address. You can only create T by
new T(...) which handles alignment,
placement new new (buff) T where buff must be suitably aligned,
or automatic storage which handles alignment too.
[Other advanced stuff, not needed to demonstrate the issue]
If T is a member, compiler handles alignment during struct/class definition.
If there is no such object at the given address and unless T is char,unsigned char, or std::byte
(which have alignment 1), you are violating strict aliasing rule by dereferencing T* pointer -> UB.
There is a potential issue if you use compiler-specific packing directives. In that case, you must be very careful about passing pointers to unaligned members around. This will generate bus errors if the architecture (ARM) cannot handle unaligned reads/writes or degrade performance(x64) if it can. Accessing the member through its struct like object.member=5; is always safe (but maybe slower) since the compiler can see the misalignment and generate appropriate instructions.
So if there are no size_t objects at pcv_elemX created by one of the above approaches, you must use memcpy, otherwise you have UB. memcpy is safe for this and compiler can often avoid the copy and use the original pointer (while being aware of the possible aliasing).
There is C++20 std::bit_cast explicitly designed for this use case.

sizeof and types, guarantees

I cannot find a proof/disproof that the following code snippet has no design flaws, speaking about the correctness.
template <class Item>
class MyDirtyPool {
public:
template<typename ... Args>
std::size_t add(Args &&... args) {
if (!m_deletedComponentsIndexes.empty()) {
//reuse old block
size_t positionIndex = m_deletedComponentsIndexes.back();
m_deletedComponentsIndexes.pop_back();
void *block = static_cast<void *> (&m_memoryBlock[positionIndex]);
new(block) Item(std::forward<Args>(args) ...);
return positionIndex;
} else {
//not found, add new block
m_memoryBlock.emplace_back();
void *block = static_cast<void *> (&m_memoryBlock.back());
new(block) Item(std::forward<Args>(args) ...);
return m_memoryBlock.size() - 1;
}
}
//...all the other methods omitted
private:
struct Chunk {
char mem[sizeof(Item)]; //is this sane?
};
std::vector<Chunk> m_memoryBlock; //and this one too, safe?
std::deque<std::size_t> m_deletedComponentsIndexes;
};
I am concerned about all the stuff with Chunk, which is used here in essence as a bag of memory having the same size as the supplied type.
I don't want to explicitly create Item objects in the m_memoryBlock, therefore I need some kind of "chunk of memory having enough of space".
Can I be sure that Chunk will have the same size as the supplied type? Please provide some examples where this assumption won't work.
If I am totally wrong in my assumptions, how should I deal with it?
In such designs the memory must be suitably aligned for the type of objects you would like to create there.
Standard built-in types normally have natural alignment, which equals to their sizeof, i.e. sizeof(T) == alignof(T).
The alignment of char array is 1 byte, which is insufficient for anything else.
One simple way to enforce alignment is to use std::max_align_t like this:
union Chunk
{
std::max_align_t max_align;
char mem[sizeof(Item)];
};
That will make Chunk::mem suitably aligned for any standard built-in type.
Another way is to use the exact alignment of the type you would like to place in that char array using C++11 alignas keyword:
struct Chunk
{
alignas(Item) char mem[sizeof(Item)];
};
And this is exactly what std::aligned_storage does for you.
However, that would require exposing the definition of Item, which may be inconvenient or undesirable in some cases.
For example, this method could be used as an optimization for Pimpl idiom to avoid a memory allocation, however, that would require exposing the definition of the implementation in the header file to get its size and alignment, therefore defeating the purpose of Pimpl. See The Fast Pimpl Idiom for more details.
Another detail, is that if you would like to copy/move Chunks and expect the stored objects to remain valid, those stored objects must be trivially copyable, e.g.
static_assert(std::is_trivially_copyable<Item>::value,
"Item cannot be copied byte-wise");
typedef std::aligned_storage<sizeof(Item), alignof(Item)> Chunk;
For a memory pool std::vector<Chunk> is not a good choice because on reallocation when the vector grows all pointers and references to the objects stored in the pool get invalidated.
The objects should not move in a general memory pool.

Creating dynamically sized objects

Removed the C tag, seeing as that was causing some confusion (it shouldn't have been there to begin with; sorry for any inconvenience there. C answer as still welcome though :)
In a few things I've done, I've found the need to create objects that have a dynamic size and a static size, where the static part is your basic object members, the dynamic part is however an array/buffer appended directly onto the class, keeping the memory contiguous, thus decreasing the amount of needed allocations (these are non-reallocatable objects), and decreasing fragmentation (though as a down side, it may be harder to find a block of a big enough size, however that is a lot more rare - if it should even occur at all - than heap fragmenting. This is also helpful on embedded devices where memory is at a premium(however I don't do anything for embedded devices currently), and things like std::string need to be avoided, or can't be used like in the case of trivial unions.
Generally the way I'd go about this would be to (ab)use malloc(std::string is not used on purpose, and for various reasons):
struct TextCache
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache* pNext;
char pBuffer[0];
};
TextCache* pCache = (TextCache*)malloc(sizeof(TextCache) + (sizeof(char) * nLength));
This however doesn't sit too well with me, as firstly I would like to do this using new, and thus in a C++ environment, and secondly, it looks horrible :P
So next step was a templated C++ varient:
template <const size_t nSize> struct TextCache
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache<nSize>* pNext;
char pBuffer[nSize];
};
This however has the problem that storing a pointer to a variable sized object becomes 'impossible', so then the next work around:
class DynamicObject {};
template <const size_t nSize> struct TextCache : DynamicObject {...};
This however still requires casting, and having pointers to DynamicObject all over the place becomes ambiguous when more that one dynamically sized object derives from it (it also looks horrible and can suffer from a bug that forces empty classes to still have a size, although that's probably an archaic, extinct bug...).
Then there was this:
class DynamicObject
{
void* operator new(size_t nSize, size_t nLength)
{
return malloc(nSize + nLength);
}
};
struct TextCache : DynamicObject {...};
which looks a lot better, but would interfere with objects that already have overloads of new(it could even affect placement new...).
Finally I came up with placement new abusing:
inline TextCache* CreateTextCache(size_t nLength)
{
char* pNew = new char[sizeof(TextCache) + nLength];
return new(pNew) TextCache;
}
This however is probably the worst idea so far, for quite a few reasons.
So are there any better ways to do this? or would one of the above versions be better, or at least improvable? Is doing even considered safe and/or bad programming practice?
As I said above, I'm trying to avoid double allocations, because this shouldn't need 2 allocations, and cause this makes writing(serializing) these things to files a lot easier.
The only exception to the double allocation requirement I have is when its basically zero overhead. the only cause where I have encountered that is where I sequentially allocate memory from a fixed buffer(using this system, that I came up with), however its also a special exception to prevent superflous copying.
I'd go for a compromise with the DynamicObject concept. Everything that doesn't depend on the size goes into the base class.
struct TextBase
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextBase* pNext;
};
template <const size_t nSize> struct TextCache : public TextBase
{
char pBuffer[nSize];
};
This should cut down on the casting required.
C99 blesses the 'struct hack' - aka flexible array member.
§6.7.2.1 Structure and union specifiers
¶16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. With two
exceptions, the flexible array member is ignored. First, the size of the structure shall be
equal to the offset of the last element of an otherwise identical structure that replaces the
flexible array member with an array of unspecified length.106) Second, when a . (or ->)
operator has a left operand that is (a pointer to) a structure with a flexible array member
and the right operand names that member, it behaves as if that member were replaced
with the longest array (with the same element type) that would not make the structure
larger than the object being accessed; the offset of the array shall remain that of the
flexible array member, even if this would differ from that of the replacement array. If this
array would have no elements, it behaves as if it had one element but the behavior is
undefined if any attempt is made to access that element or to generate a pointer one past
it.
¶17 EXAMPLE Assuming that all array members are aligned the same, after the declarations:
struct s { int n; double d[]; };
struct ss { int n; double d[1]; };
the three expressions:
sizeof (struct s)
offsetof(struct s, d)
offsetof(struct ss, d)
have the same value. The structure struct s has a flexible array member d.
106) The length is unspecified to allow for the fact that implementations may give array members different
alignments according to their lengths.
Otherwise, use two separate allocations - one for the core data in the structure and the second for the appended data.
That may look heretic regarding to the economic mindset of C or C++ programmers, but the last time I had a similar problem to solve I chosed to put a fixed size static buffer in my struct and access it through a pointer indirection. If the given struct grew larger than my static buffer the indirection pointer was then allocated dynamically (and the internal buffer unused). That was very simple to implement and solved the kind of issues you raised like fragmentation, as static buffer was used in more than 95% of actual use case, with the remaining 5% needing really large buffers, hence I didn't cared much of the small loss of internal buffer.
I believe that, in C++, technically this is Undefined Behavior (due to alignment issues), although I suspect it can be made to work for probably every existing implementation.
But why do that anyway?
You could use placement new in C++:
char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;
The only caveat being that you need to delete buff instead of pCache and if pCache has a destructor you'll have to call it manually.
If you are intending to access this extra area using pBuffer I'd recommend doing this:
struct TextCache
{
...
char *pBuffer;
};
...
char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;
pCache->pBuffer = new (buff + sizeof(TextCache)) char[nLength];
...
delete [] buff;
There's nothing wrong with managing your own memory.
template<typename DerivedType, typename ElemType> struct appended_array {
ElemType* buffer;
int length;
~appended_array() {
for(int i = 0; i < length; i++)
buffer->~ElemType();
char* ptr = (char*)this - sizeof(DerivedType);
delete[] ptr;
}
static inline DerivedType* Create(int extra) {
char* newbuf = new char[sizeof(DerivedType) + (extra * sizeof(ElemType))];
DerivedType* ptr = new (newbuf) DerivedType();
ElemType* extrabuf = (ElemType*)newbuf[sizeof(DerivedType)];
for(int i = 0; i < extra; i++)
new (&extrabuf[i]) ElemType();
ptr->lenghth = extra;
ptr->buffer = extrabuf;
return ptr;
}
};
struct TextCache : appended_array<TextCache, char>
{
uint_32 fFlags;
uint_16 nXpos;
uint_16 nYpos;
TextCache* pNext;
// IT'S A MIRACLE! We have a buffer of size length and pointed to by buffer of type char that automagically appears for us in the Create function.
};
You should consider, however, that this optimization is premature and there are way better ways of doing it, like having an object pool or managed heap. Also, I didn't count for any alignment, however it's my understanding that sizeof() returns the aligned size. Also, this will be a bitch to maintain for non-trivial construction. Also, this is totally untested. A managed heap is a way better idea. But you shouldn't be afraid of managing your own memory- if you have custom memory requirements, you need to manage your own memory.
Just occurred to me that I destructed but not deleted the "extra" memory.

preventing unaligned data on the heap

I am building a class hierarchy that uses SSE intrinsics functions and thus some of the members of the class need to be 16-byte aligned. For stack instances I can use __declspec(align(#)), like so:
typedef __declspec(align(16)) float Vector[4];
class MyClass{
...
private:
Vector v;
};
Now, since __declspec(align(#)) is a compilation directive, the following code may result in an unaligned instance of Vector on the heap:
MyClass *myclass = new MyClass;
This too, I know I can easily solve by overloading the new and delete operators to use _aligned_malloc and _aligned_free accordingly. Like so:
//inside MyClass:
public:
void* operator new (size_t size) throw (std::bad_alloc){
void * p = _aligned_malloc(size, 16);
if (p == 0) throw std::bad_alloc()
return p;
}
void operator delete (void *p){
MyClass* pc = static_cast<MyClass*>(p);
_aligned_free(p);
}
...
So far so good.. but here is my problem. Consider the following code:
class NotMyClass{ //Not my code, which I have little or no influence over
...
MyClass myclass;
...
};
int main(){
...
NotMyClass *nmc = new NotMyClass;
...
}
Since the myclass instance of MyClass is created statically on a dynamic instance of NotMyClass, myclass WILL be 16-byte aligned relatively to the beginning of nmc because of Vector's __declspec(align(16)) directive. But this is worthless, since nmc is dynamically allocated on the heap with NotMyClass's new operator, which doesn't nesessarily ensure (and definitely probably NOT) 16-byte alignment.
So far, I can only think of 2 approaches on how to deal with this problem:
Preventing MyClass users from being able to compile the following code:
MyClass myclass;
meaning, instances of MyClass can only be created dynamically, using the new operator, thus ensuring that all instances of MyClass are truly dynamically allocatted with MyClass's overloaded new. I have consulted on another thread on how to accomplish this and got a few great answers:
C++, preventing class instance from being created on the stack (during compiltaion)
Revert from having Vector members in my Class and only have pointers to Vector as members, which I will allocate and deallocate using _aligned_malloc and _aligned_free in the ctor and dtor respectively. This methos seems crude and prone to error, since I am not the only programmer writing these Classes (MyClass derives from a Base class and many of these classes use SSE).
However, since both solutions have been frowned upon in my team, I come to you for suggestions of a different solution.
If you're set against heap allocation, another idea is to over allocate on the stack and manually align (manual alignment is discussed in this SO post). The idea is to allocate byte data (unsigned char) with a size guaranteed to contain an aligned region of the necessary size (+15), then find the aligned position by rounding down from the most-shifted region (x+15 - (x+15) % 16, or x+15 & ~0x0F). I posted a working example of this approach with vector operations on codepad (for g++ -O2 -msse2). Here are the important bits:
class MyClass{
...
unsigned char dPtr[sizeof(float)*4+15]; //over-allocated data
float* vPtr; //float ptr to be aligned
public:
MyClass(void) :
vPtr( reinterpret_cast<float*>(
(reinterpret_cast<uintptr_t>(dPtr)+15) & ~ 0x0F
) )
{}
...
};
...
The constructor ensures that vPtr is aligned (note the order of members in the class declaration is important).
This approach works (heap/stack allocation of containing classes is irrelevant to alignment), is portabl-ish (I think most compilers provide a pointer sized uint uintptr_t), and will not leak memory. But its not particularly safe (being sure to keep the aligned pointer valid under copy, etc), wastes (nearly) as much memory as it uses, and some may find the reinterpret_casts distasteful.
The risks of aligned operation/unaligned data problems could be mostly eliminated by encapsulating this logic in a Vector object, thereby controlling access to the aligned pointer and ensuring that it gets aligned at construction and stays valid.
You can use "placement new."
void* operator new(size_t, void* p) { return p; }
int main() {
void* p = aligned_alloc(sizeof(NotMyClass));
NotMyClass* nmc = new (p) NotMyClass;
// ...
nmc->~NotMyClass();
aligned_free(p);
}
Of course you need to take care when destroying the object, by calling the destructor and then releasing the space. You can't just call delete. You could use shared_ptr<> with a different function to deal with that automatically; it depends if the overhead of dealing with a shared_ptr (or other wrapper of the pointer) is a problem to you.
The upcoming C++0x standard proposes facilities for dealing with raw memory. They are already incorporated in VC++2010 (within the tr1 namespace).
std::tr1::alignment_of // get the alignment
std::tr1::aligned_storage // get aligned storage of required dimension
Those are types, you can use them like so:
static const floatalign = std::tr1::alignment_of<float>::value; // demo only
typedef std::tr1::aligned_storage<sizeof(float)*4, 16>::type raw_vector;
// first parameter is size, second is desired alignment
Then you can declare your class:
class MyClass
{
public:
private:
raw_vector mVector; // alignment guaranteed
};
Finally, you need some cast to manipulate it (it's raw memory until now):
float* MyClass::AccessVector()
{
return reinterpret_cast<float*>((void*)&mVector));
}

Variable Sized Struct C++

Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.
struct Packet
{
unsigned int bytelength;
unsigned int data[];
};
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bytelength = length;
return output;
}
Edit: renamed variable names and changed code to be more correct.
Some thoughts on what you're doing:
Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if struct Packet contained a std::vector. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.
This structure represents a packet. Are you planning to read/write from a socket directly into a struct Packet? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.
Likewise, you may need to take alignment and packing into account.
You can never subclass Packet. If you did, then the subclass's member variables would overlap with the array.
Instead of malloc and free, you could use Packet* p = ::operator new(size) and ::operator delete(p), since struct Packet is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the global operator new handles errors using the global new-handler and/or exceptions, if that matters to you.
It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom operator new that takes an array length by implementing static void* operator new(size_t size, unsigned int bitlength), but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expression Packet* p = new(len) Packet(len) to allocate a packet. The only benefit I see compared to using global operator new and operator delete would be that clients of your code could just call delete p instead of ::operator delete(p). Wrapping the allocation/deallocation in separate functions (instead of calling delete p directly) is fine as long as they get called correctly.
If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.
It's frowned upon in c++ circles, but I consider the usage of it okay if you document it in the code.
Some comments to your code:
struct Packet
{
unsigned int bitlength;
unsigned int data[];
};
If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be compliant declare your array of length 1.
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bitlength = length;
return output;
}
This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:
Packet* CreatePacket(unsigned int length)
{
size_t s = sizeof (Packed) - sizeof (Packed.data);
Packet *output = (Packet*) malloc(s + length * sizeof(unsigned int));
output->bitlength = length;
return output;
}
And write a comment into your packet structure definition that data must be the last member.
Btw - allocating the structure and the data with a single allocation is a good thing. You halve the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.
Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.
This is OK (and was standard practice for C).
But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.
For Example:
void copyRHSToLeft(Packet& lhs,Packet& rhs)
{
lhs = rhs; // The compiler generated code for assignement kicks in here.
// Are your objects going to cope correctly??
}
Packet* a = CreatePacket(3);
Packet* b = CreatePacket(5);
copyRHSToLeft(*a,*b);
Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.
Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html
You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:
struct Packet
{
unsigned int bytelength;
unsigned int data[];
private:
// Will cause compiler error if you misuse this struct
void Packet(const Packet&);
void operator=(const Packet&);
};
I'd probably just stick with using a vector<> unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.
However, there are several The advantages of going with vector<>:
it already handles copy, assignment & destruction properly - if you roll your own you need to ensure you handle these correctly
all the iterator support is there - again, you don't have to roll your own.
everybody already knows how to use it
If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from vector<> privately or has a vector<> member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.
There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way
If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:
struct PacketStruct
{
unsigned int bitlength;
unsigned int data[];
};
class PacketClass
{
public:
unsigned int bitlength;
unsigned int data[];
};
The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.
struct Packet
{
unsigned long bytelength;
unsigned char data[];
Packet(unsigned long length = 256) // default constructor replaces CreatePacket()
: bytelength(length),
data(new unsigned char[length])
{
}
~Packet() // destructor to avoid memory leak
{
delete [] data;
}
};
A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:
Packet p; // default packet with 256-byte data array
Packet p(1024); // packet with 1024-byte data array
The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.
You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.
Fortunately the boost library did most of the hard part:
struct packet
{
boost::uint32_t _size;
boost::scoped_array<unsigned char> _data;
packet() : _size(0) {}
explicit packet(packet boost::uint32_t s) : _size(s), _data(new unsigned char [s]) {}
explicit packet(const void * const d, boost::uint32_t s) : _size(s), _data(new unsigned char [s])
{
std::memcpy(_data, static_cast<const unsigned char * const>(d), _size);
}
};
typedef boost::shared_ptr<packet> packet_ptr;
packet_ptr build_packet(const void const * data, boost::uint32_t s)
{
return packet_ptr(new packet(data, s));
}
There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).
Disclaimer: I wrote a small library to explore this concept: https://github.com/ppetr/refcounted-var-sized-class
We want to allocate a single block of memory for a data structure of type T and an array of elements of type A. In most cases A will be just char.
For this let's define a RAII class to allocate and deallocate such a memory block. This poses several difficulties:
C++ allocators don't provide such API. Therefore we need to allocate plain chars and place the structure in the block ourselves. For this std::aligned_storage will be helpful.
The memory block must be properly aligned. Because in C++11 there doesn't seem to be API for allocating an aligned block, we need to slightly over-allocate by alignof(T) - 1 bytes and then use std::align.
// Owns a block of memory large enough to store a properly aligned instance of
// `T` and additional `size` number of elements of type `A`.
template <typename T, typename A = char>
class Placement {
public:
// Allocates memory for a properly aligned instance of `T`, plus additional
// array of `size` elements of `A`.
explicit Placement(size_t size)
: size_(size),
allocation_(std::allocator<char>().allocate(AllocatedBytes())) {
static_assert(std::is_trivial<Placeholder>::value);
}
Placement(Placement const&) = delete;
Placement(Placement&& other) {
allocation_ = other.allocation_;
size_ = other.size_;
other.allocation_ = nullptr;
}
~Placement() {
if (allocation_) {
std::allocator<char>().deallocate(allocation_, AllocatedBytes());
}
}
// Returns a pointer to an uninitialized memory area available for an
// instance of `T`.
T* Node() const { return reinterpret_cast<T*>(&AsPlaceholder()->node); }
// Returns a pointer to an uninitialized memory area available for
// holding `size` (specified in the constructor) elements of `A`.
A* Array() const { return reinterpret_cast<A*>(&AsPlaceholder()->array); }
size_t Size() { return size_; }
private:
// Holds a properly aligned instance of `T` and an array of length 1 of `A`.
struct Placeholder {
typename std::aligned_storage<sizeof(T), alignof(T)>::type node;
// The array type must be the last one in the struct.
typename std::aligned_storage<sizeof(A[1]), alignof(A[1])>::type array;
};
Placeholder* AsPlaceholder() const {
void* ptr = allocation_;
size_t space = sizeof(Placeholder) + alignof(Placeholder) - 1;
ptr = std::align(alignof(Placeholder), sizeof(Placeholder), ptr, space);
assert(ptr != nullptr);
return reinterpret_cast<Placeholder*>(ptr);
}
size_t AllocatedBytes() {
// We might need to shift the placement of for up to `alignof(Placeholder) - 1` bytes.
// Therefore allocate this many additional bytes.
return sizeof(Placeholder) + alignof(Placeholder) - 1 +
(size_ - 1) * sizeof(A);
}
size_t size_;
char* allocation_;
};
Once we've dealt with the problem of memory allocation, we can define a wrapper class that initializes T and an array of A in an allocated memory block.
template <typename T, typename A = char,
typename std::enable_if<!std::is_destructible<A>{} ||
std::is_trivially_destructible<A>{},
bool>::type = true>
class VarSized {
public:
// Initializes an instance of `T` with an array of `A` in a memory block
// provided by `placement`. Callings a constructor of `T`, providing a
// pointer to `A*` and its length as the first two arguments, and then
// passing `args` as additional arguments.
template <typename... Arg>
VarSized(Placement<T, A> placement, Arg&&... args)
: placement_(std::move(placement)) {
auto [aligned, array] = placement_.Addresses();
array = new (array) char[placement_.Size()];
new (aligned) T(array, placement_.Size(), std::forward<Arg>(args)...);
}
// Same as above, with initializing a `Placement` for `size` elements of `A`.
template <typename... Arg>
VarSized(size_t size, Arg&&... args)
: VarSized(Placement<T, A>(size), std::forward<Arg>(args)...) {}
~VarSized() { std::move(*this).Delete(); }
// Destroys this instance and returns the `Placement`, which can then be
// reused or destroyed as well (deallocating the memory block).
Placement<T, A> Delete() && {
// By moving out `placement_` before invoking `~T()` we ensure that it's
// destroyed even if `~T()` throws an exception.
Placement<T, A> placement(std::move(placement_));
(*this)->~T();
return placement;
}
T& operator*() const { return *placement_.Node(); }
const T* operator->() const { return &**this; }
private:
Placement<T, A> placement_;
};
This type is moveable, but obviously not copyable. We could provide a function to convert it into a shared_ptr with a custom deleter. But this will need to internally allocate another small block of memory for a reference counter (see also How is the std::tr1::shared_ptr implemented?).
This can be solved by introducing a specialized data type that will hold our Placement, a reference counter and a field with the actual data type in a single structure. For more details see my refcount_struct.h.
You should declare a pointer, not an array with an unspecified length.