What is the extra cost of overloading placement new operator? - c++

We want to overload placement new operator just to verify that used memory size is enough for the given class. We know this size. The construction is more or less in this way:
template <size_t MAXSIZE>
class PlacementNewTest {
public:
void* operator new (size_t size, void* where)
{
if (size > MAXSIZE) {
throw bad_alloc();
}
return where;
}
};
Let say used in such simplified context:
char buffer[200];
class A : public PlacementNewTest<sizeof buffer> {
public:
char a[100];
};
class B : public A {
public:
char b[200];
};
int main() {
A* a = new (buffer) A; // OK
a->~A();
B* b = new (buffer) B; // throwed bad_alloc
b->~B();
}
During testing phase I have this PlacementNewTest<> class used, but in the release code I consider to remove it. Do you think, basing on your experience, how much this will cost our performance, not to remove this extra test class? Is this only cost of this verification if (size > MAXSIZE)?
In other words, what is performance penalty for such redefinition:
class PlacementNewNOP {
public:
void* operator new (size_t size, void* where)
{
return where;
}
};
Maybe it is not important in this question - but:
This is, and must be, C++03. We cannot upgrade to C++11. And boost is not an option too, just C++03.

There shouldn't be any overhead apart from the comparison, unless you are using virtual methods, the binding is static.
Of course there is the exception overhead, but since that is something that shouldn't happen, you should be safe to ignore it.

Related

C++17 polymorphic memory recources not working

I was trying to understand the c++17 pmr.
So I did this and it is not working as I thought, what could go wrong?
template <typename T>
class Memory : public std::experimental::pmr::memory_resource {
public:
Memory() { this->memory = allocate(sizeof(T), alignof(T)); }
void *getMemory() { return this->memory; }
~Memory() { deallocate(this->memory, sizeof(T), alignof(T)); }
private:
void *do_allocate(std::size_t bytes, std::size_t alignment)
{
memory = ::operator new(bytes);
}
void do_deallocate(void *p, std::size_t bytes, std::size_t alignment)
{
::operator delete(memory);
}
bool do_is_equal(
const std::experimental::pmr::memory_resource& other) const noexcept
{
}
void *memory;
};
what can be going wrong with my implementation?
This is the client..
Memory<std::string> mem;
std::string * st = (std::string*)mem.getMemory();
st->assign("Pius");
std::cout << *st;
The polymorphic resource allocators allocate memory; that's all they do. Unlike container Allocators, they don't create objects. That's why they return void*s.
Memory resources are not meant to be used by themselves. That's why std::polymorphic_allocator<T> exists. You can also do the object creation/destruction yourself, using placement-new and manual destructor calls.
Also, your memory_resource implementation makes no sense. do_allocate should return the allocated memory, not store it internally. Your function provokes undefined behavior by returning nothing (which your compiler should have warned about).

C++, adding temporary objects to a list, without dynamic memory allocation

I'm writing code for an embedded platform, therefore I cannot use the normal new operator.
Now I want to add arbitrary objects to a list, just like this.
tp.add(DerivedA("David"));
tp.add(DerivedB("Max"));
tp.add(DerivedC("Thomas"));
For the reason of code duplication I don't want to write something like this:
DerivedA david("David");
tp.add(david);
...
A solution, but not very pretty style would be this:
tp.add(new (myalloc(sizeof(DerivedB))) DerivedB("John"));
// using placement-new works
Now I tried to add a temporary object, passed by pointer:
tp.add(&DerivedA("David"));
Theoretically this could work, but the compiler complains (with good reason) about passing a pointer to a temporary object (-fpermissive).
Is there a clean way of doing what I want to?
Here is a full example:
#include <iostream>
using namespace std;
class Base // base class
{
public:
Base();
int size;
char name[100];
};
class Derived:public Base
{
public:
Derived(char* name);
};
class ThirdParty
{
public:
void add(Base* obj);
void addTemp(Base* tempObj);
Base* list[10];
int index;
};
void* myalloc(int size){
void* p;
// ...
// allocate memory in a static memory pool
// ...
return p;
}
void memcpy(void* to, void* from, int size){
}
int main()
{
ThirdParty tp;
// The ugly style:
tp.add(new (myalloc(sizeof(Derived))) Derived("John")); // using placement-new works
// The beauty style (compiler complains here):
tp.addTemp(&Derived("David")); // create temporary object here, which is copied and added to the list
tp.addTemp(&Derived("Max"));
tp.addTemp(&Derived("Thomas"));
return 0;
}
Base::Base()
{
size = sizeof(Base);
}
Derived::Derived(char *name)
{
size = sizeof(Derived); // make size of this object available for a base-pointer
}
void ThirdParty::add(Base *obj)
{
list[index++] = obj;
}
void ThirdParty::addTemp(Base* tempObj)
{
Base* newObj = (Base*) myalloc(tempObj->size); // let third party allocate memory
memcpy(newObj, tempObj, tempObj->size); // copy the temporary object
list[index++] = newObj;
}
If you use C++11, you could write a forwarding function to do the work for you:
template <typename T, typename... Args>
T* make (Args&&... args) {
return new (myalloc(sizeof(T))) T { std::forward<Args>(args)... };
}
You'd then add an object to your list like so:
tp.add(make<Derived>("John"));
My preferred solution now is the following macro:
#define m(x) new (myalloc(sizeof(x))) x
now I can add a new object with this code:
tp.add(m(Derived("Isabella")));
Can you not just override new to use myalloc ? If you do notcwant to do this globally, you certainly can do it for Base

Using placement new in a container

I just came across some container implementation in C++. That class uses an internal buffer to manage its objects. This is a simplified version without safety checks:
template <typename E> class Container
{
public:
Container() : buffer(new E[100]), size(0) {}
~Container() { delete [] buffer; }
void Add() { buffer[size] = E(); size++; }
void Remove() { size--; buffer[size].~E(); }
private:
E* buffer;
int size;
};
AFAIK this will construct/destruct E objects redundantly in Container() and ~Container() if new/delete are not customized. This seems dangerous.
Is using placement new in Add() the best way to prevent dangerous redundant constructor / destructor calls (apart from binding the class to a fully featured pool)?
When using placement new, would new char[sizeof(E)*100] be the correct way for allocating the buffer?
AFAIK this will construct/destruct E objects redundantly
It would appear so. The newed array already applies the default constructor and the delete[] would call destructor as well for all the elements. In effect, the Add() and Remove() methods add little other than maintain the size counter.
When using placement new, would new char[sizeof(E)*100] be the correct way for allocating the buffer?
The best would be to opt for the std::allocator that handles all of the memory issues for you already.
Using a placement new and managing the memory yourself requires you to be aware of a number of issues (including);
Alignment
Allocated and used size
Destruction
Construction issues such as emplacement
Possible aliasing
None of these are impossible to surmount, it has just already been done in the standard library. If you are interested in pursuing a custom allocator, the global allocation functions (void* operator new (std::size_t count);) would be the appropriate starting point for the memory allocations.
Without further explanation on the original purpose of the code - a std::vector or a std::array would be far better options for managing the elements in the container.
There's a number of issues with the code.
If you call Remove() before calling Add() you will perform assignment to a destructed object.
Otherwise the delete[] buffer will call the destructor of 100 objects in the array. Which may have been called before.
Here's a valid program:
#include <iostream>
int counter=0;
class Example {
public:
Example():ID(++counter){
std::cout<<"constructing "<<ID<<std::endl;
}
~Example(){
std::cout<<"destructing "<<ID<<std::endl;
ID=-1;
}
private:
int ID;
};
template <typename E> class Container
{
public:
Container() : buffer(new char [100*sizeof(E)]), size(0) {}
~Container() {
for(size_t i=0;i<size;++i){
reinterpret_cast<E*>(buffer)[i].~E();
}
delete [] buffer;
}
void Add() { new (buffer+sizeof(E)*size) E(); size++; }
void Remove() { reinterpret_cast<E*>(buffer)[--size].~E(); }
private:
void* buffer;
size_t size;
};
int main() {
Container<Example> empty;
Container<Example> single;
Container<Example> more;
single.Add();
more.Add();
more.Remove();
more.Add();
more.Add();
more.Remove();
return 0;
}

How to force return value optimization in msvc

I have a function in a class that I want the compiler to use NRVO on...all the time...even in debug mode. Is there a pragma for this?
Here is my class that works great in "release" mode:
template <int _cbStack> class CBuffer {
public:
CBuffer(int cb) : m_p(0) {
m_p = (cb > _cbStack) ? (char*)malloc(cb) : m_pBuf;
}
template <typename T> operator T () const {
return static_cast<T>(m_p);
}
~CBuffer() {
if (m_p && m_p != m_pBuf)
free(m_p);
}
private:
char *m_p, m_pBuf[_cbStack];
};
The class is used to make a buffer on the stack unless more than _cbStack bytes are required. Then when it destructs, it frees memory if it allocated any. It's handy when interfacing to c functions that require a string buffer, and you are not sure of the maximum size.
Anyway, I was trying to write a function that could return CBuffer, like in this test:
#include "stdafx.h"
#include <malloc.h>
#include <string.h>
template <int _cbStack> CBuffer<_cbStack> foo()
{
// return a Buf populated with something...
unsigned long cch = 500;
CBuffer<_cbStack> Buf(cch + 1);
memset(Buf, 'a', cch);
((char*)Buf)[cch] = 0;
return Buf;
}
int _tmain(int argc, _TCHAR* argv[])
{
auto Buf = foo<256>();
return 0;
}
I was counting on NRVO to make foo() fast. In release mode, it works great. In debug mode, it obviously fails, because there is no copy constructor in my class. I don't want a copy constructor, since CBuffer will be used by developers who like to copy everything 50 times. (Rant: these guys were using a dynamic array class to create a buffer of 20 chars to pass to WideCharToMultiByte(), because they seem to have forgotten that you can just allocate an array of chars on the stack. I don't know if they even know what the stack is...)
I don't really want to code up the copy constructor just so the code works in debug mode! It gets huge and complicated:
template <int _cbStack>
class CBuffer {
public:
CBuffer(int cb) : m_p(0) { Allocate(cb); }
CBuffer(CBuffer<_cbStack> &r) {
int cb = (r.m_p == r.m_pBuf) ? _cbStack : ((int*)r.m_p)[-1];
Allocate(cb);
memcpy(m_p, r.m_p, cb);
}
CBuffer(CBuffer<_cbStack> &&r) {
if (r.m_p == r.m_pBuf) {
m_p = m_pBuf;
memcpy(m_p, r.m_p, _cbStack);
} else {
m_p = r.m_p;
r.m_p = NULL;
}
}
template <typename T> operator T () const {
return static_cast<T>(m_p);
}
~CBuffer() {
if (m_p && m_p != m_pBuf)
free((int*)m_p - 1);
}
protected:
void Allocate(int cb) {
if (cb > _cbStack) {
m_p = (char*)malloc(cb + sizeof(int));
*(int*)m_p = cb;
m_p += sizeof(int);
} else {
m_p = m_pBuf;
}
}
char *m_p, m_pBuf[_cbStack];
};
This pragma does not work:
#pragma optimize("gf", on)
Any ideas?
It is not hard to make your code both standards conforming and work.
First, wrap arrays of T with optional extra padding. Now you know the layout.
For ownership use a unique ptr instead of a raw one. If it is vapid, operator T* returns it, otherwise buffer. Now your default move ctor works, as does NRVO if the move fails.
If you want to support non POD types, a bit of work will let you both suppoort ctors and dtors and move of array elements and padding bit for bit.
The result will be a class that does not behave surprisingly and will not create bugs the first time someome tries to copy or move it - well not the first, that would be easy. The code as written will blow up in different ways at different times!
Obey the rule of three.
Here is an explicit example (now that I'm off my phone):
template <size_t T, size_t bufSize=sizeof(T)>
struct CBuffer {
typedef T value_type;
CBuffer();
explicit CBuffer(size_t count=1, size_t extra=0) {
reset(count, extra);
}
void resize(size_t count, size_t extra=0) {
size_t amount = sizeof(value_type)*count + extra;
if (amount > bufSize) {
m_heapBuffer.reset( new char[amount] );
} else {
m_heapBuffer.reset();
}
}
explicit operator value_type const* () const {
return get();
}
explicit operator value_type* () {
return get();
}
T* get() {
return reinterpret_cast<value_type*>(getPtr())
}
T const* get() const {
return reinterpret_cast<value_type const*>(getPtr())
}
private:
std::unique_ptr< char[] > m_heapBuffer;
char m_Buffer[bufSize];
char const* getPtr() const {
if (m_heapBuffer)
return m_heapBuffer.get();
return &m_Buffer[0];
}
char* getPtr() {
if (m_heapBuffer)
return m_heapBuffer.get();
return &m_Buffer[0];
}
};
The above CBuffer supports move construction and move assignment, but not copy construction or copy assignment. This means you can return a local instance of these from a function. RVO may occur, but if it doesn't the above code is still safe and legal (assuming T is POD).
Before putting it into production myself, I would add some T must be POD asserts to the above, or handle non-POD T.
As an example of use:
#include <iostream>
size_t fill_buff(size_t len, char* buff) {
char const* src = "This is a string";
size_t needed = strlen(src)+1;
if (len < needed)
return needed;
strcpy( buff, src );
return needed;
}
void test1() {
size_t amt = fill_buff(0,0);
CBuffer<char, 100> strBuf(amt);
fill_buff( amt, strBuf.get() );
std::cout << strBuf.get() << "\n";
}
And, for the (hopefully) NRVO'd case:
template<size_t n>
CBuffer<char, n> test2() {
CBuffer<char, n> strBuf;
size_t amt = fill_buff(0,0);
strBuf.resize(amt);
fill_buff( amt, strBuf.get() );
return strBuf;
}
which, if NRVO occurs (as it should), won't need a move -- and if NRVO doesn't occur, the implicit move that occurs is logically equivalent to not doing the move.
The point is that NRVO isn't relied upon to have well defined behavior. However, NRVO is almost certainly going to occur, and when it does occur it does something logically equivalent to doing the move-constructor option.
I didn't have to write such a move-constructor, because unique_ptr is move-constructable, as are arrays inside structs. Also note that copy-construction is blocked, because unique_ptr cannot be copy-constructed: this aligns with your needs.
In debug, it is quite possibly true that you'll end up doing a move-construct. But there shouldn't be any harm in that.
I don't think there is a publicly available fine-grained compiler option that only triggers NRVO.
However, you can still manipulate compiler optimization flags per each source file via either changing options in Project settings, command line, and #pragma.
http://msdn.microsoft.com/en-us/library/chh3fb0k(v=vs.110).aspx
Try to give /O1 or /O2 to the file that you want.
And, the debug mode in Visual C++ is nothing but a configuration with no optimizations and generating debugging information (PDB, program database file).
If you are using Visual C++ 2010 or later, you can use move semantics to achieve an equivalent result. See How to: Write a Move Constructor.

How to do the equivalent of memset(this, ...) without clobbering the vtbl?

I know that memset is frowned upon for class initialization. For example, something like the following:
class X { public:
X() { memset( this, 0, sizeof(*this) ) ; }
...
} ;
will clobber the vtbl if there's a virtual function in the mix.
I'm working on a (humongous) legacy codebase that is C-ish but compiled in C++, so all the members in question are typically POD and require no traditional C++ constructors. C++ usage gradually creeps in (like virtual functions), and this bites the developers that don't realize that memset has these additional C++ teeth.
I'm wondering if there is a C++ safe way to do an initial catch-all zero initialization, that could be followed by specific by-member initialization where zero initialization isn't appropriate?
I find the similar questions memset for initialization in C++, and zeroing derived struct using memset. Both of these have "don't use memset()" answers, but no good alternatives (esp. for large structures potentially containing many many members).
For each class where you find a memset call, add a memset member function which ignores the pointer and size arguments and does assignments to all the data members.
edit:
Actually, it shouldn't ignore the pointer, it should compare it to this. On a match, do the right thing for the object, on a mismatch, reroute to the global function.
You could always add constructors to these embedded structures, so they clear themselves so to speak.
Try this:
template <class T>
void reset(T& t)
{
t = T();
}
This will zeroed your object - no matter it is POD or not.
But do not do this:
A::A() { reset(*this); }
This will invoke A::A in infinite recursion!!!
Try this:
struct AData { ... all A members };
class A {
public:
A() { reset(data); }
private:
AData data;
};
This is hideous, but you could overload operator new/delete for these objects (or in a common base class), and have the implementation provide zero'd out buffers. Something like this :
class HideousBaseClass
{
public:
void* operator new( size_t nSize )
{
void* p = malloc( nSize );
memset( p, 0, nSize );
return p;
}
void operator delete( void* p )
{
if( p )
free( p );
}
};
One could also override the global new/delete operators, but this could have negative perf implications.
Edit: I just realized that this approach won't work for stack allocated objects.
Leverage the fact that a static instance is initialised to zero:
https://ideone.com/GEFKG0
template <class T>
struct clearable
{
void clear()
{
static T _clear;
*((T*)this) = _clear;
};
};
class test : public clearable<test>
{
public:
int a;
};
int main()
{
test _test;
_test.a=3;
_test.clear();
printf("%d", _test.a);
return 0;
}
However the above will cause the constructor (of the templatised class) to be called a second time.
For a solution that causes no ctor call this can be used instead: https://ideone.com/qTO6ka
template <class T>
struct clearable
{
void *cleared;
clearable():cleared(calloc(sizeof(T), 1)) {}
void clear()
{
*((T*)this) = *((T*)cleared);
};
};
...and if you're using C++11 onwards the following can be used: https://ideone.com/S1ae8G
template <class T>
struct clearable
{
void clear()
{
*((T*)this) = {};
};
};
The better solution I could find is to create a separated struct where you will put the members that must be memsetted to zero. Not sure if this design is suitable for you.
This struct got no vtable and extends nothings. It will be just a chunk of data. This way memsetting the struct is safe.
I have made an example:
#include <iostream>
#include <cstring>
struct X_c_stuff {
X_c_stuff() {
memset(this,0,sizeof(this));
}
int cMember;
};
class X : private X_c_stuff{
public:
X()
: normalMember(3)
{
std::cout << cMember << normalMember << std::endl;
}
private:
int normalMember;
};
int main() {
X a;
return 0;
}
You can use pointer arithmetic to find the range of bytes you want to zero out:
class Thing {
public:
Thing() {
memset(&data1, 0, (char*)&lastdata - (char*)&data1 + sizeof(lastdata));
}
private:
int data1;
int data2;
int data3;
// ...
int lastdata;
};
(Edit: I originally used offsetof() for this, but a comment pointed out that this is only supposed to work on PODs, and then I realised that you can just use the member addresses directly.)