Do not ask me what I'm trying to do, this is just a quick test and its only purpose is to see if there is something wrong with placement new.
I've found an issue, or I just misunderstood something.
#include <vector>
using namespace std;
#define WORKS
int main(int argc, char** argv) {
vector<int>* pp = (vector<int>*)malloc(sizeof(vector<int>)*20);
#ifdef WORKS
for(int i = 0; i < 20; ++i)
new (pp+i) vector<int>;
#else
new (pp) vector<int>[20];
#endif
for(int i = 0; i < 20; ++i)
pp[i].~vector<int>();
}
when you remove the "#define WORKS" it will give you access violation, like
for(int i = 0; i < 20; ++i)
new (pp+i) vector<int>;
which works good, was different from
new (pp) vector<int>[20];
which is the cause of throwing exceptions at the destruction stage. What's going on here?
I'm working on Windows XP and building with VC++ Express 2010.
ยง5.3.4/12:
-- new T[5] results in a call of operator new[](sizeof(T)*5+x)
[ ... ]
Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. [ emphasis added ]
To summarize, trying to place the array may require some unspecified amount of overhead that you're not allocating. As long as you place the elements individually, no such overhead is allowed, so the placement new works.
The result of a new expression does not have to be at the same address passed to the placement new operator. And, you are not guaranteed that the size required to allocate an array is strictly the size of a single element times the number of elements.
5.3.4:
A new-expression passes the amount
of space requested to the allocation
function as the first argument of type
std::size_t. That argument shall be
no less than the size of the object
being created; it may be greater
than the size of the object being
created only if the object is an
array.
So, the more correct version of your code would be:
void *ppstorage= malloc(sizeof(vector<int>)*20);
pp= new (ppstorage) vector<int>[20];
for(int i = 0; i < 20; ++i)
pp[i].~vector<int>();
Although you will almost certainly write past the end of ppstorage. The compiler has to store the count of the array somewhere to properly destruct each element, and for MSVC that is stored before the address returned by the new expression.
In theory, you could overload operator new[] to get the actual allocation size of an array:
void *operator new[](size_t *allocation_size, size_t size)
{
*allocation_size= size;
return nullptr;
}
But I have never tried this.
When you use operator new[] you must deallocate with operator delete[]. You cannot allocate with new[] and then deallocate one by one. So instead of your deallocation loop, you would do this:
delete [] pp;
Related
I'm having trouble allocating memory on the heap for an array of strings. Allocating with new works but malloc segfaults each time. The reason I want to use malloc in the first place is that I don't want to call the constructor unnecessarily.
This works fine
std::string* strings = new std::string[6];
This doesn't
std::string* strings = (std::string *)malloc(sizeof(std::string[6]));
One thing I've noticed is that the first variant (using new) allocates 248 bytes of memory while the second allocates only 240. This 8 byte difference is constant no matter the size of the array from what I've gathered and I cannot find what the source of the difference is.
Here's the whole code that segfaults.
#include <iostream>
void* operator new(size_t size)
{
std::cout << size << std::endl;
return malloc(size);
}
void* operator new [](size_t size)
{
std::cout << size << std::endl;
return malloc(size);
}
int main() {
std::string* strings = new std::string[6];
strings = (std::string *)malloc(sizeof(std::string[6]));
strings[0] = std::string("test");
return 0;
}
Another thing I've noticed is that the above code seems to work if I use memset after malloc to set all of the bytes I allocated with malloc to 0. I don't understand where the 8 byte difference comes from if this works and also why this variant works at all. Why would it work just because I set all of the bytes to 0?
malloc() only allocates raw memory, but it does not construct any objects inside of that memory.
new and new[] both allocate memory and construct objects.
If you really want to use malloc() to create an array of C++ objects (which you really SHOULD NOT do!), then you will have to call the object constructors yourself using placement-new, and also call the object destructors yourself before freeing the memory, eg:
std::string* strings = static_cast<std::string*>(
malloc(sizeof(std::string) * 6)
);
for(int i = 0; i < 6; ++i) {
new (&strings[i]) std::string;
}
...
for(int i = 0; i < 6; ++i) {
strings[i].~std::string();
}
free(strings);
In C++11 and C++14, you should use std::aligned_storage to help calculate the necessary size of the array memory, eg:
using string_storage = std::aligned_storage<sizeof(std::string), alignof(std::string)>::type;
void *buffer = malloc(sizeof(string_storage) * 6);
std::string* strings = reinterpret_cast<std::string*>(buffer);
for(int i = 0; i < 6; ++i) {
new (&strings[i]) std::string;
}
...
for(int i = 0; i < 6; ++i) {
strings[i].~std::string();
}
free(buffer);
In C++17 and later, you should use std::aligned_alloc() instead of malloc() directly, eg:
std::string* strings = static_cast<std::string*>(
std::aligned_alloc(alignof(std::string), sizeof(std::string) * 6)
);
for(int i = 0; i < 6; ++i) {
new (&strings[i]) std::string;
}
...
for(int i = 0; i < 6; ++i) {
strings[i].~std::string();
}
std::free(strings);
Allocating via new means the constructor is run. Please always use new and delete with C++ classes (and std::string is a C++ class), whenever you can.
When you do malloc() / free(), only memory allocation is done, constructor (destructor) is not run. This means, the object is not initialized. Technically you might still be able to use placement new (i.e., new(pointer) Type) to initialize it, but it's better and more conformant to use classic new.
If you wanted to allocate multiple objects, that's what containers are for. Please use them. Multiple top-grade engineers work on std::vector<>, std::array<>, std::set<>, std::map<> to work and be optimal - it's very hard to beat them in performance, stability or other metrics and, even if you do, the next coder at the same company needs to learn into your specific data structures. So it's suggested not to use custom and locally implemented allocations where a container could be used, unless for a very strong reason (or, of course, didactic purposes).
I'm experimenting the usage of placement new to try to understand how it works.
Executing the code below:
#include <iostream>
#define SHOW(x) std::cout << #x ": " << x << '\n'
template<typename T>
static T *my_realloc(T *ptr, size_t count) {
return (T *) realloc((void *) ptr, count * sizeof(T));
}
template<typename T>
static void my_free(T *ptr) {
free((T *) ptr);
}
int main() {
constexpr int count = 40;
int cap = 0;
int size = 0;
std::string *strs = nullptr;
auto tmp_str = std::string();
for(int i = 0; i < count; i++) {
tmp_str = std::to_string(i);
if(size == cap) {
if(cap == 0)
cap = 1;
else
cap *= 2;
strs = my_realloc(strs, cap);
}
new (&strs[size++]) std::string(std::move(tmp_str));
}
for(int i = 0; i < size; i++)
SHOW(strs[i]);
std::destroy_n(strs, size);
my_free(strs);
return 0;
}
I get the errors:
Invalid read of size 1
Invalid free() / delete / delete[] / realloc()
Removing the line
std::destroy_n(strs, size);
The error of invalid free is solved, but somehow all memory of the program is freed and no leaks are generated. But i can't find how the std::string destructor is called in the program.
If you want to store non-trivial types (such as std::string), then realloc simply cannot be used. You will find that standard library containers like e.g. std::vector will also not use it.
realloc may either extend the current allocation, without moving it in memory, or it might internally make a new allocation in separate memory and copy the contents of the old allocation to the new one. This step is performed as if by std::memcpy. The problem here is that std::memcpy will only work to actually create new objects implicitly in the new allocation and copy the values correctly if the type is trivially-copyable and if it and all of its subobjects are implicit-lifetime types. This definitively doesn't apply to std::string or any other type that manages some (memory) resource.
You are also forgetting to check the return value of realloc. If allocation failed, it may return a null pointer, which you will then use regardless, causing a null pointer dereference, as well as a memory leak of the old allocation.
Instead of using realloc you should make a new allocation for the new size, then placement-new copies of the objects already in the old allocation into the new one and then destroy the objects in the old allocation and free it.
If you want to guarantee that there won't be memory leaks when exceptions are thrown things become somewhat complicated. I suggest you look at the std::vector implementation of one of the standard library implementations if you want to figure out the details.
strs = my_realloc(strs, cap);
strs is a pointer to a std::string, and this will result in the contents of the pointer to be realloc()ed.
This is undefined behavior. C++ objects cannot be malloced, realloced, or freeed. Using a wrapper function, or placement new, at some point along the way, does not change that.
Everything from this point on is undefined behavior, and the resulting consequences are of no importance.
Well, first of all I want to know if the following is 'legal' or at least 'not evil'.
Second, I want to know what is happening internally to make it possible! it is amazing and quite strange, I know that a pointer may be perceived a an indirect way to access to an object by its memory address, so I suppose that is due to that, that a pointer can be redirected to point over both, dynamic objects and arrays even after its declaration and multiple assignments.
Example:
#include <iostream>
#include <cstring>
int main()
{
size_t size;
std::cout << "Enter a size for an array: ";
std::cin >> size;
/*Creating a pointer to a single dynamic string*/
std::string *pointer = new std::string();
pointer->append("Some text");
std::cout << pointer << std::endl;
delete pointer;
pointer = NULL;
/*setting it to be an array*/
pointer = new std::string[size];
for (size_t i = 0; i < size; i++)
pointer[i] = "Number :" + std::to_string(i);
std::cout << std::endl;
for (size_t i = 0; i < size; i++)
std::cout << pointer[i] << std::endl;
}
Many thanks.
Note:
I tried to repeat this creating a template class with a lambda inside (using typedef) and it did not worked.
The code has defined behavior and works fine. There is no problem with it, except that you are leaking the last allocation, which is not really optimal (but legal). Add a delete[] pointer; at the end.
The array version of new doesn't return the newly created array or a pointer or reference to it. Instead it returns a pointer to the first element in the newly created array.
pointer = new std::string[size];
After this pointer will point to the first std:.string of the newly created std::string[size] array. Just looking at pointer you won't know whether the std::string that it points to is part of an array or not.
This is why you can reference both arrays and single objects through the same pointer type. The same works with automatic arrays, which decay to pointers to their first element when assigned to a pointer.
This is also why you as the programmer need to remember which pointer points to a single object allocated with new and which one points to an array allocated with new[] and how long that array is (because in the former case you need to delete it with delete and in the latter with delete[]).
In practice you should not use raw new like this for that reason (and others). Instead use containers like std::vector if you need dynamically-sized arrays and std::unique_ptr if you need dynamically allocated single objects (although there is also a std::unique_ptr version for arrays).
Not sure if "header" is the correct term, so please correct me if it isn't.
I'm trying to first allocate a memory, and then use an overloaded (placement) new[] operator to initialize an array of class objects (say, MyClass).
Say, the size of a MyClass object is 0x68, and I want an array of 0x20. So the total size is sizeof(MyClass) * 0x20 = 0xD00, or so I thought.
Now when I use my overloaded placement new[] operator :
pArr = new(pAllocatedMem)MyClass[0x20];
The compiler returned size_t to the new[] operator is actually 0xD08. There is an extra 8 bytes. Looking at the value of that 8 bytes, it's used to store the size of the array (0x20 in this case).
So is there a constant definition of what this header size is, say from WDK, that I can use? Does this size changes depends on compilers or what else?
That amount of that extra space, if any, is compiler dependent. For the languange definition [expr.new, paragraph 15]:
[It] is a non-negative unspecified value representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[].
It is typically used by the runtime to know how many objects to destroy when delete is eventually called on the array.
Adding to the answer myself.
Did a little test with a colleague. It appears if the class to be initialized has a defined destructor function, the VS compiler will need that overhead storing the array size. No destructor, no overhead. Is that a bug or a feature?!.
#include <malloc.h>
#include <new>
class Foo
{
public:
//Foo() {}
//~Foo() {} // <-- will cause overhead even with user-allocated memory passed to placement new()
int a;
};
int main()
{
int n = 0x10;
size_t size = sizeof(int) * 2 + sizeof(Foo) * n;
void* p = malloc(size);
*((int*)p) = 0xaaaaaaaa;
*(int*)((char*)p + size - sizeof(int)) = 0xbbbbbbbb;
// Placement new with user-allocated memory
Foo* pf = new ((char*)p + sizeof(int)) Foo[n];
for (int i = 0; i < n; i++)
{
pf[i].a = i;
}
free(p);
return 0;
}
I'am wondering if built-in types in objects created on heap with new will be initialized to zero? Is it mandated by the standard or is it compiler specific?
Given the following code:
#include <iostream>
using namespace std;
struct test
{
int _tab[1024];
};
int main()
{
test *p(new test);
for (int i = 0; i < 1024; i++)
{
cout << p->_tab[i] << endl;
}
delete p;
return 0;
}
When run, it prints all zeros.
You can choose whether you want default-initialisation, which leaves fundamental types (and POD types in general) uninitialised, or value-initialisation, which zero-initialises fundamental (and POD) types.
int * garbage = new int[10]; // No initialisation
int * zero = new int[10](); // Initialised to zero.
This is defined by the standard.
No, if you do something like this:
int *p = new int;
or
char *p = new char[20]; // array of 20 bytes
or
struct Point { int x; int y; };
Point *p = new Point;
then the memory pointed to by p will have indeterminate/uninitialized values.
However, if you do something like this:
std::string *pstring = new std::string();
Then you can be assured that the string will have been initialized as an empty string, but that is because of how class constructors work, not because of any guarantees about heap allocation.
It's not mandated by the standard. The memory for the primitive type members may contain any value that was last left in memory.
Some compilers I guess may choose to initialize the bytes. Many do in debug builds of code. They assign some known byte sequence to give you a hint when debugging that the memory wasn't initialized by your program code.
Using calloc will return bytes initialized to 0, but that's not standard-specific. calloc as been around since C along with malloc. However, you will pay a run-time overhead for using calloc.
The advice given previously about using the std::string is quite sound, because after all, you're using the std, and getting the benefits of class construction/destruction behaviour. In other words, the less you have to worry about, like initialization of data, the less that can go wrong.