c++ string in C struct, is it illegal? - c++

struct run_male_walker_struct {
string male_user_name;
string show_name;
};
typedef struct run_male_walker_struct run_male_walker_struct_t;
in another function:
run_male_walker_struct_t *p = malloc(sizeof(struct run_male_walker_struct));
question, is it illegal? As the string is a class, it's size can't be determined by sizeof().

This is illegal, but not for the reasons you're thinking.
The difference between std::malloc()/std::free() and new/delete is that the latter will call constructors/destructors, while the former won't. The expression
void* p = std::malloc(sizeof(run_male_walker_struct))
will return a blob of uninitialized memory on which no constructor is called. You shouldn't touch it with a ten foot pole - except for invoking a constructor on it:
run_male_walker_struct* pw = new(p) run_male_walker_struct;
If you do this, you will have to do the reverse, too:
pw->~run_male_walker_struct();
before you free the memory:
std::free(p);
However, that leaves the question why you want to do that.
The only reason to do this should be when you want to separate memory allocation from construction (like, for example, in a pool allocator). But if you need that, it's best hidden behind some interface. A natural one would be overloading new and delete per class. Also, std::vector does this internally.

Not really sure what you're asking here... Just to be clear, the struct keyword is a valid C++ designation, that functions nearly identically to class except for the default privacy. So if you're compiling with g++, and including the string library, this is a valid statement.
However, calling with malloc() will just give you the memory, not actually construct the values inside that struct. You could more appropriately instantiate it by calling it's default constructor.

The struct definition itself is fine. It results is a non-POD aggregate. But you should prefer the use of new and delete over malloc and free because these handle construction and destruction properly. If you want to keep using malloc and free you have to use the placement-new to properly construct the object and invoke the destructor manually to destroy it before you free it:
#include <new>
...
run_male_walker_struct *p = (run_male_walker_struct*)
malloc(sizeof(run_male_walker_struct));
new(p) run_male_walker_struct; // <-- placement-new
...
p->~run_male_walker_struct(); // <-- pseudo destructor call
free(p);
Or simply:
run_male_walker_struct *p = new run_male_walker_struct;
...
delete p;
BTW: the typedef is not necessary in C++

Try not to use malloc, if you are in C++.
Using NEW is a better alternative, when you browse into the NEW() code, you will realize it does call malloc!!!
The pros of using NEW is it will call the constructor of your class instantiated.
Another minor comment, the code you provided should not be compilable:
run_male_walker_struct_t *p = malloc(sizeof(struct run_male_walker_struct));
Should be
run_male_walker_struct_t *p = (run_male_walker_struct_t*)malloc(sizeof(struct run_male_walker_struct));
this is due to malloc will return a void*.

Using malloc() would work, but using it will only create enough space for your struct.
This means that you will not be able to use your strings properly, because they weren't initialised with their constructors.
Note that string classes don't have their contents in stack memory, but in dynamic memory, which doesn't affect the size of the struct. All classes and structs have a static size, that are known at compile-time (if the struct/class was defined).
I would suggest using new. Using malloc will stuff up the strings.
This raises a question of my own, how did constructors get called on dynamically allocated instantiation in C (were there no such things as constructors in C?). If so, yet another reason against using pure C.

How about
run_male_walker_struct_t * p = new run_male_walker_struct_t:

I'm fairly sure this is legal because the size of the std::string object will be known even if the lengths of the strings are not known. The results may not be what you expect though because malloc won't call constructors.
Try this:
std::string testString1("babab");
std::string testString2("12345678");
std::string testString3;
std::cout <<" sizeof(testString1)" <<sizeof(testString1) << std::endl;
std::cout <<" sizeof(testString2)" <<sizeof(testString2) << std::endl;
std::cout <<" sizeof(testString3)" <<sizeof(testString3) << std::endl;
On my machine this gives me the following output:
sizeof(testString1)8
sizeof(testString2)8
sizeof(testString3)8
Also is there some reason you are not using:
run_male_walker_struct_t *p = new(struct run_male_walker_struct);
This is the correct way to do it in c++, using malloc is almost certainly a mistake.
EDIT: see this page for a more detailed explanation of new vs malloc in c++:
http://www.codeproject.com/KB/tips/newandmalloc.aspx

The answer depends on what you mean by a "C struct".
If you mean "a struct that is valid under the C language", then the answer is obviously: it contains a datatype that isn't valid C, and so the struct itself isn't valid either.
If you mean a C++ POD type, then the answer is no, it is not illegal, but the struct is no longer a POD type (because in order to be POD, all its members must be POD as well, and std::string isn't)

Related

Initializing an array of trivially_copyable but not default_constructible objects from bytes. Confusion in [intro.object]

We are initializing (large) arrays of trivially_copiable objects from secondary storage, and questions such as this or this leaves us with little confidence in our implemented approach.
Below is a minimal example to try to illustrate the "worrying" parts in the code.
Please also find it on Godbolt.
Example
Let's have a trivially_copyable but not default_constructible user type:
struct Foo
{
Foo(double a, double b) :
alpha{a},
beta{b}
{}
double alpha;
double beta;
};
Trusting cppreference:
Objects of trivially-copyable types that are not potentially-overlapping subobjects are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
Now, we want to read a binary file into an dynamic array of Foo. Since Foo is not default constructible, we cannot simply:
std::unique_ptr<Foo[]> invalid{new Foo[dynamicSize]}; // Error, no default ctor
Alternative (A)
Using uninitialized unsigned char array as storage.
std::unique_ptr<unsigned char[]> storage{
new unsigned char[dynamicSize * sizeof(Foo)] };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << reinterpret_cast<Foo *>(storage.get())[index].alpha << "\n";
Is there an UB because object of actual type Foo are never explicitly created in storage?
Alternative (B)
The storage is explicitly typed as an array of Foo.
std::unique_ptr<Foo[]> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
This alternative was inspired by this post. Yet, is it better defined? It seems there are still no explicit creation of object of type Foo.
It is notably getting rid of the reinterpret_cast when accessing the Foo data member (this cast might have violated the Type Aliasing rule).
Overall Questions
Are any of these alternatives defined by the standard? Are they actually different?
If not, is there a correct way to implement this (without first initializing all Foo instances to values that will be discarded immediately after)
Is there any difference in undefined behaviours between versions of the C++ standard?
(In particular, please see this comment with regard to C++20)
What you're trying to do ultimately is create an array of some type T by memcpying bytes from elsewhere without default constructing the Ts in the array first.
Pre-C++20 cannot do this without provoking UB at some point.
The problem ultimately comes down to [intro.object]/1, which defines the ways objects get created:
An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]).
If you have a pointer of type T*, but no T object has been created in that address, you can't just pretend that the pointer points to an actual T. You have to cause that T to come into being, and that requires doing one of the above operations. And the only available one for your purposes is the new-expression, which requires that the T is default constructible.
If you want to memcpy into such objects, they must exist first. So you have to create them. And for arrays of such objects, that means they need to be default constructible.
So if it is at all possible, you need a (likely defaulted) default constructor.
In C++20, certain operations can implicitly create objects (provoking "implicit object creation" or IOC). IOC only works on implicit lifetime types, which for classes:
A class S is an implicit-lifetime class if it is an aggregate or has at least one trivial eligible constructor and a trivial, non-deleted destructor.
Your class qualifies, as it has a trivial copy constructor (which is "eligible") and a trivial destructor.
If you create an array of byte-wise types (unsigned char, std::byte, or char), this is said to "implicitly create objects" in that storage. This property also applies to the memory returned by malloc and operator new. This means that if you do certain kinds of undefined behavior to pointers to that storage, the system will automatically create objects (at the point where the array was created) that would make that behavior well-defined.
So if you allocate such storage, cast a pointer to it to a T*, and then start using it as though it pointed to a T, the system will automatically create Ts in that storage, so long as it was appropriately aligned.
Therefore, your alternative A works just fine:
When you apply [index] to your casted pointer, C++ will retroactively create an array of Foo in that storage. That is, because you used the memory like an array of Foo exists there, C++20 will make an array of Foo exist there, exactly as if you had created it back at the new unsigned char statement.
However, alternative B will not work as is. You did not use new[] Foo to create the array, so you cannot use delete[] Foo to delete it. You can still use unique_ptr, but you'll have to create a deleter that explicitly calls operator delete on the pointer:
struct mem_delete
{
template<typename T>
void operator(T *ptr)
{
::operator delete[](ptr);
}
};
std::unique_ptr<Foo[], mem_delete> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
Again, storage[index] creates an array of T as if it were created at the time the memory was allocated.
My first question is: What are you trying to achieve?
Is there an issue with reading each entry individually?
Are you assuming that your code will speed up by reading an array?
Is latency really a factor?
Why can't you just add a default constructor to the class?
Why can't you enhance input.read() to read directly into an array? See std::extent_v<T>
Assuming the constraints you defined, I would start with writing it the simple way, reading one entry at a time, and benchmark it.
Having said that, that which you describe is a common paradigm and, yes, can break a lot of rules.
C++ is very (overly) cautious about things like alignment which can be issues on certain platforms and non-issues on others. This is only "undefined behaviour" because no cross-platform guarantees can be given by the C++ standard itself, even though many techniques work perfectly well in practice.
The textbook way to do this is to create an empty buffer and memcpy into a proper object, but as your input is serialised (potentially by another system), there isn't actually a guarantee that the padding and alignment will match the memory layout which the local compiler determined for the sequence so you would still have to do this one item at a time.
My advice is to write a unit-test to ensure that there are no issues and potentially embed that into the code as a static assertion. The technique you described breaks some C++ rules but that doesn't mean it's breaking, for example, x86 rules.
Alternative (A): Accessing a —non-static— member of an object before its lifetime begins.
The behavior of the program is undefined (See: [basic.life]).
Alternative (B): Implicit call to the implicitly deleted default constructor.
The program is ill-formed (See: [class.default.ctor]).
I'm not sure about the latter. If someone more knowledgeable knows if/why this is UB please correct me.
You can manage the memory yourself, and then return a unique_ptr which uses a custom deleter. Since you can't use new[], you can't use the plain version of unique_ptr<T[]> and you need to manually call the destructor and deleter using an allocator.
template <class Allocator = std::allocator<Foo>>
struct FooDeleter : private Allocator {
using pointer = typename std::allocator_traits<Allocator>::pointer;
explicit FooDeleter(const Allocator &alloc, len) : Allocator(alloc), len(len) {}
void operator()(pointer p) {
for (pointer i = p; i != p + len; ++i) {
Allocator::destruct(i);
}
Allocator::deallocate(p, len);
}
size_t len;
};
std::unique_ptr<Foo[], FooDeleter<>> create(size_t len) {
std::allocator<Foo> alloc;
Foo *p = nullptr, *i = nullptr;
try {
p = alloc.allocate(len);
for (i = p; i != p + len; ++i) {
alloc.construct(i , 1.0f, 2.0f);
}
} catch (...) {
while (i > p) {
alloc.destruct(i--);
}
if (p)
alloc.deallocate(p);
throw;
}
return std::unique_ptr<Foo[], FooDeleter<>>{p, FooDeleter<>(alloc, len)};
}

Trying to store an object in an array but then how to call that object's methods?

I'm not a very experienced c++ coder and this has me stumped. I am passing a object (created elsewhere) to a function, I want to be able to store that object in some array and then run through the array to call a function on that object. Here is some pseudo code:
void AddObject(T& object) {
object.action(); // this works
T* objectList = NULL;
// T gets allocated (not shown here) ...
T[0] = object;
T[0].action(); // this doesn't work
}
I know the object is passing correctly, because the first call to object.action() does what it should. But when I store object in the array, then try to invoke action() it causes a big crash.
Likely my problem is that I simply tinkered with the .'s and *'s until it compiled, T[0].action() compliles but crashes at runtime.
The simplest answer to your question is that you must declare your container correctly and you must define an appropriate assigment operator for your class. Working as closely as possible from your example:
typedef class MyActionableClass T;
T* getGlobalPointer();
void AddInstance(T const& objInstance)
{
T* arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **objects**
//whose first element we don't mind losing
//**copy** the instance we've received
arrayFromElsewhere[0] = objInstance;
//now invoke the action() method on our **copy**
arrayFromElsewhere[0].action();
}
Note the signature change to const reference which emphasizes that we are going to copy the original object and not change it in any way.
Also note carefully that arrayFromElsewhere[0].action() is NOT the same as objInstance.action() because you have made a copy — action() is being invoked in a different context, no matter how similar.
While it is obvious you have condensed, the condensation makes the reason for doing this much less obvious — specifying, for instance, that you want to maintain an array of callback objects would make a better case for “needing” this capability. It is also a poor choice to use “T” like you did because this tends to imply template usage to most experienced C++ programmers.
The thing that is most likely causing your “unexplained” crash is that assignment operator; if you don't define one the compiler will automatically generate one that works as a bitwise copy — almost certainly not what you want if your class is anything other than a collection of simple data types (POD).
For this to work properly on a class of any complexity you will likely need to define a deep copy or use reference counting; in C++ it is almost always a poor choice to let the compiler create any of ctor, dtor, or assignment for you.
And, of course, it would be a good idea to use standard containers rather than the simple array mechanism you implied by your example. In that case you should probably also define a default ctor, a virtual dtor, and a copy ctor because of the assumptions made by containers and algorithms.
If, in fact, you do not want to create a copy of your object but want, instead, to invoke action() on the original object but from within an array, then you will need an array of pointers instead. Again working closely to your original example:
typedef class MyActionableClass T;
T** getGlobalPointer();
void AddInstance(T& objInstance)
{
T** arrayFromElsewhere = getGlobalPointer();
//ok, now at this point we have a reference to an object instance
//and a pointer which we assume is at the base of an array of T **pointers**
//whose first element we don't mind losing
//**reference** the instance we've received by saving its address
arrayFromElsewhere[0] = &objInstance;
//now invoke the action() method on **the original instance**
arrayFromElsewhere[0]->action();
}
Note closely that arrayFromElsewhere is now an array of pointers to objects instead of an array of actual objects.
Note that I dropped the const modifier in this case because I don’t know if action() is a const method — with a name like that I am assuming not…
Note carefully the ampersand (address-of) operator being used in the assignment.
Note also the new syntax for invoking the action() method by using the pointer-to operator.
Finally be advised that using standard containers of pointers is fraught with memory-leak peril, but typically not nearly as dangerous as using naked arrays :-/
I'm surprised it compiles. You declare an array, objectList of 8 pointers to T. Then you assign T[0] = object;. That's not what you want, what you want is one of
T objectList[8];
objectList[0] = object;
objectList[0].action();
or
T *objectList[8];
objectList[0] = &object;
objectList[0]->action();
Now I'm waiting for a C++ expert to explain why your code compiled, I'm really curious.
You can put the object either into a dynamic or a static array:
#include <vector> // dynamic
#include <array> // static
void AddObject(T const & t)
{
std::array<T, 12> arr;
std::vector<T> v;
arr[0] = t;
v.push_back(t);
arr[0].action();
v[0].action();
}
This doesn't really make a lot of sense, though; you would usually have defined your array somewhere else, outside the function.

Using memset on structures in C++

I am working on fixing older code for my job. It is currently written in C++. They converted static allocation to dynamic but didn't edit the memsets/memcmp/memcpy. This is my first programming internship so bare with my newbe-like question.
The following code is in C, but I want to have it in C++ ( I read that malloc isn't good practice in C++). I have two scenarios: First, we have f created. Then you use &f in order to fill with zero. The second is a pointer *pf. I'm not sure how to set pf to all 0's like the previous example in C++.
Could you just do pf = new foo instead of malloc and then call memset(pf, 0, sizeof(foo))?
struct foo { ... } f;
memset( &f, 0, sizeof(f) );
//or
struct foo { ... } *pf;
pf = (struct foo*) malloc( sizeof(*pf) );
memset( pf, 0, sizeof(*pf) );
Yes, but only if foo is a POD. If it's got virtual functions or anything else remotely C++ish, don't use memset on it since it'll stomp all over the internals of the struct/class.
What you probably want to do instead of memset is give foo a constructor to explicitly initialise its members.
If you want to use new, don't forget the corresponding delete. Even better would be to use shared_ptr :)
Can you? Yes, probably. Should you? No.
While it will probably work, you're losing the state that the constructor has built for you. Adding to this, what happens when you decide to implement a subclass of this struct? Then you lose the advantage of reuseable code that C++ OOP offers.
What you ought to do instead is create a constructor that initializes the members for you. This way, when you sublass this struct later on down the line, you just use this constructor to aid you in constructing the subclasses. This is free, safe code! use it!
Edit: The caveat to this is that if you have a huge code base already, don't change it until you start subclassing the structs. It works as it is now.
Yes, that would work. However, I don't think malloc is necessarily bad practice, and I wouldn't change it just to change it. Of course, you should make sure you always match the allocation mechanisms properly (new->delete, malloc->free, etc.).
You could also add a constructor to the struct and use that to initialize the fields.
You could new foo (as is the standard way in C++) and implement a constructor which initialises foo rather than using memset.
E.g.
struct Something
{
Something()
: m_nInt( 5 )
{
}
int m_nInt;
};
Also don't forget if you use new to call delete when you are finished with the object otherwise you will end up with memory leaks.

How should I change this declaration?

I have been given a header with the following declaration:
//The index of 1 is used to make sure this is an array.
MyObject objs[1];
However, I need to make this array dynamically sized one the program is started. I would think I should just declare it as MyObject *objs;, but I figure if the original programmer declared it this way, there is some reason for it.
Is there anyway I can dynamically resize this? Or should I just change it to a pointer and then malloc() it?
Could I use some the new keyword somehow to do this?
Use an STL vector:
#include <vector>
std::vector<MyObject> objs(size);
A vector is a dynamic array and is a part of the Standard Template Library. It resizes automatically as you push back objects into the array and can be accessed like a normal C array with the [] operator. Also, &objs[0] is guaranteed to point to a contiguous sequence in memory -- unlike a list -- if the container is not empty.
You're correct. If you want to dynamically instantiate its size you need to use a pointer.
(Since you're using C++ why not use the new operator instead of malloc?)
MyObject* objs = new MyObject[size];
Or should I just change it to a
pointer and then malloc() it?
If you do that, how are constructors going to be called for the objects in on the malloc'd memory? I'll give you a hint - they won't be - you need to use a std::vector.
I have only seen an array used as a pointer inside a struct or union. This was ages ago and was used to treat the len and first char of a string as a hash to improve the speed of string comparisons for a scripting language.
The code was similar to this:
union small_string {
struct {
char len;
char buff[1];
};
short hash;
};
Then small_string was initialised using malloc, note the c cast is effectively a reinterpret_cast
small_string str = (small_string) malloc(len + 1);
strcpy(str.buff, val);
And to test for equality
int fast_str_equal(small_string str1, small_string str2)
{
if (str1.hash == str2.hash)
return strcmp(str1.buff, str2.buff) == 0;
return 0;
}
As you can see this is not a very portable or safe style of c++. But offered a great speed improvement for associative arrays indexed by short strings, which are the basis of most scripting languages.
I would probably avoid this style of c++ today.
Is this at the end of a struct somewhere?
One trick I've seen is to declare a struct
struct foo {
/* optional stuff here */
int arr[1];
}
and malloc more memory than sizeof (struct foo) so that arr becomes a variable-sized array.
This was fairly commonly used in C programs back when I was hacking C, since variable-sized arrays were not available, and doing an additional allocation was considered too error-prone.
The right thing to do, in almost all cases, is to change the array to an STL vector.
Using the STL is best if you want a dynamically sizing array, there are several options, one is std::vector. If you aren't bothered about inserting, you can also use std::list.
Its seems - yes, you can do this change.
But check your code on sizeof( objs );
MyObj *arr1 = new MyObj[1];
MyObj arr2[1];
sizeof(arr1) != sizeof(arr2)
Maybe this fact used somewhere in your code.
That comment is incredibly bad. A one-element array is an array even though the comment suggests otherwise.
I've never seen anybody try to enforce "is an array" this way. The array syntax is largely syntactic sugar (a[2] gives the same result as 2[a]: i.e., the third element in a (NOTE this is an interesting and valid syntax but usually a very bad form to use because you're going to confuse programmers for no reason)).
Because the array syntax is largely syntactic sugar, switching to a pointer makes sense as well. But if you're going to do that, then going with new[] makes more sense (because you get your constructors called for free), and going with std::vector makes even more sense (because you don't have to remember to call delete[] every place the array goes out of scope due to return, break, the end of statement, throwing an exception, etc.).

Issues with C++ 'new' operator?

I've recently come across this rant.
I don't quite understand a few of the points mentioned in the article:
The author mentions the small annoyance of delete vs delete[], but seems to argue that it is actually necessary (for the compiler), without ever offering a solution. Did I miss something?
In the section 'Specialized allocators', in function f(), it seems the problems can be solved with replacing the allocations with: (omitting alignment)
// if you're going to the trouble to implement an entire Arena for memory,
// making an arena_ptr won't be much work. basically the same as an auto_ptr,
// except that it knows which arena to deallocate from when destructed.
arena_ptr<char> string(a); string.allocate(80);
// or: arena_ptr<char> string; string.allocate(a, 80);
arena_ptr<int> intp(a); intp.allocate();
// or: arena_ptr<int> intp; intp.allocate(a);
arena_ptr<foo> fp(a); fp.allocate();
// or: arena_ptr<foo>; fp.allocate(a);
// use templates in 'arena.allocate(...)' to determine that foo has
// a constructor which needs to be called. do something similar
// for destructors in '~arena_ptr()'.
In 'Dangers of overloading ::operator new[]', the author tries to do a new(p) obj[10]. Why not this instead (far less ambiguous):
obj *p = (obj *)special_malloc(sizeof(obj[10]));
for(int i = 0; i < 10; ++i, ++p)
new(p) obj;
'Debugging memory allocation in C++'. Can't argue here.
The entire article seems to revolve around classes with significant constructors and destructors located in a custom memory management scheme. While that could be useful, and I can't argue with it, it's pretty limited in commonality.
Basically, we have placement new and per-class allocators -- what problems can't be solved with these approaches?
Also, in case I'm just thick-skulled and crazy, in your ideal C++, what would replace operator new? Invent syntax as necessary -- what would be ideal, simply to help me understand these problems better.
Well, the ideal would probably be to not need delete of any kind. Have a garbage-collected environment, let the programmer avoid the whole problem.
The complaints in the rant seem to come down to
"I liked the way malloc does it"
"I don't like being forced to explicitly create objects of a known type"
He's right about the annoying fact that you have to implement both new and new[], but you're forced into that by Stroustrups' desire to maintain the core of C's semantics. Since you can't tell a pointer from an array, you have to tell the compiler yourself. You could fix that, but doing so would mean changing the semantics of the C part of the language radically; you could no longer make use of the identity
*(a+i) == a[i]
which would break a very large subset of all C code.
So, you could have a language which
implements a more complicated notion of an array, and eliminates the wonders of pointer arithmetic, implementing arrays with dope vectors or something similar.
is garbage collected, so you don't need your own delete discipline.
Which is to say, you could download Java. You could then extend that by changing the language so it
isn't strongly typed, so type checking the void * upcast is eliminated,
...but that means that you can write code that transforms a Foo into a Bar without the compiler seeing it. This would also enable ducktyping, if you want it.
The thing is, once you've done those things, you've got Python or Ruby with a C-ish syntax.
I've been writing C++ since Stroustrup sent out tapes of cfront 1.0; a lot of the history involved in C++ as it is now comes out of the desire to have an OO language that could fit into the C world. There were plenty of other, more satisfying, languages that came out around the same time, like Eiffel. C++ seems to have won. I suspect that it won because it could fit into the C world.
The rant, IMHO, is very misleading and it seems to me that the author does understand the finer details, it's just that he appears to want to mislead. IMHO, the key point that shows the flaw in argument is the following:
void* operator new(std::size_t size, void* ptr) throw();
The standard defines that the above function has the following properties:
Returns: ptr.
Notes: Intentionally performs no other action.
To restate that - this function intentionally performs no other action. This is very important, as it is the key to what placement new does: It is used to call the constructor for the object, and that's all it does. Notice explicitly that the size parameter is not even mentioned.
For those without time, to summarise my point: everything that 'malloc' does in C can be done in C++ using "::operator new". The only difference is that if you have non aggregate types, ie. types that need to have their destructors and constructors called, then you need to call those constructor and destructors. Such types do not explicitly exist in C, and so using the argument that "malloc does it better" is not valid. If you have a struct in 'C' that has a special "initializeMe" function which must be called with a corresponding "destroyMe" then all points made by the author apply equally to that struct as they do to a non-aggregate C++ struct.
Taking some of his points explicitly:
To implement multiple inheritance, the compiler must actually change the values of pointers during some casts. It can't know which value you eventually want when converting to a void * ... Thus, no ordinary function can perform the role of malloc in C++--there is no suitable return type.
This is not correct, again ::operator new performs the role of malloc:
class A1 { };
class A2 { };
class B : public A1, public A2 { };
void foo () {
void * v = ::operator new (sizeof (B));
B * b = new (v) B(); // Placement new calls the constructor for B.
delete v;
v = ::operator new (sizeof(int));
int * i = reinterpret_cast <int*> (v);
delete v'
}
As I mention above, we need placement new to call the constructor for B. In the case of 'i' we can cast from void* to int* without a problem, although again using placement new would improve type checking.
Another point he makes is about alignment requirements:
Memory returned by new char[...] will not necessarily meet the alignment requirements of a struct intlist.
The standard under 3.7.3.1/2 says:
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
That to me appears pretty clear.
Under specialized allocators the author describes potential problems that you might have, eg. you need to use the allocator as an argument to any types which allocate memory themselves and the constructed objects will need to have their destructors called explicitly. Again, how is this different to passing the allocator object through to an "initalizeMe" call for a C struct?
Regarding calling the destructor, in C++ you can easily create a special kind of smart pointer, let's call it "placement_pointer" which we can define to call the destructor explicitly when it goes out of scope. As a result we could have:
template <typename T>
class placement_pointer {
// ...
~placement_pointer() {
if (*count == 0) {
m_b->~T();
}
}
// ...
T * m_b;
};
void
f ()
{
arena a;
// ...
foo *fp = new (a) foo; // must be destroyed
// ...
fp->~foo ();
placement_pointer<foo> pfp = new (a) foo; // automatically !!destructed!!
// ...
}
The last point I want to comment on is the following:
g++ comes with a "placement" operator new[] defined as follows:
inline void *
operator new[](size_t, void *place)
{
return place;
}
As noted above, not just implemented this way - but it is required to be so by the standard.
Let obj be a class with a destructor. Suppose you have sizeof (obj[10]) bytes of memory somewhere and would like to construct 10 objects of type obj at that location. (C++ defines sizeof (obj[10]) to be 10 * sizeof (obj).) Can you do so with this placement operator new[]? For example, the following code would seem to do so:
obj *
f ()
{
void *p = special_malloc (sizeof (obj[10]));
return new (p) obj[10]; // Serious trouble...
}
Unfortunately, this code is incorrect. In general, there is no guarantee that the size_t argument passed to operator new[] really corresponds to the size of the array being allocated.
But as he highlights by supplying the definition, the size argument is not used in the allocation function. The allocation function does nothing - and so the only affect of the above placement expression is to call the constructor for the 10 array elements as you would expect.
There are other issues with this code, but not the one the author listed.