Flexible array member in class with polymorphism - c++

In C99 you can have something like
struct foo
{
int a;
int data[];
};
And then allocate with foo* f=(foo*)malloc(sizeof(foo)+n) to have a struct where the length of the array is n.
Can one do something similar in C++ when the class is a subclass with virtual functions?
Like foo being a subclass of bar, then do something like std::unique_ptr<bar> f= std::unique_ptr<foo>((foo*)malloc(sizeof(foo)+n))
I know that that code doesn't work as freeing the memory would be done with delete but allocation was done with malloc

Variable length arrays are not actually part of the C++ standard, but rather a compiler extension. However, if you really want to use them, I mean, allocating the object with malloc, you would need to use placement new to call the constructor, and manually call the destructor (which should be virtual) like f->~bar() before calling free. Since malloc produces a pointer to memory of necessary size for initialization of the object, this shouldn't produce undefined behaviour.

No, it is not possible by standard rules. Variable-length arrays and flexible array members, such as you show in your example, are not allowed in C++ at all. There is no equivalent or alternative either.
Also, malloc cannot be used to create objects in C++ at all. Only new with the correct type given to it can create an object of that type dynamically. Everything else is not allowed and causes undefined behavior if one pretends that an object of the given type was created.
Since C++20 there are some exceptions to the rule above for certain types of objects which may be created implicitly, but still, the size of an object is fixed at compile-time by its type and can not be varied at all.
Overallocating for an object does never cause the additional storage to become part of the object and one is never allowed to access it as if it was.

Related

What does the destructor of primitive types actually do? [duplicate]

This question comes from me trying to understand the motivation for smart pointers where you make a wrapper class around the pointer so that you could add a custom destructor. Do pointers (and ints, bools, doubles, etc.) not have a destructor?
Technically speaking, non-class types (C++ term for what often called 'primitive type' in layman words) do not have destructors.
C++ Standard only speaks of real destructors in context of classes, see [class.dtor] in C++ standard. Aside from that, C++ also allows to call a destructor on a non-class object using the same notation, i.e. following code is valid:
void foo(int z) {
using T = int;
z.~T();
}
This is called 'pseudo-destructor' and exists exclusively to allow writing generic templated code to deal in the same manner with class and non-class types. This call does nothing at all. This syntax is defined in [expr.prim.id] in C++ standard.
Primitive types (and compounds thereof) have trivial destructors. These don't do anything, and have special wording that allows them to be skipped altogether in some cases.
This, however, is orthogonal to why C++ has smart pointers. A raw pointer is non-owning: it points at another object, but does not affect its lifetime. Smart pointers, on the other hand, own (or share ownership of) their pointee, tying its lifetime to their own. This is what is implemented inside, among other special functions, their destructor.
In addition to the answers given here so far, since C++20 the pseudo-destructor call on a non-class object will always end its lifetime. Consequently accessing the object's value after the call will have undefined behavior. This does not mean however that the compiler has to emit any code for such a call. It still effectively does nothing.
No, pointers don't have destructors. An object referenced through a plain old pointer has to be deleted to avoid memory leaks, and the object's destructor is called then, but the compiler won't call delete automatically, even when a pointer goes out of scope - what if another part of your program also had a pointer to the same object?
Smart pointers aren't about calling a custom destructor, they're about ensuring that things get cleaned up automatically when they go out of scope. This 'cleaning up' might be deleting owned objects, freeing any malloced memory, closing files, releasing locks, etc.
Destructors are used to free the resources that an object may have used.
For pointers, you don't need delete if you are not allocating new memory from the heap.
C and C++ have two ways to store a variable: stack and heap.
Stack is for static memory, and the compiler takes care of that. Heap is for dynamic memory, and you have to take care of this if you are using it.
When you do primitive type declarations, stack memory is allocated for the variables.
When you use new to declare an object, this object is stored on the heap, which you need to delete it when you are finishing using it, or it would be a memory leak.
Basically, you only need delete if you new something.

Do Primitive Types in C++ have destructors?

This question comes from me trying to understand the motivation for smart pointers where you make a wrapper class around the pointer so that you could add a custom destructor. Do pointers (and ints, bools, doubles, etc.) not have a destructor?
Technically speaking, non-class types (C++ term for what often called 'primitive type' in layman words) do not have destructors.
C++ Standard only speaks of real destructors in context of classes, see [class.dtor] in C++ standard. Aside from that, C++ also allows to call a destructor on a non-class object using the same notation, i.e. following code is valid:
void foo(int z) {
using T = int;
z.~T();
}
This is called 'pseudo-destructor' and exists exclusively to allow writing generic templated code to deal in the same manner with class and non-class types. This call does nothing at all. This syntax is defined in [expr.prim.id] in C++ standard.
Primitive types (and compounds thereof) have trivial destructors. These don't do anything, and have special wording that allows them to be skipped altogether in some cases.
This, however, is orthogonal to why C++ has smart pointers. A raw pointer is non-owning: it points at another object, but does not affect its lifetime. Smart pointers, on the other hand, own (or share ownership of) their pointee, tying its lifetime to their own. This is what is implemented inside, among other special functions, their destructor.
In addition to the answers given here so far, since C++20 the pseudo-destructor call on a non-class object will always end its lifetime. Consequently accessing the object's value after the call will have undefined behavior. This does not mean however that the compiler has to emit any code for such a call. It still effectively does nothing.
No, pointers don't have destructors. An object referenced through a plain old pointer has to be deleted to avoid memory leaks, and the object's destructor is called then, but the compiler won't call delete automatically, even when a pointer goes out of scope - what if another part of your program also had a pointer to the same object?
Smart pointers aren't about calling a custom destructor, they're about ensuring that things get cleaned up automatically when they go out of scope. This 'cleaning up' might be deleting owned objects, freeing any malloced memory, closing files, releasing locks, etc.
Destructors are used to free the resources that an object may have used.
For pointers, you don't need delete if you are not allocating new memory from the heap.
C and C++ have two ways to store a variable: stack and heap.
Stack is for static memory, and the compiler takes care of that. Heap is for dynamic memory, and you have to take care of this if you are using it.
When you do primitive type declarations, stack memory is allocated for the variables.
When you use new to declare an object, this object is stored on the heap, which you need to delete it when you are finishing using it, or it would be a memory leak.
Basically, you only need delete if you new something.

Does C++ require a destructor call for each placement new?

I understand that placement new calls are usually matched with explicit calls to the destructor. My question is: if I have no need for a destructor (no code to put there, and no member variables that have destructors) can I safely skip the explicit destructor call?
Here is my use case: I want to write C++ bindings for a C API. In the C API many objects are accessible only by pointer. Instead of creating a wrapper object that contains a single pointer (which is wasteful and semantically confusing). I want
to use placement new to construct an object at the address of the C object. The C++ object will do nothing in its constructor or destructor, and its methods will do nothing but delegate to the C methods. The C++ object will contain no virtual methods.
I have two parts to this question.
Is there any reason why this idea will not work in practice on any production compiler?
Does this technically violate the C++ language spec?
If I understand your question correctly you have a C object in memory and you want to initialize a C++ object with the same layout "over the top" of the existing object.
CppObject* cppobject = new (cobject) CppObject;
While there is no problem with not calling a destructor for the old object - whether this causes resource leaks or other issues is entirely down to the type of the old object and is a user code issue, not a language conformance issue - the fact that you reuse the memory for a new object means that the old object is no longer accessible.
Although the placement form of operator new must just return the address that it was given, there is nothing to stop the new expression itself wiping the memory for the new object before any constructor (if any) is called. Members of the new object that are not initialized according to C++ language rules have unspecified contents which definitely does not mean the same as having the contents of any old object that once lived in the memory being reused.
If I understand you correctly, what you are trying to do is not guaranteed to work.
I think the answer is that if your class is POD (which it is, if it's true that it does nothing in the con/destructor, has no virtual member functions, and has no non-static data members with any of those things), then you don't need to call a constructor or a destructor, its lifetime is just the lifetime of the underlying memory. You can use it the same way that a struct is used in C, and you can call its member functions regardless of whether it has been constructed.
The purpose of placement new is to allow you to create object pools or align multiple objects together in contiguous memory space as with std::vector.
If the objects are C-structs then you do not need placement new to do this, you can simply use the C method of allocating the memory based on sizeof(struct Foo) where Foo is the struct name, and if you allocate multiple objects you may need to multiple the size up to a boundary for alignment.
However there is no need to placement-new the objects there, you can simply memcpy them in.
To answer your main question, yes you still need to call the destructor because other stuff has to happen.
Is there any reason why this idea will not work in practice on any production compiler?
You had damn well be sure your C++ object fits within the size of the C object.
Does this technically violate the C++ language spec?
No, but not everything that is to spec will work like you want.
I understand that placement new calls are usually matched with explicit calls to the destructor. If I have no need for a destructor (no code to put there, and no member variables that have destructors) can I safely skip the explicit destructor call?
Yes. If I don't need to fly to New York before posting this answer, can I safely skip the trip? :) However, if the destructor is truly unneeded because it does nothing, then what harm is there in calling it?
If the compiler can figure out a destructor should be a no-op, I'd expect it to eliminate that call. If you don't write an explicit dtor, remember that your class still has a dtor, and the interesting case – here – is whether it is what the language calls trivial.
Solution: destroy previously constructed objects before constructing over them, even when you believe the dtor is a no-op.
I want to write C++ bindings for a C API. In the C API many objects are accessible only by pointer. Instead of creating a wrapper object that contains a single pointer (which is wasteful and semantically confusing). I want to use placement new to construct an object at the address of the C object.
This is the purpose of layout-compatible classes and reinterpret_cast. Include a static assert (e.g. Boost's macro, 0x static_assert, etc.) checking size and/or alignment, as you wish, for a short sanity check, but ultimately you have to know a bit of how your implementation lays out the classes. Most provide pragmas (or other implementation-specific mechanisms) to control this, if needed.
The easiest way is to contain the C struct within the C++ type:
// in C header
typedef struct {
int n;
//...
} C_A;
C_A* C_get_a();
// your C++
struct A {
void blah(int n) {
_data.num += n;
}
// convenience functions
static A* from(C_A *p) {
return reinterpret_cast<A*>(p);
}
static A const* from(C_A const *p) {
return reinterpret_cast<A const*>(p);
}
private:
C_A _data; // the only data member
};
void example() {
A *p = A::from(C_get_a());
p->blah(42);
}
I like to keep such conversions encapsulated, rather than strewing reinterpret_casts throughout, and more uniform (i.e. compare call-site for const and non-const), hence the convenience functions. It's also a bit harder to modify the class without noticing this type of use must still be supported.
Depending on the exact class, you might make the data member public.
The first question is: why don't you just use a cast? Then there's no issue of the placement new doing anything, and clearly no issue of failing to use a destructor. The result will work if the C and C++ types are layout compatible.
The second question is: what is the point? If you have no virtual functions, you're not using the constructor or destructor, the C++ class doesn't seem to offer any advantages over just using the C type: any methods you write should have been global functions anyhow.
The only advantage I can imagine is if you want to hide the C representation, you can overlay the C object with a class with all private members and use methods for access. Is that your purpose? [That's a reasonable thing to do I think]

Array of pointers member, is it initialized?

If I have a
class A
{
private:
Widget* widgets[5];
};
Is it guaranteed that all pointers are NULL, or do I need to initialize them in the constructor? Is it true for all compilers?
Thanks.
The array is not initialized unless you do it. The standard does not require the array to be initialized.
It is not initialized if it is on the stack or using the default heap allocator (although you can write your own to do so).
If it is a global variable it is zero filled.
This is true for all conformant compilers.
It depends on the platform and how you allocate or declare instances of A. If it's on the stack or heap, you need to explicitly initialize it. If it's with placement new and a custom allocator that initializes memory to zero or you declare an instance at file scope AND the platform has the null pointer constant be bitwise zero, you don't. Otherwise, you should.
EDIT: I suppose I should have stated the obvious which was "don't assume that this happens".
Although in reality the answer is "it depends on the platform". The standard only tells you what happens when you initialize explicitly or at file scope. Otherwise, it is easiest to assume that you are in an environment that will do the exact opposite of what you want it to do.
And if you really need to know (for educational or optimizational purposes), consult the documentation and figure out what you can rely on for that platform.
In general case the array will not be initialized. However, keep in mind that the initial value of the object of class type depends not only on how the class itself is defined (constructor, etc), but might also depend on the initializer used when creating the object.
So, in some particular cases the answer to your question might depend on the initializer you supply when creating the object and on the version of C++ language your compiler implements.
If you supply no initializer, the array will contain garbage.
A* pa = new A;
// Garbage in the array
A a;
// Garbage in the array
If supply the () initializer, in C++98 the array will still contain garbage. In C++03 however the object will be value-initialized and the array will contain null-pointers
A* pa = new A();
// Null-pointers in the array in C++03, still garbage in C++98
A a = A();
// Null-pointers in the array in C++03, still garbage in C++98
Also, objects with static storage duration are always zero-initialized before any other initialization takes place. So, if you define an object of A class with static storage duration, the array will initially contain null-pointers.

What can be instantiated?

What types in C++ can be instantiated?
I know that the following each directly create a single instance of Foo:
Foo bar;
Foo *bizz = new Foo();
However, what about with built-in types? Does the following create two instances of int, or is instance the wrong word to use and memory is just being allocated?
int bar2;
int *bizz2 = new int;
What about pointers? Did the above example create an int * instance, or just allocate memory for an int *?
Would using literals like 42 or 3.14 create an instance as well?
I've seen the argument that if you cannot subclass a type, it is not a class, and if it is not a class, it cannot be instantiated. Is this true?
So long as we're talking about C++, the only authoritative source is the ISO standard. That doesn't ever use the word "instantiation" for anything but class and function templates.
It does, however, use the word "instance". For example:
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block.
Note that in C++ parlance, an int lvalue is also an "object":
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage.
Since new clearly creates regions of storage, anything thus created is an object, and, following the precedent of the specification, can be called an instance.
As far as I can tell, you're really just asking about terminology here. The only real distinction made by the C++ standard is POD types and non-POD types, where non-POD types have features like user-defined constructors, member functions, private variables, etc., and POD types don't. Basic types like int and float are of course PODs, as are arrays of PODs and C-structs of PODs.
Apart from (and overlapping with) C++, the concept of an "instance" in Object-Oriented Programming usually refers to allocating space for an object in memory, and then initializing it with a constructor. Whether this is done on the stack or the heap, or any other location in memory for that matter, is largely irrelevant.
However, the C++ standard seems to consider all data types "objects." For example, in 3.9 it says:
"The object representation of type T
is the sequence of N unsigned char
objects taken up by the object of type
T, where N equals sizeof(T)..."
So basically, the only distinction made by the C++ standard itself is POD versus non-POD.
in C++ an 'instance' and 'instantiate' is only associated with Classes
note however that these are also english words that can have conversational meaning.
'pointer' is certainly a class of things in the english usage and a pointer is certainly an instance of that class
but in c++ speak 'pointer' is not a Class and a pointer is not an Instance of a Class
see also - how many angels on pinheads
The concept of an "instance" isn't something that's really intrinsic to C++ -- basically you have "things which have a constructor and things which don't".
So, all types have a size, e.g. an int is commonly 4 bytes, a struct with a couple of ints is going to be 8 and so on. Now, slap a constructor on that struct, and it starts looking (and behaving) like a class. More specifically:
int foo; // <-- 4 bytes, no constructor
struct Foo
{
int foo;
int bar;
}; // <-- 8 bytes, no constructor
struct Foo
{
Foo() : foo(0), bar(0) {}
int foo;
int bar;
}; // <-- 8 bytes, with constructor
Now, you any of these types can live on the stack or on the heap. When you create something on the stack, like the "int foo;" above, goes away after its scope goes away (e.g. at the end of the function call). If you create something with "new" it goes on the heap and gets its own place to live in memory until you call delete on it. In both cases the constructor, if there, will be called during instantiation.
It is unusual to do "new int", but it's allowed. You can even pass 0 or 1 arguments to the constructor. I'm not sure if "new int()" means it's 0-initialized (I'd guess yes) as distinct from "new int".
When you define a value on the stack, it's not usually called "allocating memory" (although it is getting memory on the stack in theory, it's possible that the value lives only in CPU registers).
Literals don't necessarily get an address in program memory; CPU instructions may encode data directly (e.g. put 42 into register B). Probably arbitrary floating point constants have an address.