suppose I have three class like these:
class base {
//some data
method();
};
class sub1 : base {
//some data
//overrides base method
method();
};
class sub2: base {
//some data
//overrides base methods
method();
};
How can I create a array mixed with sub1 and sub2? then calling subclass method with base?
Ok, let's sort this out. First of all, you probably meant virtual method();, probably with a return type, maybe with parameters. Without virtual, base class pointers and references won't know about the overridden method. Second, make the destructor virtual. Do this until you know why you need to (delete (base*) new derived;) - then keep doing this until all your neighbourhood knows why you need to. Third, the sad thing is, all std. C++ containers are homogeneous (non-std. heterogeneous container-like objects in Boost exist), thus you need to find an object that's common and that's somehow able to handle these types. Common choices are:
Common base class pointer, in your case, base*. This conventionally owns the objects and is manually (de)allocated (that is, you need to call new and delete). This is the most common choice. You might try smart pointers later, but let's get the basics first.
Common base class reference, in your case, base&. Common convention is that this doesn't own the object (albeit this is not a language restriction), thus it's mainly used for referring to objects that are stored in another container. Since you need to store them somewhere, I wouldn't opt for this now, but it might come handy later.
std::variant<> (or boost::variant<>), this is a discriminated union, that is, a class that stores one and only one of the listed items and knows which one it stores. You don't need a common base class, but even if you have one, it's cool because it tends to store objects locally, thus might be faster when you have enough cache.
union, which is like variant, but does not know the type being stored. Local storage is guaranteed, as well as UB if you write one field and read another
Compiler-specific solutions. If you know that your classes are of the same size (in this case, they are) and you know for sure that you have untyped memory, then you might store the base class and it'll 'just work', provided you always take the address and -> operator. Note that this is UB squared, I just list this because you'll likely encounter similar code. Also note that simply having a union does not remove UB in this case - until we have access to virtual table pointer, this can only be done by manually handling virtual functions.
Related
When using C-libraries it might be appropopriate to derive a class from a C-structure and add some methods to operate on it without any data-members. F.e. you could add a constructor to initialize the members more conveniently. So this objects might be implicitly upcasted and passed to the C-APIs.
There might be cases where the API expects an array of the C-structures. But is there any guarantee of the C++-language that the derived objectds have the same size as the base-struct so that the distances between the objects are properly offsetted?
BTW: None of the suggestions of similar questions matches my question.
In general, there is no such guarantee. And in particular, if you introduce virtual member functions for example, then there would typically be additional memory used for the virtual table pointer.
If we add an additional assumption that the derived class is standard layout, and no non-standard features such as "packing" is used, then the size would be the same in practice.
However, even if the size is the same, you technically cannot pretend that an array of derived type is an array of base type. In particular, iterating the "pretended" array with a pointer to base would have undefined behaviour. At least that's how it is within C++. Those operations are presumably performed in C across the API. I really don't know what guarantees there are in that case.
I would recommend that if you need to deal with arrays of the C struct (i.e. the pointer would be incremented or subscripted by the API), then instead of wrapping the individual struct, create a C++ wrapper for the entire array.
But is there any guarantee of the C++-language that the derived objectds have the same size as the base-stuct
In general I would expect, that the class will not add any additional to the class memory layout, as long you did not introduce new data members or virtual functions. Use of virtual functions results in adding the v-table pointer.
The implementation is also free to add a v-table pointer if you use virtual inheritance. This will also change the layout for most compilers ( clang and c++ use a vtable in that case! )
But this all is implementation specific and I did not know of a guarantee in the C++ standards which defines that the class layout will guarantee that you can use a derived class without a cast operation as the base class.
You also have to think of padding of data structures which may be different for the derived class.
Generate something ( the derived class ) and use it as something different ( the base struct ) is in general undefined behavior. We are not talking of cast operations! If you cast the derived class to the base class, everything is fine. But packing many instances into a derived class array and simply use it as a base class array is undefined behavior.
While going through a very good book on templates in C++, I came across an explanation on alternative to templates that I don't understand:
These are bad alternatives to templates in C++
You can write general code for a common base type such as Object or void*.
Reason : If you write general code for a common base class you
lose the benefit of type checking. In addition, classes may be
required to be derived from special base classes, which makes it more
difficult to maintain your code.
Can someone explain this with a code example?
It's not the concept of a common base type that's bad. It's the use of a "Object class" that everything has to be derived from, or worse, writing code that takes void* and then, making assumptions about what the pointer points to, typecasts to a pointer to some other type and hopes for the best. This is best exemplified with containers.
The right way to implement container methods is with templates. For example:
template<typename T> void List<T>::append(const T& obj);
Object class
In the case of an Object base class, what that means is that anything you put in the container must be derived from Object, because all the container methods use Object* for the data in said container. So you get methods like this:
void List::append(Object* obj);
Two bad things here: First, that Object class has to get dragged around with your container wherever you go. Second, it's a horribly generic name and will probably conflict with an Object class from some other library.
Also, your container can never contain types that aren't Object derived directly, including primitive types like int and standard types like std::string. You'd have to "wrap" those types in Object subclasses, and then you'd have to spend time with code to extract the values from those wrapper objects, etc. It's a pain in the rear that you don't need.
void*
So you might think you could use the generic pointer void* instead:
void List::append(void* obj);
But when you do that, there are many things the container may need to do that it can't, because it has no idea what that void* points to:
It can't copy data objects.
It can't compare data objects.
It can't delete data objects.
and so on. (You can avoid these problems in the Object* case with virtual methods, for example declaring something like:
virtual ~Object() {}
virtual Object* clone() const;
virtual int cmp(const Object* rhs) const;
where these methods must be overridden by all subclasses, in your Object base class. But now your Object class isn't very lightweight.)
In both cases, you'd be far better off using a templated type for your container's data type. If you are worried about code bloat and have code in your container that doesn't care about the data type (because it doesn't bother with the data, such as when counting contained elements), you can put that code in a base class and have your templated container class derive from it. But most of the time nobody really cares about this "bloat" because it's much smaller than your available memory.
Type checking discarded
If you write general code for a common base class you lose the benefit of type checking.
Since you've typecast to Object* or void*, type checking went out the window for the most part. (Note: this is somewhat dated, since in some cases you can use Runtime Type Identification (RTTI) and the dynamic_cast operation to perform type checking after the fact, to make sure the object you pulled out of the container is the type you expect. But all the above-mentioned limits still apply, since the container still doesn't know what it's containing.)
The old qsort function used void* pointers for the start of the data and for the parameters to the comparison function. You could easily try to sort an array of double with a comparison function that compared int and wind up with a real mess.
Say I have an object that exists in high quantity, stores little data about itself, but requires several larger functions to act upon itself.
class Foo
{
public:
bool is_dead();
private:
float x, y, z;
bool dead;
void check_self();
void update_self();
void question_self();
};
What behavior can I expect from the compiler - would every new Foo object cause duplicates of its methods to be copied into memory?
If yes, what are good options for managing class-specific (private-like) functions while avoiding duplication?
If not, could you elaborate on this a little?
C++ methods are simply functions (with a convention about this which often becomes the implicit first argument).
Functions are mostly machine code, starting at some specific address. The start address is all that is needed to call the function.
So objects (or their vtable) need at most the address of called functions.
Of course a function takes some place (in the text segment).
But an object won't need extra space for that function. If the function is not virtual, no extra space per object is needed. If the function is virtual, the object has a single vtable (per virtual class). Generally, each object has, as its first field, the pointer to the vtable. This means 8 bytes per object on x86-64/Linux. Each object (assuming single inheritance) has one vtable pointer, independently of the number or of the code size of the virtual
functions.
If you have multiple, perhaps virtual, inheritance with virtual methods in several superclasses you'll need several vtable pointers per instance.
So for your Foo example, there is no virtual function (and no superclass containing some of them), so instances of Foo contain no vtable pointer.
If you add one (or many hundreds) of virtual functions to Foo (then you should have a virtual destructor, see rule of three in C++), each instance would have one vtable pointer.
If you want a behavior to be specific to instances (so instances a and b could have different behavior) without using the class machinery for that, you need some member function pointers (in C++03) or (in C++11) some std::function (perhaps anonymous closures). Of course they need space in every instance.
BTW, to know the size of some type or class, use sizeof .... (it does include the vtable[s] pointer[s] if relevant).
Methods exists for every class in program, not for every object.
Try to read some good books about c++ to know so easy facts about language.
If I were to create a base class called base and derived classes called derived_1, derived_2 etc... I use a collection of instances of the base class, then when I retrieved an element and tried to use it I would find that C++ thinks it's type is that of the base class, probably because I retrieved it from a std::vector of base. Which is a problem when I want to use features that only exist for the specific derived class who's type I knew this object was when I put it into the vector.
So I cast the element into the type it is supposed to be and found this wouldn't work.
(derived_3)obj_to_be_fixed;
And remembered that it's a pointer thing. After some tweaking this now worked.
*((derived_3*)&obj_to_be_fixed);
Is this right or is there for example an abc_cast() function that does it with less mess?
edit:
I had to expand this into another question, the full solutions are shown there. stackoverflow.com ... why-the-polymorphic-types-error-and-cleanup-question
If you store your objects in a std::vector<base> there is simply no way to go back to the derived class. This is because the derived part has been sliced of when storing it in an instance of base class (afterall your vector contains copies of your data, so it happily copies only the base part of your objectes), making the stored object a true instance of base class, instead of a derived class used as a base class.
If you want to store polymorphic objects in the vector make it a std::vector<base*> (or some kind of smartpointer to base, but not base itself) and use dynamic_cast<derived_3*> to cast it to the correct type (or static_cast, if its performance sensitive and you are confident enough that you are trying to cast to the correct type (in that case horrible things will happen if you are wrong, so beware)).
If you are using a vector of base then all your instances are base instances and not derived instances.
If you try to insert a derived instance, the object will be sliced. Inserting into a vector always involves a copy and the target type is determined by the type of the object that the vector holds. A vector cannot hold objects of different types.
Most of the time you shall not need to do this. A carefully designed class hierarchy can handle this by polymorphism (i.e. virtual functions).
If you really need to cast to the derived type, use dynamic_cast operator.
What you are trying to do is not even remotely possible. If the objects stored in your container have type base, then they are base, period. They are not derived objects, they will never become derived objects and they cannot be used as derived objects regardless of what you do.
Your cast through pointers is nothing than just a hack that reinterprets memory occupied by base object as derived object. This is totally meaningless and can only "work" by accident.
If I understand correctly we have at least two different ways of implementing composition. (The case of implementation with smart pointers is excluded for simplicity. I almost don't use STL and have no desire to learn it.)
Let's have a look at Wikipedia example:
class Car
{
private:
Carburetor* itsCarb;
public:
Car() {itsCarb=new Carburetor();}
virtual ~Car() {delete itsCarb;}
};
So, it's one way - we have a pointer to object as private member.
One can rewrite it to look like this:
class Car
{
private:
Carburetor itsCarb;
};
In that case we have an object itself as private member. (By the way, am I right to call this entity an object from the terminology point of view?)
In the second case it is not obligatory to implicitly call default constructor (if one need to call non-default constructor it's possible to do it in initializer list) and destructor. But it's not a big problem...
And of course in some aspects these two cases differ more appreciably. For example it's forbidden to call non-const methods of Carburetor instance from const methods of Car class in the second case...
Are there any "rules" to decide which one to use? Am I missing something?
In that case we have an object itself as private member. (By the way, calling this entity as object am I write from the terminology point of view?)
Yes you can say "an object" or "an instance" of the class.
You can also talk about including the data member "by value" instead of "by pointer" (because "by pointer" and "by value" is the normal way to talk about passing parameters, therefore I expect people would understand those terms being applied to data members).
Is there any "rules" to decide which one to use? Am I missed something?
If the instance is shared by more than one container, then each container should include it by pointer instead of value; for example if an Employee has a Boss instance, include the Boss by pointer if several Employee instances share the same Boss.
If the lifetime of the data member isn't the same as the lifetime of the container, then include it by pointer: for example if the data member is instantiated after the container, or destroyed before the container, or destroyed-and-recreated during the lifetime of the container, or if it ever makes sense for the data member to be null.
Another time when you must including by pointer (or by reference) instead of by value is when the type of the data member is an abstract base class.
Another reason for including by pointer is that that might allow you to change the implementation of the data member without recompiling the container. For example, if Car and Carburetor were defined in two different DLLs, you might want to include Carburetor by pointer: because then you might be able to change the implementation of the Carburetor by installing a different Carburetor.dll, without rebuilding the Car.dll.
I tend to prefer the first case because the second one requires you to #include Carburettor.h in Car.h.
Since Carburettor is a private member you should not have to include its definition somewhere else than in the actual Car implementation code. The use of the Carburettor class is clearly an implementation detail and external objects that use your Car object should not have to worry about including other non mandatory dependencies. By using a pointer you just need to use a forward declaration of Carburettor in Car.h.
Composition: prefer member when possible. Use a pointer when polymorphism is needed or when a forward declaration is used. Of course, without smart pointer, manual memory management is needed when using pointers.
If Carb has the same lifetime as Car, then the non-pointer form is better, in my opinion. If you have to replace the Carb in Car, then I'd opt for the pointer version.
Generally, the non-pointer version is easier to use and maintain.
But in some cases, you can't use it. For example if the car has multiple carburetors and you wish to put them in an array, and the Carburetor constructor requires an argument: you need to create them via new and thus store them as pointers.