As a Master Thesis I need to expand the database duckdb on github with functionality.
One of the first steps was to create a fixed internal plan that represents something like "select 42;" just on physical level. To that end I tired to manually create such a plan with the classes used internally by duckdb.
On compiling I generally get an error message like this:
/home/ubuntu/git/duckdb/src/include/common/helper.hpp: In instantiation of ‘std::unique_ptr<T> duckdb::make_unique(Args&& ...)
[with T = duckdb::Expression; Args = {duckdb::ExpressionType,
duckdb::ExpressionClass, duckdb::TypeId}]’:
/home/ubuntu/git/duckdb/src/execution/physical_plan_generator.cpp:125:155:
required from here
/home/ubuntu/git/duckdb/src/include/common/helper.hpp:24:23: error:
invalid new-expression of abstract class type ‘duckdb::Expression’
return unique_ptr<T>(new T(std::forward<Args>(args)...));
The creation was this line:
unique_ptr<Expression> ProjectionExpression = make_unique<Expression>(ExpressionType::VALUE_CONSTANT, ExpressionClass::BOUND_CONSTANT, TypeId::INTEGER);
The constructor is
Expression::Expression(ExpressionType type, ExpressionClass expression_class, TypeId return_type)
: BaseExpression(type, expression_class), return_type(return_type) {
}
with baseexpression being
BaseExpression(ExpressionType type, ExpressionClass expression_class)
: type(type), expression_class(expression_class) {
}
virtual ~BaseExpression() {
}
As you can see the class expression uses an initialization list from class baseExpression. As Far as I can tell there is no direct inheritance between the 2 but clearly I need something that is currently missing to correctly initialize the constructor.
The problem is that normally in duckdb these things come from the parser and get then built from these objects. And I have to try and guess how the data structure is supposed to look like.
I am having problems figuring out how to directly allocate this object with make_unique because expression clearly requires a baseExpression of somekind but baseexpression itself has a virtual component so I can't just create that one directly either.
basically what I am asking is: How do you make a new unique_ptr object when the class is abstract?
How do you make a new unique_ptr object when the class is abstract?
By creating an instance of a concrete non-abstract subclass, and returning that as a pointer the to abstract base class. This is common practice when using the Factory pattern and similar idioms.
Looking into the source code at here, you see that Expression has a pure virtual member function (recognizable by virtual and the = 0):
class Expression : public BaseExpression {
//...
virtual unique_ptr<Expression> Copy() = 0;
//...
};
A class with a pure virtual member function is an abstract class. Instances of abstract classes can not be created (whether as variables with automatic storage duration, with new or with std::make_unique). Instead you need to choose the appropriate class derived from Expression that implements all pure virtual methods and create an instance of that class, e.g. by calling std::make_unique<DerivedClass>(...). You can still assign that to std::unique_ptr<Expression> afterwards.
The problem is not virtual member functions in general, only pure virtual member functions. Without pure virtual member functions, classes with virtual functions can be used with std::make_unique without problem.
Related
I'am working on an abstract class, which later will be derived by several subclasses. However, pretty much of the functionality is non-abstract - so why can't I allocate an object of this abstract class and work with it, as long as I don't call any of the pure virtual functions? After all, the size of the abstract class is well known at compile time!?
The definition of abstract class is as:
Defines an abstract type which cannot be instantiated, but can be used as a base class.
From: https://en.cppreference.com/w/cpp/language/abstract_class
To use it in another way would go against the use of abstract classes.
Abstract classes are used to represent general concepts (for example, Shape, Animal), which can be used as base classes for concrete classes (for example, Circle, Dog).
No objects of an abstract class can be created (except for base subobjects of a class derived from it) and no non-static data members of an abstract class can be declared.
See also https://timsong-cpp.github.io/cppwp/n3337/class.abstract
as long as I don't call any of the pure virtual functions?
Except in trivial cases, this can be impossible to prove. Let's look at an example. Suppose this is your abstract class
class Abby { virtual void fun() const = 0; };
and suppose you are able to create an Abby object for testing.
// First source file
Abby test;
In a trivial case, you literally do nothing with this object except cast it to void to avoid a compiler warning about an unused variable. In a non-trivial case, you might pass this object (by reference) to a function defined in a different file than the one where the object is created, perhaps the following function.
// Second source file
void foo(const Abby & o)
{
o.fun();
}
This function expects an object that has Abby as a base class, and calls that object's virtual function. This works as long as it is forbidden to create an Abby object directly. If you relax this restriction, what happens? Should the compiler flag foo() as potentially broken because of what you might do in another translation unit? Should the compiler ignore the situation, leading to a runtime error?
Or maybe you would want a call to foo(test) to be flagged as making the definition of test to be invalid? This would be rather strange, to have the validity of one line (Abby test;) be dependent on what comes after it. It would also greatly limit what you could do in your test, leading compilers to ask why they bother. There is so little benefit to bending the rules this way, so why make a complicated language even more complicated?
Besides, it does not take much programming work to get the result you are looking for. Just comment out the pure virtual function.
class Abby { /* virtual void fun() const = 0; */ };
Since you're not calling it, this should not affect your code, right? If it does you could instead dummy-up the function, as in
class Abby { virtual void fun() const {} };
Revert this change once you are done testing.
Personally I might derive a class from Abby with a dummy fun() member, and use that concrete class for testing instead of the abstract class. However, the OP called that "a nuisance" in a comment.
class Concrete : public Abby { void fun() const override {} };
class base{
.....
virtual void function1();
virtual void function2();
};
class derived::public base{
int function1();
int function2();
};
int main()
{
derived d;
base *b = &d;
int k = b->function1() // Why use this instead of the following line?
int k = d.function1(); // With this, the need for virtual functions is gone, right?
}
I am not a CompSci engineer and I would like to know this. Why use virtual functions if we can avoid base class pointers?
The power of polymorphism isn't really apparent in your simple example, but if you extend it a bit it might become clearer.
class vehicle{
.....
virtual int getEmission();
}
class car : public vehicle{
int getEmission();
}
class bus : public vehicle{
int getEmission();
}
int main()
{
car a;
car b;
car c;
bus d;
bus e;
vehicle *traffic[]={&a,&b,&c,&d,&e};
int totalEmission=0;
for(int i=0;i<5;i++)
{
totalEmission+=traffic[i]->getEmission();
}
}
This lets you iterate through a list of pointers and have different methods get called depending on the underlying type. Basically it lets you write code where you don't need to know what the child type is at compile time, but the code will perform the right function anyway.
You're correct, if you have an object you don't need to refer to it via a pointer. You also don't need a virtual destructor when the object will be destroyed as the type it was created.
The utility comes when you get a pointer to an object from another piece of code, and you don't really know what the most derived type is. You can have two or more derived types built on the same base, and have a function that returns a pointer to the base type. Virtual functions will allow you to use the pointer without worrying about which derived type you're using, until it's time to destroy the object. The virtual destructor will destroy the object without you knowing which derived class it corresponds to.
Here's the simplest example of using virtual functions:
base *b = new derived;
b->function1();
delete b;
its to implement polymorphism. Unless you have base class pointer
pointing to derived object you cannot have polymorphism here.
One of the key features of derived classes is that a pointer to a
derived class is type-compatible with a pointer to its base class.
Polymorphism is the art of taking advantage of this simple but
powerful and versatile feature, that brings Object Oriented
Methodologies to its full potential.
In C++, a special type/subtype relationship exists in which a base
class pointer or a reference can address any of its derived class
subtypes without programmer intervention. This ability to manipulate
more than one type with a pointer or a reference to a base class is
spoken of as polymorphism.
Subtype polymorphism allows us to write the kernel of our application
independent of the individual types we wish to manipulate. Rather, we
program the public interface of the base class of our abstraction
through base class pointers and references. At run-time, the actual
type being referenced is resolved and the appropriate instance of the
public interface is invoked. The run-time resolution of the
appropriate function to invoke is termed dynamic binding (by default,
functions are resolved statically at compile-time). In C++, dynamic
binding is supported through a mechanism referred to as class virtual
functions. Subtype polymorphism through inheritance and dynamic
binding provide the foundation for objectoriented programming
The primary benefit of an inheritance hierarchy is that we can program
to the public interface of the abstract base class rather than to the
individual types that form its inheritance hierarchy, in this way
shielding our code from changes in that hierarchy. We define eval(),
for example, as a public virtual function of the abstract Query base
class. By writing code such as
_rop->eval();
user code is shielded from the variety and volatility of our query language. This not only allows for the addition, revision,
or removal of types without requiring changes to user programs, but
frees the provider of a new query type from having to recode behavior
or actions common to all types in the hierarchy itself. This is
supported by two special characteristics of inheritance: polymorphism
and dynamic binding. When we speak of polymorphism within C++, we
primarily mean the ability of a pointer or a reference of a base class
to address any of its derived classes. For example, if we define a
nonmember function eval() as follows, // pquery can address any of the
classes derived from Query
void eval( const Query *pquery ) { pquery->eval(); }
we can invoke it legally, passing in the address of an object of any of the
four query types:
int main()
{
AndQuery aq;
NotQuery notq;
OrQuery *oq = new OrQuery;
NameQuery nq( "Botticelli" ); // ok: each is derived from Query
// compiler converts to base class automatically
eval( &aq );
eval( ¬q );
eval( oq );
eval( &nq );
}
whereas an attempt to invoke eval() with the address of an object not derived from Query
results in a compile-time error:
int main()
{ string name("Scooby-Doo" ); // error: string is not derived from Query
eval( &name);
}
Within eval(), the execution of pquery->eval(); must invoke the
appropriate eval() virtual member function based on the actual class
object pquery addresses. In the previous example, pquery in turn
addresses an AndQuery object, a NotQuery object, an OrQuery object,
and a NameQuery object. At each invocation point during the execution
of our program, the actual class type addressed by pquery is
determined, and the appropriate eval() instance is called. Dynamic
binding is the mechanism through which this is accomplished.
In the object-oriented paradigm, the programmer manipulates an unknown instance of a bound but infinite set of types. (The set of
types is bound by its inheritance hierarchy. In theory, however, there
is no limit to the depth and breadth of that hierarchy.) In C++ this
is achieved through the manipulation of objects through base class
pointers and references only. In the object-based paradigm, the
programmer
manipulates an instance of a fixed, singular type that is completely defined at the point of compilation. Although the
polymorphic manipulation of an object requires that the object be
accessed either through a pointer or a reference, the manipulation of
a pointer or a reference in C++ does not in itself necessarily result
in polymorphism. For example, consider
// no polymorphism
int *pi;
// no language-supported polymorphism
void *pvi;
// ok: pquery may address any Query derivation
Query *pquery;
In C++, polymorphism
exists only within individual class hierarchies. Pointers of type
void* can be described as polymorphic, but they are without explicit
language support — that is, they must be managed by the programmer
through explicit casts and some form of discriminant that keeps track
of the actual type being addressed.
You seem to have asked two questions (in the title and in the end):
Why use base class pointers for derived classes?
This is the very use of polymorphism. It allows you to treat objects uniformly while allowing you to have specific implementation. If this bothers you, then I assume you should ask: Why polymorphism?
Why use virtual destructors if we can avoid base class pointers?
The problem here is you cannot always avoid base class pointers to exploit the strength of polymorphism.
is it possible to return exemplar of object using passed type name (string) in c++?
I have some base abstract class Base and a few derivates. Example code:
class Base
{
/* ... */
};
class Der1 : public Base
{
/* ... */
};
class Der2 : public Base
{
/* ... */
};
And I need function like:
Base *objectByType(const std::string &name);
Number of derivates classes are changeable and I don't want to make something like switching of name and returning by hands new object type. Is it possible in c++ to do that automatically anyway?
p.s. usage should looks like:
dynamic_cast<Der1>(objectByType("Der1"));
I need pure c++ code (crossplatform). Using boost is permissible.
There is a nice trick which allows you to write a factory method without a sequence of if...else if....
(note that, AFAIK, it is indeed not possible to do what you want in C++ as this code is generated in the compile time. A "Factory Method" Design Pattern exists for this purpose)
First, you define a global repository for your derived classes. It can be in the form std::map<std::string, Base*>, i.e. maps a name of the derived class to an instance of that class.
For each derived class you define a default constructor which adds an object of that class to the repository under class's name. You also define a static instance of the class:
// file: der1.h
#include "repository.h"
class Der1: public Base {
public:
Der1() { repository[std::string("Der1")] = this; }
};
// file: der1.cpp
static Der1 der1Initializer;
Constructors of static variables are run even before main(), so when your main starts you already have the repository initialized with instances of all derived classes.
Your factory method (e.g. Base::getObject(const std::string&)) needs to search the repository map for the class name. It then uses the clone() method of the object it finds to get a new object of the same type. You of course need to implement clone for each subclass.
The advantage of this approach is that when you are adding a new derived class your additions are restricted only to the file(s) implementing the new class. The repository and the factory code will not change. You will still need to recompile your program, of course.
It's not possible to do this in C++.
One options is to write a factory and switch on the name passed in, but I see you don't want to do that. C++ doesn't provide any real runtime reflection support beyond dynamic_cast, so this type of problem is tough to solve.
Yes that is possible! Check this very funny class called Activator
You can create everything by Type and string and can even give a List of parameters, so the method will call the appropriate constructor with the best set of arguments.
Unless I misunderstood, the typeid keyword should be a part of what you are looking for.
It is not possible. You have to write the objectByType function yourself:
Base* objectByType(const std::string& name) {
if (name == "Der1")
return new Der1;
else if (name == "Der2")
return new Der2;
// other possible tests
throw std::invalid_argument("Unknown type name " + name);
}
C++ doesn't support reflection.
In my opinion this is the single point where Java beats C++.
(ope not to get too many down votes for this...)
You could achieve something like that by using a custom preprocessor, similar to how MOC does for Qt.
I have a c++ class derived from a base class in a framework.
The derived class doesn't have any data members because I need it to be freely convertible into a base class and back - the framework is responsible for loading and saving the objects and I can't change it. My derived class just has functions for accessing the data.
But there are a couple of places where I need to store some temporary local variables to speed up access to data in the base class.
mydata* MyClass::getData() {
if ( !m_mydata ) { // set to NULL in the constructor
m_mydata = some_long_and complex_operation_to_get_the_data_in_the_base()
}
return m_mydata;
}
The problem is if I just access the object by casting the base class pointer returned from the framework to MyClass* the ctor for MyClass is never called and m_mydata is junk.
Is there a way of only initializing the m_mydata pointer once?
It doesn't have members and you must maintain bit-for-bit memory layout compatibility… except it does and C++ doesn't have a concept of freely-convertible.
If the existing framework allocates the base objects, you really can't derive from it. In that case, I can think of two options:
Define your own class Cached which links to Base by reference. Make the reference public and/or duplicate Base's interface without inheritance.
Use a hash table, unordered_map< Base *, mydata > mydata_cache;. This seems most appropriate to me. Use free functions to look up cache data before delegating to the Base *.
You could initialize your private variables in a separate initialization member function, so something like this:
class MyClass {
public:
init() {
if (!m_mydata) {
m_mydata = f();
}
}
};
framework_class_t *fclass = framework.classfactory.makeclass();
MyClass *myclass = (MyClass*)fclass;
myclass->init();
char *mydata = myclass->getData();
It's hard to say if this is a good idea or not without knowing what framework you're using, or seeing your code. This is just the first thing that came to mind after reading your description.
You could create a wrapper for the factory of the framework. The wrapper would have the same interface, delegate calls to the framework but it could initialize the created base class instance before returning it. Of course, this requires you to change your code to use the wrapper everywhere, but if it is possible, after that you can be sure that the initialization happens properly.
A variation on this: use RAiI by wrapping the base class instances into a custom autopointer which could do the initialization in its constructor. Again, if you manage to change the code everywhere to use the new wrapper type instead of the derived class directly, you are safe.
I have an abstract base class
class IThingy
{
virtual void method1() = 0;
virtual void method2() = 0;
};
I want to say - "all classes providing a concrete instantiation must provide these static methods too"
I am tempted to do
class IThingy
{
virtual void method1() = 0;
virtual void method2() = 0;
static virtual IThingy Factory() = 0;
};
I know that doesnt compile, and anyway its not clear how to use it even if it did compile. And anyway I can just do
Concrete::Factory(); // concrete is implementation of ITHingy
without mentioning Factory in the base class at all.
But I feel there should be some way of expressing the contract I want the implementations to sign up to.
Is there a well known idiom for this? Or do I just put it in comments? Maybe I should not be trying to force this anyway
Edit: I could feel myself being vague as I typed the question. I just felt there should be some way to express it. Igor gives an elegant answer but in fact it shows that really it doesn't help. I still end up having to do
IThingy *p;
if(..)
p = new Cl1();
else if(..)
p = new Cl2();
else if(..)
p = new Cl3();
etc.
I guess reflective languages like c#, python or java could offer a better solution
The problem that you are having is partly to do with a slight violation a single responsibility principle. You were trying to enforce the object creation through the interface. The interface should instead be more pure and only contain methods that are integral to what the interface is supposed to do.
Instead, you can take the creation out of the interface (the desired virtual static method) and put it into a factory class.
Here is a simple factory implementation that forces a factory method on a derived class.
template <class TClass, class TInterface>
class Factory {
public:
static TInterface* Create(){return TClass::CreateInternal();}
};
struct IThingy {
virtual void Method1() = 0;
};
class Thingy :
public Factory<Thingy, IThingy>,
public IThingy {
//Note the private constructor, forces creation through a factory method
Thingy(){}
public:
virtual void Method1(){}
//Actual factory method that performs work.
static Thingy* CreateInternal() {return new Thingy();}
};
Usage:
//Thingy thingy; //error C2248: 'Thingy::Thingy' : cannot access private member declared in class 'Thingy'
IThingy* ithingy = Thingy::Create(); //OK
By derinving from Factory<TClass, TInterface>, the derived class is forced to have a CreateInternal method by the compiler. Not deifining it will result in an error like this:
error C2039: 'CreateInternal' : is not
a member of 'Thingy'
There is no sure way to prescribe such a contract in C++, as there is also no way to use this kind of polymorphism, since the line
Concrete::Factory()
is always a static compile-time thing, that is, you cannot write this line where Concrete would be a yet unknown client-provided class.
You can make clients implement this kind of "contract" by making it more convenient than not providing it. For example, you could use CRTP:
class IThingy {...};
template <class Derived>
class AThingy : public IThingy
{
public:
AThingy() { &Derived::Factory; } // this will fail if there is no Derived::Factory
};
and tell the clients to derived from AThingy<their_class_name> (you could enforce this with constructor visibility tweaking, but you cannot ensure the clients don't lie about their_class_name).
Or you could use the classic solution, create a separate hierarchy of factory classes and ask the clients to provide their ConcreteFactory object to your API.
Static methods cannot be made virtual (or abstract, for that matter) in C++.
To do what you're intending, you can have have an IThingy::factory method that returns a concrete instance, but you need to somehow provide a means for factory to create the instance. For instance, define a method signature like IThing* (thingy_constructor*)() and have a static call in IThingy that you can pass such a function to that defines how IThingy will construct the factory instance. Then, in a dependent library or class, you can call this method with an appropriate function that, in turn, nows how to properly construct an object implementing your interface.
Supposing you haven't had your factory 'initializer' called, you'd want to take appropriate action, such as throwing an exception.