Understanding inheritance in Google's V8 C++ code base - c++

I cannot understand the implementation of inheritance in Google's V8 JavaScript engine. It clearly (?) implements an inheritance hierarchy, but seems to completely do away with virtual functions.
This is the inheritance hierarchy as detailed in the objects.h header file:
// Inheritance hierarchy:
// - Object
// - Smi (immediate small integer)
// - HeapObject (superclass for everything allocated in the heap)
// - JSReceiver (suitable for property access)
// - JSObject
// - JSArray
// ... and many more entries
Most object types are derived from Object, which is declared as follows:
// Object is the abstract superclass for all classes in the
// object hierarchy.
// Object does not use any virtual functions to avoid the
// allocation of the C++ vtable.
// Since both Smi and HeapObject are subclasses of Object no
// data members can be present in Object.
class Object {
// ... bunch of method declarations and definitions
};
The relatively simple Smi class is declared next:
class Smi: public Object {
public:
// methods declarations and static member definitions
};
and so on.
For the life of me, I cannot understand how can, say, an instance of Smi can be used as an Object; there are no virtual functions and I cannot find overrides in the the implementation file, objects.cc. At 17,290 lines, though, trying to understand what is going on is proving a difficult task.
As another difficulty, I found an ObjectVisitor class in the same header file (this one is more classical; it consists of virtual methods). But I could not find the equivalent Accept(Visitor*) (or similar) method in the Object base class.
What I am asking in concrete is for a minimal example that illustrates how does this inheritance pattern works.

The classes in objects.h do not actually define real C++ classes. They do not have any fields. The classes are merely facades to objects managed on the V8 JavaScript heap. Hence they cannot have any virtual functions either, because that would require putting vtable pointers into the JS heap. Instead, all dispatch is done manually, via explicit type checks and down casts.
The this pointer inside methods isn't real either. For smis, this is simply an integer. For everything else it is a pointer into the V8 heap, off by one for tagging. Any actual accessor method masks this pointer and adds an offset to access the appropriate address in the heap. The offsets of each field is also defined manually in the classes.

Take a look at Object::IsPromise() for a perfect example of how it works:
bool Object::IsPromise(Handle<Object> object) {
if (!object->IsJSObject()) return false;
auto js_object = Handle<JSObject>::cast(object);
// Promises can't have access checks.
if (js_object->map()->is_access_check_needed()) return false;
auto isolate = js_object->GetIsolate();
// TODO(dcarney): this should just be read from the symbol registry so as not
// to be context dependent.
auto key = isolate->promise_status();
// Shouldn't be possible to throw here.
return JSObject::HasRealNamedProperty(js_object, key).FromJust();
}
The way inheritance is used here is static. That is, type queries are done by a proxy or container (using some hidden magic, that, at a glance looks like they're using references to query a tag), and conversions from Object to a derived class is done by static_cast<>(). In that way, the member functions of the derived class can be called.
Note that in the above function, the type query and cast is indirectly performed by the Handle<> class, not by Object or any of its derived classes.
Note also that the functions which accept ObjectVisitor as a parameter are rather uniformly called Iterate, and that these functions all appear on proxies or handles.

Related

C++ design issue. New to templates

I'm fairly new to c++ templates.
I have a class whose constructor takes two arguments. It's a class that keeps a list of data -- it's actually a list of moves in a chess program.
I need to keep my original class as it's used in other places, but I now need to pass extra arguments to the class, and in doing so have a few extra private data members and specialize only one of the private methods -- everything else will stay the same. I don't think a derived class helps me here, as they aren't going to be similar objects, and also the private methods are called by the constructor and it will call the virtual method of the base class -- not the derived method.
So I guess templates are going to be my answer. Just looking for any hints about how might proceed.
Thanks in advance
Your guess is wrong. Templates are no more the answer for your problem than inheritance is.
As jtbandes said in comment below your question, use composition.
Create another class that contains an instance of your existing class as a member. Forward or delegate operations to that contained object as needed (i.e. a member function in your new class calls member functions of the contained object). Add other members as needed, and operations to work with them.
Write your new code to interact with the new class. When your new code needs to interact with your old code, pass the contained object (or a reference or a pointer to it) as needed.
You might choose to implement the container as a template, but that is an implementation choice, and depends on how you wish to reuse your container.
Templates are used when you want to pass at compile time parameter like values,typenames, or classes. Templates are used when you want to use exactly the same class with the same methods, but applying it to different parameters. The case you described is not this I think.
If they aren't goign to be similar objects you may want to create a specialized class (or collections of function) to use from the various other classes.
Moreover you can think of creating a base class and extending it as needed. Using a virtual private method should allow you to select the method implementation of the object at runtime instead of the method of the base class.
We may help you more if you specify what does they need to share, what does your classes have in common?
The bare bones of my present code looks like this:
class move_list{
public:
move_list(const position& pos, unsigned char ply):pos_(pos),ply_(ply){
//Calculates moves and calls add_moves(ply,target_bitboard,flags) for each move
}
//Some access functions etc...
private:
//private variables
void add_moves(char,Bitboard,movflags);
};
Add_moves places the moves on a vector in no particular order as they are generated. My new class however, is exactly the same except it requires extra data:
move_list(const position& pos, unsigned char ply,trans_table& TT,killers& kill,history& hist):pos_(pos),ply_(ply),TT_(TT),kill_(kill),hist_(hist) {
and the function add_moves needs to be changed to use the extra data to place the moves in order as it receives them. Everything else is the same. I guess I could just write an extra method to sort the list after they have all been generated, but from previous experience, sorting the list as it receives it has been quicker.

Interface for library c++

I need to create simulation of parabolic flight of bullet(simple rectangle), and one of conditions is to make all calculation inside self-made library and to create for it interface(abstract class).
Am confused how to implement this:
Make fully abstract class and couple of functions(not methods in
class) that will use class through "get()" and "set()"?
Make class with all calculations implemented in his methods, and just
make one "draw" method pure virtual?
I'm using WinAPI, and all graphics through GDI
and will be really appreciate for any help
One of the purposes you create classes for is to separate all unrelative data and operations to the different classes.
In your case one part is calculations and the other part is result layout.
So, the best way to implement it is to define a class which provides all calculations and access to results and implement the drawing function, which will use the object of your calculation class.
Thus, it will be able to use your calculations in other environment (for example, in some your other project) without any code changing, which is natural. It will provide portability of your platform-independent caclulation code.
And the layout part, which is platform-dependent, should be implemented separatly, using just interface, which is provided by the calculation class.
class Trajectory
{
public:
// Constructor, computation call methods
// "GetResult()" function,
// which will return trajectory in the way you choose
...
private:
// computation functions
};
// somewhere else
void DrawTrajectory(Trajectory t)
{
// here is a place for calling all winapi functions
// with data you get using t.GetResult()
}
If abstract class is required you should inherit Trajectory class from an abstract class,
where you will define all functions you have to call.
In this case
//
class ITrajectory
{
public:
// virtual /type/ GetResult() = 0;
// virtual /other methods/
};
class Trajectory : public ITrajectory
{
// the same as in previous definition
};
void DrawTrajectory(ITrajectory T)
{
// the same as in previous definition
}
When you are talking about Windows, libraries, and abstract classes as interfaces, I wonder if you are thinking of sharing classes between DLLs.
There is a declspec(dllexport) keyword, but using this on classes and/or class members is bad. You end up with all your library code closely coupled and completely dependent on using the same compiler version and settings for everything.
A much better option, which allows you to upgrade compiler for one DLL at a time, for instance, is to pass interface pointers. The key here is that the consumer of the library knows nothing about the class layout. The interface doesn't describe data members or non-virtual functions which might get inlined. Only public virtual functions appear in the interface, which is just a class defined in the public header.
The DLL has the real implementation which inherits from the interface. All the consumer has is the virtual function table and a factory (plain old C-compatible function) which returns a pointer to a new object.
If you do that, you can change the implementation any way you like without changing the binary interface which consumers depend on, so they continue to work without a recompile. This is the basis of how COM objects work in Windows.

Ways to make (relatively) safe assumptions about the type of concrete subclasses?

I have an interface (defined as a abstract base class) that looks like this:
class AbstractInterface
{
public:
virtual bool IsRelatedTo(const AbstractInterface& other) const = 0;
}
And I have an implementation of this (constructors etc omitted):
class ConcreteThing
{
public:
virtual bool IsRelatedTo(const AbstractInterface& other) const
{
return m_ImplObject.has_relationship_to(other.m_ImplObject);
}
private:
ImplementationObject m_ImplObject;
}
The AbstractInterface forms an interface in Project A, and the ConcreteThing lives in Project B as an implementation of that interface. This is so that code in Project A can access data from Project B without having a direct dependency on it - Project B just has to implement the correct interface.
Obviously the line in the body of the IsRelatedTo function cannot compile - that instance of ConcreteThing has an m_ImplObject member, but it can't assume that all AbstractInterfaces do, including the other argument.
In my system, I can actually assume that all implementations of AbstractInterface are instances of ConcreteThing (or subclasses thereof), but I'd prefer not to be casting the object to the concrete type in order to get at the private member, or encoding that assumption in a way that will crash without a diagnostic later if this assumption ceases to hold true.
I cannot modify ImplementationObject, but I can modify AbstractInterface and ConcreteThing. I also cannot use the standard RTTI mechanism for checking a type prior to casting, or use dynamic_cast for a similar purpose.
I have a feeling that I might be able to overload IsRelatedTo with a ConcreteThing argument, but I'm not sure how to call it via the base IsRelatedTo(AbstractInterface) method. It wouldn't get called automatically as it's not a strict reimplementation of that method.
Is there a pattern for doing what I want here, allowing me to implement the IsRelatedTo function via ImplementationObject::has_relationship_to(ImplementationObject), without risky casts?
(Also, I couldn't think of a good question title - please change it if you have a better one.)

What are benefits of using PrivateClass containing data of the Class?

class MyClassPrivate
{
//My members.
};
//and then
class MyClass {
private:
MyClassPrivate* const d;
};
What is the reason of using this 'pattern'? How it's correctly called?
This is called "Pointer to implementation" or "pimpl". See http://en.wikibooks.org/wiki/C++_Programming/Idioms#Pointer_To_Implementation_.28pImpl.29
When you use this pattern you would forward declare the implementation class, and declare the body elsewhere, i.e.:
// header
class MyClassPrivate;
class MyClass {
public:
MyClass();
~MyClass();
private:
MyClassPrivate* const d;
};
// cpp
class MyClassPrivate {
};
MyClass::MyClass() : d(new MyClassPrivate) {}
MyClass::~MyClass() { delete d; }
The advantage of doing this is that the implementation of MyClass is not exposed to other users of MyClass. If the implementation changes, the other users of MyClass does not need to be recompiled. Any header files that has to be included for members also need not be exposed, which improves compilation time.
The most usage is Pimlp idiom.
Why should the “PIMPL” idiom be used?
Pimpl idiom
The Pimpl
The Pimpl idiom describes a way for making your header files
impervious to change. You often hear advices like "Avoid change your
public interface!" So you can modify your private interface, but how
can you avoid recompilation when your header file defines the private
methods. This is what the Pimpl does – Reduce compilation damages when
your private interface changes[3].
From Here:
Benefits:
Changing private member variables of a class does not require recompiling classes that depend on it, thus make times are faster, and the FragileBinaryInterfaceProblem is reduced.
The header file does not need to #include classes that are used 'by value' in private member variables, thus compile times are faster.
This is sorta like the way SmallTalk automatically handles classes... more pure encapsulation.
Drawbacks:
More work for the implementer.
Doesn't work for 'protected' members where access by subclasses is required.
Somewhat harder to read code, since some information is no longer in the header file.
Run-time performance is slightly compromised due to the pointer indirection, especially if function calls are virtual (branch prediction for indirect branches is generally poor).
How to do it:
Put all the private member variables into a struct.
Put the struct definition in the .cpp file.
In the header file, put only the ForwardDeclaration of the struct.
In the class definition, declare a (smart) pointer to the struct as the only private member variable.
The constructors for the class need to create the struct.
The destructor of the class needs to destroy the struct (possibly implicitly due to use of a smart pointer).
The assignment operator and CopyConstructor need to copy the struct appropriately or else be disabled.
You can use this for the PIMPL idiom, when you want to separate interface from iplementation.
Many design patterns use a "pointer" to a private attribute as well, such as the Strategy Pattern. This patterns allows you to select a different algorithm at run-time.
Also, if you make the manipulation of your data adhere to the same interface, you can encapsulate the data in a Private Class, make this class a part of a hierarchy and switch between different data implementations during run time (or compile time for that matter :)).
A good example of this is a geometrical class that holds data on polygons. Each Polygon provides access to the points, you can also delete the Polygon edge and do various other topological operations. If you provide an abstract base class for the Polygon class with the methods such as deletePoint, addPoint, swapEdge, you can test different Polygon implementations.
You may define a polygon as a list of Point types directly, and store the points in different contaienrs (list or vector). The Polygon class may be defined via indirect addressing, where the polygon is actually a list of IDs to the list of points (I am talking about lists in the general sense). This way, you can test different algorithms of the PolygonGeometry class and see how they work with differtn Polygon implementations.
There is a design principle behind this: Prefer Composition to Inheritance. Whenever you are using Composition and you are relying on the type to be deterimined at run-time, you will have a private attribute pointer.

Why can't we create objects for an abstract class in C++?

I know it is not allowed in C++, but why? What if it was allowed, what would the problems be?
Judging by your other question, it seems you don't understand how classes operate. Classes are a collection of functions which operate on data.
Functions themselves contain no memory in a class. The following class:
struct dumb_class
{
void foo(){}
void bar(){}
void baz(){}
// .. for all eternity
int i;
};
Has a size of int. No matter how many functions you have ever, this class will only take up the space it takes to operate on an int. When you call a function in this class, the compiler will pass you a pointer to the place where the data in the class is stored; this is the this pointer.
So, the function lie in memory somewhere, loaded once at the beginning of your program, and wait to be called with data to operate on.
Virtual functions are different. The C++ standard does not mandate how the behavior of the virtual functions should go about, only what that behavior should be. Typically, implementations use what's called a virtual table, or vtable for short. A vtable is a table of function pointers, which like normal functions, only get allocated once.
Take this class, and assume our implementor uses vtables:
struct base { virtual void foo(void); };
struct derived { virtual void foo(void); };
The compiler will need to make two vtables, one for base and one for derived. They will look something like this:
typedef /* some generic function pointer type */ func_ptr;
func_ptr __baseTable[] = {&base::foo};
func_ptr __derivedTable[] = {&derived::foo};
How does it use this table? When you create an instance of a class above, the compiler slips in a hidden pointer, which will point to the correct vtable. So when you say:
derived d;
base* b = &d;
b->foo();
Upon executing the last line, it goes to the correct table (__derivedTable in this case), goes to the correct index (0 in this case), and calls that function. As you can see, that will end up calling derived::foo, which is exactly what should happen.
Note, for later, this is the same as doing derived::foo(b), passing b as the this pointer.
So, when virtual methods are present, the class of the size will increase by one pointer (the pointer to the vtable.) Multiple inheritance changes this a bit, but it's mostly the same. You can get more details at C++-FAQ.
Now, to your question. I have:
struct base { virtual void foo(void) = 0; }; // notice the = 0
struct derived { virtual void foo(void); };
and base::foo has no implementation. This makes base::foo a pure abstract function. So, if I were to call it, like above:
derived d;
base* b = &d;
base::foo(b);
What behavior should we expect? Being a pure virtual method, base::foo doesn't even exist. The above code is undefined behavior, and could do anything from nothing to crashing, with anything in between. (Or worse.)
Think about what a pure abstract function represents. Remember, functions take no data, they only describe how to manipulate data. A pure abstract function says: "I want to call this method and have my data be manipulated. How you do this is up to you."
So when you say, "Well, let's call an abstract method", you're replying to the above with: "Up to me? No, you do it." to which it will reply "##^##^". It simply doesn't make sense to tell someone who's saying "do this", "no."
To answer your question directly:
"why we cannot create an object for an abstract class?"
Hopefully you see now, abstract classes only define the functionality the concrete class should be able to do. The abstract class itself is only a blue-print; you don't live in blue-prints, you live in houses that implement the blue-prints.
The problem is simply this:
what should the program do when an abstract method is called?
and even worse: what should be returned for a non-void function?
The application whould proabably have to crash or thow a runtime exception and thus this would cause trouble. You can't dummy-implement every abstract function.
A class can simply be declared abstract where it has no abstract methods. I guess that could be instantiated in theory but the class designer doesn't want you to. It may have unintended consequences.
Usually however abstract classes have abstract methods. They can't be instantiated for the simple reason that they're missing those methods.
Because logically it does not make any sense.
An abstract class is a description that is incomplete.
It indicates what things need to be filled out to make it complete but without those bits its not complete.
My first example was a chess game:
The game has lots of pieces of different type (King,Queen,Pawn ... etc).
But there are no actual objects of type piece, but all objects are instances of objects derived from piece. How can you have an object of something that is not fully defined. There is not point in creating an object of piece as the game does not know how it moves (that is the abstract part). It knows it can move but not how it does it.
Abstract classes are non-instantiable by definition. They require that there be derived, concrete classes. What else would an abstract class be if it didn't have pure virtual (unimplemented) functions?
It's the same class of question as why can't I change the value of a const variable, why can't I access private class members from other classes or why can't I override final methods.
Because that's the purpose of these keywords, to prevent you from doing so. Because the author of the code deemed doing so dangerous, undesired or simply impossible due to some abstract reasons like lack of essential functions that need to be added by specific child classes. It isn't really that you can't instantiate because a class is virtual. It's that inability to instantiate a class defines it as virtual (and if a class that can't be instantiated isn't virtual, it's an error. Same goes the other way, if instance of given class makes sense, it shouldn't be marked as virtual)
Why we cant create an object of an abstract class?
simply abstract class contains abstract methods(means the functions which are without the body) and we cannot give functionality to the abstract methods. And if we try to give functionality to the abstract methods then there will be no difference between abstract class and virtual class. So lastly if we create an object Of an abstrast class then there is no fun to call the useless functions or abstract methods as they are without the functionality..so thats why any language doesnt allow us to create an object of an abstract class..
Abstract classes instantiated would be pretty useless, because you would be seeing a lot more of "pure virtual function called". :)
It's like: we all know that a car would have 3 pedals and a steering wheel and a gear stick. Now, if that would be it, and there'd be an instance of 3 pedals and gear stick and a wheel, I'm not buying it, I want a car, like with seats, doors, AC etc. with pedals actually doing something apart from being in existence and that's what abstract class doesn't promise me, the ones implementing it do.
Basically creation of object is responsible for allocation of memory for member variables and member functions. but here, in pure virtual function we have declaration and defination in derived class.so creation of object generates error.