I want to create an instance of a class and place it in shared memory so the same instance can be called from multiple processes. However, this class has virtual methods which I think may cause problems as I have read the mapped data can't contain pointers, which would be the case here with the vtable in the class. Will it work?
As Kerrek SB commented, you cannot map a class containing virtual methods. But you can probably make a simple struct or class without virtuals, map that, and then give a pointer to it to another class which does have virtuals and uses the plain struct as its implementation. Basically, the Pimpl idiom.
If needed, you can even do something like virtual dispatch yourself by storing a "type" integer in the plain struct, and inspecting it to decide which functions to invoke.
Related
Can someone explains how this virtual table for the different class is stored in memory? When we call a function using pointer how do they make a call to function using address location? Can we get these virtual table memory allocation size using a class pointer? I want to see how many memory blocks is used by a virtual table for a class. How can I see it?
class Base
{
public:
FunctionPointer *__vptr;
virtual void function1() {};
virtual void function2() {};
};
class D1: public Base
{
public:
virtual void function1() {};
};
class D2: public Base
{
public:
virtual void function2() {};
};
int main()
{
D1 d1;
Base *dPtr = &d1;
dPtr->function1();
}
Thanks! in advance
The first point to keep in mind is a disclaimer: none of this is actually guaranteed by the standard. The standard says what the code needs to look like and how it should work, but doesn't actually specify exactly how the compiler needs to make that happen.
That said, essentially all C++ compilers work quite similarly in this respect.
So, let's start with non-virtual functions. They come in two classes: static and non-static.
The simpler of the two are static member functions. A static member function is almost like a global function that's a friend of the class, except that it also needs the class`s name as a prefix to the function name.
Non-static member functions are a little more complex. They're still normal functions that are called directly--but they're passed a hidden pointer to the instance of the object on which they were called. Inside the function, you can use the keyword this to refer to that instance data. So, when you call something like a.func(b);, the code that's generated is pretty similar to code you'd get for func(a, b);
Now let's consider virtual functions. Here's where we get into vtables and vtable pointers. We have enough indirection going on that it's probably best to draw some diagrams to see how it's all laid out. Here's pretty much the simplest case: one instance of one class with two virtual functions:
So, the object contains its data and a pointer to the vtable. The vtable contains a pointer to each virtual function defined by that class. It may not be immediately apparent, however, why we need so much indirection. To understand that, let's look at the next (ever so slightly) more complex case: two instances of that class:
Note how each instance of the class has its own data, but they both share the same vtable and the same code--and if we had more instances, they'd still all share the one vtable among all the instances of the same class.
Now, let's consider derivation/inheritance. As an example, let's rename our existing class to "Base", and add a derived class. Since I'm feeling imaginative, I'll name it "Derived". As above, the base class defines two virtual functions. The derived class overrides one (but not the other) of those:
Of course, we can combine the two, having multiple instances of each of the base and/or derived class:
Now let's delve into that in a little more detail. The interesting thing about derivation is that we can pass a pointer/reference to an object of the derived class to a function written to receive a pointer/reference to the base class, and it still works--but if you invoke a virtual function, you get the version for the actual class, not the base class. So, how does that work? How can we treat an instance of the derived class as if it were an instance of the base class, and still have it work? To do it, each derived object has a "base class subobject". For example, lets consider code like this:
struct simple_base {
int a;
};
struct simple_derived : public simple_base {
int b;
};
In this case, when you create an instance of simple_derived, you get an object containing two ints: a and b. The a (base class part) is at the beginning of the object in memory, and the b (derived class part) follows that. So, if you pass the address of the object to a function expecting an instance of the base class, it uses on the part(s) that exist in the base class, which the compiler places at the same offsets in the object as they'd be in an object of the base class, so the function can manipulate them without even knowing that it's dealing with an object of the derived class. Likewise, if you invoke a virtual function all it needs to know is the location of the vtable pointer. As far as it cares, something like Base::func1 basically just means it follows the vtable pointer, then uses a pointer to a function at some specified offset from there (e.g., the fourth function pointer).
At least for now, I'm going to ignore multiple inheritance. It adds quite a bit of complexity to the picture (especially when virtual inheritance gets involved) and you haven't mentioned it at all, so I doubt you really care.
As to accessing any of this, or using in any way other than simply calling virtual functions: you may be able to come up with something for a specific compiler--but don't expect it to be portable at all. Although things like debuggers often need to look at such stuff, the code involved tends to be quite fragile and compiler-specific.
The virtual table is supposed to be shared between instances of a class. More precisely, it lives at the "class" level, rather than the instance level. Each instance has the overhead of actually having a pointer to the virtual table, if in it's hierarchy there are virtual functions and classes.
The table itself is at least the size necessary to hold a pointer for each virtual function. Other than that, it is an implementation detail how it's actually defined. Check here for a SO question with more details about this.
First of all, the following answer contain almost everything you want to know regarding virtual tables:
https://stackoverflow.com/a/16097013/8908931
If you are looking for something a little more specific (with the regular disclaimer that this might change between platforms, compilers, and CPU architectures):
When needed, a virtual table is being created for a class. The class will have only one instance of the virtual table, and each object of the class will have a pointer which will point to the memory location of this virtual table. The virtual table itself can be thought of as a simple array of pointers.
When you assigned the derived pointer to the base pointer, it also contain the pointer to the virtual table. This mean that the base pointer points to the virtual table of the derived class. The compiler will direct this call to an offset into the virtual table, which will contain the actual address of the function from the derived class.
Not really. Usually at the start of an object, there is a pointer to the virtual table itself. But this will not help you too much, as it is just an array of pointers, with no real indication of its size.
Making a very long answer short: For an exact size you can find this information in the executable (or in segments loaded from it to the memory). With enough knowledge of how the virtual table works, you can get a pretty accurate estimation, given you know the code, the compiler, and the target architecture.
For the exact size, you can find this information in either the executable, or in segments in the memory which are being loaded from the executable. An executable is usually an ELF file, this kind of files, contain information which is needed to run a program. A part of this information is symbols for various kinds of language constructs such as variables, functions and virtual tables. For each symbol, it contains the size it takes in memory. So button line, you will need the symbol name of the virtual table, and enough knowledge in ELF in order to extract what you want.
The answer that Jerry Coffin gave is excellent in explaining how virtual function pointers work to achieve runtime polymorphism in C++. However, I believe that it is lacking in answering where in memory the vtable is stored. As others have pointed out this is not dictated by the standard.
However, there is an excellent blog post(s) by Martin Kysel that goes into great detail about where virtual tables are stored. To summarize the blog post(s):
One vtable is created for every class (not instance) with virtual functions. Each instance of this class points to the same vtable in memory
Each vtable is stored in read only memory of the resulting binary file
The disassembly for each function in the vtable is stored in the text section of the resulting ELF binary
Attempting to write over the vtable, located in read only memory, results in a Segmentation fault (as expected)
Each class has a pointer to a list of functions, they are each in the same order for derived classes, then the specific functions that are overrided change at that position in the list.
When you point with a base pointer type, the pointed to object still has the correct _vptr.
Base's
Base::function1()
Base::function2()
D1's
D1::function1()
Base::function2()
D2's
Base::function1()
D2::function2()
Further derived drom D1 or D2 will just add their new virtual functions in the list below the 2 current.
When calling a virtual function we just call the corresponding index, function1 will be index 0
So your call
dPtr->function1();
is actually
dPtr->_vptr[0]();
I need to be able to retrieve a type based on an ID.
To achieve this, I am intending to create a base class for the types, which declares all the functions as pure virtual functions.
class Base
{
public:
virtual int function1() = 0;
virtual int function2() = 0;
};
I then create derived classes, instantiate these and store the pointers to the instances in an std::unordered_map of base pointers.
std::unordered_map<int, Base*> types;
Each element of the std::unordered_map will point to a different derived class type (i.e. no duplicates).
Then I can retrieve the type I need by getting the associated member from the std::unordered_map.
Base* ptr = types[4];
// Use type...
int num = ptr->function1();
Now I know this could come down to opinion, but is this a horrible use of C++? Is there a cleaner way to do this?
Update
I could use instance of the derived classes instead of their respective IDs, however I would likely have to dynamically allocate each one on the heap (i.e. using new), and I am going to be creating a lot of these, so the overhead from dynamic allocation could have an unwanted effect on performance.
Now I know this could come down to opinion, but is this a horrible use
of C++? Is there a cleaner way to do this?
I can see only one thing in the example code that could conceivably be called horrible, and that's the lack of abstraction. However, the concept itself is perfectly OK.
There are two pieces of machinery here: a polymorphic base class, and a map that gives you an object instance based on some kind of opaque id -- an int in the example, but it could also be e.g. an std::string and nothing would really change.
The base class is as classic as it gets, with the provision that it should also have a virtual destructor as per standard good practice. Apart from that there's nothing to see there.
The map is conceptually what you could call a "singleton factory". It's a factory because you call a method passing an id and you get back an object that corresponds to that id, with the only distinction that you get the same instance back every time you call the method with the same id. From a high-level perspective, this is not at all interesting to the client. The client just cares that e.g. every time they pass in 4, they get back a Widget*. How the factory gets hold of that instance in order to return the pointer is simply not important when you are looking at client code. So there's nothing out of the ordinary here also, at least on the design level.
Extending this line of thought leads to the possible deficiencies: there is no need for the client to know how the factory satisfies requests, but exposing the implementation (direct access to std::unordered_map) lets the client know everything, and it also lets them do things with the factory that they should perhaps not be allowed to do (e.g. unregistering types). This can be easily fixed by creating a new class that encapsulates the map and exposes a public interface that you have free reign to design as you see fit.
is this horrible use of C++? -- Yes, perhaps better expression is "no use of polymorphism", i.e. you are not using polymorphism (properly).
You don't need to track types in the first place as compiler does it for you through polymorphism:
1) just declare the method virtual in the base class (you already did that).
2) overwrite the method in child-classes.
like:
class Base {
public:
virtual int myId(){ return 0; }
};
class Child1: public Base {
public:
int myId() { return 1; }
};
I'm very new to C++ and am trying to find a good pattern for having the following:
A base class that defines several (virtual?) functions and properties.
Several varying classes that inherit from this base class and override some or all of the virtual functions and work with the parent's properties.
Then my plan was to have a single variable that can store any one of the classes and call functions defined in the base one. Sometimes I will swap out the object in this variable for one of the others.
Does this seem sensible and how can I store objects like this of varying classes? I was hoping to just be able to define the variable as BaseClass myCurrentObject; and then do something like myCurrentObject = ChildClassA(); or myCurrentObject = ChildClassB(); etc but it doesn't seem to be that simple!
What you describe is exactly polymorphism, so yes, this is reasonable.
As proposed in the comments, you have to use pointers for referencing to the objects, as in
BaseClass* obj = new ChildClassA();
obj->call_some_virtual_function();
// ^ Will call the most derived definition of the function
Of course, if you are writing any serious program, you should not use plain pointers here. They simply make it too likely that you will sooner or latter forget to delete the old object pointed to by obj before assigning a new one to it. But since you said you are new to C++ and might just want to try things out, the above could fine.
The reason why it cannot be as easy as simply writing BaseClass obj = ChildClassA(); is that this defines obj to be an object of type BaseClass. But since ChildClassA might have more members than BaseClass, you cannot store a ChildClassA at a spot that was only intended for a BaseClass object. You should thus realize that polymorphism requires a variable amount of memory, and this you can only get on the heap.
If Base is a base class and Derived a derived class and there are 25 instances of Derived, how are the vtables set up to be accessed by all the instances? Where are they loaded in the memory?
Compilers are allowed to implement dynamic dispatch however they want in c++, i don't think there is actually any requirement to even use a vtable at all, but it would be very unusual to find a compiler that didn't.
In most cases i think that each class (that contains some virtual methods) will own a single vtable (so if i had 5 instances of class A i will still only have 1 vtable), but this behaviour should not be relied upon in any way.
Non virtual classes have no need for vtables as far as i know.
Reading your question it seems as if you think that each object has its own copy of the code, I'm not sure and i don't want to accuse you of anything like that but just in case ...
Google something like: "what does a c++ object look like in memory"
There will be one vtable somewhere in memory, probably in the same place as the code.
Each instance of the class will contain a single pointer to the vtable for that class, so in your case all 25 instances will contain a pointer to one copy of the vtable.
Multiple and virtual inheritance complicate things, but the principle is the same.
This question is likely a "what does the C++ standard say" thing, but my Google searching hasn't given me the answer I'm looking for.
I know that when you have classes, and you have one class inherit from another class, you get into the world of virtual function tables, since the code needs to figure out which class contains the function you're trying to call.
But what about inheritance between structs that only contain data? For example, if you have a widget struct, and then you want a specialized version of that struct that has a few extra variables, but you still want to be able to pass its original data to functions that handle widgets, it would be simpler to inherit from the original widget struct than to make your code handle two types of widget structs. Is there any overhead when there is only data involved in the inheritance? Is the specialized widget still a simple struct (in terms of memory layout) with both data combined, or is the original widget data stored separate from the new data?
Ultimately, I'd like to keep my data simple and contiguous, as a basic struct would be, and I don't know if inheriting data would break that.
In the C++ memory model an object is always laid out in contiguous memory. You need to use members pointing to data outside this object if you want to have non-contiguous memory. That is, if you inherit any class whether it is a struct or has virtual function, the actual object is always contiguous. There are few other implications about types which may be of interested: if a class is a standard layout type you can e.g. memcpy() the object. I'm not sure what C++2011 says about inheritance and standard layout type but I'm pretty sure that C++2003 didn't allow inheritance and C++2011 allows it.
know that when you have classes, and you have one class inherit from another class, you get into the world of virtual function tables
only if you have virtual functions...
so to answer your question: if you have a plain struct without member functions, then the compiler won't generate a virtual function table.
and BTW you shouldn't be worrying about it, that table is per class, and you only need a simple extra pointer per instance (if you use simple inheritance).