in C++, I can easily create a function pointer by taking the address of a member function. However, is it possible to change the address of that local function?
I.e. say I have funcA() and funcB() in the same class, defined differently. I'm looking to change the address of funcA() to that of funcB(), such that at run time calling funcA() actually results in a call to funcB(). I know this is ugly, but I need to do this, thanks!
EDIT----------
Background on what I'm trying to do:
I'm hoping to implement unit tests for an existing code base, some of the methods in the base class which all of my modules are inheriting from are non-virtual. I'm not allowed to edit any production code. I can fiddle with the build process and substitute in a base class with the relevant methods set to virtual but I thought I'd rather use a hack like this (which I thought was possible).
Also, I'm interested in the topic out of technical curiosity, as through the process of trying to hack around this problem I'm learning quite a bit about how things such as code generation & function look-up work under the hood, which I haven't had a chance to learn in school having just finished 2nd year of university. I'm not sure as to I'll ever be taught such things in school as I'm in a computer engineering program rather than CS.
Back on topic
The the method funcA() and funcB() do indeed have the same signature, so the problem is that I can only get the address of a function using the & operator? Would I be correct in saying that I can't change the address of the function, or swap out the contents at that address without corrupting portions of memory? Would DLL injection be a good approach for a situation like this if the functions are exported to a dll?
No. Functions are compiled into the executable, and their address is fixed throughout the life-time of the program.
The closest thing is virtual functions. Give us an example of what you're trying to accomplish, I promise there's a better way.
It cannot be done the way you describe it. The only way to change the target for a statically bound call is by modifying the actual executable code of your program. C++ language has no features that could accomplish that.
If you want function calls to be resolved at run-time you have to either use explicitly indirect calls (call through function pointers), or use language features that are based on run-time call resolution (like virtual functions), or you can use plain branching with good-old if or switch. Which is more appropriate in your case depends on your specific problem.
Technically it might be possible for virtual functions by modifying the vtable of the type, but you most certainly cannot do it without violating the standard (causing Undefined Behavior) and it would require knowledge of how your specific compiler handles vtables.
For other functions it is not possible because the addresses of the functions are directly written to program code, which is generally on a read-only memory area.
I am fairly sure this is impossible in pure C++. C++ is not a dynamic language.
What you want is a pointer to a function, you can point it to FuncA or FuncB assuming that they have the same signature.
You cannot do what you want to do directly. However, you can achieve a similar result with some slightly different criteria, using something you are already familiar with -- function pointers. Consider:
// This type could be whatever you need, including a member function pointer type.
typedef void (*FunctionPointer)();
struct T {
FunctionPointer Function;
};
Now you can set the Function member on any given T instance, and call it. This is about as close as you can reasonably get, and I presume that since you are already aware of function pointers you're already aware of this solution.
Why don't you edit your question with a more complete description of the problem you're trying to solve? As it stands it really sounds like you're trying to do something horrible.
Its simple!
For
at run time calling funcA() actually results in a call to funcB().
write funcA() similar to following:
int funcA( int a, int b) {
return funcB( a, b );
}
:-)
Related
Short Version
I'm using the entries from the vtable of a specific object to invoke virtual methods inherited from an interface.
In the end I'm searching for a way to get the exact offset each address to a virtual method has in the vtable of a specific object.
Detailed Version
Disclaimer
I know that this topic is implementation dependant and that one should not try to do this manually because the compiler does the (correct) work and a vtable is not considered to be a standard (let alone the dataformat).
I hereby testify that I already read dozens of "Don't do it... just, don't do it!"'s and am clear about the outrageous consequences my actions could have.
Therefore (and to favor a contructive discussion) I'll be using g++ (4.x.x) for the compiler on a Linux x64 Platform as my reference. Any software compiled using the below presented code will use the same setup, so it should be platform-indepandant as far as this is concerned.
Okay, this whole issue of mine is completely experimental, I don't want to use it in production code but to educate myself and my fellow students (my professor asked me if I could write a quick paper about this topic).
What I'm trying to do is basically the automatic invocation of methods using offsets to determine which method to invoke.
The basic outline of the classes is as follows (this is a simplification but shows the composition of my current attempts):
class IMethods
{
virtual double action1(double) = 0;
virtual double action2(double) = 0;
};
As you can see just a class with pure virtual methods which share the same signature.
enum Actions
{
actionID1,
actionID2
};
The enum-items are used to invoke the appropriate method.
class MethodProcessor : public IMethods
{
public:
double action1(double);
double action2(double);
};
Purposely omitted ctor/dtor of above class.
We can safely assume that these are the only virtual methods inherited from the interface and that polymorphy is of no concern.
This concludes the basic outline. Now to the real topic:
Is there a safely way to get the mapping of addresses in the vtable to the inherited virtual methods?
What I'm trying to do is something like this:
MethodProcessor proc;
size_t * vTable = *(size_t**) &proc;
double ret = ((double(*)(MethodProcessor*,double))vTable[actionID2])(&proc, 3.14159265359);
This is working out fine and is invoking action2, but I'm assuming that the address pointing to action2 is equal to index 1 and this part bugs me: What if there is some sort of offset added into the vtable before the address of action1 is defined?
In a book about the data object model of C++ I read that normally the first address in the vtable is leading to the RTTI (runtime type information), which in return I couldn't confirm because vTable[0] is legitimately pointing to action1.
The compiler knows the exact index of every virtual method pointer because, yeah, the compiler is building them and replacing every member invocation of the virtual methods with augmented code which equals the one I used above - but knows the index to use. I, for once, am taking the educated guess that there is no offset before my defined virtual methods.
Can't I use some C++-Hack to let the compiler compute the correct index on compile (or even run)-time? I could then use this information to add some offset to my enum-items and woulnd't have to worry about casting wrong addresses...
The ABI used by Linux is known: you should look at the section on virtual function table layout of the Itanium ABI. The document also specifies the object layout to find the vtable. I don't know if the approach you outlined works, however.
Note, that despite pointing at the document, I do not recommend to use the information to play tricks based on it! You unfortunately didn't explain what your actual goal is but it seems that using a pointer to member of the irtual functions is a more reliable approach to run-time dispatch to a virtual function:
double (IMethods::*method)(double)
= flag? &IMethods:: action1: &INethods::actions2;
MethodProcessor mp;
double rc = (mp.*method)(3.14);
I have nearly the same problem, even worse, in my case, this is for procuction code... don't ask me why (just tell me: "Don't do it... just, don't do it", or at least use an existing dynamic compiler such as LLC...).
Of course, this "torture" could be smoothed if compiler designers were asked to follow some generic C++ ABI specifications (or at least specify their own).
Apparently, there is a growing consensus on complying with "Generic C++ ABI" (also often called "ITANIUM C++ ABI", mentioned by Dietmar Kühl).
Because you are using G++ and because this is for educationnal purposes, I suggest you have a look at what the "-fdump-class-hierarchy" g++ switch does (you will find vtable layouts); this is what I personally use to be sure that I'm not casting wrong addresses.
Note: using a x86 (ia32) MinGW g++-4.7.2 compiler,
In double MethodProcessor::action( double ), the hidden "(MethodProcessor*)this" argument will be passed in a register (%ecx).
Whereas in ((double(*)(MethodProcessor*,double)), the first explicit "this" argument will be passed on stack.
I have a function (actually from ATL, it is ATL::CSoapMSXMLInetClient::SendRequest(LPCTSTR)) whose behaviour should slightly be modified. That is, I just have to add one function call somewhere in the middle of the function.
Taking into consideration that this is not a template method, what is the best practice of changing its behaviour? Do I have to re-write the whole function?
Thanks in advance.
EDIT: Deriving from the class ATL::CSoapMSXMLInetClient and copy-pasting whole function code with a slight modification in subclass function definition does not work because most of the members used in ATL::CSoapMSXMLInetClient::SendRequest are "private" and accessing them in subclass is a compile time error.
Rather than best practice I am looking for a way to do it now, if there is any. :(
Yes you will. If it's in the middle of the function there is no way of getting around it.
There are some refactoring methods you can use. But I cannot think of any pretty ones, and all depend heavily on the code within the class, although for you case it might be tough to find any that works.
Like if you have a line:
do_frobnicate();
dingbat->pling();
And you need to call somefunc() after the dingbat plings. You can, if the dingbat is an interface that you provide, make a new dingbat that also do somefunc() when it plings. Given that the only place this dingbat plings is in this function.
Also, if do_frobnicate() is a free function and you want to add the somefunc() after this, you could create a function within the class, or within its namespace that is called the same. That way you make your own do_frobnicate() that also does somefunc().
This is not a question about how they work and declared, this I think is pretty much clear to me. The question is about why to implement this?
I suppose the practical reason is to simplify bunch of other code to relate and declare their variables of base type, to handle objects and their specific methods from many other subclasses?
Could this be done by templating and typechecking, like I do it in Objective C? If so, what is more efficient? I find it confusing to declare object as one class and instantiate it as another, even if it is its child.
SOrry for stupid questions, but I havent done any real projects in C++ yet and since I am active Objective C developer (it is much smaller language thus relying heavily on SDK's functionalities, like OSX, iOS) I need to have clear view on any parallel ways of both cousins.
Yes, this can be done with templates, but then the caller must know what the actual type of the object is (the concrete class) and this increases coupling.
With virtual functions the caller doesn't need to know the actual class - it operates through a pointer to a base class, so you can compile the client once and the implementor can change the actual implementation as much as it wants and the client doesn't have to know about that as long as the interface is unchanged.
Virtual functions implement polymorphism. I don't know Obj-C, so I cannot compare both, but the motivating use case is that you can use derived objects in place of base objects and the code will work. If you have a compiled and working function foo that operates on a reference to base you need not modify it to have it work with an instance of derived.
You could do that (assuming that you had runtime type information) by obtaining the real type of the argument and then dispatching directly to the appropriate function with a switch of shorts, but that would require either manually modifying the switch for each new type (high maintenance cost) or having reflection (unavailable in C++) to obtain the method pointer. Even then, after obtaining a method pointer you would have to call it, which is as expensive as the virtual call.
As to the cost associated to a virtual call, basically (in all implementations with a virtual method table) a call to a virtual function foo applied on object o: o.foo() is translated to o.vptr[ 3 ](), where 3 is the position of foo in the virtual table, and that is a compile time constant. This basically is a double indirection:
From the object o obtain the pointer to the vtable, index that table to obtain the pointer to the function and then call. The extra cost compared with a direct non-polymorphic call is just the table lookup. (In fact there can be other hidden costs when using multiple inheritance, as the implicit this pointer might have to be shifted), but the cost of the virtual dispatch is very small.
I don't know the first thing about Objective-C, but here's why you want to "declare an object as one class and instantiate it as another": the Liskov Substitution Principle.
Since a PDF is a document, and an OpenOffice.org document is a document, and a Word Document is a document, it's quite natural to write
Document *d;
if (ends_with(filename, ".pdf"))
d = new PdfDocument(filename);
else if (ends_with(filename, ".doc"))
d = new WordDocument(filename);
else
// you get the point
d->print();
Now, for this to work, print would have to be virtual, or be implemented using virtual functions, or be implemented using a crude hack that reinvents the virtual wheel. The program need to know at runtime which of various print methods to apply.
Templating solves a different problem, where you determine at compile time which of the various containers you're going to use (for example) when you want to store a bunch of elements. If you operate on those containers with template functions, then you don't need to rewrite them when you switch containers, or add another container to your program.
A virtual function is important in inheritance. Think of an example where you have a CMonster class and then a CRaidBoss and CBoss class that inherit from CMonster.
Both need to be drawn. A CMonster has a Draw() function, but the way a CRaidBoss and a CBoss are drawn is different. Thus, the implementation is left to them by utilizing the virtual function Draw.
Well, the idea is simply to allow the compiler to perform checks for you.
It's like a lot of features : ways to hide what you don't want to have to do yourself. That's abstraction.
Inheritance, interfaces, etc. allow you to provide an interface to the compiler for the implementation code to match.
If you didn't have the virtual function mecanism, you would have to write :
class A
{
void do_something();
};
class B : public A
{
void do_something(); // this one "hide" the A::do_something(), it replace it.
};
void DoSomething( A* object )
{
// calling object->do_something will ALWAYS call A::do_something()
// that's not what you want if object is B...
// so we have to check manually:
B* b_object = dynamic_cast<B*>( object );
if( b_object != NULL ) // ok it's a b object, call B::do_something();
{
b_object->do_something()
}
else
{
object->do_something(); // that's a A, call A::do_something();
}
}
Here there are several problems :
you have to write this for each function redefined in a class hierarchy.
you have one additional if for each child class.
you have to touch this function again each time you add a definition to the whole hierarcy.
it's visible code, you can get it wrong easily, each time
So, marking functions virtual does this correctly in an implicit way, rerouting automatically, in a dynamic way, the function call to the correct implementation, depending on the final type of the object.
You dont' have to write any logic so you can't get errors in this code and have an additional thing to worry about.
It's the kind of thing you don't want to bother with as it can be done by the compiler/runtime.
The use of templates is also technically known as polymorphism from theorists. Yep, both are valid approach to the problem. The implementation technics employed will explain better or worse performance for them.
For example, Java implements templates, but through template erasure. This means that it is only apparently using templates, under the surface is plain old polymorphism.
C++ has very powerful templates. The use of templates makes code quicker, though each use of a template instantiates it for the given type. This means that, if you use an std::vector for ints, doubles and strings, you'll have three different vector classes: this means that the size of the executable will suffer.
I would like to run some code (perhaps a function) right before every function call for a class and all functions of the classes that inherit from that class. I'd like to do this without actually editing every function, Is such a thing even possible?
I would settle for having a function called as the first instruction of every function call instead of it being called right before.
AspectC++ is what you want. I haven't used it myself, but Aspect-Oriented Programming paradigm tries to solve this exact problem.
I would suggest using the Non Virtual Interface idiom. All public functions are non-virtual. All virtual functions are protected or private. Public members delegate the calls to virtual members and are usually implemented as inline functions.
This is the way IOStreams are implemented in STL. You can read more about it at C++ Wikibooks.
Intent: To modularize/refactor common before and after code fragments (e.g., invariant checking, acquiring/releasing locks) for an entire class hierarchy at one location.
Regards,
Ovanes
The following might be a bit of an overkill - but how about?
http://msdn.microsoft.com/en-us/library/c63a9b7h.aspx
Another thing you could consider is using something like the [boost/C++0X] shared_ptr wrapper, where you call your custom function on the '->' overload before returning the class instance pointer. It involves modifying usage but not the underlying class, and I've used it a couple times to achieve the same effect. Just another thought.
The somewhat inconvenient way where to build a wrapper class that takes an object of your base type and calls the surrounding function and then the function that you wanted to call. This would be something like a decorator.
The best you can do is to declare a set of virtual functions as protected and have the developers inheriting from the class override the virtual functions. The interface used by the base class can be public, which executes the desired code before passing information to the protected virtual method.
For example:
class Base {
public:
void MyMethod(void) { /* Insert code here */ YourMethod(); }
protected:
virtual void YourMethod(void) {}
};
If the developer knows that he has a specific subclass, he can still bypass your code simply by using a dynamic_cast, and using his own method set. As such, you may want to follow the other suggestions already posted that do not involve the base C++ language.
This sounds like what a profiler does. Have you looked at the source for any profiling tools?
You could also do this with the Curiously recurring template pattern (CRTP).
Using g++, you could use the option -pg for the respective compilation units, which makes the compiler generate a call to the function mcount at the start of every function. mcount is usually provided by profiling tools like gprof, but you can also implement it yourself. You should however make sure that
mcount has C linkage (and is not C++-style name-mangled), i.e. by implementing it as a C function and compiling with a pure C compiler like gcc.
the compilation unit containing mcount is not compiled with -pg.
Should be a newbie question...
I have existing code in an existing class, A, that I want to extend in order to override an existing method, A::f().
So now I want to create class B to override f(), since I don't want to just change A::f() because other code depends on it.
To do this, I need to change A::f() to a virtual method, I believe.
My question is besides allowing a method to be dynamically invoked (to use B's implementation and not A's) are there any other implications to making a method virtual? Am I breaking some kind of good programming practice? Will this affect any other code trying to use A::f()?
Please let me know.
Thanks,
jbu
edit: my question was more along the lines of is there anything wrong with making someone else's method virtual? even though you're not changing someone else's implementation, you're still having to go into someone's existing code and make changes to the declaration.
If you make the function virtual inside of the base class, anything that derives from it will also have it virtual.
Once virtual, if you create an instance of A, then it will still call A::f.
If you create an instance of B and store it in a pointer of type A*. And then you call A*::->f, then it will call B's B::f.
As for side effects, there probably won't be any side effects, other than a slight (unnoticeable) performance loss.
There is a very small side effect as well, there could be a class C that also derives from A, and it may implement C::f, and expect that if A*::->f was called, then it expects A::f to be called. But this is not very common.
But more than likely, if C exists, then it does not implement C::f at all, and in which case everything is fine.
Be careful though, if you are using an already compiled library and you are modifying it's header files, what you are expecting to work probably will not. You will need to recompile the header and source files.
You could consider doing the following to avoid side effects:
Create a type A2 that derives from A and make it's f virtual
Use pointers of type A2 instead of A
Derive B from type A2.
In this way anything that used A will work in the same way guaranteed
Depending on what you need you may also be able to use a has-a relationship instead of a is-a.
There is a small implied performance penalty of a vtable lookup every time a virtual function is called. If it were not virtual, function calls are direct, since the code location is known at compile time. Wheras at runtime, a virtual function address must be referenced from the vtable of the object you're calling upon.
To do this, I need to change A::f() to
a virtual method, I believe.
Nope, you do not need to change it to a virtual method in order to override it. However, if you are using polymorphism you need to, i.e. if you have a lot of different classes deriving from A but stored as pointers to A.
There's also a memory overhead for virtual functions because of the vtable (apart from what spoulson mentioned)
There are other ways of accomplishing your goal. Does it make sense for B to be an A? For example, it makes sense for a Cat to be an Animal, but not for a Cat to be a Dog. Perhaps both A and B should derive from a base class, if they are related.
Is there just common functionality you can factor out? It sounds to me like you'll never be using these classes polymorphically, and just want the functionality. I would suggest you take that common functionality out and then make your two separate classes.
As for cost, if you're using A ad B directly, the compile will by-pass any virtual dispatching and just go straight to the functions calls, as if they were never virtual. If you pass a B into a place expecting `A1 (as a reference or pointer), then it will have to dispatch.
There are 2 performance hits when speaking about virtual methods.
vtable dispatching, its nothing to really worry about
virtual functions are never inlined, this can be much worse than the previous one, function inlining is something that can really speed things in some situations, it can never happen with a virtual function.
How kosher it is to change somebody else's code depends entirely on the local mores and customs. It isn't something we can answer for you.
The next question is whether the class was designed to be inherited from. In many cases, classes are not, and changing them to be useful base classes, without changing other aspects, can be tricky. A non-base class is likely to have everything private except the public functions, so if you need to access more of the internals in B you'll have to make more modifications to A.
If you're going to use class B instead of class A, then you can just override the function without making it virtual. If you're going to create objects of class B and refer to them as pointers to A, then you do need to make f() virtual. You also should make the destructor virtual.
It is good programming practise to use virtual methods where they are deserved. Virtual methods have many implications as to how sensible your C++ Class is.
Without virtual functions you cannot create interfaces in C++. A interface is a class with all undefined virtual functions.
However sometimes using virtual methods is not good. It doesn't always make sense to use a virtual methods to change the functionality of an object, since it implies sub-classing. Often you can just change the functionality using function objects or function pointers.
As mentioned a virtual function creates a table which a running program will reference to check what function to use.
C++ has many gotchas which is why one needs to be very aware of what they want to do and what the best way of doing it is. There aren't as many ways of doing something as it seems when compared to runtime dynamic OO programming languages such as Java or C#. Some ways will be either outright wrong, or will eventually lead to undefined behavior as your code evolves.
Since you have asked a very good question :D, I suggest you buy Scott Myer's Book: Effective C++, and Bjarne Stroustrup's book: The C++ Programming Language. These will teach you the subtleties of OO in C++ particularly when to use what feature.
If thats the first virtual method the class is going to have, you're making it no longer a POD. This can break things, although the chances for that are slim.
POD: http://en.wikipedia.org/wiki/Plain_old_data_structures