I am learning how c++ is compiled into assembly and I found how exceptions works under the hood very interesting. If its okay to have more then one execution paths for exceptions why not for normal functions.
For example, lets say you have a function that can return a pointer to class A or something derived from A. The way your supposed to do it is with RTTI.
But why not, instead, have the called function, after computing the return value, jump back to the caller function into the specific location that matchs up with the return type. Like how exceptions, the execution flow can go normal or, if it throws, it lands in one of your catch handlers.
Here is my code:
class A
{
public:
virtual int GetValue() { return 0; }
};
class B : public A
{
public:
int VarB;
int GetValue() override { return VarB; }
};
class C : public A
{
public:
int VarC;
int GetValue() override { return VarC; }
};
A* Foo(int i)
{
if(i == 1) return new B;
if(i == 2)return new C;
return new A;
}
void main()
{
A* a = Foo(2);
if(B* b = dynamic_cast<B*>(a))
{
b->VarB = 1;
}
else if(C* c = dynamic_cast<C*>(a)) // Line 36
{
c->VarC = 2;
}
else
{
assert(a->GetValue() == 0);
}
}
So instead of doing it with RTTI and dynamic_cast checks, why not have the Foo function just jump to the appropriate location in main. So in this case Foo returns a pointer to C, Foo should instead jump to line 36 directly.
Whats wrong with this? Why aren't people doing this? Is there a performance reason? I would think this would be cheaper then RTTI.
Or is this just a language limitation, regardless if its a good idea or not?
First of all, there are million different ways of defining the language. C++ is defined as it is defined. Nice or not really does not matter. If you want to improve the language, you are free to write a proposal to C++ committee. They will review it and maybe include in future standards. Sometimes this happens.
Second, although exceptions are dispatched under the hood, there are no strong reasons to think that this is more efficient comparing your handwritten code that uses RTTI. Exception dispatch still requires CPU cycles. There is no miracle there. The real difference is that for using RTTI you need to write the code yourself, while the exception dispatch code is generated for you by compiler.
You may want to call you function 10000 times and find out what will run faster: RTTI based code or exception dispatch.
Related
The problem
I am writing a thin C++ wrapper around an object oriented C library. The idea was to automate memory management, but so far its not been very automatic. Basically when I use my wrapper classes, I get all kinds of memory access and inappropriate freeing problems.
Minimal example of C library
Lets say the C library consists of A and B classes, each of which have a few 'methods' associated with them:
#include <memory>
#include "cstring"
#include "iostream"
extern "C" {
typedef struct {
unsigned char *string;
} A;
A *c_newA(const char *string) {
A *a = (A *) malloc(sizeof(A)); // yes I know, don't use malloc in C++. This is a demo to simulate the C library that uses it.
auto *s = (char *) malloc(strlen(string) + 1);
strcpy(s, string);
a->string = (unsigned char *) s;
return a;
}
void c_freeA(A *a) {
free(a->string);
free(a);
}
void c_printA(A *a) {
std::cout << a->string << std::endl;
}
typedef struct {
A *firstA;
A *secondA;
} B;
B *c_newB(const char *first, const char *second) {
B *b = (B *) malloc(sizeof(B));
b->firstA = c_newA(first);
b->secondA = c_newA(second);
return b;
}
void c_freeB(B *b) {
c_freeA(b->firstA);
c_freeA(b->secondA);
free(b);
}
void c_printB(B *b) {
std::cout << b->firstA->string << ", " << b->secondA->string << std::endl;
}
A *c_getFirstA(B *b) {
return b->firstA;
}
A *c_getSecondA(B *b) {
return b->secondA;
}
}
Test the 'C lib'
void testA() {
A *a = c_newA("An A");
c_printA(a);
c_freeA(a);
// outputs: "An A"
// valgrind is happy =]
}
void testB() {
B *b = c_newB("first A", "second A");
c_printB(b);
c_freeB(b);
// outputs: "first A, second A"
// valgrind is happy =]
}
Wrapper classes for A and B
class AWrapper {
struct deleter {
void operator()(A *a) {
c_freeA(a);
}
};
std::unique_ptr<A, deleter> aptr_;
public:
explicit AWrapper(A *a)
: aptr_(a) {
}
static AWrapper fromString(const std::string &string) { // preferred way of instantiating
A *a = c_newA(string.c_str());
return AWrapper(a);
}
void printA() {
c_printA(aptr_.get());
}
};
class BWrapper {
struct deleter {
void operator()(B *b) {
c_freeB(b);
}
};
std::unique_ptr<B, deleter> bptr_;
public:
explicit BWrapper(B *b)
: bptr_(std::unique_ptr<B, deleter>(b)) {
}
static BWrapper fromString(const std::string &first, const std::string &second) {
B *b = c_newB(first.c_str(), second.c_str());
return BWrapper(b);
}
void printB() {
c_printB(bptr_.get());
}
AWrapper getFirstA(){
return AWrapper(c_getFirstA(bptr_.get()));
}
AWrapper getSecondA(){
return AWrapper(c_getSecondA(bptr_.get()));
}
};
Wrapper tests
void testAWrapper() {
AWrapper a = AWrapper::fromString("An A");
a.printA();
// outputs "An A"
// valgrind is happy =]
}
void testBWrapper() {
BWrapper b = BWrapper::fromString("first A", "second A");
b.printB();
// outputs "first A"
// valgrind is happy =]
}
Demonstration of the problem
Great, so I move on and develop the full wrapper (lot of classes) and realise that when classes like this (i.e. aggregation relationship) are both in scope, C++ will automatically call the descructors of both classes separately, but because of the structure of the underlying library (i.e. the calls to free), we get memory problems:
void testUsingAWrapperAndBWrapperTogether() {
BWrapper b = BWrapper::fromString("first A", "second A");
AWrapper a1 = b.getFirstA();
// valgrind no happy =[
}
Valgrind output
Things I've tried
Cloning not possible
The first thing I tried was to take a copy of A, rather than having them try to free the same A. This, while a good idea, is not possible in my case because of the nature of the library I'm using. There is actually a catching mechanism in place so that when you create a new A with a string its seen before, it'll give you back the same A. See this question for my attempts at cloning A.
Custom destructors
I took the code for the C library destructors (freeA and freeB here) and copied them into my source code. Then I tried to modify them such that A does not get freed by B. This has partially worked. Some instances of memory problems have been resolved, but because this idea does not tackle the problem at hand (just kind of temporarily glosses over the main issue), new problems keep popping up, some of which are obscure and difficult to debug.
The question
So at last we arive at the question: How can I modify this C++ wrapper to resolve the memory problems that arise due to the interactions between the underlying C objects? Can I make better use of smart pointers? Should I abandon the C wrapper completly and just use the libraries pointers as is? Or is there a better way I haven't thought of?
Thanks in advance.
Edits: response to the comments
Since asking the previous question (linked above) I have restructed my code so that the wrapper is being developed and built in the same library as the one it wraps. So the objects are no longer opaque.
The pointers are generated from function calls to the library, which uses calloc or malloc to allocate.
In the real code A is raptor_uri* (typdef librdf_uri*) from raptor2 and is allocated with librdf_new_uri while B is raptor_term* (aka librdf_node*) and allocated with librdf_new_node_* functions. The librdf_node has a librdf_uri field.
Edit 2
I can also point to the line of code where the same A is returned if its the same string. See line 137 here
The problem is that getFirstA and getSecondA return instances of AWrapper, which is an owning type. This means that when constructing an AWrapper you're giving up the ownership of an A *, but getFirstA and getFirstB don't do that. The pointers from which the returned objects are constructed are managed by a BWrapper.
The easiest solution is that you should return an A * instead of the wrapper class. This way you're not passing the ownership of the inner A member. I also would recommend making the constructors taking pointers in the wrapper classes private, and having a fromPointer static method similar to fromString, which takes ownership of the pointer passed to it. This way you won't accidently make instances of the wrapper classes from raw pointers.
If you want to avoid using raw pointers or want to have methods on the returned objects from getFirstA and getSecondA you could write a simple reference wrapper, which has a raw pointer as a member.
class AReference
{
private:
A *a_ref_;
public:
explicit AReference(A *a_ref) : a_ref_(a_ref) {}
// other methods here, such as print or get
};
You are freeing A twice
BWrapper b = BWrapper::fromString("first A", "second A");
When b goes out of scope, c_freeB is called which also calls c_freeA
AWrapper a1 = b.getFirstA();
Wraps A with another unique_ptr, then when a1 goes out of scope it will call c_freeA on the same A.
Note that getFirstA in BWrapper gives ownership of an A to another unique_ptr when using the AWrapper constructor.
Ways to fix this:
Don't let B manage A memory, but since you are using a lib that won't be possible.
Let BWrapper manage A, don't let AWrapper manage A and make sure the BWrapper exists when using AWrapper. That is, use a raw pointer in AWrapper instead of a smart pointer.
Make a copy of A in the AWrapper(A *) constructor, for this you might want to use a function from the library.
Edit:
shared_ptr won't work in this case because c_freeB will call c_freeA anyways.
Edit 2:
In this specific case considering the raptor lib you mentioned, you could try the following:
explicit AWrapper(A *a)
: aptr_(raptor_uri_copy(a)) {
}
assuming that A is a raptor_uri. raptor_uri_copy(raptor_uri *) will increase the reference count and return the same passed pointer. Then, even if raptor_free_uri is called twice on the same raptor_uri * it will call free only when the counter becomes zero.
I am trying to optimize the run time of my code and I was told that removing unnecessary virtual functions was the way to go. With that in mind I would still like to use inheritance to avoid unnecessary code bloat. I thought that if I simply redefined the functions I wanted and initialized different variable values I could get by with just downcasting to my derived class whenever I needed derived class specific behavior.
So I need a variable that identifies the type of class that I am dealing with so I can use a switch statement to downcast properly. I am using the following code to test this approach:
Classes.h
#pragma once
class A {
public:
int type;
static const int GetType() { return 0; }
A() : type(0) {}
};
class B : public A {
public:
int type;
static const int GetType() { return 1; }
B() : {type = 1}
};
Main.cpp
#include "Classes.h"
#include <iostream>
using std::cout;
using std::endl;
using std::getchar;
int main() {
A *a = new B();
cout << a->GetType() << endl;
cout << a->type;
getchar();
return 0;
}
I get the output expected: 0 1
Question 1: Is there a better way to store type so that I do not need to waste memory for each instance of the object created (like the static keyword would allow)?
Question 2: Would it be more effective to put the switch statement in the function to decide that it should do based on the type value, or switch statement -> downcast then use a derived class specific function.
Question 3: Is there a better way to handle this that I am entirely overlooking that does not use virtual functions? For Example, should I just create an entirely new class that has many of the same variables
Question 1: Is there a better way to store type so that I do not need to waste memory for each instance of the object created (like the static keyword would allow)?
There's the typeid() already enabled with RTTI, there's no need you implement that yourself in an error prone and unreliable way.
Question 2: Would it be more effective to put the switch statement in the function to decide that it should do based on the type value, or switch statement -> downcast then use a derived class specific function.
Certainly no! That's a heavy indicator of bad (sic!) class inheritance hierarchy design.
Question 3: Is there a better way to handle this that I am entirely overlooking that does not use virtual functions? For Example, should I just create an entirely new class that has many of the same variables
The typical way to realize polymorphism without usage of virtual functions is the CRTP (aka Static Polymorphism).
That's a widely used technique to avoid the overhead of virtual function tables when you don't really need them, and just want to adapt your specific needs (e.g. with small targets, where low memory overhead is crucial).
Given your example1, that would be something like this:
template<class Derived>
class A {
protected:
int InternalGetType() { return 0; }
public:
int GetType() { static_cast<Derived*>(this)->InternalGetType(); }
};
class B : public A<B> {
friend class A<B>;
protected:
int InternalGetType() { return 1; }
};
All binding will be done at compile time, and there's zero runtime overhead.
Also binding is safely guaranteed using the static_cast, that will throw compiler errors, if B doesn't actually inherits A<B>.
Note (almost disclaimer):
Don't use that pattern as a golden hammer! It has it's drawbacks also:
It's harder to provide abstract interfaces, and without prior type trait checks or concepts, you'll confuse your clients with hard to read compiler error messages at template instantiantion.
That's not applicable for plugin like architecture models, where you really want to have late binding, and modules loaded at runtime.
If you don't have really heavy restrictions regarding executable's code size and performance, it's not worth doing the extra work necessary. For most systems you can simply neglect the dispatch overhead done with virtual function defintions.
1)The semantics of GetType() isn't necessarily the best one, but well ...
Go ahead and use virtual functions, but make sure each of those functions is doing enough work that the overhead of an indirect call is insignificant. That shouldn't be very hard to do, a virtual call is pretty fast - it wouldn't be part of C++ if it wasn't.
Doing your own pointer casting is likely to be even slower, unless you can use that pointer a significant number of times.
To make this a little more concrete, here's some code:
class A {
public:
int type;
int buffer[1000000];
A() : type(0) {}
virtual void VirtualIncrease(int n) { buffer[n] += 1; }
void NonVirtualIncrease(int n) { buffer[n] += 1; }
virtual void IncreaseAll() { for i=0; i<1000000; ++i) buffer[i] += 1; }
};
class B : public A {
public:
B() : {type = 1}
virtual void VirtualIncrease(int n) { buffer[n] += 2; }
void NonVirtualIncrease(int n) { buffer[n] += 2; }
virtual void IncreaseAll() { for i=0; i<1000000; ++i) buffer[i] += 2; }
};
int main() {
A *a = new B();
// easy way with virtual
for (int i = 0; i < 1000000; ++i)
a->VirtualIncrease(i);
// hard way with switch
for (int i = 0; i < 1000000; ++i) {
switch(a->type) {
case 0:
a->NonVirtualIncrease(i);
break;
case 1:
static_cast<B*>(a)->NonVirtualIncrease(i);
break;
}
}
// fast way
a->IncreaseAll();
getchar();
return 0;
}
The code that switches using a type code is not only much harder to read, it's probably slower as well. Doing more work inside a virtual function ends up being both cleanest and fastest.
Given the following:
class ReadWrite {
public:
int Read(size_t address);
void Write(size_t address, int val);
private:
std::map<size_t, int> db;
}
In read function when accessing an address which no previous write was made to I want to either throw exception designating such error or allow that and return 0, in other words I would like to either use std::map<size_t, int>::operator[]() or std::map<size_t, int>::at(), depending on some bool value which user can set. So I add the following:
class ReadWrite {
public:
int Read(size_t add) { if (allow) return db[add]; return db.at(add);}
void Write(size_t add, int val) { db[add] = val; }
void Allow() { allow = true; }
private:
bool allow = false;
std::map<size_t, int> db;
}
The problem with that is:
Usually, the program will have one call of allow or none at the beginning of the program and then afterwards many accesses. So, performance wise, this code is bad because it every-time performs the check if (allow) where usually it's either always true or always false.
So how would you solve such problem?
Edit:
While the described use case (one or none Allow() at first) of this class is very likely it's not definite and so I must allow user call Allow() dynamically.
Another Edit:
Solutions which use function pointer: What about the performance overhead incurred by using function pointer which is not able to make inline by the compiler? If we use std::function instead will that solve the issue?
Usually, the program will have one call of allow or none at the
beginning of the program and then afterwards many accesses. So,
performance wise, this code is bad because it every-time performs the
check if (allow) where usually it's either always true or always
false. So how would you solve such problem?
I won't, The CPU will.
the Branch Prediction will figure out that the answer is most likely to be same for some long time so it will able to optimize the branch in the hardware level very much. it will still incur some overhead, but very negligible.
If you really need to optimize your program, I think your better use std::unordered_map instead of std::map, or move to some faster map implementation, like google::dense_hash_map. the branch is insignificant compared to map-lookup.
If you want to decrease the time-cost, you have to increase the memory-cost. Accepting that, you can do this with a function pointer. Below is my answer:
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
// when allowed, make the function pointer point to read2
void Allow() { Read = &ReadWrite::read2;}
//function pointer that points to read1 by default
int (ReadWrite::*Read)(size_t) = &ReadWrite::read1;
private:
int read1(size_t add){return db.at(add);}
int read2(size_t add) {return db[add];}
std::map<size_t, int> db;
};
The function pointer can be called as the other member functions. As an example:
ReadWrite rwObject;
//some code here
//...
rwObject.Read(5); //use of function pointer
//
Note that non-static data member initialization is available with c++11, so the int (ReadWrite::*Read)(size_t) = &ReadWrite::read1; may not compile with older versions. In that case, you have to explicitly declare one constructor, where the initialization of the function pointer can be done.
You can use a pointer to function.
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
int Read(size_t add) { (this->*Rfunc)(add); }
void Allow() { Rfunc = &ReadWrite::Read2; }
private:
std::map<size_t, int> db;
int Read1(size_t add) { return db.at(add); }
int Read2(size_t add) { return db[add]; }
int (ReadWrite::*Rfunc)(size_t) = &ReadWrite::Read1;
}
If you want runtime dynamic behaviour you'll have to pay for it at runtime (at the point you want your logic to behave dynamically).
You want different behaviour at the point where you call Read depending on a runtime condition and you'll have to check that condition.
No matter whether your overhad is a function pointer call or a branch, you'll find a jump or call to different places in your program depending on allow at the point Read is called by the client code.
Note: Profile and fix real bottlenecks - not suspected ones. (You'll learn more if you profile by either having your suspicion confirmed or by finding out why your assumption about the performance was wrong.)
Recent compilers, e.g. clang, complain if a function tests if "this" is NULL, as this is illegal according to the standard.
I have a program that made large use of this, and am trying to clean it up. Below are some examples that don't warn - are these safe or not? Is there a good way to get the ->toAA and ->toAB functionality behaviour that is C++ standard compliant? (Ideally without changing the code that is calling these functions, and reasonably fast - see the note below that testing this is faster in GCC 4.6.)
#include <stddef.h>
class ClassAA;
class ClassAB;
class ClassBase {
public:
enum Type {AA, AB};
Type m_type;
ClassAA* toAA();
ClassAB* toAB();
ClassBase(Type t) : m_type(t) {}
virtual ~ClassBase() {}
};
class ClassAA : public ClassBase { public: int a; ClassAA():ClassBase(AA) {} };
class ClassAB : public ClassBase { public: int b; ClassAB():ClassBase(AB) {} };
// toAA and toAB are intended to have same function,
// but toAB is significantly better performing on GCC 4.6.
inline ClassAA* ClassBase::toAA() { return dynamic_cast<ClassAA*>(this); }
inline ClassAB* ClassBase::toAB() { return (this && m_type == AB) ? static_cast<ClassAB*>(this) : NULL; }
int foo(ClassBase* bp) {
if (bp && bp->toAA()) // Legal
return -1;
if (dynamic_cast<ClassAA*>(bp)) // Legal
return -1;
if (!bp->toAA()) // No warning, is this legal?
return -1;
if (bp->toAA()->a) // No warning, is this legal?
return 10;
if (bp->toAB()->b) // Warning due to use of "this", illagal presumably
return 20;
return 0;
}
Make them free functions instead, or static members that take an argument.
Non-static member functions must be invoked on an extant object; period.
Your compiler isn't warning likely because you don't dereference this, so its detection algorithm isn't triggered. But that doesn't make the behaviour any less undefined. The compiler could be omitting the warning then sneaking off to make pancakes, for all you know.
In my jpg decoder I have a loop with an if statement that will always be true or always be false depending on the image. I could make two separate functions to avoid the if statement but I was wondering out of curiosity what the effect on efficiency would be using a function pointer instead of the if statement. It will point to the inline function if true or point to an empty inline function if false.
class jpg{
private:
// emtpy function
void inline nothing();
// real function
void inline function();
// pointer to inline function
void (jpg::*functionptr)() = nullptr;
}
jpg::nothing(){}
main(){
functionptr = &jpg::nothing;
if(trueorfalse){
functionptr = &jpg::function;
}
while(kazillion){
(this->*functionptr)();
dootherstuff();
}
}
Could this be faster than an if statement? My guess is no, because the inline will be useless since the compiler won't know which function to inline at compile time and the function pointer address resolve is slower than an if statement.
I have profiled my program and while I expected a noticeable difference one way or the other when I ran my program... I did not experience a noticeable difference. So I'm just wondering out of curiosity.
It is very likely that the if statement would be faster than invoking a function, as the if will just be a short jump vs the overhead of a function call.
This has been discussed here: Which one is faster ? Function call or Conditional if Statement?
The "inline" keyword is just a hint to the compiler to tell it to try to put the instructions inline when assembling it. If you use a function pointer to an inline, the inline optimization cannot be used anyway:
Read: Do inline functions have addresses?
If you feel that the if statement is slowing it too much, you could eliminate it altogether by using separate while statements:
if (trueorfalse) {
while (kazillion) {
trueFunction();
dootherstuff();
}
} else {
while (kazillion) {
dootherstuff();
}
}
Caution 1: I am not really answering the above question, on purpose. If one wants to know what it faster between an if statement and a function call via a pointer in the above example, then mbonneau gives a very good answer.
Caution 2: The following is pseudo-code.
Besides curiosity, I truly think one should not ask himself what is faster between an if statement and a function call to optimize his code. The gain would certainly be very small, and the resulting code might be twisted in such a way it could impact readability AND maintenance.
For my research, I do care about performance, this is a fundamental notion I have to stick with. But I do more care about code maintenance, and if I have to choose between a good structure and a slight optimization, I definitely choose the good structure. Then, if it was me, I would write the above code as follows (avoiding if statements), using composition through a Strategy Pattern.
class MyStrategy {
public:
virtual void MyFunction( Stuff& ) = 0;
};
class StrategyOne : public MyStrategy {
public:
void MyFunction( Stuff& ); // do something
};
class StrategyTwo : public MyStrategy {
public:
void MyFunction( Stuff &stuff ) { } // do nothing, and if you
// change your mind it could
// do something later.
};
class jpg{
public:
jpg( MyStrategy& strat) : strat(strat) { }
void func( Stuff &stuff ) { return strat.MyFunction( stuff ); }
private:
...
MyStrategy strat;
}
main(){
jpg a( new StrategyOne );
jpg b( new StrategyTwo );
vector<jpg> v { a, b };
for( auto e : v )
{
e.func();
dootherstuff();
}
}