I wonder if and how writing "almighty" classes in c++ actually impacts performance.
If I have for example, a class Point, with only uint x; uint y; as data, and have defined virtually everything that math can do to a point as methods. Some of those methods might be huge. (copy-)constructors do nothing more than initializing the two data members.
class Point
{
int mx; int my;
Point(int x, int y):mx(x),my(y){};
Point(const Point& other):mx(other.x),my(other.y){};
// .... HUGE number of methods....
};
Now. I load a big image and create a Point for every pixel, stuff em into a vector and use them. (say, all methods get called once)
This is only meant as a stupid example!
Would it be any slower than the same class without the methods but with a lot of utility functions? I am not talking about virtual functions in any way!
My Motivation for this: I often find myself writing nice and relatively powerful classes, but when I have to initialize/use a ton of them like in the example above, I get nervous.
I think I shouldn't.
what I think I know is:
Methods exist only once in memory.
(optimizations aside)
Allocation
only takes place for the data
members, and they are the only thing
copied.
So it shouldn't matter. Am I missing something?
You are right, methods only exist once in memory, they're just like normal functions with an extra hidden this parameter.
And of course, only data members are taken in account for allocation, well, inheritance may introduce some extra ptrs for vptrs in the object size, but not a big deal
You have already got some pretty good technical advice. I want to throw in something non-technical: As the STL showed us all, doing it all in member functions might not be the best way to do this. Rather than piling up arguments, I refer to Scott Meyers' class article on the subject: How Non-Member Functions Improve Encapsulation.
Although technically there should be no problem, you still might want to review your design from a design POV.
I suppose this is more of an answer than you're looking for, but here goes...
SO is filled with questions where people are worried about the performance of X, Y, or Z, and that worry is a form of guessing.
If you're worried about the performance of something, don't worry, find out.
Here's what to do:
Write the program
Performance tune it
Learn from the experience
What this has taught me, and I've seen it over and over, is this:
Best practice says Don't optimize prematurely.
Best practice says Do use lots of data structure classes, with multiple layers of abstraction, and the best big-O algorithms, "information hiding", with event-driven and notification-style architecture.
Performance tuning reveals where the time is going, which is: Galloping generality, making mountains out of molehills, calling functions & properties with no realization of how long they take, and doing this over multiple layers using exponential time.
Then the question is asked: What is the reason behind the best practice for the big-O algorithms, the event- and notification-driven architecture, etc. The answer comes: Well, among other things, performance.
So in a way, best practice is telling us: optimize prematurely. Get the point? It says "don't worry about performance", and it says "worry about performance", and it causes the very thing we're trying unsuccessfully not to worry about. And the more we worry about it, against our better judgement, the worse it gets.
My constructive suggestion is this: Follow steps 1, 2, and 3 above. That will teach you how to use best practice in moderation, and that will give you the best all-around design.
If you are truly worried, you can tell your compiler to inline the constructors. This optimization step should leave you with clean code and clean execution.
These 2 bits of code are identical:
Point x;
int l=x.getLength();
int l=GetLength(x);
given that the class Point has a non-virtual method getLength(). The first invocation actually calls int getLength(Point &this), an identical signature as the one we wrote in our second example. (*)
This of course wouldn't apply if the methods you're calling are virtual, since everything would go through an extra level of indirection (something akin to the C-style int l=x->lpvtbl->getLength(x)), not to mention that instead of 2 int's for every pixel you'd actually have 3, the extra one being that pointer to the virtual table.
(*) this isn't exactly true, the "this" pointer is passed through one of the cpu registers instead of through the stack, but the mechanism could have easily worked either way.
First: do not optimize prematurely.
Second: clean code is easier to maintain than optimized code.
Methods for classes have the hidden this pointer, but you should not worry about it. Most of the time the compiler tries to pass it via register.
Inheritance and virtual function introduce indirections in the appropriate calls (inheritance = constructor / destructor call, virual function - every function call of this function).
Short:
Objects you don't create/destroy often can have virtual methods, inheritance and so on as long as it benefits the design.
Objects you create/destroy often should be small (few data members) and should not have many virtual methods (best would be none at all - performance wise).
try to inline small methods/constructor. This will reduce the overhead.
Go for a clean design and refactor if you don't reach the desired performance.
There is a different discussion about classes having large or small interfaces (for example in one of Scott Meyers (More) Effective C++ Books - he opts for minimal interface). But this has nothing to do with performance.
I agree with the above comments wrt:performance and class layout, and would like to add a comment not yet stated about design.
It feels to me like you're over-using your Point class beyond it's real Design scope. Sure, it can be used that way but should it?
In past work in computer games I've often been faced by similar situations, and usually the best end result has been that when doing specialized processing (e.g. image processing) having a specialized code set for that which work on differently laid-out buffers has been more efficient.
This also allows you to performance optimize for the case that matters, in a more clean way, without making base code less maintainable.
In theory, I'm sure that there is a crafty way of using a complex combination of template code, concrete class design, etc., and getting nearly the same run-time efficiency ... but I am usually unwilling to make the complexity-of-implementation trade.
Member functions are not copied along with the object. Only data fields contribute to the size of the object.
I have created the same point class as you except it is a template class and all functions are inline. I expect to see performance increase not decrease by this. However, an image of size 800x600 will have 480k pixels and its memory print will be close to 4M without any color information. Not just memory but also initializing 480k object will take too much time. Therefore, I think its not a good idea in that case. However, if you use this class to transform position of an image, or use it for graphic primitives (lines, curves, circles, etc.)
Related
I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.
I know that virtual functions have an overhead of dereferencing to call a method. But I guess with modern architectural speed it is almost negligible.
Is there any particular reason why all functions in C++ are not virtual as in Java?
From my knowledge, defining a function virtual in a base class is sufficient/necessary. Now when I write a parent class, I might not know which methods would get over-ridden. So does that mean that while writing a child class someone would have to edit the parent class. This sounds like inconvenient and sometimes not possible?
Update:
Summarizing from Jon Skeet's answer below:
It's a trade-off between explicitly making someone realize that they are inheriting functionality [which has potential risks in themselves [(check Jon's response)] [and potential small performance gains] with a trade-off for less flexibility, more code changes, and a steeper learning curve.
Other reasons from different answers:
Virtual functions cannot be in-lined because inlining have to happen at runtime. This have performance impacts when you expect you functions benefits from inlining.
There might be potentially other reasons, and I would love to know and summarize them.
There are good reasons for controlling which methods are virtual beyond performance. While I don't actually make most of my methods final in Java, I probably should... unless a method is designed to be overridden, it probably shouldn't be virtual IMO.
Designing for inheritance can be tricky - in particular it means you need to document far more about what might call it and what it might call. Imagine if you have two virtual methods, and one calls the other - that must be documented, otherwise someone could override the "called" method with an implementation which calls the "calling" method, unwittingly creating a stack overflow (or infinite loop if there's tail call optimization). At that point you've then got less flexibility in your implementation - you can't switch it round at a later date.
Note that C# is a similar language to Java in various ways, but chose to make methods non-virtual by default. Some other people aren't keen on this, but I certainly welcome it - and I'd actually prefer that classes were uninheritable by default too.
Basically, it comes down to this advice from Josh Bloch: design for inheritance or prohibit it.
One of the main C++ principles is: you only pay for what you use ("zero overhead principle"). If you don't need the dynamic dispatch mechanism, you shouldn't pay for its overhead.
As the author of the base class, you should decide which methods should be allowed to be overridden. If you're writing both, go ahead and refactor what you need. But it works this way, because there has to be a way for the author of the base class to control its use.
But I guess with modern architectural speed it is almost negligible.
This assumption is wrong, and, I guess, the main reason for this decision.
Consider the case of inlining. C++’ sort function performs much faster than C’s otherwise similar qsort in some scenarios because it can inline its comparator argument, while C cannot (due to use of function pointers). In extreme cases, this can mean performance differences of as much as 700% (Scott Meyers, Effective STL).
The same would be true for virtual functions. We’ve had similar discussions before; for instance, Is there any reason to use C++ instead of C, Perl, Python, etc?
Most answers deal with the overhead of virtual functions, but there are other reasons not to make any function in a class virtual, as the fact that it will change the class from standard-layout to, well, non-standard-layout, and that can be a problem if you need to serialize binary data. That is solved differently in C#, for example, by having structs being a different family of types than classes.
From the design point of view, every public function establishes a contract between your type and the users of the type, and every virtual function (public or not) establishes a different contract with the classes that extend your type. The greater the number of such contracts that you sign the less room for changes that you have. As a matter of fact, there are quite a few people, including some well known writers, that defend that the public interface should never contain virtual functions, as your compromise to your clients might be different from the compromises you require from your extensions. That is, the public interfaces shows what you do for your clients, while the virtual interface shows how others might help you in doing it.
Another effect of virtual functions is that they always get dispatched to the final overrider (unless you explicitly qualify the call), and that means that any function that is needed to maintain your invariants (think the state of the private variables) should not be virtual: if a class extends it, it will have to either make an explicit qualified call back to the parent or else would break the invariants at your level.
This is similar to the example of the infinite loop/stack overflow that #Jon Skeet mentioned, just in a different way: you have to document in each function whether it accesses any private attributes so that extensions will ensure that the function is called at the right time. And that in turn means that you are breaking encapsulation and you have a leaking abstraction: Your internal details are now part of the interface (documentation + requirements on your extensions), and you cannot modify them as you wish.
Then there is performance... there will be an impact in performance, but in most cases that is overrated, and it could be argued that only in the few cases where performance is critical, you would fall back and declare the functions non-virtual. Then again, that might not be simple on a built product, since the two interfaces (public + extensions) are already bound.
You forget one thing. The overhead is also in memory, that is you add a virtual table and a pointer to that table for each object. Now if you have an object which has significant number of instances expected then it is not negligible. example, million instance equals 4 Mega byte. I agree that for simple application this is not much, but for real time devices such as routers this counts.
I'm rather late to the party here, so I'll add one thing that I haven't noticed covered in other answers, and summarise quickly...
Usability in shared memory: a typical implementation of virtual dispatch has a pointer to a class-specific virtual dispatch table in each object. The addresses in these pointers are specific to the process creating them, which means multi-process systems accessing objects in shared memory can't dispatch using another process's object! That's an unacceptable limitation given shared memory's importance in high-performance multi-process systems.
Encapsulation: the ability of a class designer to control the members accessed by client code, ensuring class semantics and invariants are maintained. For example, if you derive from std::string (I may get a few comments for daring to suggest that ;-P) then you can use all the normal insert / erase / append operations and be sure that - provided you don't do anything that's always undefined behaviour for std::string like pass bad position values to functions - the std::string data will be sound. Someone checking or maintaining your code doesn't have to check if you've changed the meaning of those operations. For a class, encapsulation ensures freedom to later modify the implementation without breaking client code. Another perspective on the same statement: client code can use the class any way it likes without being sensitive to the implementation details. If any function can be changed in a derived class, that whole encapsulation mechanism is simply blown away.
Hidden dependencies: when you know neither what other functions are dependent on the one you're overriding, nor that the function was designed to be overridden, then you can't reason about the impact of your change. For example, you think "I've always wanted this", and change std::string::operator[]() and at() to consider negative values (after a type-cast to signed) to be offsets backwards from the end of the string. But, perhaps some other function was using at() as a kind of assertion that an index was valid - knowing it'll throw otherwise - before attempting an insertion or deletion... that code might go from throwing in a Standard-specified way to having undefined (but likely lethal) behaviour.
Documentation: by making a function virtual, you're documenting that it is an intended point of customisation, and part of the API for client code to use.
Inlining - code side & CPU usage: virtual dispatch complicates the compiler's job of working out when to inline function calls, and could therefore provide worse code in terms of both space/bloat and CPU usage.
Indirection during calls: even if an out-of-line call is being made either way, there's a small performance cost for virtual dispatch that may be significant when calling trivially simple functions repeatedly in performance critical systems. (You have to read the per-object pointer to the virtual dispatch table, then the virtual dispatch table entry itself - means the VDT pages are consuming cache too.)
Memory usage: the per-object pointers to virtual dispatch tables may represent significant wasted memory, especially for arrays of small objects. This means less objects fit in cache, and can have a significant performance impact.
Memory layout: it's essential for performance, and highly convenient for interoperability, that C++ can define classes with the exact memory layout of member data specified by network or data standards of various libraries and protocols. That data often comes from outside your C++ program, and may be generated in another language. Such communications and storage protocols won't have "gaps" for pointers to virtual dispatch tables, and as discussed earlier - even if they did, and the compiler somehow let you efficiently inject the correct pointers for your process over incoming data, that would frustrate multi-process access to the data. Crude-but-practical pointer/size based serialisation/deserialisation/comms code would also be made more complicated and potentially slower.
Pay per use (in Bjarne Stroustrup words).
Seems like this question might have some answers Virtual functions should not be used excessively - Why ?. In my opinion the one thing that stands out is that it just add more complexity in terms of knowing what can be done with inheritance.
Yes, it's because of performance overhead. Virtual methods are called using virtual tables and indirection.
In Java all methods are virtual and the overhead is also present. But, contrary to C++, the JIT compiler profiles the code during run-time and can in-line those methods which don't use this property. So, JVM knows where it's really needed and where not thus freeing You from making the decision on your own.
The issues is that while Java compiles to code that runs on a virtual machine, that same guarantee can't be made for C++. It common to use C++ as a more organized replacement for C, and C has a 1:1 translation to assembly.
If you consider that 9 out of 10 microprocessors in the world are not in a personal computer or a smartphone, you'll see the issue when you further consider that there are a lot of processors that need this low level access.
C++ was designed to avoid that hidden deferencing if you didn't need it, thus keeping that 1:1 nature. Some of the first C++ code actually had an intermediate step of being translated to C before running through a C-to-assembly compiler.
Java method calls are far more efficient than C++ due to runtime optimization.
What we need is to compile C++ into bytecode and run it on JVM.
My class has 3 states. In each state it does some work, and goes to other state, or remains in the same state (in 95% or more cases it will stay in the same state). I can implement state pattern (I assume you know it). The alternative, which I pretty like, is this:
I have a member function per state, and also a pointer to member function, which points to the current state function. When in a state I want to go to another state, I just point that function pointer to another state function. (maybe this isn't completely equivalent to state pattern, but in my case it works fine).
Those two ways are almost identical, I think.
So, my questions are:
Which solution is better (depends on what)?
Is it worth to declare a class per state (which will have only one function)? I think that would be artificial.
What about performance? isn't creating new object of state class (in case of state pattern) bring with it a slight overhead? (Sure state classes shouldn't have members, but anyway it should cost something)
You don't really mention the constraints under which your program will run, so it's hard comment specifically about overheads of one implementation over the other, so I'll just make a comment about code maintainability.
Personally I think that unless your state machine is extremely simple and will stay simple, then declaring a class per state is far more maintainable, extensible & readable. A good rule of thumb might be that if you can't look at the code in your class and keep the entire picture in your head, then your class is probably doing too much. The small overhead you pay in declaring a class per state is likely to be well worth the productivity gains you will get from writing modular code (or anyone else who ends up maintaining it). I've come across far too many 'uber' classes that are essentially one big (very hard too maintain) state machine that probably started out as a simple state machine, to recommend otherwise.
The 'S' and 'O' portions of the SOLID acronym (https://en.wikipedia.org/wiki/SOLID_(object-oriented_design) are always good things to keep in mind.
It depends if you need to access private members of your object or not. If not, then an out-of-class implementation breaks your code in smaller fragments and may be preferable because of this (but this is non objective : the two solutions have pros and cons).
It's not necessary, but adds a layer of abstraction and loosen the coupling. Using an interface, you can change each implementation without affecting the others (e.g. adding class fields...)
Doesn't matter so much, allocating a new empty class or calling a function have same magnitude of overhead.
Just wondering whether dividing one class into several classes will reduce the efficiency of a program. In order to make my question clear, I give the following examples:
class OneClass
{
public:
Do_the_job()
{
Do_function1();
Do_function2();
Do_function3();
}
Do_function1();
Do_function2();
Do_function3();
variables_related_to_function1;
variables_related_to_function2;
variables_related_to_function3;
}
In this class, the functionality can be divided into three sub-functions, and they are in the same class. However, for the purpose of easy maintenance, I decide to make each sub-function a class, and the whole functionality is performed by combining three independent classes in one class:
Class OneClass
{
public:
Do_the_job()
{
class1.do_function1();
class2.do_function2();
class3.do_function3();
}
Class1 class1;
Class2 class2;
Class3 class3;
}
My question is that will the new implementation reduce the efficiency of the program. Thanks.
Probably not, and even if it does you're unlikely to notice.
The worst case for calling a method of another object is calling a virtual method via a pointer, which requires a vtable lookup to find the code to run, and that has to be done at runtime so it's not subject to inlining. But even that's pretty quick, and you're not going to notice it except in very specific situations.
Ordinary member functions on objects held by value is as fast as calling your own member functions.
Don't worry about things like this unless you can see that they're actually causing an issue. The impact of what kinds of function call you're making are negligible next to the potential speed problems caused by flawed algorithms, or even using postfix increment unnecessarily on some iterator classes.
Suggest you concentrate first of all on getting the logical design right, then think about these kinds of optimizations second. The kinds of optimizations that modern compilers and linkers will perform are quite impressive, and you should only worry about these kinds of performance issues after taking performance measurements / profiling.
In particular, modern linkers can do whole program optimization, when appropriate, which can essentially inline a method from another module, even if that module is in another statically linked library. So if the methods you're moving out are quite small, the executable code might be the essentially the same as the original.
The answer is "Probably Not". While I've never attempted to benchmark this, I do know that most developers write classes with cohesion in mind, rather than a specific number of functions. I tend to only allow 10 or so functions in my class. Any more complicated than that and I need to rethink my design.
That being said, if it is any slower, the difference is probably not discernible. Unless you're doing heavy polymorphic design with many calls to the vtable, you shouldn't notice a difference.
Even if you have multiple classes, invocation of a method still takes the same time as with only one class. So I think there's no reason why you shouldn't split up big classes and benefit from better readability and maintainability.
What really matters is if you're starting to use virtual methods, which requires a lookup for the address of the method at runtime (if the compiler can't resolve/inline it at compile-time). In this case, the invocation takes in the order of 3 instead of 1 machine code instructions.
If you're using dynamic dispatching AND multiple inheritance, you have an overhead in the order of 4-5 machine code instructions instead of one.
That said, I should mention that you probably won't feel a noticeable difference unless you have several thousand objects of a class. But then again you could start devirtualizing your classes to reduce the overhead.
I know there is some positive aspects of inheritance, but I don't know negative runtime impacts of inheritance? Can anybody tell me about that, thanks!
Large inheritance based systems usually uses more memory and have worse data layout than composition based systems, this has a runtime cost in terms of speed due to how the cache behaves ( you want everything related to be as tightly packed as possible ).
Virtual function calls requires a trip to a virtual function table in order to retrieve the correct function to call, this can be costly due to cache behavior, the vtable might be far from the calling function.
Multiple inheritance increases the cost of virtual function calls further, as first an offset might need to be computed in order to get the correct vtable.
If you're using RTTI, then you'll usually see additional data at a fixed location in relation to the vtable. This affects the vtable locality, which once again prohibits the cache.
If a base class contains virtual functions then instances of it and its descendants will each have a pointer to a virtual function table, increasing their memory footprint by the size of one pointer. Calls to virtual functions will have an extra level of indirection compared to non-virtual functions, so there is a small call time cost there.
Otherwise, there is no negative impact. Deriving one class from another but not using polymorphism (so, no virtual functions, always calling methods through pointers to the derived class) has no cost over a class with no parent.
Update: I have addressed the performance impact of inheritance here. Other answers have more to say on OO-correctness.
The benefits of using inheritance greatly outweigh the downfalls.
The first downfall is the object size in memory, which, when using virtual functions, has an extra pointer to the virtual function table.
Virtual function calls also require a few extra steps in the assembly compared to regular calls.
Non-virtual function calls cost the same in terms of performance.
Object size can also increase as an object of class A, if A is derived from B, contains all information from B. Of course, with a well-thought design, this doesn't happen, because even without inheritance, A would contain all information in B.
One more issue would be the use of dynamic_cast or static_cast, which you wouldn't encounter in an inheritance-free environment, but these can also be avoided even using inheritance.
The only runtime impact could be performance in terms of memory and speed. Considering functionality-wise everything can be done without inheritance, the only question is how well it performs as opposed to the alternatives. That will depend on the specific scenarios you want to compare, and the complier's generated code.
Inheritance can negatively impact data locality, which is a big deal when you have a lot of numbers to crunch. You also get less control over data layout than when you use composition, so your objects might take up more memory.
If you also use polymorphism, then you spend additional cycles on indirect function calls and get even worse data locality, as you reference virtual function tables.
Generally, the overhead cost of object-oriented programming is fairly small and you only have to think about it when you are processing large amounts of data. Check Sony's Pitfalls of Object Oriented Programming presentation — it looks at OOP performance from a game developer's perspective.
After reading the other (informative!) responses, I believe one potential negative impact wasn't mentioned yet:
Inheritance is often used to achieve polymorphy. In C++, this means that you pass references (C++ references or pointers) to the base type around, instead of passing it by value, to avoid the slicing problem. In practice, passing references around often means that the scope of an object should no longer define its life time - so people start using dynamic memory management (say, new and delete). And this can open a whole can of worms itself.
To make a long story short: very often, inheritance goes hand in hand with dynamic memory allocation, which opens a whole new class of issues.
Since you tagged your post with C++, I'd like to add that one of the most important runtime impact when you use virtual function in C++ is related to the impossibility to expand them inline.
In fact, the heaviest performance impact is not due to the virtual function table lookup, but to the fact that the compiler cannot expand a virtual function even if you declare it as inline. This prevents an important optimization that could make your code much faster.
I would think inheritance would only improve runtime. If you rewrite the code in several places that code has to be compiled that many times more.