Where is function overriding done? - c++

Where in the process of creating the program, compiler, linker etc., is the overriding of functions and operator overloading done?
I'm particularly interested where it is done in C++, Ruby and Python.

Function overloading is (at least in C++) handled internally inside the compiler. The idea is that the code that the compiler ultimately generates will be hardcoded to call the appropriate function, as if the functions all had different names and you called the function uniquely suited to the arguments. More generally, in most compiled languages that support overloading, the overload resolution is done at compile-time and the emitted code will always call the indicated function. For example, Haskell supports compile-time overloading this way.
Operator overloading is a special case of general overloading, so it's usually handled the same way.
Function overriding (a term that arises in OOP when a derived class inherits from a base class and redefines one of its methods) is almost always resolved at runtime, since a compiler can't always tell which function is going to be invoked without actually knowing about the types at runtime. Some compilers might be able to statically prove that a certain object has a specific type and can then optimize the dynamic dispatch away, but it's impossible to do this in all cases.
I am not aware of any dynamic languages that support overloading, since in theory you could introduce new overload candidates as the program was running. I would love to be enlightened if such a language exists.

For C++, operator overloading is done at the compiler level though a name-mangling process that creates a unique name identifier for every function so that the linker will not complain about duplicate function definitions. In C++, operator overloading is possible because overloadable operations like +, -, *, etc. are actual functions themselves that have the prefix operator followed by the symbol of the operation. So for instance, an overloaded operator+ function with a function signature like
my_type operator+(const my_type& lhs, const my_type& rhs);
will not conflict with another operator+ function with a different signature, even though both functions have the same operator+ name, because each version of the function will have a different name at the assembly-language level after the C++ compiler's name-mangling process is complete. Name-mangling has another benefit in that allows C and C++ compiled code to be used with the same linker, since two functions with the same name will not exist and cause a linker error.
Note that in C, that even if you create two functions with different signatures, if they have the same name, since the C-compiler will not do any name-mangling, the linker will complain about duplicate definitions of the function.

Python is not linked/compiled, it is interpreted.
So, the normal overriding is done when class sources are parsed. Of course, due to dynamic nature you can always override during the runtime as well.
I suppose that alternate implementations using the byto-code compilation do it on the compile-time.
I also think the above is true for Ruby as well.

Related

Is it possible overriding a global implemented function?

I'm being curious about if it is possible to override an implemented function. I mean, is there any legal syntax of function declaration / implementation that allows alternative implementation?
Why am I asking? (I know it sounds ridiculus)
First, just of curiosity and expanding my knowledge.
Second, I've learned that the global new can be overrided (Although it is strongly not recommended).
Third, assume that I have written a library: AwsomeLibrary.hpp, which
my friend wants to include.Among a lot of functions, there is a function like void sort(int* arr), which he thinks that he could implement better (and of course call it with the same name).
I mean, is there any legal syntax of function declaration /
implementation that allows alternative implementation?
No. That would break the one-definition rule (ODR).
Second, I've learned that the global new can be overrided (Although
it is strongly not recommended).
Replaceable allocation functions as documented at http://en.cppreference.com/w/cpp/memory/new/operator_new are really just a very special case, a grey area between language and standard library; certainly not something from which you can infer general rules for your own code.
Third, assume that I have written a library: AwsomeLibrary.hpp, which
my friend wants to include. Among a lot of functions, there is a
function like void sort(int* arr), which he thinks that he could
implement better (and of course call it with the same name).
Such problems are beyond the scope of C++. They are more related to source control versioning systems like Git. If, for example, your project is under Git control, then your friend could create a branch of the code with his better implementation.
It is not possible at language level, aside from one "bizarre" language feature you mentioned yourself: replaceable operator new and operator delete functions. These functions can be replaced through a dedicated mechanism, which is why it is formally referred to as replacement (as opposed to overriding or overloading). This feature is not available to the language user for their own functions.
Outside the limits of standard language you can employ such implementation-specific features as weak symbols, which would allow you to create replaceable functions. For example, virtually all functions in GNU standard C library are declared as weak symbols and can be replaced with user-provided implementations.
The latter is exactly what would facilitate replacement of void sort(int* arr) function in your library. However this does not look like a good design for a library. Function replacement capability should probably reserved for debugging/logging and for other internal library-tuning purposes.

Does name mangling apply to virtual functions in c++?

We all know that all the functions in C++ are name mangled during the compile time only, so is this applied to virtual functions too?
Yes. Member function names are mangled. They need to embed their argument types so that you can overload them with different argument types.
In theory, a compiler could encode the argument types in some other way, but at some level each function body needs to be labelled by (and to have references to it resolved using) both the function name and its argument types. All major compilers certainly use mangling.
Name mangling is unrelated to member functions being virtual or not; after all virtual methods can be called non-virtually just like any member function. Only if the compiler could be certain that a virtual method is exclusively called through the vtable, it might avoid generating any linker symbol at all for the method (just inserting its address in the vtable instead). But I don't think there is any practical way a compiler can know that a method is not being called directly in another compilation unit (as it can for functions that are visible only in the current compilation unit).

How does function overloading work at run-time, and why overload?

Let's say I have a class named ClothingStore. That class has 3 member functions, that point a visitor to the right department of the store. Member functions are ChildrenDept, MenDept and WomenDept, depending on whether the visitor is a child, a man or a woman.
Function overloading can be used to make 3 functions that have same name, say, PointToDept, but take different input argument ( child, man, woman ).
What is actually happening on run-time when program is executing ?
My guess is that compiler adds switch statements to the program, to select the right member function. But that makes me wonder - is there any benefit in terms of program performance when using overloaded functions, instead of making your own function with switch statements? Again, my only conclusion on that part is code readability. Thank you.
My guess is that compiler adds switch statements to the program, to select the right member function.
That's a bad guess. C++ is a statically typed language. The type of a variable does not change at runtime. This means the decision as to which non-polymorphic overload to call is one that can always be made at compile time. Section 13.3 in the standard, Overload resolution, ensures that this is the case. There's no reason to have a runtime decision when that decision can be made at compile time. The runtime cost of having a non-polymorphic overloaded function in most implementations is zero. The only exception might be a C++ interpreter.
How does function overloading work at run-time
It doesn't. It works at compile-time. A call to an overloaded function is no different at runtime from a call to a non-overloaded function.
and why overload? ... is there any benefit in terms of program performance when using overloaded functions, instead of making your own function with switch statements?
Yes. There is no runtime overhead at all, compared with 'making your own function with switch statements'.
From Gene's comment:
The compiler sees three different functions just as though they had been differently named.
In the case of most compilers, they are differently named. This used to be called name mangling where the function name is prefixed by return type and suffixed by the parameter types.

Why are C++11 string new functions (stod, stof) not member functions of the string class?

Why are those C++11 new functions of header <string> (stod, stof, stoull) not member functions of the string class ?
Isn't more C++ compliant to write mystring.stod(...) rather than stod(mystring,...)?
It is a surprise to many, but C++ is not an Object-Oriented language (unlike Java or C#).
C++ is a multi-paradigm language, and therefore tries to use the best tool for the job whenever possible. In this instance, a free-function is the right tool.
Guideline: Prefer non-member non-friend functions to member functions (from Efficient C++, Item 23)
Reason: a member function or friend function has access to the class internals whereas a non-member non-friend function does not; therefore using a non-member non-friend function increases encapsulation.
Exception: when a member function or friend function provides a significant advantage (such as performance), then it is worth considering despite the extra coupling. For example even though std::find works really well, associative containers such as std::set provide a member-function std::set::find which works in O(log N) instead of O(N).
The fundamental reason is that they don't belong there. They
don't really have anything to do with strings. Stop and think
about it. User defined types should follow the same rules as
built-in types, so every time you defined a new user type,
you'd have to add a function to std::string. This would
actually be possible in C++: if std::string had a member
function template to, without a generic implementation, you
could add a specialization for each type, and call
str.to<double>() or str.to<MyType>(). But is this really
what you want. It doesn't seem like a clean solution to me,
having everyone writing a new class having to add
a specialization to std::string. Putting these sort of things
in the string class bastardizes it, and is really the opposite
of what OO tries to achieve.
If you were to insist on pure OO, they would have to be
members of double, int, etc. (A constructor, really. This
is what Python does, for example.) C++ doesn't insist on pure
OO, and doesn't allow basic types like double and int to
have members or special constructors. So free functions are
both an acceptable solution, and the only clean solution
possible in the context of the language.
FWIW: conversions to/from textual representation is always
a delicate problem: if I do it in the target type, then I've
introduced a dependency on the various sources and sinks of text
in the target type---and these can vary in time. If I do it in
the source or sink type, I make them dependent on the the type
being converted, which is even worse. The C++ solution is to
define a protocol (in std::streambuf), where the user writes
a new free function (operator<< and operator>>) to handle
the conversions, and counts on operator overload resolution to
find the correct function. The advantage of the free function
solution is that the conversions are part of neither the data
type (which thus doesn't have to know of sources and sinks) nor
the source or sink type (which thus doesn't have to know about
user defined data types). It seems like the best solution to
me. And functions like stod are just convenience functions,
which make one particularly frequent use easier to write.
Actually they are some utility functions and they don't need to be inside the main class. Similar utility functions such as atoi, atof are defined (but for char*) inside stdlib.h and they too are standalone functions.

Member functions for scalar types and operator overloading [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've been thinking about some possible features C++ could have, does anyone know why they aren't supported?
Member functions for built-in types. This may just not seem necessary, but an interesting feature nonetheless. Example pseudocode: int int::getFirstBit(void) {return *this & 1;} ... int a = 2; a.getFirstBit(); This may seem useless, but it shouldn't be hard to implement either. With this springs up the following thought:
Member functions outside class definition. I don't see why this shouldn't be supported, except for conflicts with access restriction (public,protected,private,etc.) and encapsulation, but perhaps only structs could have this feature.
Operator overloading for non-object types, a use for this could be for pointers or arrays.
I know these features aren't necessary for much, but they still seem cool. Is it because they don't seem necessary or because they can cause many headaches?
I know these features aren't necessary for much, but they still seem cool. Is it because they don't seem necessary or because they can cause many headaches?
Part of one part of the other. Every new feature that is added to a language increments the complexity of the language, compilers and programs. In general, unless there is a real motivating need (or the new feature will help in writing simpler safer programs) features are not added.
As of the particular features you suggest:
1- Member functions for built in types
There is no need, you can do anything you want to do with a member function with a free function for the same cost and with the only difference in user code that the argument to the function is before a . or inside the parenthesis.
The only thing that cannot be done with free functions is dynamic dispatch (polymorphism) but since you cannot derive from those types, you could not have polymorphism either. Then again, to be able to do this you would need 2.
2- Member functions outside class definition.
I understand that you mean extension methods as in C#, where new methods can be added to a type externally. There are very few uses of this feature that are not simple enough to implement without it. Then there are the complexities.
Currently, the compiler sees a single definition of the class and is able to determine all member methods that can be applied to an element of the type. This includes virtual functions, which means that the compiler can at once determine the virtual function table (while vtable is not standard, all implementations use them). If you could add virtual methods outside of the class definition, different translation units would be seeing different incompatible view of the type. Dispatching to the 3rd virtual function could be calling foo in one .cpp file but bar in another. Solving this without postponing a big part of the linking stage to the loading of the binary into memory for execution would be almost impossible, and postponing it would mean a significant change in the language model.
If you restrict the feature to non-virtual functions, things get simpler as the calls would be dispatched to the function directly, but nevertheless, even this would imply other levels of complexity. With a separate compilation model as in C++, you would end up having to write headers for the original class and the extension methods, and including both in the translation unit from which you want to use it, which in most cases you can simplify by just declaring those same methods in the original headers as real member methods (or free functions, free functions do form part of the interface of user defined types!)
Additionally, allowing this would mean that simple typos in the code could have unexpected results. Currently the compiler verifies that the definition of a member function has a proper declaration, if this was allowed that check would have to be removed, and a simple typo while writing the name in either the declaration or definition would cause two separate functions rather than a quick to fix compiler error.
3- Operator overloading for non-object types
The language allows for overloading of operators for all user defined types, which includes classes and enumerations. For the rest of the types, there is a set of operators that are already defined with precise semantics that cannot be changed. Again with separate compilation model, it would mean that 1+2 could mean different things in different translation units, in particular the exact combination of includes could change the semantics of the program, and that would cause havoc --you remove a dependency in your header, and that removes an include that contains the overload for const char* + int, which in turn means that the semantics of "Hi" + 2 in code that included your header changes, from the user defined operation to yielding a pointer to the nul terminator of the string. This is really dangerous, because it means that a simple change in one part of a program can render other parts of the program incorrect.
Even for combinations for which there is no current meaning (char* + int*) you can use a regular function to provide the same operation. Remember that you should only overload an operator when in the domain that you are modeling that operation is naturally understood to have that particular semantics, which is why you can overload for user defined types, but pointers are not part of your domain, but rather part of the language, and in the language there is no natural definition of what "Hi" + new int(5) means. Operator overloading has the purpose of making code more readable and in any context for which there is no natural definition, operator overloading has the exact opposite effect.
Because you can write a free function that does the same thing. Would the "member function" be so much more desirable so as to offset the tremendous cost of ratifying this in the standard and having compiler vendors implement it? No.
Outside where exactly?
See 1.