I know that when we create multiple objects of a given class type, multiple copies of the member variables are created. Each object has it's separate set of member variables. Does this work the same way with member functions too? If my class has a lot of functions, do the member functions get duplicated for each object that is created? Does each created object have it's own set of the member functions?
class demo {
public:
int height;
int width;
void setheight(int height)
{
this->height = height;
}
void getArea() const
{
return height * width;
}
// 100 more member functions.
};
This is just a hypothetical example to prove a point about the C++ compiler. Actually this is related to what I'm doing in my project. Let's suppose I have a class type with only a few member variables but lots and lots of member functions. If I create multiple objects of that class type, will I have duplication of code, with each object having it's own copy of the member function? In that case, would it be better for me to declare the functions just as regular stand alone global functions which take the object as a parameter instead, in order to avoid growing the executable?
This is just an implementation detail (the standard doesn't mandate anything particular about it), but on pretty much any implementation class methods are essentially syntactic sugar for "regular", free functions taking this as a hidden parameter1. IOW, your proposed optimization is what the compiler already does.
There's some extra machinery involved for virtual methods, as every virtual method generally "costs" one slot into the vtable of the class (and all its derived classes), but again, it's a O(1) space cost, not O(n) in the number of instances.
On some implementations there's also a difference in calling convention, e.g. on x86 VC++ methods receive this in ecx instead than on the stack as it would be if it were a free function with this as first parameter, but that's irrelevant for our discussion.
Related
We can overload functions by giving them a different number of parameters. For example, functions someFunc() and someFunc(int i) can do completely different things.
Is it possible to achieve the same effect on classes? For example, having one class name but creating one class if a function is not called and a different class if that function is not called. For example, If I have a dataStorage class, I want the internal implementation to be a list if only add is called, but want it to be a heap if both add and pop are called.
I am trying to implement this in C++, but I am curious if this is even possible. Examples in other languages would also help. Thanks!
The type of an object must be completely known at the point of definition. The type cannot depend on what is done with the object later.
For the dataStorage example, you could define dataStorage as an abstract class. For example:
struct dataStorage {
virtual ~dataStorage() = default;
virtual void add(dataType data) = 0;
// And anything else necessarily common to all implementations.
};
There could be a "default" implementation that uses a list.
struct dataList : public dataStorage {
void add(dataType data) override;
// And whatever else is needed.
};
There could be another implementation that uses a heap.
struct dataHeap : public dataStorage {
void add(dataType data) override;
void pop(); // Maybe return `dataType`, if desired
// And whatever else is needed.
};
Functions that need only to add data would work on references to dataStorage. Functions that need to pop data would work on references to dataHeap. When you define an object, you would choose dataList if the compiler allows it, dataHeap otherwise. (The compiler would not allow passing a dataList object to a function that requires a dataHeap&.) This is similar to what you asked for, except it does require manual intervention. On the bright side, you can use the compiler to tell you which decision to make.
A downside of this approach is that changes can get messy. There is additional maintenance and runtime overhead compared to simply always using a heap (one class, no inheritance). You should do some performance measurements to ensure that the cost is worth it. Sometimes simplicity is the best design, even if it is not optimal in all cases.
Sort of a style question here. Say I have a class A which has to do a sequence of reasonably complex things to its member variable B b
class A {
public:
void DoStuffOnB(){
DoThing1();
DoThing2();
DoThing3();
}
private:
B b;
void DoThing1(){ /* modify b */ }
void DoThing2(){ /* modify b */ }
void DoThing3(){ /* modify b */ }
};
where the DoThings functions only depend on b (or other member variables and some passed parameters). If I want to make those functions re-usable in the future outside of that class, I'm better off writing them as:
class A {
public:
void DoStuffOnB(){
DoThing1(b);
DoThing2(b);
DoThing3(b);
}
private:
B b;
void DoThing1(B& b){ /* modify b */ }
void DoThing2(B& b){ /* modify b */ }
void DoThing3(B& b){ /* modify b */ }
};
and then my DoThing functions can just be copied elsewhere in the future. Am I better off writing the function to take all relevant parameters like that, or should the function only take non-member parameters?
In case the answer is "you should write the function to take all relevant parameters", why would one bother to put it in a class?
When should you use a free function, and when should you use a member function?
Assuming from the context that the "do something on B" functions only operate on the B member and not other state in A then:
If the functions directly manipulate/operate on the private state of B then they should be members of B.
Else they should be free functions.
A member function is a member function because its' scope has access to the member variables without having to use referencing and pointer syntax. As someone mentioned earlier this would most likely make it simpler to code and maintain so you would use this method unless you needed the function to be a free function that might take the same type data but from different classes in which case you would have to pass by reference or use pointers to gain access to the scope of the variable.
Should you pass member variables within member functions?
There is no need to pass member variables to member functions, since the member functions have access to all the data members.
It's similar to free standing functions accessing static file local variables. The functions have access to the statically declared variables in the same translation unit.
When should you use a freestanding function and when should you use a member function?
In general, use a member function when the functionality is associated with the object.
Use a freestanding function when
the class has static members
or functionality is associated with a class and doesn't use static
members.
You can also use freestanding functions when the same functionality can apply to different objects.
For example, let's talk serialization or outputting of an object.
One can define a method, load_from_buffer() in an object, but it won't work with POD types.
However, if a function load_from_buffer() is made freestanding, it can be overloaded for different types, such as int, char, double and with templates, an overload can be made to call objects derived from an interface.
Summary
Prefer to use member methods when they require access to data members of an object. Use static member methods when they access static data members or there is a need for the functionality without an instance of an object (think encapsulation). Freestanding functions also provide the capability of functionality to different objects based on function overloading.
There are no hard rules, just use what you think will be easiest to maintain, assist in correctness and robustness and speed up development.
Just to confuse people, here is an article by Scott Meyers:
How Non-Member functions increase encapsulation
Remember, in order for a free standing function to access data members of an object, the data members must be given public access or the function needs to be a friend of the object. The classic example is overloading the stream operators for a class.
What is the advantage of having a free function (in anonymous namespace and accessible only in a single source file) and sending all variables as parameters as opposed to having a private class member function free of any parameters and accessing member variables directly?
header:
Class A {
int myVariable;
void DoSomething() {
myVariable = 1;
}
};
source:
namespace {
void DoSomething2(int &a) {
a = 1;
}
}
int A::SomeFunction() {
DoSomething2(myVariable); // calling free function
DoSomething(); // calling member function
}
If you prefer making them members, then what if I have a case where I first call a function that is not accessing any member variables, but that function calls another function which is accessing a member. Should they both be member functions or free?
see this question: Effective C++ Item 23 Prefer non-member non-friend functions to member functions
and also C++ Member Functions vs Free Functions
You should prefer free functions, in the extent that it promotes loose coupling.
Consider making it a member function only if it works on the guts of your class, and that you consider it really really tied to your class.
It is a point of the book 101 C++ coding standards, which states to prefer free function and static function over member functions.
Altough this may be considered opinion based, it allows to keep class little, and to seperate concerns.
This answer states: "the reason for this rule is that by using member functions you may rely too much on the internals of a class by accident."
One advantage of a non-member function in a source file is similar to the benefits of the Pimpl idiom: clients using your headers do not have to recompile if you change your implementation.
// widget.h
class Widget
{
public:
void meh();
private:
int bla_;
};
// widget.cpp
namespace {
void helper(Widget* w) // clients will never know about this
{ /* yadayada */ }
}
void widget::meh()
{ helper(this); }
Of course, when written like this, helper() can only use the public interface of Widget, so you gain little. You can put a friend declaration for helper() inside Widget but at some point you better switch to a full-blown Pimpl solution.
The primary advantage of free functions vs member functions is that it helps decouple the interface from the implementation. For example, std::sort doesn't need to know anything about the underlying container on which it operates, just that it's given access to a container (through iterators) that provide certain characteristics.
In your example the DoSomething2 method doesn't do much to decrease coupling since it still has to access the private member by having it passed by reference. It's almost certainly more obvious to just do the state mutation in the plain DoSomething method instead.
When you can implement a task or algorithm in terms of a class's public interface then that makes it a good candidate to make a free function. Scott Meyers summarizes a reasonable set of rules here: http://cpptips.com/nmemfunc_encap
question about c++
why minimal number of data members in class definition is zero
i think it should be one , i.e pointer to virtual table defined by compiler
thanks a lot
It is often useful to have a class with no data members for use in inheritance hierarchies.
A base class may only have several typedefs that are used in multiple classes. For example, the std::iterator class template just has the standard types defined so that you don't need to define them in each iterator class.
An interface class typically has no data members, only virtual member functions.
A virtual table has nothing to do with the data members of a class.
I’m working on a library that sometimes even uses types that – gasp! – aren’t even defined, much less have data members!
That is, the type is incomplete, such as
struct foobar;
This is used to create an unambiguous name, nothing more.
So what is this useful for? Well, we use it to create distinct tags, using an additional (empty, but fully defined) type:
template <typename TSpec>
struct Tag {};
Now you can create distinct tags like so (yes, we can declare the type inside the template argument list, we do not need to declare it separately):
using ForwardTag = Tag<struct Forward_>;
using RandomAccessibleTag = Tag<struct RandomAccessible_>;
These in turn can be used to disambiguate specialized overloads. Many STL implementations do something similar:
template <typename Iter>
void sort(Iter begin, Iter end, RandomAccessibleTag const&) …
Strictly speaking, the indirect route via a common Tag class template is redundant, but it was a useful trick for the sake of documentation.
All this just to show that a (strict, static) type system can be used in many different ways than just to bundle and encapsulate data.
Well, actually C++ mandates that all classes must occupy some space (You need to be able to generate a pointer to that class). They only need a pointer to a vtable though, if the class is polymorphic. There's no reason for a vtable at all in a monomorphic class.
Another use of a class with no data-members is for processing data from other sources. Everything gets passed into the class at runtime through pointers or references and the class operates on the data but stores none of it.
I hadn't really thought about this until I saw it done in a UML class I took. It has it's uses, but it does usually create coupled classes.
Because classes are not structures. Their purpose, contrary to popular belief, is not to hold data.
For instance, consider a validator base class that defines a virtual method which passes a string to validate, and returns a bool.
An instance of a validator may refuse strings which have capital letters in them. This is a perfect example on when you should use a class, and by the definition of what it does, there's clearly no reason to have any member variables.
question about c++ why minimal number of data members in class definition is zero
It is zero because you have various cases of classes that should have no members:
You can implement traits classes containing only static functions for example. These classes are the equivalent of a namespace that is also recognizable as a type. That means you can instantiate a template on the class and make sure the implementation of that template uses the functions within the class. The size of such a traits class should be zero.
Example:
class SingleThreadedArithmetic
{
static int Increment(int i) { return ++i; }
// other arithmetic operations implemented with no thread safety
}; // no state and no virtual members -> sizeof(SingleThreadedArithmetic) == 0
class MultiThreadedArithmetic
{
static int Increment(int i) { return InterlockedIncrement(i); }
// other arithmetic operations implemented with thread safety in mind
}; // no state and no virtual members -> sizeof(MultiThreadedArithmetic) == 0
template<class ThreadingModel> class SomeClass
{
public:
void SomeFunction()
{
// some operations
ThreadingModel::Increment(i);
// some other operations
}
};
typedef SomeClass<SingleThreadedArithmetic> SomeClassST;
typedef SomeClass<MultithreadedArithmetic> SomeClassMT;
You can define distinct class categories by implementing "tag" classes: classes that hold no interface or data, but are just used to differentiate between separate "logical" types of derived classes. The differentiation can be used in normal OOP code or in templated code.
These "tag" classes have 0 size also. See the iterators tags implementation in your current STL library for an example.
I am sure there are other cases where you can use "zero-sized" classes.
We know that C++ doesn't allow templated virtual function in a class. Anyone understands why such restriction?
Short answer: Virtual functions are about not knowing who called whom until at run-time, when a function is picked from an already compiled set of candidate functions. Function templates, OTOH, are about creating an arbitrary number of different functions (using types which might not even have been known when the callee was written) at compile-time from the callers' sides. That just doesn't match.
Somewhat longer answer: Virtual functions are implemented using an additional indirection (the Programmer's General All-Purpose Cure), usually implemented as a table of function pointers (the so-called virtual function table, often abbreviated "vtable"). If you're calling a virtual function, the run-time system will pick the right function from the table. If there were virtual function templates, the run-time system would have to find the address of an already compiled template instance with the exact template parameters. Since the class' designer cannot provide an arbitrary number of function template instances created from an unlimited set of possible arguments, this cannot work.
How would you construct the vtable? Theoretically you could have an infinite number of versions of your templated member and the compiler wouldn't know what they might be when it creates the vtable.
The other answers have already mentionned that virtual functions are usually handled in C++ by having in the object a pointer (the vptr) to a table. This table (vtable) contains pointer to the functions to use for the virtual members as well as some other things.
The other part of the explanation is that templates are handled in C++ by code expansion. This allow explicit specialization.
Now, some languages mandate (Eiffel -- I think it is also the case of Java and C#, but my knowledge of them is not good enough to be authoritative) or allow (Ada) an shared handling of genericity, don't have explicit specialization, but would allow virtual template function, putting template in libraries and could reduce the code size.
You can get the effect of shared genericity by using a technique called type erasure. This is doing manually what compilers for shared genericity language are doing (well, at least some of them, depending on the language, other implementation techniques could be possible). Here is a (silly) example:
#include <string.h>
#include <iostream>
#ifdef NOT_CPP
class C
{
public:
virtual template<typename T> int getAnInt(T const& v) {
return getint(v);
}
};
#else
class IntGetterBase
{
public:
virtual int getTheInt() const = 0;
};
template<typename T>
class IntGetter: public IntGetterBase
{
public:
IntGetter(T const& value) : myValue(value) {}
virtual int getTheInt() const
{
return getint(myValue);
}
private:
T const& myValue;
};
template<typename T>
IntGetter<T> makeIntGetter(T const& value)
{
return IntGetter<T>(value);
}
class C
{
public:
virtual int getAnInt(IntGetterBase const& v)
{
return v.getTheInt();
}
};
#endif
int getint(double d)
{
return static_cast<int>(d);
}
int getint(char const* s)
{
return strlen(s);
}
int main()
{
C c;
std::cout << c.getAnInt(makeIntGetter(3.141)) + c.getAnInt(makeIntGetter("foo")) << '\n';
return 0;
}
I think it's so that compilers can generate vtable offsets as constants (whereas references to non-virtual functions are fixups).
When you compile a call to a template function, the compiler usually just puts a note in the binary, effectively telling the linker "please replace this note with a pointer to the correct function". The static linker does something similar, and eventually the loader fills in the value once the code has been loaded into memory and its address is known. This is called a fixup, because the loader "fixes up" the code by filling in the numbers it needs. Note that to generate the fixup, the compiler doesn't need to know what other functions exist in the class, it just needs to know the munged name of the function it wants.
However with virtual functions, the compiler usually emits code saying "get the vtable pointer out of the object, add 24 to it, load a function address, and call it". In order to know that the particular virtual function you want is at offset 24, the compiler needs to know about all the virtual functions in the class, and what order they're going to appear in the vtable. As things stand, the compiler does know this, because all the virtual functions are listed right there in the class definition. But in order to generate a virtual call where there are templated virtual functions, the compiler would need to know at the point of the call, what instantiations there are of the function template. It can't possibly know this, because different compilation units might instantiate different versions of a function template. So it couldn't work out what offset to use in the vtable.
Now, I suspect that a compiler could support virtual function templates by emitting, instead of a constant vtable offset, an integer fixup. That is, a note saying "please fill in the vtable offset of the virtual function with this munged name". Then the static linker might fill in the actual value once it knows what instantiations are available (at the point where it removes duplicate template instantiations in different compilation units). But that would impose a serious burden of work on the linker to figure out vtable layouts, which currently the compiler does by itself. Templates were deliberately specified to make things easier on implementers, in the hope that they might actually appear in the wild some time before C++0x...
So, I speculate that some reasoning along these lines led the standards committee to conclude that virtual function templates, even if implementable at all, were too difficult to implement and therefore could not be included in the standard.
Note that there's a fair bit of speculation in the above even before I try to read the minds of the committee: I am not the writer of a C++ implementation, and nor do I play one on television.