Where is stored in memory the reference to current object? - c++

I have a simple question. I know that after compile a program when I call a function a call stack is generated with the arguments, space for local vars, return point and the registers that i'm charged.
But in object-oriented language like c++, where the compiler stores the reference to the current object? object->instanceMethod() will store the object pointer like an argument in the call stack?
I know the question is generalist and thanks for the answer

It's implementation-defined but in practice you will find that most (all?) C++ compilers generate code which passes the this pointer as a hidden first argument to the function, so you can access it without explicitely specifiying it in the method signature.

In C++, when a member function is called the pointer to the instance on which it will operate (i.e. what will be this inside the function) is implicitly passed alongside the other function arguments/parameters. Actually, different systems use different conventions, so some number of such parameters could be packed into registers and never placed on the stack (this tends to be faster), but your conception is basically sound.

Related

Is there a reason some functions don't take a void*?

Many functions accept a function pointer as an argument. atexit and call_once are excellent examples. If these higher level functions accepted a void* argument, such as atexit(&myFunction, &argumentForMyFunction), then I could easily wrap any functor I pleased by passing a function pointer and a block of data to provide statefulness.
As is, there are many cases where I wish I could register a callback with arguments, but the registration function does not allow me to pass any arguments through. atexit only accepts one argument: a function taking 0 arguments. I cannot register a function to clean up after my object, I must register a function which cleans up after all objects of a class, and force my class to maintain a list of all objects needing cleanup.
I always viewed this as an oversight, there seemed no valid reason why you wouldn't allow a measly 4 or 8 byte pointer to be passed along, unless you were on an extremely limited microcontroller. I always assumed they simply didn't realize how important that extra argument could be until it was too late to redefine the spec. In the case of call_once, the posix version accepts no arguments, but the C++11 version accepts a functor (which is virtually equivalent to passing a function and an argument, only the compiler does some of the work for you).
Is there any reason why one would choose not to allow that extra argument? Is there an advantage to accepting only "void functions with 0 arguments"?
I think atexit is just a special case, because whatever function you pass to it is supposed to be called only once. Therefore whatever state it needs to do its job can just be kept in global variables. If atexit were being designed today, it would probably take a void* in order to enable you to avoid using global variables, but that wouldn't actually give it any new functionality; it would just make the code slightly cleaner in some cases.
For many APIs, though, callbacks are allowed to take additional arguments, and not allowing them to do so would be a severe design flaw. For example, pthread_create does let you pass a void*, which makes sense because otherwise you'd need a separate function for each thread, and it would be totally impossible to write a program that spawns a variable number of threads.
Quite a number of the interfaces taking function pointers lacking a pass-through argument are simply coming from a different time. However, their signatures can't be changed without breaking existing code. It is sort of a misdesign but that's easy to say in hindsight. The overall programming style has moved on to have limited uses of functional programming within generally non-functional programming languages. Also, at the time many of these interfaces were created storing any extra data even on "normal" computers implied an observable extra cost: aside from the extra storage used, the extra argument also needs to be passed even when it isn't used. Sure, atexit() is hardly bound to be a performance bottleneck seeing that it is called just once but if you'd pass an extra pointer everywhere you'd surely also have one qsort()'s comparison function.
Specifically for something like atexit() it is reasonably straight forward to use a custom global object with which function objects to be invoked upon exit are registered: just register a function with atexit() calling all of the functions registered with said global object. Also note that atexit() is only guaranteed to register up to 32 functions although implementations may support more registered functions. It seems ill-advised to use it as a registry for object clean-up function rather than the function which calling an object clean-up function as other libraries may have a need to register functions, too.
That said, I can't imagine why atexit() is particular useful in C++ where objects are automatically destroyed upon program termination anyway. Of course, this approach assumes that all objects are somehow held but that's normally necessary anyway in some form or the other and typically done using appropriate RAII objects.

Use reference argument in recursive function in C++

In a recursive function in C++, one of its argument is reference type. I just want to know what will happen during the recursive call of the function.
Without reference type, I believe every time the function is called recursively, a new variable will be created in the stack. So with the reference, every time what has been created in stack now is some kind of pointer pointing to the address of the original variable where it is declared,right?
So by using reference in such scenario, I believe sometimes we can save some memory.
Yes, you've got the right idea. Note, of course, that you only save memory if the parameter type is larger than a pointer. A reference to an integer (or maybe even a double) won't save any memory on the stack.
Usually parameter values change during recursion. You can't simply share those across all levels.
Furthermore, when a function is not inlined (and recursion interferes with inlining), passing an argument by reference costs as much space as a pointer.

pthread_create: Passing argument by value

I wonder why cannot we pass objects by value to the functions on which we create threads.
Is there a logical reason behind it?
Would it be harmful if the language allowed passing by value?
pthread is a C style interface. To allow more flexibility than "pass an integer", it has to be a pointer. A void * is the most flexible way to pass arbitrary things in C. In C, you can of course pass a struct by value, but which struct needs to be known by both the source and the destination function at compile time (and the same every time, so we can't use struct X in one of our threads, and struct Y in another thread).
In C++ we can of course use classes and templates to allow almost anything to be passed to almost any type of function.
The C++ 11 std::thread allows you to use various C++ style things to overcome the "C-ness" of pthreads (and subject to an available implementation for the target system, use threads without pthreads).
[This is not unique to pthreads. Both OS/2 and Windows thread implementations take a void * as the argument to the thread function]
POSIX threads is a C API. C does not provide language facilities like copy constructors and so it is not possible to copy any object by value without additional information (i.e. passing in function that are aware of the type and can do the job of allocating memory and copying the data). However, that API would be over-complicated for no good reason.
That being said, you can pass any object by value as long as its size is not greater than sizeof(void *).
Since you have tagged your question as C++, C++ does allow to pass a function with as many arguments as you want through variadic templates. See std::thread for more details.
The argument to pthread_create is typed as a pointer, to be as flexible as possible, but that doesn't mean you can't pass an int.
Just cast it back to an int in the start_routine.
As long as the passed-by value argument is smaller than a pointer you should be OK.

C++: How do I decide if to pass params by ref or by value?

With C++ how do i decide if i should pass an argument by value or by reference/pointer? (tell me the answer for both 32 and 64bits) Lets take A. Is 2 32bit values more less or equal work as a pointer to a 32bit value?
B to me seems like i always should pass by value. C i think i should pass by value but someone told me (however i haven't seen proof) that processors don't handle values not their bitsize and so it is more work. So if i were passing them around would it be more work to pass by value thus byref is faster? Finally i threw in an enum. I think enums should always be by value
Note: When i say by ref i mean a const reference or pointer (can't forget the const...)
struct A { int a, b; }
struct B { int a; }
struct C { char a, b; }
enum D { a,b,c }
void fn(T a);
Now tell me the answer if i were pushing the parameters many times and the code doesn't use a tail call? (lets say the values isnt used until 4 or so calls deep)
Forget the stack size. You should pass by reference if you want to change it, otherwise you should pass by value.
Preventing the sort of bugs introduced by allowing functions to change your data unexpectedly is far more important than a few bytes of wasted stack space.
If stack space becomes a problem, stop using so many levels (such as replacing a recursive solution with an iterative one) or expand your stack. Four levels of recursion isn't usually that onerous, unless your structures are massive or you're operating in the embedded world.
If performance becomes a problem, find a faster algorithm :-) If that's not possible, then you can look at passing by reference, but you need to understand that it's breaking the contract between caller and callee. If you can live with that, that's okay. I generally can't :-)
The intent of the value/reference dichotomy is to control what happens to the thing you pass as a parameter at the language level, not to fiddle with the way an implementation of the language works.
I pass all parameters by reference for consistency, including builtins (of course, const is used where possible).
I did test this in performance critical domains -- worst case loss compared to builtins was marginal. Reference can be quite a bit faster, for non-builtins, and when the calls are deep (as a generalization). This was important for me as I was doing quite a bit of deep TMP, where function bodies were tiny.
You might consider breaking that convention if you're counting instructions, the hardware is register-starved (e.g. embedded), or if the function is not a good candidate for inlining.
Unfortunately, the question you ask is more complex than it appears -- the answer may vary greatly by your platform, ABI, calling conventions, register counts, etc.
A lot depends on your requirement but best practice is to pass by reference as it reduces the memory foot print.
If you pass large objects by value, a copy of it is made in memory andthe copy constructor is called for making a copy of this.
So it will take more machine cycles and also, if you pass by value, changes are not reflected in the original object.
So try passing them by reference.
Hope this has been helpful to you.
Regards, Ken
First, reference and pointers aren't the same.
Pass by pointer
Pass parameters by pointers if any/some of these apply:
The passed element could be null.
The resource is allocated inside the called function and the caller is responsible should be responsible for freeing such a resource. Remember in this case to provide a free() function for that resource.
The value is of a variable type, like for example void*. When it's type is determined at runtime or depending on the usage pattern (or hiding implementation - i.e Win32 HANDLE), such as a thread procedure argument. (Here favor c++ templates and std::function, and use pointers for this purpose only if your environment does not permit otherwise.
Pass by reference
Pass parameters by reference if any/some of these apply:
Most of the time. (prefer passing by const reference)
If you want the modifications to the passed arguments to be visible to the caller. (unless const reference is used).
If the passed argument is never null.
If you know what is the passed argument type and you have control over function's signature.
Pass by copy
Pass a copy if any/some of these apply:
Generally try to avoid this.
If you want to operate on a copy of the passed argument. i.e you know that the called function would create a copy anyway.
With primitive types smaller than the system's pointer size - as it makes no performance/memory difference compared to a const ref.
This is tricky - when you know that the type implements a move constructor (such as std::string in C++11). It then looks as if you're passing by copy.
Any of these three lists can go more longer, but these are - I would say - the basic rules of thumb.
Your complete question is a bit unclear to me, but I can answer when you would use passing by value or by reference.
When passing by value, you have a complete copy of the parameter into the call stack. It's like you're making a local variable in the function call initialized with whatever you passed into it.
When passing by reference, you... well, pass by reference. The main difference is that you can modify the external object.
There is the benefit of reducing memory load for large objects passing by reference. For basic data types (32-bit or 64-bit integers, for example), the performance is negligible.
Generally, if you're going to work in C/C++ you should learn to use pointers. Passing objects as parameters will almost always be passed via a pointer (vs reference). The few instances you absolutely must use references is in the copy constructor. You'll want to use it in the operators as well, but it's not required.
Copying objects by value is usually a bad idea - more CPU to do the constructor function; more memory for the actual object. Use const to prevent the function modifying the object. The function signature should tell the caller what might happen to the referenced object.
Things like int, char, pointers are usually passed by value.
As to the structures you outlined, passing by value will not really matter. You need to do profiling to find out, but on the grand scheme of a program you be better off looking elsewhere for increasing performance in terms of CPU and/or memory.
I would consider whether you want value or reference semantics before you go worrying about optimizations. Generally you would pass by reference if you want the method you are calling to be able to modify the parameter. You can pass a pointer in this case, like you would in C, but idiomatic C++ tends to use references.
There is no rule that says that small types or enums should always be passed by value. There is plenty of code that passes int& parameters, because they rely on the semantics of passing by reference. Also, you should keep in mind that for any relatively small data type, you won't notice a difference in speed between passing by reference and by value.
That said, if you have a very large structure, you probably don't want to make lots of copies of it. This is where const references are handy. Do keep in mind though that const in C++ is not strictly enforced (even if it's considered bad practice, you can always const_cast it away). There is no reason to pass a const int& over an int, although there is a reason to pass a const ClassWithManyMembers& over a ClassWithManyMembers.
All of the structs that you listed I would say are fine to pass by value if you are intending them to be treated as values. Consider that if you call a function that takes one parameter of type struct Rectangle{int x, y, w, h}, this is the same as passing those 4 parameters independently, which is really not a big deal. Generally you should be more worried about the work that the copy constructor has to do - for example, passing a vector by value is probably not such a good idea, because it will have to dynamically allocate memory and iterate through a list whose size you don't know, and invoke many more copy constructors.
While you should keep all this in mind, a good general rule is: if you want refence semantics, pass by refence. Otherwise, pass intrinsics by value, and other things by const reference.
Also, C++11 introduced r-value references which complicate things even further. But that's a different topic.
These are the rules that I use:
for native types:
by value when they are input arguments
by non-const reference when they are mandatory output arguments
for structs or classes:
by const reference when they are input arguments
by non-const reference when they are output arguments
for arrays:
by const pointer when they are input arguments (const applies to the data, not the pointer here, i.e. const TYPE *)
by pointer when they are output arguments (const applies to the data, not the pointer)
I've found that there are very few times that require making an exception to the above rules. The one exception that comes to mind is for a struct or class argument that is optional, in which case a reference would not work. In that case I use a const pointer (input) or a non-const pointer (output), so that you can also pass 0.
If you want a copy, then pass by value. If you want to change it and you want those changes to be seen outside the function, then pass by reference. If you want speed and don't want to change it, pass by const reference.

what functions are called when passing value to function

In C++, if an object of a class is passed as a parameter into a function, the copy constructor of the class will be called.
I was wondering if the object is of nonclass type, what function will be called?
Similarly in C, what function is called when passing values or address of variables into a function?
Thanks and regards!
No function will be called; the bytes composing the object will simply be copied to the correct place for the callee (be that a location in memory or a register).
The copy constructor is only called if the object is being passed by value (and is a non-POD type). This is one of the reasons that it is common practice to pass objects by reference and const reference should you not wish the object to be changed by the function.
No function is called.
Since non-object types don't have methods, they are simply copied onto the stack to be used as-is by your function.
It depends on the implementation, but in some cases you may incur a function call if you are passing a floating-point value into a function expecting a value of integral type. (This is an implementation detail rather than part of the language, it's true, but it's no less worth taking account of because of that. And such conversions are often slow in any event, function call required or not.)