Accessing global array more efficient than passing as argument? - c++

I have a function which is called many times and I need to pass an array of 4 or 5 elements down in to 3 or 4 nested functions.
Surely it would be more efficient to create this array data structure as a global variable where all functions can access it with one address reference, rather than passing it down the nested functions as an argument. The latter would require stack pushing and popping whereas the global variable wouldn't?
(I know I can profile, but I want to understand what the theory would suggest- the difference in what code would be executed)

1-first of all an array in C/C++ is just a contiguous area reserved in memory , with a pointer to the first element namely : Arr[0]
2-passing array as a parameter most likely consumes a register in parameter passing (according to calling convention used and count of function parameters)
while using a global variable will not consume this register
3-to the compiler passing the parameter like
a) Foo(int* Arr)
b) Foo(int Arr[])
is just the same, a pointer is copied to the register or the stack according to calling convention used
the format in (b) may just give a hint to the compiler that there are no overlapping while processing the array to make better optimization

You can use pointer or reference instead of passing whole array.
If it is C plus plus, make that array member variable.

Related

How to prevent objects declared as local variables from being allocated in the stack?

I am struggling with following performance problem while compiling C++ for a small ARM processor using the ARM/Keil compiler.
Inside a function that does some processing I have code with following structure:
{
MyClass temp = global_variable_input;
Operation 1 on temp;
Operation 2 on temp;
...
Operation N on temp;
global_variable_output = temp;
}
MyClass is used to model a mathematical object and the only member is a 32 bit integer (that is, the complete size of the object is 4 bytes).
All operations involve using either an overloaded operator or a method of MyClass and change the value of 'temp' as a result. Some operations are trivial and inlined (method declared inline in the class), and some are more complex and need to generate a call to the method.
Having a look at the assembler code generated by the compiler for my routine I noticed that the compiler allocates space for "temp" in the stack and every single operation (also the inlined ones!) store the result of the operation in that place in the stack, to then just continue using the value stored in the register from last operation. For the non-inline ones, the compiler pass a pointer to the object (this) in register r1 and a pointer to another object created in the stack to store the result in register r0.
The code implements a signal processing algorithm and you can imagine it as a sequence of arithmetic operations on temp, so having this additional "store" instruction with the corresponding memory access after every single operation (which might be just one single opcode) introduces a massive performance penalty in the implementation.
Ideally I would expect the compiler to complete the operation using only a register instead of keeping an stacked version of "temp" that needs to be updated after every operation.
Another wish would be for it to pass the current value of the object to the methods simply using a register (like the ARM C calling convention specifies for normal C functions) and getting the result in the same way, instead of using pointers to memory locations.
Am I asking for too much? How can I get my ARM/Keil compiler to work in that way?
PS: The function is simple enough so it's not like the compiler needs to allocate my variable in stack because it ran out of registers. I suspect the reason why it does it is that it feels a need to have a pointer to pass to the non-inlined methods, and then believes it is necessary to keep the value in stack always up to date.
Thanks a lot!
Using a reference like
MyClass& temp = global_variable_input;
would avoid having a full copy of MyClass allocated on the stack (local storage)
Though any of
Operation 1 on temp;
Operation 2 on temp;
// ...
will affect the original global_variable_input as well.
You may change your class into a struct. It will store on the stack rather than the heap.

Returning static/ normal arrays in recursion/another function

A non-static array is declared in one function, it is returned and stored in another non-static array in main.
Q 1> I know that when an array is passed, it is passed by pointer(not by reference) so any changes made to the array in the passed function gets reflected in the original function but in this case while returning that array to another function say main shouldn't the original array be destroyed when it goes out of it's function scope?
Static variable retain value within function calls and it's lifetime is the entire program so it is justified for a static array but why so for a non static array?
Q 2>Now say we have a non-static array declared inside a function that undergoes recursion, does the array gets declared each time it recurses? It doesn't goes out of the scope so the array is re-declared so why doesn't it gives a redeclaration error? Is it a new array or does the array gets over-written?
Say we have a static array now, it being a static it is declared only once..
Does changes made in the array in one recursion gets reflected in another recursion if the array is static/non static?
I tried it out and I got that for non static arrays changes are not reflected but in static arrays the changes are reflected so again it basically boils down to the first question?
Q 3> Say we declared a static array and we run two tests(one imp thing to mention is we need the values we get in the previous recursion for the next recursion), the arrays values stored in the first test case(it being static) leads to incorrect values stored during the second test case run(well consider it as a vector so it will give incorrect output when we push the elements in second run since the values stored in the first run are already in the vector). Can you suggest a way to get around this w/o removing the static array
I know I have asked many questions all together at the same time, but I did that because they are all related 1 and 2 more so. 3 will help me clear my doubts better. It will be very helpful if you can clear the doubts. Thank You
Q 1> I know that when an array is passed, it is passed by pointer(not by reference) so any changes made to the array in the passed function gets reflected in the original function but in this case while returning that array to another function say main shouldn't the original array be destroyed when it goes out of it's function scope?
Yes, the array will be destroyed when the function it's declared in exits. If you continue trying to use it after that point, you'll get "undefined behaviour" - which means that anything could happen. It might seem to work, it might crash, or it might contain garbage data.
Q 2>Now say we have a non-static array declared inside a function that undergoes recursion, does the array gets declared each time it recurses? It doesn't goes out of the scope so the array is re-declared so why doesn't it gives a redeclaration error? Is it a new array or does the array gets over-written?
It's a new array. Each call to a function gets its own copy of that function's local variables.
Q 3> Say we declared a static array and we run two tests(one imp thing to mention is we need the values we get in the previous recursion for the next recursion), the arrays values stored in the first test case(it being static) leads to incorrect values stored during the second test case run(well consider it as a vector so it will give incorrect output when we push the elements in second run since the values stored in the first run are already in the vector). Can you suggest a way to get around this w/o removing the static array
You could make it global, then clear it between tests. That's not the prettiest solution, but it's probably the easiest.

c++ force array declaration without giving parameters

I'm currently trying to program an array of objects in a c++ program. However it keeps giving me errors when trying to create the arrays.
So on top of my code I have the following code:
#define sensNumber 4
ros::Publisher pub_range2 [sensNumber];
this gives the error:
multisone2.ino:19:38: error: no matching function for call to ‘ros::Publisher::Publisher()’
So it's trying to call the constructor for Publisher, why? And how do I stop it?
Now I know this can also be done with Vectors but I'm trying to optimize the code esp. for reading speed so I would rather avoid vectors(yes I know that it remains linair but accessing this array represents a significant portion of my code).
As you're trying to stack-allocate sensNumber instances of ros::Publisher, the default constructor must be called.
An alternative would be to allocate an array of pointers to ros::Publisher. Better still, an array of std::unique_ptr or std::shared_ptr.
When you declare an array of c++ objects you're actually instantiating each element. This means that the default parameterless constructor is called for each array element. If you don't want to instantiate all elements when declaring your array, you should declare an array of pointers instead and then initialize each element whenever required.

Cost of using functions in fortran (or any other language)

Let say I have a array which is very big verybigvariable
And I have defined a function that does some operations like this
function myfunc(var) result(res)
real:: var(:,:,:),res
...
...
...
end function myfunc
My question is that when I call this function like this
myvar=myfunc(verybigvariable)
what happens? does it duplicate my variable so it holds 2X space in the ram during the execution of the function? If so how can I prevent this? (In a simple program, I know, I can define the function without any parameter and make it use existing variables, but If I am programming a module, it seems I have to include parameter to the definition)
The Fortran language standard does not specify how arguments are passed. Typically in the interest of efficiency the compiler will not make a copy but pass the address of the argument. There will be cases in which a Fortran compiler has to make a copy. E.g., the actual argument is a non-contiguous array but the procedure expects a contiguous argument. The compiler will have to fix the mismatch by making a copy that is contiguous to pass to the procedure. If the procedure modifies that argument, the values have to be copied back to the original argument.
In fortran it seems that parameters are passed by reference. This means that only the address of the variable is passed, and the function can then access the variable through that address.
So no, the array is not copied, only the address of the array is passed. The overhead for this will be either 32 bits for a 32-bit system, or 64 bits for a 64-bit system.
I have no experience with fortran, and this is only what I could figure out though a Google search, so if any Fortran programmers have any remarks, please feel free to edit/comment.

Will static methods decrease my overhead?

Does a class have to allocate memory for its non-static member functions every time a new instance is created?
To put a finer point on it, if I were writing a class v3d representing 3-space vectors, would I use less memory by defining
static v3d::dotProduct(v3d v1, v3d v2)
as opposed to
v3d::dotProduct(v3d v2) ?
Neither static nor non-static member functions are stored per instance. In the case of non-static member functions, the way I understand it is that they are translated into something like (probably less readable than this):
v3d_dotProduct(v3d this, v3d v2)
And the calls to them are translated accordingly. If you want to improve performance, I would recommend using inline functions as these essentially copy the function contents to the place that you call it. I don't think this will decrease your memory usage, but it's worth using for class functions (static and non-static) which are called many times per second.
http://www.cplusplus.com/forum/articles/20600/
There is one instance of the function in memory. It has nothing to do with static or not. You don't allocate memory for member functions.
Or maybe I misunderstood. Perhaps you meant the function somehow takes up space in the object? No, it doesn't. At the object code level, membership is essentially just a name convention and a hidden 'this' parameter. If virtual, there is typically a single vtable, the same one for all instances.
However, in your examples, you appear to be passing all the v3d objects by value. This means in the static case you're making 2 object copies (one for each arg) and in the non-static case you're making only 1 object copy.
If you passed the args by reference, you could avoid making copies - except as may be required by the dot-product algorithm, whatever that is (a long time since I did any mathematics).
In either case the function's code only has a single copy in code memory. Static functions use the same amount of code memory but use less stack memory because when they are called one less argument is passed on the stack. Non-static class member functions have an additional argument (the this pointer) that is added to the stack when called. If you don't use anything in the object that would necessitate using the "this" pointer, you should declare the function static.
The amount of stack memory you will save is likely trivial. But if the function is called millions of times per second a static function could see an improvement in speed due to not having to pass an additional argument on the stack.