Lua garbage collection and C userdata - c++

In my game engine I expose my Vector and Color objects to Lua, using userdata.
Now, for every even locally created Vector and Color from within Lua scripts, Luas memory usage goes up a bit, it doesn't fall until the garbage collector runs.
The garbage collector causes a small lagspike in my game.
Shouldn't the Vector and Color objects be immediately deleted if they are only used as arguments? For example like: myObject:SetPosition( Vector( 123,456 ) )
They aren't right now - the memory usage of Lua rises to 1,5 MB each second, then the lag spike occurs and it goes back to about 50KB.
How can I solve this problem, is it even solvable?

You can run a lua_setgcthreshold(L,0) to force an immediate garbage collection after you exit the function.
Edit: for 5.1 I'm seeing the following:
int lua_gc (lua_State *L, int what, int data);
Controls the garbage collector.
This function performs several tasks, according to the value of the parameter what:
* LUA_GCSTOP: stops the garbage collector.
* LUA_GCRESTART: restarts the garbage collector.
* LUA_GCCOLLECT: performs a full garbage-collection cycle.
* LUA_GCCOUNT: returns the current amount of memory (in Kbytes) in use by Lua.
* LUA_GCCOUNTB: returns the remainder of dividing the current amount of bytes of memory in use by Lua by 1024.
* LUA_GCSTEP: performs an incremental step of garbage collection. The step "size" is controlled by data (larger values mean more steps) in a non-specified way. If you want to control the step size you must experimentally tune the value of data. The function returns 1 if the step finished a garbage-collection cycle.
* LUA_GCSETPAUSE: sets data as the new value for the pause of the collector (see §2.10). The function returns the previous value of the pause.
* LUA_GCSETSTEPMUL: sets data as the new value for the step multiplier of the collector (see §2.10). The function returns the previous value of the step multiplier.

In Lua, the only way an object like userdata can be deleted is by the garbage collector. You can call the garbage collector directly, like B Mitch wrote (use lua_gc(L, LUA_CGSTEP, ...)), but there is no warranty that exactly your temporary object will be freed.
The best way to solve this is to avoid the creation of temporary objects. If you need to pass fixed parameters to methods like SetPosition, try to modify the API so that it also accepts numeric arguments, avoiding the creation of a temporary object, like so:
myObject:SetPosition(123, 456)

Lua Gems has a nice piece about optimization for Lua programs.

Remember, Lua doesn't know until runtime whether or not you saved those objects- you could have put them in a table in the registry, for example. You shouldn't even notice the impacts of collecting 1.5MB, there's another problem here.
Also, you're really being a waste making a new object for that. Remember that in Lua every object has to be dynamically allocated, so you're calling malloc to .. make a Vector object to hold two numbers? Write your function to take a pair of numeric arguments as an overload.

Related

Why a pointer to a class take less memory SRAM than a "classic" variable

i have a Arduino Micro with 3 time of flight LIDAR micro sensors welded to it. In my code i was creating 3 Global variable like this:
Adafruit_VL53L0X lox0 = Adafruit_VL53L0X();
Adafruit_VL53L0X lox1 = Adafruit_VL53L0X();
Adafruit_VL53L0X lox2 = Adafruit_VL53L0X();
And it took like ~80% of the memory
Now i am creating my objects like this
Adafruit_VL53L0X *lox_array[3] = {new Adafruit_VL53L0X(), new Adafruit_VL53L0X(), new Adafruit_VL53L0X()};
And it take 30% of my entire program
I Try to look on arduino documentation but i don't find anything that can help me.
I can understand that creating a "classic" object can fill the memory. But where is the memory zone located when the pointer is create ?
You use the same amount of memory either way. (Actually, the second way uses a tiny bit more, because the pointers need to be stored as well.)
It's just that with the first way, the memory is already allocated statically from the start and part of the data size of your program, so your compiler can tell you about it, while with the second way, the memory is allocated at runtime dynamically (on the heap), so your compiler doesn't know about it up front.
I dare say that the second method is more dangerous, because consider the following scenario: Let's assume your other code and data already uses 90% of the memory at compile-time. If you use your first method, you will fail to upload the program because it would now use something like 150%, so you already know it won't work. But if you use your second method, your program will compile and upload just fine, but then crash when trying to allocate the extra memory at runtime.
(By the way, the compiler message is a bit incomplete. It should rather say "leaving 1750 bytes for local variables and dynamically allocated objects" or something along those lines).
You can check it yourself using this function which allows you to estimate the amount of free memory at runtime (by comparing the top of the heap with the bottom [physically, not logically] of the stack, the latter being achieved by looking at the address of a local variable which would have been allocated at the stack at that point):
int freeRam () {
extern int __heap_start, *__brkval;
int v;
return (int) &v - (__brkval == 0 ? (int) &__heap_start : (int) __brkval);
}
See also: https://playground.arduino.cc/Code/AvailableMemory/

Cross process COM Marshaler: reduce number of copies for large arrays

As simplified case: I need to transfer a VARIANT to another process over the existing COM interface. I currently use the MIDL-generated marshaller.
The actual transfer is for many values, is part of a time-critical process, and may involve large strings or safearray's (a few MB), thus number of copies made seems relevant.
Since the receiver needs to "keep" the data beyond the function call, at least one copy needs to be made by the marshaler. All signatures I can think of invlove two copies, however:
SetValue([in] VARIANT)
GetValue([out] VARIANT *) // called by receiver
In both cases, in my understanding the marshaller makes a cross-process copy that does get destroyed by the marshaller. Since I need to keep the data in the receiver, I need to make a second copy.
I considered "detaching" the data at the receiver:
SetValue([in, out] VARIANT *)
// receiver detaches value and sets to VT_EMPTY for return
But this would also destroy the source.
Q1: Is it possible to get the MIDL-generated marshaling code to do only one copy?
Q2: Would this be possible with a custom marshaller, and at what cost? (My first looks into that were extremly discouraging)
I am pretty mouch bound to using SAFEARRAY and/or other VARIANT/PROPVARIANT types, and to transfer the whole array.
[edit]
Both sides use C++, the interfaces are IUnknown-based, and it needs to work cross-process on a single machine, in the same context.
You don't say so explicitly, but it seems the problem you are seeking to solve is speed issues. In any case, consider using a profiler to identify the bottleneck if you haven't already done so.
I very much doubt in this case that it is the copying which is taking the time. Rather, it is likely to be the context-switching between processes involved, as you are getting the values one at a time. This means that for each value you retrieve, you have to switch processes to the target of the call, then switch back again.
You could speed this up enormously be making your design less "chatty" when setting or getting multiple values.
Something like this:
SetMultipleValues(
[in] SAFEARRAY(BSTR)* asNames,
[in] SAFEARRAY(VARIANT)* avValues
)
GetMultipleValues(
[in] SAFEARRAY(BSTR)* asNames,
[out,retval] SAFEARRAY(VARIANT)* pavValues
)
I.e. when calling GetMultipleValues, pass in an array of 10 names, and receive an array of 10 values in the same order as the names passed in, (or VT_EMPTY if the value does not exist).

What do C++ arrays init to?

So I can fix this manually so it isn't an urgent question but I thought it was really strange:
Here is the entirety of my code before the weird thing that happens:
int main(int argc, char** arg) {
int memory[100];
int loadCounter = 0;
bool getInput = true;
print_memory(memory);
and then some other unrelated stuff.
The print memory just prints the array which should've initialized to all zero's but instead the first few numbers are:
+1606636544 +32767 +1606418432 +32767 +1856227894 +1212071026 +1790564758 +813168429 +0000 +0000
(the plus and the filler zeros are just for formatting since all the numbers are supposed to be from 0-1000 once the array is filled. The rest of the list is zeros)
It also isn't memory leaking because I tried initializing a different array variable and on the first run it also gave me a ton of weird numbers. Why is this happening?
Since you asked "What do C++ arrays init to?", the answer is they init to whatever happens to be in the memory they have been allocated at the time they come into scope.
I.e. they are not initialized.
Do note that some compilers will initialize stack variables to zero in debug builds; this can lead to nasty, randomly occurring issues once you start doing release builds.
The array you are using is stack allocated:
int memory[100];
When the particular function scope exits (In this case main) or returns, the memory will be reclaimed and it will not leak. This is how stack allocated memory works. In this case you allocated 100 integers (32 bits each on my compiler) on the stack as opposed to on the heap. A heap allocation is just somewhere else in memory hopefully far far away from the stack. Anyways, heap allocated memory has a chance for leaking. Low level Plain Old Data allocated on the stack (like you wrote in your code) won't leak.
The reason you got random values in your function was probably because you didn't initialize the data in the 'memory' array of integers. In release mode the application or the C runtime (in windows at least) will not take care of initializing that memory to a known base value. So the memory that is in the array is memory left over from last time the stack was using that memory. It could be a few milli-seconds old (most likely) to a few seconds old (less likely) to a few minutes old (way less likely). Anyways, it's considered garbage memory and it's to be avoided at all costs.
The problem is we don't know what is in your function called print_memory. But if that function doesn't alter the memory in any ways, than that would explain why you are getting seemingly random values. You need to initialize those values to something first before using them. I like to declare my stack based buffers like this:
int memory[100] = {0};
That's a shortcut for the compiler to fill the entire array with zero's.
It works for strings and any other basic data type too:
char MyName[100] = {0};
float NoMoney[100] = {0};
Not sure what compiler you are using, but if you are using a microsoft compiler with visual studio you should be just fine.
In addition to other answers, consider this: What is an array?
In managed languages, such as Java or C#, you work with high-level abstractions. C and C++ don't provide abstractions (I mean hardware abstractions, not language abstractions like OO features). They are dessigned to work close to metal that is, the language uses the hardware directly (Memory in this case) without abstractions.
That means when you declare a local variable, int a for example, what the compiler does is to say "Ok, im going to interpret the chunk of memory [A,A + sizeof(int)] as an integer, which I call 'a'" (Where A is the offset between the beginning of that chunk and the start address of function's stack frame).
As you can see, the compiler only "assigns" memory-segments to variables. It does not do any "magic", like "creating" variables. You have to understand that your code is executed in a machine, and the machine has only a memory and a CPU. There is no magic.
So what is the value of a variable when the function execution starts? The value represented with the data which the chunk of memory of the variable has. Commonly, that data has no sense from our current point of view (Could be part of the data used previously by a string, for example), so when you access that variable you get extrange values. Thats what we call "garbage": Data previously written which has no sense in our context.
The same applies to an array: An array is only a bigger chunk of memory, with enough space to fit all the values of the array: [A,A + (length of the array)*sizeof(type of array elements)]. So as in the variable case, the memory contains garbage.
Commonly you want to initialize an array with a set of values during its declaration. You could achieve that using an initialiser list:
int array[] = {1,2,3,4};
In that case, the compiler adds code to the function to initialize the memory-chunk which the array is with that values.
Sidenote: Non-POD types and static storage
The things explained above only applies to POD types such as basic types and arrays of basic types. With non-POD types like classes the compiler adds calls to the constructor of the variables, which are designed to initialise the values (attributes) of a class instance.
In addition, even if you use POD types, if variables have static storage specification, the compiler initializes its memory with a default value, because static variables are allocated at program start.
the local variable on stack is not initialized in c/c++. c/c++ is designed to be fast so it doesn't zero stack on function calls.
Before main() runs, the language runtime sets up the environment. Exactly what it's doing you'd have to discover by breaking at the load module's entry point and watching the stack pointer, but at any rate your stack space on entering main is not guaranteed clean.
Anything that needs clean stack or malloc or new space gets to clean it itself. Plenty of things don't. C[++] isn't in the business of doing unnecessary things. In C++ a class object can have non-trivial constructors that run implicitly, those guarantee the object's set up for use, but arrays and plain scalars don't have constructors, if you want an inital value you have to declare an initializer.

What can be done to optimize the amount of time it takes to leave a method and empty out the stack of local variables?

I have a method which is responsible for taking an openGl triangle mesh and converting it to a 3ds file. This method is called exportShape(). To perform this conversion exportShape() creates a bunch of very large vectors and hash_maps. Currently, getting from the last line of exportShape() to the next line of code from where exportShape() was called can take up to 5 minutes. I’m sure that all this time is spent emptying out the very large stack of local variables because if I move all the local vectors and hash_maps to global scope the method exists instantly as I would expect.
Why am I able to populate all these local data structures in just a few seconds whereas popping them off the stack takes minutes? How can I optimized the process of leaving my exportShape() and clearing out the stack?
Edit:
The objects which are being deleted contain only strings, doubles and ints - nothing with a custom destructor.
I pretty much solved my own problem. Running in release mode is a huge performance increase (~20x). Nevertheless, the process still hangs for a few seconds. Is there anything else that can be done?
The problem in the first instance is that you're using a debug allocator, which marks freed memory with a bit pattern (e.g. 0xfdfdfdfd) to aid in detecting accesses to freed memory. This obviously takes time as it must iterate over all the freed memory.
To speed things up further, you could use a scoped allocator e.g. the Boost Pool Library; see also Creating a scoped custom memory pool/allocator?

ODBC Documentation Clarity

I am having trouble with a certain piece of documentation from MSDN. I am using C++ (or C, rather) to connection to an SQL Server instance via ODBC. Take at the code sample at the bottom of this piece of documentation.
Notice there is a function in the sample called AllocParamBuffer(). The documentation describes what it should do, but doesn't provide any further help. Could someone please give me a few pointers (no pun intended) as to how I could replicate the definition of this function for this particular case, or, better yet, show it could be done? I'm at a real roadblock, and I can't find any assistance elsewhere.
Any help would be greatly appreciated.
Thank you for your time.
You are referring to:
// Call a helper function to allocate a buffer in which to store the parameter
// value in character form. The function determines the size of the buffer from
// the SQL data type and parameter size returned by SQLDescribeParam and returns
// a pointer to the buffer and the length of the buffer.
AllocParamBuffer(DataType, ParamSize, &PtrArray[i], &BufferLenArray[i]);
All this does is allocate some memory one presumes with malloc (since the later free calls) to store the input parameter (PtrArray[i]) then set the buffer length BufferLenArray[i] (i.e. the amount of memory allocated for PtrArrayp[i]).
We'd only be guessing how it calculates how much memory to allocate since the amount required in this case will differ depending on the DataType and ParamSize returned by SQLDescribeParameter. The guess work is down to the fact all the parameters are bound as SQL_C_CHAR and some of them might not be string columns e.g., they could be dates.
All you need to do is malloc some memory, assign the pointer to PtrArray[i] and set the amount allocated in BufferLenArray[i].