Why calling GC.Collect is speeding things up - c++

We have a C++ library (No MFC, No ATL) that provides some core functionality to our .NET application. The C++ DLL is used by SWIG to generate a C# assembly that can be used to access its classes/methods using PInvoke. This C# assembly is used in our .NET application to use the functionality inside C++ DLL.
The problem is related to memory leaks. In our .NET application, I have a loop in my .NET code that creates thousands of instances of a particular class from the C++ DLL. The loop keeps slowing down as it creates instances but if I call GC.Collect() inside the loop (which I know is not recommended), the processing becomes faster. Why is this? Calling Dispose() on the types does not have any affect on the speed. I was expecting a decrease in program speed on using GC.Collect() but it's just the opposite.
Each class that SWIG generates has a ~destructor which calls Dispose(). Every Dispose method has a lock(this) around the statements that make calls to dispose unmanaged memory. Finally it calls GC.SuppressFinalize. We also see AccessViolationException sporadically in Release builds. Any help will be appreciated.

Some types of object can clean up after themselves (via Finalize) if they are abandoned without being disposed, but can be costly to keep around until the finalizer gets around to them; one example within the Framework is the enumerator returned by Microsoft.VisualBasic.Collection.GetEnumerator(). Each call to GetEnumerator() will attach an object wrapped by the new enumerator object to various private update events managed by the collection' when the enumerator is Disposed, it will unsubscribe its events. If GetEnumerator() is called many times (e.g. many thousands or millions) without the enumerators being disposed and without an intervening garbage collection, the collection will get slower and slower (hundreds or thousands of times slower than normal) as the event subscription list keeps growing. Once a garbage-collection occurs, however, any abandoned enumerators' Finalize methods will clean up their subscriptions, and things will start working decently again.
I know you've said that you're calling Dispose, but I have a suspicion that something is creating an IDisposable object instance and not calling Dispose on it. If IDisposable class Foo creates and owns an instance of IDisposable class Bar, but Foo doesn't Dispose that instance within its own Dispose implementation, calling Dispose on an instance of Foo won't clean up the Bar. Once the instance of Foo is abandoned, whether or not it has been Disposed, its Bar will end up being abandoned without disposal.

Do you calling GC.SupressFinalize in your Dispose method? Anyway, there is nice MSDN article that explains, how to write GC friendly code - garbage collection basics and performance hints. Maybe it will be useful.

Related

How to expose a thread-safe interface that allocate resources?

I'm trying to expose a C interface for my C++ library. This notably involve functions that allow the user to create, launch, query the status, then release a background task.
The task is implemented within a C++ class, which members are protected from concurrent read/write via an std::mutex.
My issue comes when I expose a C interface for this background task. Basically I have say the following functions (assuming task_t is an opaque pointer to an actual struct containing the real task class):
task_t* mylib_task_create();
bool mylib_task_is_running(task_t* task);
void mylib_task_release(task_t* task);
My goal is to make any concurrent usage of these functions thread-safe, however I'm not sure exactly how, i.e. that if a client code thread calls mylib_task_is_running() at the same time that another thread calls mylib_task_release(), then everything's fine.
At first I thought about adding an std::mutex to the implementation of task_t, but that means the delete statement at the end of mylib_task_release() will have to happen while the mutex is not held, which means it doesn't completely solve the problem.
I also thought about using some sort of reference counting but I still end up against the same kind of issue where the actual delete might happen right after a hypothetical retain() function is called.
I feel like there should be a (relatively) simple solution to this but I can't quite put my hand on it. How can I make it so I don't have to force the client code to protect accesses to task_t?
if task_t is being deleted, you should ensure that nobody else has a pointer to it.
if one thread is deleting task_t and the other is trying to acquire it's mutex, it should be apparent that you should not have deleted the task_t.
shared_ptrs are a great help for this.

Access violation when using boost::signals2 and unloading DLLs

I'm trying to figure out this problem.
Suppose, you have a code that uses boost::signals2 for communicating between objects. Lets call them "colorscales". Code for these colorscales is usually situated in the same DLL as the code that uses them. Let's call it main.dll
But sometimes code from other DLLs needs to use these objects and this is where the problems begin.
Basically, the application is pretty big and most of the DLLs are loaded to do some work and then they are unloaded. This is not the case with DLL that contain colorscales code, it's neved unloaded during application normal runtime.
So, when one of the DLLs is loaded (lets call it tools.dll) and some code runs, it may want to use these colorscale objects and communicate with them, so I connect to the signals these objects provide.
The problem is that boost is pretty lazy and all clever, and when you disconnect() slots, it doesn't actually erase connection and stuff that is associated with it (like boost::bind object and such). It just sets a flag that this connection is now disconnected and cleans it up on later (actually, it clean up 2 of these objects when you connect new slots and 1 of them when you invoke signal as of version 1.57). You probably already see where this is coming to.
So, you when you don't need more tools, you disconnect these signals and then application unloads tools.dll.
Then at a later stage, some code executes from the main.dll which causes one of colorscale signals invoked. boost::signals2 goes to invoke it, but before it tries to clean up one disconnected slot. This is where access violation happens, because internally connection had a shared_state object or something like this, that tries to clean itself up in a thread-safe way. But it faces the problem, that the code that it tries to call is already not there, because DLL is unloaded, so the Access Violation exception is thrown.
I've tried to fix this by invoking signal with some dummy parameters before DLL is unloaded and also by connecting and then disconnecting more slots (this one was a stupid idea, because it doesn't solve problem, but just multiplies it) some predefined amount of times (2 or 3 times more than there are slots at all).
It worked, or I thought so, because now it doesn't crash instantly, but rather crashes the next time you load the same tools.dll. I still need to figure out where and why does it crash, but it's somewhere else inside boost.
So, I wanted to ask, what are my options of fixing it?
My thoughts were
Implementing my own connection that works in a more simple way
Providing a more simple way to communicate, like, callbacks, for instance
Finding a workaround for boost being so lazy and smart.
Well, it seems that I've found the cause of the crash after the fix.
So, basically, what happens, when you use the workaround described above (calling signal with dummy parameters multiple times), what it does is that it replaces _shared_state object that was created from boost code from main.dll by another _shared_state object that is created from boost code from tools.dll. This object maintains pointer to reference counter (of type derived from boost::detail::sp_counter_base) inside.
Then the tools.dll unloads and the object remains, but its virtual table is pointing to the code that is no longer there. Let's look at the virtual table of the reference counter to understand what's going on.
[0] 0x000007fed8a42fe5 tools.dll!boost::detail::sp_counted_impl_p<...>::`vector deleting destructor'(unsigned int)
[1] 0x000007fed8a4181b tools.dll!boost::detail::sp_counted_impl_p<...>::dispose(void)
[2] 0x000007fed8a4458e tools.dll!boost::detail::sp_counted_base::destroy(void)
[3] 0x000007fed8a43c42 SegyTools.dll!boost::detail::sp_counted_impl_p<...>::get_deleter(class type_info const &)
[4] 0x000007fed8a42da6 tools.dll!boost::detail::sp_counted_impl_p<...>::get_untyped_deleter(void)
As you can see, all these method are connected to the disposal of reference counter, so the problem doesn't arise before you try to do the same trick second time. So, the trick with disconnecting all signals to try to get rid of all the code from tools.dll doesn't work as expected and the next time you try to do the trick, Access Violation occurs.

Guice: Why must #Singleton-annotated classes be immutable/thread safe?

NOTE: This question is not about Singleton classes as described in Gamma94 (ensuring only one object ever gets instantiated.)
I read the Guice documentation about the #Singleton attribute:
Classes annotated #Singleton and #SessionScoped must be threadsafe.
Is this the case even if I don't intend to access the object from more than one thread? If so, why?
If an object is only ever accessed from a single thread, it doesn't need to be threadsafe even it's a Guice #Singleton. Guice doesn't do any multithreading internally that could cause a non-threadsafe singleton to break... the process of building the Injector is all done on the thread that calls Guice.createInjector and any dynamic provisioning is done on the thread that calls provider.get(). Of course, a singleton is only going to be created once and then just returned each time it's needed... when it's created depends on whether it's bound as an eager singleton (always created at startup) and whether the Injector is created in Stage.DEVELOPMENT (created only if and when needed) or Stage.PRODUCTION (created at startup).
It's very often the case that singletons can be accessed from multiple threads at the same time though (particularly in web applications), hence the warning. While many developers will understand that a singleton needs to be threadsafe in that case, others may not and I imagine it was considered worth it to warn them.

Is it safe to emit a sigc++ signal in a destructor?

My current project has a mechanism that tracks/proxies C++ objects to safely expose them to a script environment. Part of its function is to be informed when a C++ object is destroyed so it can safely clean up references to that object in the script environment.
To achieve this, I have defined the following class:
class DeleteEmitter {
public:
virtual ~DeleteEmitter() {
onDelete.emit();
}
sigc::signal<void> onDelete;
};
I then have any class that may need to be exposed to the script environment inherit from this class. When the proxy layer is invoked it connects a callback to the onDelete signal and is thus informed when the object is destroyed.
Light testing shows that this works, but in live tests I'm seeing peculiar memory corruptions (read: crashes in malloc/free) in unrelated parts of the code. Running under valgrind suggests there may be a double-free or continued use of an object after its been freed, so its possible that there is an old bug in a class that was only exposed after DeleteEmitter was added to its inheritance hierarchy.
During the course of my investigation it has occured to me that it might not be safe to emit a sigc++ signal during a destructor. Obviously it would be a bad thing to do if the callback tried to use the object being deleted, but I can confirm that is not what's happening here. Assuming that, does anyone know if this is a safe thing to do? And is there a more common pattern for achieving the same result?
The c++ spec guarantees that the data members in your object will not be destroyed until your destructor returns, so the onDelete object is untouched at that point. If you're confident that the signal won't indirectly result in any reads, writes or method calls on the object(s) being destroyed (multiple objects if the DeleteEmitter is part of another object) or generate C++ exceptions, then it's "safe." Assuming, of course, that you're not in a multi-threaded environment, in which case you also have to ensure other threads aren't interfering.

Static objects and singletons

I'm using boost singletons (from serialization).
For example, there are some classes which inherit boost::serialization::singleton. Each of them has such define near it's definition (in h-file):
#define appManager ApplicationManager::get_const_instance()
class ApplicationManager: public boost::serialization::singleton<ApplicationManager> { ... };
And I have to call some method from that class each update (nearly 17 ms), for example, 200 times. So the code is like:
for (int i=0; i < 200; ++i)
appManager.get_some_var();
I looked with gprof at function call stack and saw that boost::get_const_instance calls each time. Maybe, in release-mode compiler will optimize this?
My idea is to make some global variable like:
ApplicationManager &handle = ApplicationManager::get_const_instance();
And use handle, so it wouldn't call get_const_instnace each time. Is that right?
Instead of using the Singleton anti-pattern, just a global variable and be done with it. It's more honest.
The main benefit of Singleton is when you want lazy initialization, or more fine grained control over initialization order than a global variable would allow you. It doesn't look like either of these things are a concern for you, so just use a global.
Personally, I think designs with global variables or Singletons are almost certainly broken. But to each h(is/er) own.
If you are bent on using a Singleton, the performance concern you raise is interesting, but likely not an issue as the function call overhead is probably less than 100ns. As was pointed out, you should profile. If it really concerns you a whole lot, store a local reference to the Singleton before the loop:
ApplicationManager &myAppManager = appManager;
for (int i=0; i < 200; ++i)
myAppManager.get_some_var();
BTW, using that #define in that way is a serious mistake. Almost all cases where you use the preprocessor for anything other than conditional compilation based on compile-time flags is probably a poor use. Boost does make extensive use of the pre-processor, but mostly to get around C++ limitations. Do not emulate it.
Lastly, that function is probably doing something important. One of the jobs of a get_instance method for Singletons is to avoid having multiple threads initialize the same Singleton at the same time. With global variables this shouldn't be an issue because they should be initialized before you've started any threads.
Is it really a problem? I mean, does your application really suffers for this behaviour?
I would despise such a solution because, in all effects, you are countering one of the benefits of the Singleton pattern, namely to avoid global variables. If you want to use a global variable, then don't use Singleton at all, right?
Yes, that is certainly a possible solution. I'm not entirely sure what boost is doing with its singleton behind the scenes; you can look that up yourself in the code.
The singleton pattern is just like creating a global object and accessing the global object, in most respects. There are some differences:
1) The singleton object instance is not created until it is first accessed, whereas the global object is created at program startup.
2) Because the singleton object is not created until it is first accessed, it is actually created when the program is running. Thus the singleton instance has access to other fully constructed objects in the program when the constructor is actually running.
3) Because you access the singleton through the getInstance() method (boost's get_const_instance method) there is a little bit of overhead for executing that method call.
So if you're not concerned about when the singleton is actually created, and can live with it being created at program startup, you could just go with a global variable and access that. If you really need the singleton created after the program starts up, then you need the singleton. In that case, you can grab and hold onto a reference to the object returned by get_const_instance() and use that reference.
Something that bit me in the past though you should be aware of. You're actually getting a reference to the object that is owned by the singleton. You don't own that object.
1) Do not write code that would cause the destructor to execute (say, using a shared pointer on the returned reference), or write any other code that could cause the object to end up in a bad state.
2) In a multi-threaded app, take care to correctly lock fields in the object if the object may be used by more than one thread.
3) In a multi-threaded app, make sure that all threads that hold onto references to the object terminate before the program is unloaded. I've seen a case where the singleton's code resides in one DLL library; a thread that holds the reference lives in another DLL library. When the program ends, the thread was still active. The DLL holding the singleton's code was unloaded first; the thread that was still alive tried to do something to the singleton's object and caused a crash.
Singletons have their advantages in situations where you want to control the level of access to something at process or application scope beyond what a global variable could achieve in a more elegant way.
However most singleton objects provided in a library will be designed to ensure some level of thread safety and most likely access to the instance is being locked via a mutex or other critical section of some kind which can affect performance.
In the case of a game or 3d application where performance is key you may want to consider making your own lightweight singleton if thread safety is not a concern and gain some performance.