Class Loading in JVM - classloader

When does the class get loaded in the JVM? Do they get loaded on the server start up or when there is a reference for the class? My assumption is that all the class gets loaded as the server like jboss starts up but then there is something called lazyloading.
Also what is actually meant by loading? Does it mean there is this .class in the JVM memory along with all the methods, variables including instance, static variables methods and are available for execution. I Know that ClassLoader locates the bytecodes for a Java class that needs to be loaded, reads the bytecodes, checks the refrencces of other class used in the particualr class and loads them as well by creating an instance of the java.lang.Class class. This makes the class available to the JVM for execution
Are methods also loaded in the JVM along with the class? My assumption is that methods are only in the stack memory of threads. then What is method memory? Is it a part of heap or stack?
Do only static methods get loaded along with class loading and not the instance method? Iknow that static bock gets executed when the class get laoded and also all the static variables get initialzed.
Thanks in advance if these doubts get cleared.

These are pretty much basic questions about JVM and Google could surely help you with answers.
For some of your questions (especially for the questions about the actual loading process), you could look here, for example: http://www.securingjava.com/chapter-two/chapter-two-7.html
On short, at the beginning, just the basic (and trusted) classes are loaded by the JVM. Next, other classloaders (for example the bootstrap classloader) are created as required and they will load some more classes. Before a class can be successfully loaded, all the classes it depends on must be loaded.
A loaded class is stored in memory in various forms (this is JVM specific), but a Class object is always exposed. Everything inside the class (methods, variables etc.) gets loaded. This doesn't mean that the class is also compiled (compilation happens later, when a method needs to be executed).
Allocation of method variables happens either on stack (for primitives) or on heap.
Initialization of static variables and execution of static blocks happens right after the class is loaded, before any instances of it are created.

Related

Can I create a second instance of a singleton in a DLL?

I have a static library which contains singletons. I need to load a separate instance of those singletons in the same process for testing purposes.
So I have created a DLL which links the same static library, and then the main process loads that DLL.
As soon as the DLL tries to load, I get access violations when trying to access the static instance pointers in the singletons.
Some posts that I have read say that it's impossible and that I need a second process, while others say that each DLL gets it's own copies of all the static variables in the static library it links, which suggests that this should work..
Is what I am trying to do possible?
Most of the time a singleton is really meant to be only one - your request is unusual.
I know that linking a static library into a DLL can result in multiple instances of static variables, because I've seen it myself. Each DLL or EXE gets its own copy of the static library via the linker, and thus its own copy of the static variables.
The access violations may come from problems with initialization order. The best way to control that is to make sure the static variables are within a function that initializes them just-in-time, rather than global variables.

Why calling GC.Collect is speeding things up

We have a C++ library (No MFC, No ATL) that provides some core functionality to our .NET application. The C++ DLL is used by SWIG to generate a C# assembly that can be used to access its classes/methods using PInvoke. This C# assembly is used in our .NET application to use the functionality inside C++ DLL.
The problem is related to memory leaks. In our .NET application, I have a loop in my .NET code that creates thousands of instances of a particular class from the C++ DLL. The loop keeps slowing down as it creates instances but if I call GC.Collect() inside the loop (which I know is not recommended), the processing becomes faster. Why is this? Calling Dispose() on the types does not have any affect on the speed. I was expecting a decrease in program speed on using GC.Collect() but it's just the opposite.
Each class that SWIG generates has a ~destructor which calls Dispose(). Every Dispose method has a lock(this) around the statements that make calls to dispose unmanaged memory. Finally it calls GC.SuppressFinalize. We also see AccessViolationException sporadically in Release builds. Any help will be appreciated.
Some types of object can clean up after themselves (via Finalize) if they are abandoned without being disposed, but can be costly to keep around until the finalizer gets around to them; one example within the Framework is the enumerator returned by Microsoft.VisualBasic.Collection.GetEnumerator(). Each call to GetEnumerator() will attach an object wrapped by the new enumerator object to various private update events managed by the collection' when the enumerator is Disposed, it will unsubscribe its events. If GetEnumerator() is called many times (e.g. many thousands or millions) without the enumerators being disposed and without an intervening garbage collection, the collection will get slower and slower (hundreds or thousands of times slower than normal) as the event subscription list keeps growing. Once a garbage-collection occurs, however, any abandoned enumerators' Finalize methods will clean up their subscriptions, and things will start working decently again.
I know you've said that you're calling Dispose, but I have a suspicion that something is creating an IDisposable object instance and not calling Dispose on it. If IDisposable class Foo creates and owns an instance of IDisposable class Bar, but Foo doesn't Dispose that instance within its own Dispose implementation, calling Dispose on an instance of Foo won't clean up the Bar. Once the instance of Foo is abandoned, whether or not it has been Disposed, its Bar will end up being abandoned without disposal.
Do you calling GC.SupressFinalize in your Dispose method? Anyway, there is nice MSDN article that explains, how to write GC friendly code - garbage collection basics and performance hints. Maybe it will be useful.

Should LD_PRELOAD load module or just use module to replace symbols

We have a multi-threaded c++ app compiled with g++ running on an embedded powerpc. To memory leak test this in a continuous integration test we've created a heap analyzer that gets loaded with ld_preload.
We'd like to guarantee that a function in the ld_preloaded module gets called before anything else happens (including creation of static objects etc...). Even more crucially we'd like to have another function that gets called right before the process exits so the heap analyzer can output its results. The problem we see is that a vector in our application is being created at global file scope before anything happens in our ld_preloaded module. The vector grows in size within main. At shutdown the destructor function in our preloaded module is called before the vector is destroyed.
Is there any way we can code a preloaded module to run a function before anything else and after everything else? We've tried using __attribute__((constructor)) and destructor without success.
Returning to the question title, I'm beginning to suspect that ld only looks in the preloaded module when resolving symbols for subsequent module loads. It doesn't actually load the preloaded module first. Can anyone shed any light on this for us?
Originally, you would have no control over the order of constructors from different translation units. So, this extends to shared libraries as well.
However, newer versions of GCC support applying a priority parameter to the constructor attribute which should allow you some control over when your specified function will run in relation to other global constructors. The default priority when not specified is the maximum priority value. So any priority level you set below that should make your constructor run before them, and your destructor after them.
static int initialize () __attribute__((constructor(101)));
static int deinitialize () __attribute__((destructor(101)));
static int initialize () {
puts("initialized");
}
static int deinitialize () {
puts("deinitialized");
}
101 appears to be the lowest priority level allowed to be specified. 65535 is the highest. Lower numbers are executed first.

Where are global variables of a DLL stored in memory?

Suppose you have a VB6 app which uses a C++ DLL. They share the same memory (you can use pointers from one in the other). The DLL is declared in the VB6 app with Public Declare Function ... Lib ...
So how does this fit with the "Stack grows from one side of memory, heap from the other" philosophy? Where is the stack of the DLL? Are global DLL variables allocated when the application is started? If so, why does it only give me an error when I try to run a function from the DLL?
VB6 uses thread local storage for module-level variables, not data segements. What this means is that public (global) variables in a module can have different values per different threads. Which is not what a C/C++ developer is used to.
Global variables are stored in the Data Segment.
http://en.wikipedia.org/wiki/Data_segment
The stack is only used for local variables.
Global DDL symbols will be in the DLL image itself. If the DLL uses the symbol as a pointer to which it attaches some dynamic memory, then the memory will be from whatever the dynamic allocation is from (typically the heap used by the CRT). We would need to see exactly how the VB declaration of the C++ import looks like and what the C++ DLL does (could be initializing on DllMain, could be a static region in the DLL image, could require call to some Init functione etc etc etc).
"Stack grows from one side of memory, heap from the other" was true maybe on 8088 processors, no such thing happens on modern platforms. Stack gets allocated per thread and goes upwards, true, but there could be hundreds of stacks in a process. Heap gets allocated all over the place and grows, basically, at random. And a typical process also has several heaps in it.
There is typically one stack per thread. The function in the DLL will use the stack of the current thread (the thread on which is was invoked).
See Remus's answer to your other questions about memory management.

Guice: Why must #Singleton-annotated classes be immutable/thread safe?

NOTE: This question is not about Singleton classes as described in Gamma94 (ensuring only one object ever gets instantiated.)
I read the Guice documentation about the #Singleton attribute:
Classes annotated #Singleton and #SessionScoped must be threadsafe.
Is this the case even if I don't intend to access the object from more than one thread? If so, why?
If an object is only ever accessed from a single thread, it doesn't need to be threadsafe even it's a Guice #Singleton. Guice doesn't do any multithreading internally that could cause a non-threadsafe singleton to break... the process of building the Injector is all done on the thread that calls Guice.createInjector and any dynamic provisioning is done on the thread that calls provider.get(). Of course, a singleton is only going to be created once and then just returned each time it's needed... when it's created depends on whether it's bound as an eager singleton (always created at startup) and whether the Injector is created in Stage.DEVELOPMENT (created only if and when needed) or Stage.PRODUCTION (created at startup).
It's very often the case that singletons can be accessed from multiple threads at the same time though (particularly in web applications), hence the warning. While many developers will understand that a singleton needs to be threadsafe in that case, others may not and I imagine it was considered worth it to warn them.