Profiling jetty with visualvm is super slow

Profiling jetty with visualvm is super slow - profiling

I have a wicket+spring+hibernate application running on Jetty. When I start CPU profiling it with VisualVM (jdk 1.7.0_9) it first stalls for several minutes with console prints:
Profiler Agent: 250 classes cached.
Profiler Agent: 250 classes cached.
These lines are repeated around 20 times, then VisualVM says it has started instumentation and instrumented around 8000 methods.
Now after this I click a button on my web application and again the application completely hangs for few minutes while console prints out lines like:
Profiler Agent: Redefining 100 classes at idx 100, out of total 336
After this I get profiling results but they are pretty useless as almost 99.6% of the time is spent by
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run()
This makes VisualVM totally unusuable. Any guesses what could be the culprit here?
I'm running Jetty 8.1.2.v20120308

I would suggest to start with CPU sampling first. Once you have an idea what is wrong, you can switch to profiling to get detailed information. Be sure to read Profiling With VisualVM, Part 1 and Profiling With VisualVM, Part 2 to get more information how to set up profiling of your Jetty server.

The answer is to narrow the scope of what classes are being instrumented.
Click on the settings option in the profiler and look at "Do not profile classes" or "profile only classes". Be sure to exclude third party libraries that you don't want to examine. For example, I was using Jython in my app and the profiler was trying to instrument thousands of classes, likely including classes dynamically generated at runtime (not good).

Related

CPU Usage gradually increases in dotnet core webservice

I have a .net Core web service which seems to slowly increase its cpu usage.
meaning at the first day it won't go past 10%, the second day it can go up to 20% and so on.
Using the TOP command in linux, all my webservices seems to sometime be shown there (probably when a request is made) and afterward disappear.
This specific process after running for a while just stays there constantly consuming cpu even when no request has been made.
the API still working fine, it seems like there are some threads that just keeps hanging and consuming cpu. last time I checked I had 5 threads that consumed 3-4% cpu and didn't die for some reason.
My guess is that in some specific scenario a thread just stays alive consuming cpu.
The app runs on ubuntu machine, my first step was trying to create a dump file with ProcDump so I can analyze those threads and maybe find where they are hanging.
ProcDump generates a huge 21gb file, which trying to analyze with lldb throws out of memory exception. even tried transferring it to a windows machine to debug with windbg , no help there as it couldn't open the file.
As there is no specific exception or anything I can't really share any piece of code as I have no idea where the issue is... just kind a hoping for some suggestion that might help me get to a solution or at least understand where the problem is.
Thanks a lot for reading, cheers

You could try using something like jetBrains’ DotMemory, they also have a fairly high level but helpful guide https://www.jetbrains.com/help/dotmemory/How_to_Find_a_Memory_Leak.html it also worth checking your startup file and double checking the services you’ve registered are used in the correct way ie not added as scoped when they should be transient or even a singleton etc

so iv'e been at it for a while.
Eventually found out that my problem was with HttpClient
Probably some bad mix of static class and creating new instances of HttpClient that causes the issue Iv'e explained above.
Solved it by utilizing HttpClientFactory as explained here -
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/http-requests?view=aspnetcore-2.1
Lesson learned :)

A little late but Procdump for Linux just added .NET Core 3 support that generates much more managable sized core dumps. It automatically detects if the target process is .NET Core and does the right thing (i.e., no need to specify switches).

Visual Studio - Program runs faster when profiling

I have been doing some profiling of a physics application I wrote, and I've noticed when I profile it, it runs faster and perhaps smoother than without the profiler. Note that I am NOT running the program in the debug configuration or with the debugger attached.
I measured the difference, and I found program runs ~50% faster under the profiler. I don't consider this a duplicate because the other question doesn't make it clear whether he/she was running it with the debugger attached, and the top answer assumes that's the case (And the 20x speedup strongly indicates it would be the correct answer in most cases).
Another answer suggests a "heisenburg" bug, but that's kind of a catchall hypothesis (I'm still going to investigate down this line).
Is it possible that Visual Studio does something that prevents other applications from interfering with my application's compute or memory resources (in order to get a "fairer" result)?

Visual Studio's "CPU Usage" profiler appears to disregard laptop power usage settings, so if you run an application on a laptop that is trying to conserve battery power, it will run slower than if you run the profiler on it.
I discovered this when I got home from work- I noticed the speed difference had disappeared. On a hunch, I unplugged my laptop and tried the test several more times. The speed difference returned. What's more, under the profiler, the application runs at about the same speed plugged in or not.
I was not able to find any sources on this, but I'll be happy to edit them in if someone can find some.

If you use threading in your code, this can be caused due to the System Timer Resolution in Windows.
Default windows timer resolution is 15.6ms
When you are running the profiler, this is reduced to 1ms and your program run fast.
Checkout this answer

ColdFusion scheduler threads eating CPU

I've got CF10 running on a dev box, Windows 7, 64 bit.
Periodically, every minute or so, the CPU usage for CF10 will spike up to 100% for about 20 seconds and come back down. It's pretty regular.
I've found it difficult to diagnose this issue. I've seen talk of client variables purges, logging, monitoring and all manner of things - but I've turned these all off to no avail.
With VisualVM, I've managed to track the issue down to the 'scheduler' threads. I have 5 of these in a waiting state. Periodically each will run, bumping up the CPU dramatically.
Taking a thread dump, it seems that all these threads are calling java.io.WinNTFileSystem.getBooleanAttributes - something I've seen mentioned a few times as potentially problematic.
UPDATE: Recently I've been playing with onSessionEnd on another app, and discovered that the scheduler-x threads appear to be internal to ColdFusion - my onSessionEnd tasks always seem to run in one of these threads.
Looking in the temp folder, I can see that a lot of EH Cache folders have been made which I think are to do with query caching. The apps I have running make use of this fairly extensively. I thought clearing the temp folder out might improve performance but it has had no effect.
It's worth noting that if I start the CF service without actually calling any of my apps, the problem does not occur. That might suggest the issue is with the apps themselves, however they do not cause any issue in production - only on this box.
There are no scheduled tasks set up either.
Below is an example of one of the threads causing high CPU. I'd appreciate any help in diagnosing what this thread is doing and why, as well as how to potentially stop it from using so much resources.
"scheduler-2" - Thread t#84
java.lang.Thread.State: RUNNABLE
at java.io.WinNTFileSystem.getBooleanAttributes(Native Method)
at java.io.File.isDirectory(File.java:849)
at coldfusion.watch.Watcher.accept(Watcher.java:352)
at java.io.File.listFiles(File.java:1252)
at coldfusion.watch.Watcher.getFiles(Watcher.java:386)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.getFiles(Watcher.java:397)
at coldfusion.watch.Watcher.checkWatchedDirectories(Watcher.java:166)
at coldfusion.watch.Watcher.run(Watcher.java:216)
at coldfusion.scheduling.ThreadPool.run(ThreadPool.java:211)
at coldfusion.scheduling.WorkerThread.run(WorkerThread.java:71)
My environment:
Win 7 64-bit
CF10 Update 12
JDK 1.8.0_11
The issue occurs on multiple versions of JVM - this version is currently used to make monitoring available.
My java settings:
Min heap size: 512mb
Max heap size: 1024mb
-server -XX:MaxPermSize=512m -XX:+UseParallelGC -Xbatch -Dcoldfusion.home={application.home} -Dcoldfusion.rootDir={application.home} -Dcoldfusion.libPath={application.home}/lib -Dorg.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER=true -Dcoldfusion.jsafe.defaultalgo=FIPS186Random -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8701 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false
I'd be lying if I said I understood what all of these settings do!
Sorry if you're one of those people that believes all CF developers should be Java app stack experts. I am not.
Any help, much appreciated. ;)

Using FusionReactor 6, I was able to solve this for us today. We were using this.javaSettings to hot load java class files. The WatchInterval from this.javaSettings uses the DirectoryWatcher at the specified watch number. In our case, I had lowered it to one second.
How I solved it: I set a breakpoint in FusionReactor and could see that it was constantly stuck scanning the directory above the one I specified in this.javasettings. This directory has enough files and subfolders, that it looks like one DirectoryWatcher was unable to finish before the next one was created. Had ColdFusion just stuck to the subfolder, I specified in this.javaSettings, it would not have been a problem.
Example:
This.javaSettings = {
loadPaths = ["\externals\lib\"]
, loadColdFusionClassPath = true
, reloadOnChange = true
, watchInterval = 1
};
In the above case, lib has just 5 files. However, "externals" is loaded with stuff. In the breakpoint, it was typically looking at stuff in "externals."

Do you have scheduled tasks running that use the CFFILE tag? They tend to be resource hogs. Spinning these into their own threads may help with the CPU spike.
another thought:
looking at the JVM,
•Min heap size: 512mb
•Max heap size: 1024mb
These establish the minimum and maximum memory available to the java virtual machine
-server -XX:MaxPermSize=512m
This is the amount of memory dedicated to the java permanent memory generation.
you've got half of your JVM allocated memory dedicated to the permanent generation, try bumping up the maximum heap size to 2048mb. and restarting the ColdFusion service. It could go higher based on whether or not you're running a 64Bit operating system or not.

Is there anyway to reduce nrepl (ritz-repl) startup time?

I wasn't using ritz-nrepl before, and nrepl took around 10 sec which is long but still bearable since I don't restart it that often.
When I tried out ritz-repl, it took nearly 30s to boot, and consumes around 1.3G memory.
This makes me reluctant to use it.
I even threw in a SSD hoping it can increase the speed, because I heard someone mention that he "hardly notice the lein repl startup time" using ubuntu + ssd. But I can't tell the difference myself between ssd and hdd. I don't know if I did something wrong or if its just a myth.

There might be ways to reduce the startup time of an nrepl server that includes ritz but for the most part you will be stuck with at least the 10 seconds it takes to boot up the jvm on your machine. For me that is kind of an unacceptable delay when doing interactive development.
As an alternative you can use a smarter code reloading approach using the clojure.tools.namespace library. It basically keeps a dependency graph in memory and reloads only those namespaces that have been changed since you last refresh.
This will work out of the box for some but not all Clojure code. See the 'Preparing Your Application' section of the readme for more info on those edge cases to avoid.
Hope this helps!

Profiling a Play framework application (2.0.2) through VisualVM

I'm having some serious problems regarding my Play! Applications performance. I already tried to change the server and the data base, but the slowness persists.
Using Firebug to measure my http requests I found out that they are taking around 20 seconds just to start replying.
So my last hope is to use VisualVM to profile my application and find its bottle necks. But I don't know the proper way of passing some arguments like -Dcom.sun.management.jmxremote without messing with the global JAVA_OPTS variable.
Thanks again!

It looks like Metrics handles this automatically.
Add the following to your Build.scala app dependencies:
"com.yammer.metrics" % "metrics-core" % "2.1.2"
And start instrumenting your code. Then start up the application with "play run" -- VisualVM should show your JVM process and you can just connect to it directly (assuming you have the VisualVM-MBeans plugin). Check to see if you have at least 1.3.4. This is what I see when I start up:
the xsbt.boot.Boot process is Play.
More generally, this article really helps when debugging Akka based frameworks like Play.

In case someone needs to profile a Play 2.3.x app:
Put your JAVA_OPTS settings in ~/.activator/activatorconfig.txt (c.f. https://typesafe.com/activator/docs):
-Dcom.sun.management.jmxremote.port=1234
-Dcom.sun.management.jmxremote.rmi.port=1234
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=127.0.0.1
In VisualVM, add a Local JMX connection to localhost:1234

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js