Make JProfiler ignore `Thread.sleep()` in CPU views - profiling

In JProfiler, in the Call Tree and Hot Spots views of the CPU profiler, I have found that JProfiler is showing some methods as hot spots which aren't really hot spots. These methods are skewing my profiling work, as they are dominating these CPU views and making all other CPU consumers appear insignificant.
For example one thread is performing a Thread.sleep(300_000L) (sleeping for 5 minutes), and then doing some relatively minor work -- in a while(true) loop. JProfiler is configured to update the view every 5 seconds, and I have set the thread status selector to "Runnable". Every 5 seconds when JProfiler updates the view, I would expect the total self-time for the method to remain relatively small since the thread is sleeping and not in a runnable state, however instead I see the self time increment by about 5 seconds which would indicate (incorrectly) that the entire 5-second interval, the thread was in the runnable state. My concern is that the tool will be useless for my CPU profiling purposes if I cannot filter out the sleeping (Waiting) state.
With some testing, I have found that when the Thread.sleep() call eventually terminates, the self time drops down to near-zero again, and begins climbing again with the next invocation of Thread.sleep(). So to me it seems JProfiler is counting the method stats for the current invocation of Thread.sleep() as Runnable -- until the method actually terminates and then these stats are backed out of.
Is this a bug in JProfiler? Is there a way to get JProfiler to not count Thread.sleep() towards the Runnable state, even for long-running invocations of Thread.sleep()?
I am using a licensed version of JProfiler 8.1.4. I have also tried an evaluation version of JProfiler 10.1.
Update:
Here is a simple test case which exhibits this problem for me. I discovered that if I move the Thread.sleep() call to a separate method the problem goes away (see the in-line comments). This is not a great workaround because I'm profiling a large application and don't want to update all of the places where it calls Thread.sleep().
public class TestProfileSleep {
public static void main(String... args) {
new Thread(new Runnable() {
private void sleep(long millis) throws InterruptedException {
Thread.sleep(millis);
}
public void run() {
try {
while (true) {
Thread.sleep(60_000L); // profiling this is broken
//sleep(60_000L); // profiling this works
}
}
catch (InterruptedException ie) {
}
}
}).start();
}
}

Related

Create threads dynamically depending on time needs of single tasks

Say I have a list of callable objects like
std::list<std::shared_ptr<Callable>> tasks;
and the task is to run them all in an infinite loop, say
void run_all(const bool& abort){
while(true){
for(const auto& ptr : tasks){
if (abort) return;
(*ptr)();
}
}
}
This is fine as long as every "task" finishes after short time. Now I'd like to add the requirement that whenever a task needs more time than a specific threshold, a new thread should be created so that the other tasks do not have do wait for a specific long running task.
The simplest solution regarding code complexity I can think of at the moment would be creating a thread for each task:
void run_all(const bool& abort){
auto job = [&](std::shared_ptr<Callable> task){
while (!abort){
(*task)();
}
};
std::list<std::thread> threads;
for(auto& ptr : tasks){
threads.emplace_back(job, ptr);
}
for(auto& t : threads){
t.join();
}
}
But this might create inappropriate many threads.
What is an appropriate way to implement running the tasks and create threads dynamically depending on how long a tasks needs to be finished? Say we have got some
std::chrono::duration threshold;
and the goal is to run the first task and continue with the next afterwards if the first one takes no longer than threshold until finish, but create a new thread to run the rest of the tasks in parallel, if the first task does not finish before threashold. The generalized goal is:
If there is no task that has been finished in some thread so that another task began to run during the certain period of time threshold, then a new thread should be created so that other tasks which may potentially run in very short time do not have to wait.
If there are more than 3 threads that finish at least one task per period threshold, one of them should be joined.
There may be tasks that itself run ad infinitum. This should have no effect on the other tasks.
What could be an appropriate implementation satisfying these requirements or at least doing something related or at least a concept of an implementation?
Or is it completely fine to just create a bunch of threads? (I think about running such an application on a low performance machine like Raspberry Pi and a set of 50 to 300 tasks that should be treated.)

Experiencing deadlocks when using the Hikari transactor for Doobie with ZIO

I'm using Doobie in a ZIO application, and sometimes I get deadlocks (total freeze of the application). That can happen if I run my app on only one core, or if I reach the number of maximum parallel connections to the database.
My code looks like:
def mkTransactor(cfg: DatabaseConfig): RManaged[Blocking, Transactor[Task]] =
ZIO.runtime[Blocking].toManaged_.flatMap { implicit rt =>
val connectEC = rt.platform.executor.asEC
val transactEC = rt.environment.get.blockingExecutor.asEC
HikariTransactor
.fromHikariConfig[Task](
hikari(cfg),
connectEC,
Blocker.liftExecutionContext(transactEC)
)
.toManaged
}
private def hikari(cfg: DatabaseConfig): HikariConfig = {
val config = new com.zaxxer.hikari.HikariConfig
config.setJdbcUrl(cfg.url)
config.setSchema(cfg.schema)
config.setUsername(cfg.user)
config.setPassword(cfg.pass)
config
}
Alternatively, I set the leak detection parameter on Hikari (config.setLeakDetectionThreshold(10000L)), and I get leak errors which are not due to the time taken to process DB queries.
There is a good explanation in the Doobie documentation about the execution contexts and the expectations for each: https://tpolecat.github.io/doobie/docs/14-Managing-Connections.html#about-transactors
According to the docs, the "execution context for awaiting connection to the database" (connectEC in the question) should be bounded.
ZIO, by default, has only two thread pools:
zio-default-async – Bounded,
zio-default-blocking – Unbounded
So it is quite natural to believe that we should use zio-default-async since it is bounded.
Unfortunately, zio-default-async makes an assumption that its operations never, ever block. This is extremely important because it's the execution context used by the ZIO interpreter (its runtime) to run. If you block on it, you can actually block the evaluation progression of the ZIO program. This happens more often when there's only one core available.
The problem is that the execution context for awaiting DB connection is meant to block, waiting for free space in the Hikari connection pool. So we should not be using zio-default-async for this execution context.
The next question is: does it makes sense to create a new thread pool and corresponding execution context just for connectEC? There is nothing forbidding you to do so, but it is likely not necessary, for three reasons:
You want to avoid creating thread pools, especially since you likely have several already created from your web framework, DB connection pool, scheduler, etc. Each thread pool has its cost. Some examples are:
More to manage for the jvm JVM
Consumes more OS resources
Switching between threads, which that part is expensive in terms of performance
Makes your application runtime more complex to understand(complex thread dumps, etc)
ZIO thread pool ergonomics start to be well optimized for their usage
At the end of the day, you will have to manage your timeout somewhere, and the connection is not the part of the system which is the most likely to have enough information to know how long it should wait: different interactions (ie, in the outer parts of your app, nearer to use points) may require different timeout/retry logic.
All that being said, we found a configuration that works very well in an application running in production:
// zio.interop.catz._ provides a `zioContextShift`
val xa = (for {
// our transaction EC: wait for aquire/release connections, must accept blocking operations
te <- ZIO.access[Blocking](_.get.blockingExecutor.asEC)
} yield {
Transactor.fromDataSource[Task](datasource, te, Blocker.liftExecutionContext(te))
}).provide(ZioRuntime.environment).runNow
def transactTask[T](query: Transactor[Task] => Task[T]): Task[T] = {
query(xa)
}
I made a drawing of how Doobie and ZIO execution context map one other to each other: https://docs.google.com/drawings/d/1aJAkH6VFjX3ENu7gYUDK-qqOf9-AQI971EQ4sqhi2IY
UPDATE: I created a repos with 3 examples of that pattern usage (mixed app, pure app, ZLayer app) here: https://github.com/fanf/test-zio-doobie
Any feedback is welcome.

Profile the locks in OpenJDK or any Java VM

The thing I want to do is, count how many locks happened during the execution of one JVM application. (I know the lock number may change from run to run, but I just want to get the average number). And I cannot change the application, since it is one benchmark.
I have tried to use JRockit-JDK, but two problems:
-Djrockit.lockprofiling=true does not give me the profile information (link);
does "-Xverbose:locks" print the right information?
The platform I am using is Ubuntu Server.
Any suggestions on this would be great appreciated.
To do this previously I've used AspectJ with a pointcut that detects locking and a counter i.e.
public aspect CountLocks{
private static AtomicInteger locks = new AtomicInteger();
before(Object l) : lock() && args(l) { locks.incrementAndGet(); }
after() : execution(void main(String[])) { System.out.println(locks+" locks"); }
}
But this obviously involves weaving the code, potentially changing its performance characteristics.

Perform function at certain clock time

I would like the user to input a time e.g. 1400h - which will then cause a function to run at 1400h.
How can I do this?
Context: I have a client-server program that works on the same computer - and I need several nodes to send messages simultaneously (which is the function as above)
edit: I do not want to use a sleep() function, ideally, as the issue is that the clients will be started at different times and it is much neater as a solution to call something that causes the function to execute at 1400h.
You can use std::this_thread::sleep_until, e.g.
void main()
{
auto fire_time = /**/;
std::thread thread([&]
{
std::this_thread::sleep_until(fire_time);
fire();
});
thread.join();
}
You can refactor that into a helper function, which is probably what you are looking for:
template<class Func, class Clock, class Duration>
void run_at(Func&& func, const std::chrono::time_point<Clock,Duration>& sleep_time)
{
std::thread(std::bind([&](const Func& func)
{
std::this_thread::sleep_until(sleep_time);
func();
}, std::move(func)))
.detach();
}
If the program is running the entire time, use a function such as sleep to wait the amount of time between now and 1400h. You might need to do this in a separate thread to allow the program to do other things, or replace the sleep with an event loop timeout (if the program is event-loop-based).
If the program must exit, then you must use a system facility, such as at on Unix, to arrange the program to be restarted and code to be executed at the specified time.
I believe you need a some kind of task manager. That's a basic model. Breeding sleeping threads is very wrong way to do that job. A single manager will know when to run a next task. How to run a task is another question. You can make new thread per task if you want them to be interactive. Or you can serialize them and run from within the manager thread.

Is it possible to change the tick count value returned from GetTickCount()?

I'm trying to do some testing and it requires the Windows system to be up and running for 15 Real-Time minutes before a certain action can ever occur. However, this is very time consuming to HAVE to wait the 15 real-time minutes.
Is there a way to change the value GetTickCount() returns so as to make it appear that the system has been running for 15 real-time minutes?
Edit: There is an app that does something close to what I want, but it doesn't quite seem to work and I have to deal with hexadecimal values instead of straight decimal values: http://ysgyfarnog.co.uk/utilities/AdjustTickCount/
Not directly.
Why not just mock the call, or replace the chunk of code that does the time check with a strategy object?
struct Waiter
{
virtual void Wait() = 0;
virtual ~Waiter() {};
};
struct 15MinWaiter : public Waiter
{
virtual void Wait()
{
//Do something that waits for 15 mins
}
};
struct NothingWaiter : public Waiter
{
virtual void Wait()
{
//Nill
}
};
You could do similar to mock out a call to GetTickCount, but doing this at the higher level of abstraction of whatever is doing the wait is probably better.
For debugging purposes, you can just replace all the calls to GetTickCount() with _GetTickCount(), which can implement to return with GetTickCount() or GetTickCount()+15min, depending whether or not you are debugging.
Why not make it one minute, confirm it works, then change it back to fifteen?
You could do something quite hideous like #define GetTickCount() MyReallyEvilReplacement().
You can use the Application Verifier provided with the Windows SDK to run your app with the "Miscellaneous > TimeRollOver" test. It will fake a tick count which starts at a time that will overflow after a short moment.
Another possibility is to to hibernate / hybrid shutdown / sleep a Windows system, then boot to the BIOS, change the date time to something you require, like add 30 days if you want to test unsigned tick counts. When Windows boots again, it has no way of detecting the appropiate time since the computer really started previously, and thinks it is running for 30 more days. It is important to use sleep / hibernate / hybrid shutdown (the latter being the default since Windows 8), not a full shutdown, as the up time is otherwise reset.
Yet another possibility could be to hook imports of GetTickCount to your own code and let it return arbitrary results.