Missing heartbeat of client machine - c++

My application launches hundreds of child processes sent to SGE. Few of them take a lot of memory due to which jobs are being failed.
i need some way to monitor the memory usage of clients from main process and relaunch/resubmit them to grid with higher memory requirement in case of such job failures.
i have heard something about missing heartbeat algo for such requirements but I am not much aware of them.
Can experts here please help me in finding out a nice solution for this issue? my application is a c++ application on Linux/Solaris.
Thanks
Ruchi

A solution I have used before is to have a script that captures the output from the qstat-command (using rsh in my case). I filter on my jobs and store the information I need (in my case it was CPU) in a continuously updated list. When a job aborted or was killed, it was easy to go back and look at the CPU usage. It is not 100% real-time, but good enough for me.
My language of choice was Python, as it contains easy-to-use libraries for catching output and logging in to remote machines. However, it should be easy to implement something like capturing rsh-output in C++. You can for example use popen() to pipe the output into your application. I hope this helps.

Related

CPU Usage gradually increases in dotnet core webservice

I have a .net Core web service which seems to slowly increase its cpu usage.
meaning at the first day it won't go past 10%, the second day it can go up to 20% and so on.
Using the TOP command in linux, all my webservices seems to sometime be shown there (probably when a request is made) and afterward disappear.
This specific process after running for a while just stays there constantly consuming cpu even when no request has been made.
the API still working fine, it seems like there are some threads that just keeps hanging and consuming cpu. last time I checked I had 5 threads that consumed 3-4% cpu and didn't die for some reason.
My guess is that in some specific scenario a thread just stays alive consuming cpu.
The app runs on ubuntu machine, my first step was trying to create a dump file with ProcDump so I can analyze those threads and maybe find where they are hanging.
ProcDump generates a huge 21gb file, which trying to analyze with lldb throws out of memory exception. even tried transferring it to a windows machine to debug with windbg , no help there as it couldn't open the file.
As there is no specific exception or anything I can't really share any piece of code as I have no idea where the issue is... just kind a hoping for some suggestion that might help me get to a solution or at least understand where the problem is.
Thanks a lot for reading, cheers
You could try using something like jetBrains’ DotMemory, they also have a fairly high level but helpful guide https://www.jetbrains.com/help/dotmemory/How_to_Find_a_Memory_Leak.html it also worth checking your startup file and double checking the services you’ve registered are used in the correct way ie not added as scoped when they should be transient or even a singleton etc
so iv'e been at it for a while.
Eventually found out that my problem was with HttpClient
Probably some bad mix of static class and creating new instances of HttpClient that causes the issue Iv'e explained above.
Solved it by utilizing HttpClientFactory as explained here -
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/http-requests?view=aspnetcore-2.1
Lesson learned :)
A little late but Procdump for Linux just added .NET Core 3 support that generates much more managable sized core dumps. It automatically detects if the target process is .NET Core and does the right thing (i.e., no need to specify switches).

Listen For Process Start and End

I'm new to Windows API programming. I am aware that there are ways to check if a process is already running (via enumeration). However, I was wondering if there was a way to listen for when a process starts and ends (for example, notepad.exe) and then perform some action when the starting or ending of that process has been detected. I assume that one could run a continuous enumeration and check loop for every marginal unit of time, but I was wondering if there was a cleaner solution.
Use WMI, Win32_ProcessStartTrace and Win32_ProcessStopTrace classes. Sample C# code is here.
You'll need to write the equivalent C++ code. Which, erm, isn't quite that compact. It's mostly boilerplate, the survival guide is available here.
If you can run code in kernel, check Detecting Windows NT/2K process execution.
Hans Passant has probably given you the best answer, but... It is slow and fairly heavy-weight to write in C or C++.
On versions of Windows less than or equal to Vista, you can get 95ish% coverage with a Windows WH_CBT hook, which can be set with SetWindowsHookEx.
There are a few problems:
This misses some service starts/stops which you can mitigate by keeping a list of running procs and occasionally scanning the list for changes. You do not have to keep procs in this list that have explorer.exe as a parent/grandparent process. Christian Steiber's proc handle idea is good for managing the removal of procs from the table.
It misses things executed directly by the kernel. This can be mitigated the same way as #1.
There are misbehaved apps that do not follow the hook system rules which can cause your app to miss notifications. Again, this can be mitigated by keeping a process table.
The positives are it is pretty lightweight and easy to write.
For Windows 7 and up, look at SetWinEventHook. I have not written the code to cover Win7 so I have no comments.
Process handles are actually objects that you can "Wait" for, with something like "WaitForMultipleObjects".
While it doesn't send a notification of some sort, you can do this as part of your event loop by using the MsgWaitForMultipleObjects() version of the call to combine it with your message processing.
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion
\Image File Execution Options
You can place a registry key here with your process name then add a REG_SZ named 'Debugger' and your listner application name to relay the process start notification.
Unfortunately there is no such zero-overhead aproach to recieving process exit that i know of.

How to find why my application becomes unresponsive at unaccessible customer sites

My application is deployed at customer sites, that I can not access, and has no internet connection.
There are complains that in several sites, once in a week or so, the application become unresponsive, so that the operators need to kill and restart it.
We were unable to observe it in our site.
Is there something I can do that may help me find the problem?
It is a VC2008 Win32 MFC applications.
The application is quite complex, and includes many threads, synchronization mechanisms, database access, HMI, communication channels...
Note: The custmer can send us log files.
Note: The application does not crash. It just hangs. Since I don't know what is the nature of the problem, I have no way to know programmatically that something went wrong (or do I?)
I have had great success with ADplus and WinDBG in the past. You may check it out. Especially check out the Hang mode in ADplus.
I would start with some questions - is the CPU hogged during these unresponsive times? Is there a specific process that's hogging it? (You can use PerfMon to get the answers). Depending on the answers I would probably proceed by taking a dump of the process at this stage (ProcDump by sysinternals is great for these purposes) and investigate it offline.
In similar situation on a non-windows platform we have the capability to gather system dumps. Get a thread dump of the entire system for off-site analysis. This enables us to find deadlocks quite easily. For slow problems rather than stop a single dump is not enough. Then we need a sequence of dumps, and some good luck.
Another, rather messier technique is to have enough trace, and enough fine-grained control of trace in the app. Then turn on some trace and hope to spot where the delays are happening.
My experience with finding bugs in installations on the other side of the planet shows three helpful techniques: Logging, logging, and logging.
What do those log files say your customers sent you? If they aren't detailed enough, send them a version that logs more. Use binary approximation to home in on the error.
To know where the process is hung is better to start with the stack trace at that instant.
Now since your program is installed remotely and you can't access it, you can write a monitoring program which can periodically check the stack of your program and log it. This information along with your logging mechanism will make things easier to identify and debug.
Since I am not a windows programmer, i don't know much about such tools availability in windows, however i think you need something similar to this http://www.codeproject.com/KB/threads/StackWalker.aspx

Process name change at runtime (C++)

Is it possible to change the name(the one that apears under 'processes' in Task Manager) of a process at runtime in win32? I want the program to be able to change it's own name, not other program's. Help would be appreciated, preferably in C++. And to dispel any thoughts of viruses, no this isn't a virus, yes I know what I'm doing, it's for my own use.
I would like to submit what i believe IS a valid reason for changing the process name at runtime:
I have an exe that runs continuously on a server -- though it is not a service. Several instances of this process can run on the server. The process is a scheduling system. An instance of the process runs for each line that is being scheduled, monitored and controlled. Imagine a factory with 7 lines to be scheduled. Main Assembly line, 3 sub assembly lines, and 3 machining lines.
Rather than see sched.exe 7 times in task manager, it would be more helpful to see:
sched-main
sched-sub1
sched-sub2
sched-sub3
sched-mach1
sched-mach2
sched-mach3
This would be much more helpful to the Administrator ( the user in this situation should never see task manager). If one process is hung, the Administrator can easily know which one to kill and restart.
I know you're asking for Win32, but under most *nixes, this can be accomplished by just changing argv[0]
I found code for doing that in VB. I believe it won't be too hard to convert it to C++ code.
A good book about low level stuff is Microsoft Windows Internals.
And I agree with Peter Ruderman
This is not something you should do.

How to write an unkillable process for Windows?

I'm looking for a way to write an application. I use Visual C++ 6.0.
I need to prevent the user from closing this process via task manager.
You can't do it.
Raymond Chen on why this is a bad idea.
You can make an unkillable process, but it won't be able to accomplish anything useful while it's unkillable. For example, one way to make a process unkillable is to have it make synchronous I/O requests to a driver that can never complete (for example, by deliberately writing a buggy driver). The kernel will not allow a process to terminate until the I/O requests finish.
So it's not quite true that you "can't do it" as some people are saying. But you wouldn't want to anyway.
That all depends on who shouldn't be able to kill that process. You usually have one interactively logged-on user. Running the process in that context will alow the user to kill it. It is her process so she can kill it, no surprise here.
If your user has limited privileges you can always start the process as another user. A user can't kill a process belonging to another user (except for the administrator), no surprise here as well.
You can also try to get your process running with Local System privileges where, I think not even an administrator could kill it (even though he could gain permission to do so, iirc).
In general, though, it's a terribly bad idea. Your process does not own the machine, the user does. The only unkillable process on a computer I know is the operating system and rightly so. You have to make sure that you can't hog resources (which can't be released because you're unkillable) and other malicious side-effects. Usually stuff like this isn't the domain of normal applications and they should stay away from that for a reason.
It's a Win32 FAQ for decades. See Google Groups and Und. boards for well-known methods.(hooking cs and others...)
Noobs who answer "You can't do it" know nothing to Win32 programming : you can do everything with Win32 api...
What I've learned from malware:
Create a process that spawns a dozen of itself
Each time you detect that one is missing (it was killed) spawn a dozen more.
Each one should be a unique process name so that a batch process could not easily kill all of them by name
Sequentially close and restart some of the processes to keep the pids changing which would also prevent a batch kill
Depends on the users permission. If you run the program as administrator a normal user will not have enough permissions to kill your process. If an administrator tries to kill the process he will in most cases succeed. If you really want someone not to kill you process you should take a look at windows system services and driver development. In any case, please be aware that if a user cannot kill a process he is stuck with it, even though it behaves abnormally duo to bugs! You will find a huge wealth of these kind of programs/examples on the legal! site rootkit.com. Please respect the user.
I just stumbled upon this post while trying to find a solution to my own (unintentional) unkillable process problem. Maybe my problem will be your solution.
Use jboss Web Native to install a service that will run a batch file (modify service.bat so that it invokes your own batch file)
In your own batch file, invoke a java process that performs whatever task you'd like to persist
Start the service. If you view the process in process explorer, the resulting tree will look like:
jbosssvc.exe -> cmd.exe -> java.exe
use taskkill from an administrative command prompt to kill cmd.exe. Jbosssvc.exe will terminate, and java.exe will be be an orphaned running process that (as far as I can tell) can't be killed. So far, I've tried with Taskmanager, process explorer (running as admin), and taskkill to no avail.
Disclaimer: There are very few instances where doing this is a good idea, as everyone else has said.
There's not a 100% foolproof method, but it should be possible to protect a process this way. Unfortunately, it would require more knowlegde of the Windows security system API than I have right now, but the principle is simple: Let the application run under a different (administrator) account and set the security properties of the process object to the maximum. (Denying all other users the right to close the process, thus only the special administrator account can close it.)
Set up a secondary service and make it run as a process guardian. It should have a lifeline to the protected application and when this lifeline gets cut (the application closes) then it should restart the process again. (This lifeline would be any kind of inter-process communications.)
There are still ways to kill such an unkillable process, though. But that does require knowledge that most users don't really know about, so about 85% of all users won't have a clue to stop your process.
Do keep in mind that there might be legal consequences to creating an application like this. For example, Sony created a rootkit application that installed itself automatically when people inserted a Sony music CD or game CD in their computer. This was part of their DRM solution. Unfortunately, it was quite hard to kill this application and was installed without any warnings to the users. Worse, it had a few weaknesses that would provide hackers with additional ways to get access to those systems and thus to get quite a few of them infected. Sony had to compensate quite a lot of people for damages and had to pay a large fine. (And then I won't even mention the consequences it had on their reputation.)
I would consider such an application to be legal only when you install it on your own computer. If you're planning to sell this application to others, you must tell those buyers how to kill the process, if need be. I know Symantec is doing something similar with their software, which is exactly why I don't use their software anymore. It's my computer, so I should be able to kill any process I like.
The oldest idea in the world, two processes that respawn each other?