Stopping Windows service asynchronously - c++

I am trying to control a service within an application. Starting the service via StartService (MSDN) works fine, the service needs about 10 seconds to start, but after calling StartService it gives the control back to the main-application immediately.
However, when stopping the service via ControlService (MSDN) - AFAIK there is no StopService - it blocks the main-application for the complete time until the service is stopped, which takes about 10 seconds.
Start: StartServiceW( handle, 0, NULL)
Stop: ControlService( handle, SERVICE_CONTROL_STOP, status )
Is there a way for a non-blocking / asynchronously stopping of a windows service?

I would probably look at stopping the service in a new thread. That will eliminate the blocking of your main thread.

The SCM processes control requests in a serialized manner. If any service is busy processing a control request, ControlService() will be blocked until the SCM can process the new request. This is stated as much in the documentation:
The SCM processes service control notifications in a serial fashion—it
will wait for one service to complete processing a service control
notification before sending the next one. Because of this, a call to
ControlService will block for 30 seconds if any service is busy
handling a control code. If the busy service still has not returned
from its handler function when the timeout expires, ControlService
fails with ERROR_SERVICE_REQUEST_TIMEOUT.

The service is doing its cleanup in its control handler routine. That's OK for a service that will only take a fraction of a second to exit, but a service that's going to take ten seconds should definitely be setting a status of STOP_PENDING and then cleaning up asynchronously.
If this is your own service, you should correct that problem. I'd start by making sure that all of the cleanup is really necessary; for example, there's no need to free memory before stopping (unless the service is sharing a process with other services). If the cleanup really can't be made fast enough, launch a separate thread (or signal your main thread) to perform the service shutdown and set the service status to STOP_PENDING.
If this is someone else's service, the only solution is to issue the stop request from a separate thread or in a subprocess.

Related

Change timeout value with StartService method

I'm trying to start a service with StartService method. According to the documentation:
StartService will block for 30 seconds if any service is busy handling a control code.
How can I change this timeout value?
You cannot change the timeout. It is builtin to the SCM, and contractually obligated by the documentation to be 30 seconds only.
UPDATE: Apparently, you can change the timeout after all. But only in the Registry, not in code. And it requires a reboot to take effect. See How do I increase windows service startup timeout on Server Fault.
It is the responsibility of each service to respond to the SCM in a timely manner. During a start request, a service needs to call StartServiceCtrlDispatcher() as soon as possible. If it needs a lengthy startup, it should be starting the dispatcher quickly, then entering a PENDING state and report updated status at regular intervals until ready.

Does Windows ever stop services when resuming from sleep?

I'm running on windows 8.
Occasionally, when I resume from sleep, my service gets a stop request through the SCM (call to SvcCtrlHandler with SERVICE_CONTROL_STOP). I wasn't able to trace the source of this request. Can it possibly be sent by the OS itself, in some scenario?
My two main suspicions right now:
If the resume event (SERVICE_CONTROL_POWEREVENT of type PBT_APMRESUMEAUTOMATIC) is taking too long, the OS might stop the service (system logs contain logs referring to this specific service: A timeout was reached (30000 milliseconds) while waiting for the [...] The service did not respond to the start or control request in a timely fashion)
The OS stops the service because it has been flagged as a problematic service (system logs contain logs referring to this specific service: service did not shut down properly after receiving a preshutdown control

Cross-platform notification that a service is running

I am looking for a cross platform method of notifying several client applications that a service/daemon has started and is able to handle incoming connections. The clients will be running all the time, whereas the service may not. Normally the service/daemon will be started automatically when the computer starts, but in some cases it may not and in those cases the clients should automatically connect when the service/daemon starts.
The basic flow of the client is to wait until they notice that the service is running, then connect. If the connection is interrupted or if they were unable to connect they just try again from the beginning.
For Windows I have a solution where the service signals a global event object when it starts so that the clients can wait on this event. This works ok in reality, but I am pretty sure that it does not handle all potential cases (such as a crashing service or multiple instances of the service running). I don't mind if the clients "accidentally" wake up every now and then even though the service isn't running. I just want to avoid that the clients enter a busy loop trying to connect all the time, while at the same time respond pretty quickly to the service starting. I.e. just adding a sleep between connection attempts is not great.
Is there a cross platform method to detect whether the service is running and ready to accept connections?
Update: I'll add a bit more information on how the current mechanism works on Windows using approximate code from memory, so please excuse any typos:
Service:
SECURITY_ATTRIBUTES sa;
// Set up empty SECURITY_ATTRIBUTES so that everyone has access
// ...
// Create a manual reset event, initially set to nonsignaled
HANDLE event = ::CreateEvent(&sa, TRUE, FALSE, "Global\\unique_name");
// Signal the event - service is running and ready
::SetEvent(event);
// Handle connections, do work
// If the service dies for whatever reason, Windows deletes the event handle
// The event is deleted when the last open handle to it is closed
// So the event is signaled for at least as long as the service lives
Clients:
while (true) {
// Set up event the same way as the service, including empty security attributes
// ...
HANDLE event = ::CreateEvent(&sa, TRUE, FALSE, "Global\\unique_name");
// Wait for the service to start
DWORD ret = ::WaitForSingleObject(event, INFINITE);
// Close the handle to avoid keeping the event object alive
// This isn´t enough in theory, but works in real usage as the number of clients
// will always be low
::CloseHandle(event);
// Check if we woke up because the event is signaled
if (WAIT_OBJECT_0 == ret) {
// connect to service, do work
// ...
}
}
How could I achieve approximately the same on OS X and Linux?

A new Thread should be created on ServiceMain?

The MSDN says that:
"The ServiceMain function should create a global event, call the RegisterWaitForSingleObject function on this event, and exit. This will terminate the thread that is running the ServiceMain function, but will not terminate the service..."
So the question is: A new Thread should be created inside the ServiceMain function to execute the service code, or I can simple set the service to RUNNING state and uses the ServiceMain thread to run the service code? If the ServiceMain thread is used to run the service code the SCM will remain locked, even if the service state is set to RUNNING?
I do not think the way of implementing services described by that statement from MSDN is the only possible way. That would contradict MSDN service example at http://msdn.microsoft.com/en-us/library/windows/desktop/bb540476(v=vs.85).aspx . In the example the service waits for events in the same thread that called ServiceMain. This way is probably better for simple services that work just fine with a single thread.
If you choose to use RegisterWaitForSingleObject way you do not have to create threads explicitly. MSDN page for RegisterWaitForSingleObject says: "New wait threads are created automatically when required." You do have to open I/O channels you service is going to monitor and bind their handles to thread pool before exiting ServiceMain.
MSDN says: "The Service Control Manager (SCM) waits until the service reports a status of SERVICE_RUNNING. It is recommended that the service reports this status as quickly as possible, as other components in the system that require interaction with SCM will be blocked during this time."
The control dispatcher creates a new thread to execute the ServiceMain function for the service. The ServiceMain function should perform the following tasks.
5.. Perform the service tasks, or, if there are no pending tasks, return control to the caller. Any change in the service state warrants
a call to SetServiceStatus to report new status information.
From this example follow that you can perform more complex initialization tasks inside the ServiceMain function such as creating additional threads.
Guidance for creating Multithreaded Services.

How to kill /re-start a long running task

Is there a way to kill / re-start a long running task in AWS SWF? Sometimes some of our tasks run for a longer duration and we would like to manually kill a certain task (either via UI or programmatically) and re-start the task if possible. How to achieve this?
Console is option to manually kill workflow.
You can also set timeouts to whole workflow execution time or to individual activities. This can be set when you register your activity or when you start your activity (defaultTaskStartToCloseTimeoutSecond).
It's not clear what language you're using.
If you're using java, then you should look into Exponential Retry in Flow Framework. This make SDK restart your activity if it fails.
Long running activity is expected to heartbeat using RecordActivityTaskHeartbeat. It leads to timeout failure after short hearbeat interval instead of long task execution timeout if the activity process hangs or crashes.
The workflow code (decider) can always request activity cancellation through RequestCancelActivityTask decision. The cancellation request is returned as output of the RecordActivityTaskHeartbeat call. Activity implementation should cancel itself and report back to the service using RespondActivityTaskCanceled API call.
See Error Handling section of AWS Flow Framework Developer Guide for the AWS Flow Framework way of cancelling activities.
Sometimes activity implementation cannot support heartbeating and self cancellation. The solution is to execute another kill activity that terminates the first activity execution. For example under Unix such kill activity could emit "kill -9" command for the process that implements the first one.