How to determine that boost::asio::io_service has finished the task? - c++

I need to build a thread pool with scheduling priorities: all running threads have the same priority in terms of CPU time and OS priority, but when it takes to pick the next task to complete, the one with the highest priority goes first.
I've decided to try the boost::asio as it has a thread pool that looks good. I've looked over the prioritized handlers example in the asio documentation but I don't like it because it doesn't limit the number of threads, and I have to schedule the tasks manually. What I need is a fixed number of threads that would take tasks from a queue, so I could create the single pool in my application and then add tasks at any time during the application lifetime.
What would be sufficient is getting some notification from the asio::io_service when a task is finished; the handler of that notification could go and find the next task with the highest priority, and post it to the service.
Is that possible?

Related

Keep the task in the queue even after maximum number of retry limits in google task queue

I am using google task queues and I am setting task_retry_limit on the queue.
The default behavior is the task is removed from task queue in the following cases :
1) when the task is executed successfully or
2) when the task reaches the maximum number of retry attempts set.
In my use case, I have a problem with the second case. I want to keep the task in the task queue even after maximum number of retries
(I don't want to retry the task after task_retry_limit but I want to keep it in the task queue so that I can run it manually later)
Is there parameter in Queue.yaml which drives this?
I know that a workaround for this would be to set a moderate task_age_limit, but I don't want the task to keep retrying.
No, the task queues aren't presently designed to keep around tasks which reached their maximum number of retries.
I see 2 options you could try, from inside your task code when you detect it will fail on the final task retry:
create some sort of FailedTask datastore entities with all the info/parameters required to re-create and enqueue copies of the original failing tasks later on, under manual triggers
re-queue the task on a different queue, configured with an extremely long time between retries - long enough to not actually be retried until the moment you get to trigger them manually (you can do that for any task still pending, in any queue, at any time).
Somehow related: handling failure after maximum number of retries in google app engine task queues

Does the Spring SqsListener wait until the last message is processed (or completed) from the current poll before the next poll of messages happens?

I have a SQS Listener with a max message count of 10. When my consumer receives a batch of 10 message they all get processed but sometimes (depending on the message) the process will take 5-6 hours and some with take as little as 5 minutes. I have 3 consumers (3 different JVM's) polling from the queue with a maxMessageCount of 10. Here is my issue:
If one of those 10 messages takes 5 hours to process it seems as though the listener is waiting to do the next poll of 10 messages until all of the previous messages are 100% complete. Is there a way to allow it to poll a new batch of messages even though another is still being processed?
I'm guessing that I am missing something little here. How I am using Spring Cloud library and the SqsListener annotation. Has anybody ran across this before?
Also I dont think this should matter but the queue is AWS SQS and there JVM's are running on an ECS cluster.
If you run the task on the poller thread, the next poll won't happen until the current one completes.
You can use an ExecutorChannel or QueueChannel to hand the work off to another thread (or threads) but you risk message loss if you do that.
Your situation is rather unusual; 5 hours is a long time to process a message.
You should perhaps consider redesigning your application to persist these "long running" requests to a database or similar, instead of processing them directly from the message. Or, perhaps put them in a different queue so that they don't impact the shorter tasks.

Create workers dynamically (ActiveMQ)

I want to create a web application were a client calls a REST Webservice. This returns OK-Status for the client (with a link to the result) and creates a new message on an activeMQ Queue. On the listeners side of the activeMQ there should be worker who process the messages.
Iam stucking here with my concept, because i dont really know how to determine the number of workers i need. The workers only have to call web service interfaces, so no high computation power is needed for the worker itself. The most time the worker has to wait for returning results from the called webservice. But one worker can not handle all requests, so if a limit of requests in the queue is exceeded (i dont know the limit yet), another worker should treat the queue.
What is the best practise for doing this job? Should i create one worker per Request and destroying them if the work is done? How to dynamically create workers based on the queue size? Is it better to run these workers all the time or creating them when the queue requiere that?
I think a Topic/Suscriber architecture is not reasonable, because only one worker should care about one request. Lets imagine of 100 Requests per Minute average and 500 requests on high workload.
My intention is to get results fast, so no client have to wait for it answer just because not properly used ressources ...
Thank you
Why don't you figure out the max number of workers you'd realistically be able to support, and then make that number and leave them running forever? I'd use a prefetch of either 0 or 1, to avoid piling up a bunch of messages in one worker's prefetch buffer while the others sit idle. (Prefetch=0 will pull the next message when the current one is finished, whereas prefetch=1 will have a single message sitting "on deck" available to be processed without needing to get it from the network but it means that a consumer might be available to consume a message but can't because it's sitting in another consumer's prefetch buffer waiting for that consumer to be read for it). I'd use prefetch=0 as long as the time to download your messages from the broker isn't unreasonable, since it will spread the workload as evenly as possible.
Then whenever there are messages to be processed, either a worker available to process the next message (so no delay) or all the workers are processing messages (so of course you're going to have to wait because you're at capacity, but as soon as there's a worker available it will take the next message from the queue).
Also, you're right that you want queues (where a message will be consumed by only a single worker) not topics (where a message will be consumed by each worker).

AWS SWF Simple Workflow - Best Way to Keep Activity Worker Scripts Running?

The maximum amount of time the pollForActivityTask method stays open polling for requests is 60 seconds. I am currently scheduling a cron job every minute to call my activity worker file so that my activity worker machine is constantly polling for jobs.
Is this the correct way to have continuous queue coverage?
The way that the Java Flow SDK does it and the way that you create an ActivityWorker, give it a tasklist, domain, activity implementations, and a few other settings. You set both the setPollThreadCount and setTaskExecutorSize. The polling threads long poll and then hand over work to the executor threads to avoid blocking further polling. You call start on the ActivityWorker to boot it up and when wanting to shutdown the workers, you can call one of the shutdown methods (usually best to call shutdownAndAwaitTermination).
Essentially your workers are long lived and need to deal with a few factors:
New versions of Activities
Various tasklists
Scaling independently on tasklist, activity implementations, workflow workers, host sizes, etc.
Handle error cases and deal with polling
Handle shutdowns (in case of deployments and new versions)
I ended using a solution where I had another script file that is called by a cron job every minute. This file checks whether an activity worker is already running in the background (if so, I assume a workflow execution is already being processed on the current server).
If no activity worker is there, then the previous long poll has completed and we launch the activity worker script again. If there is an activity worker already present, then the previous poll found a workflow execution and started processing so we refrain from launching another activity worker.

Why do agents have a pool of threads?

In clojure documentation I see that agent use a pool of thread to process data. But I read that (always in documentation) :
The actions of all Agents get interleaved amongst threads in a thread
pool. At any point in time, at most one action for each Agent is being
executed.
Why does an agent have a pool of thread and not a single thread to process the "queue" of sended function ?
Thanks.
An agent does not 'have a pool of threads'. There are two thread pools (for send and send-off actions), to which agent actions get assigned.
This design decision is the optimal choice for CPU-bound tasks, and a best-effort approach for IO-bound tasks.
For the latter case, providing your own pool with send-via will be the optimal choice (assuming you know what you're doing).