Reusing a database record created by means of Celery task - django

There is a task which creates database record {R) when it runs for the first time. When task is started second time it should read database record, perform some calculations and call external API. First and second start happens in a loop
In case of single start of the task there are no problems, but in the case of loops (at each loop's iteration the new task is created and starts at certain time) there is a problem. In the task queue (for it we use a flower) we have crashed task on every second iteration.
If we add, at the and of the loop time.sleep(1) sometimes the tasks work properly, but sometimes - not. How to avoid this problem? We afraid that task for different combination of two users started at the same time also will be crashed.
Is there some problem with running tasks in Celery simultaneously? Or something we should consider, tasks are for scheduled payments so they have to work rock solid

Related

How do parallel multi instance loop work in Camunda 7.16.6

I'm using the camunda-enginge 7.16.6.
I have a Process with a multi instance loop like this one that repeats parallel a 1000 times.
This loop is execute parallel. My assumption was, that n camunda executors now starts their work so executor #1 executes Task 2, then Task 3, then Task 4, and executor #2 and all others do the same. So after a short while at least some of the 1000 times finished all three Tasks in the loop
However what I observed so far is, that Task 2 gets execute 1000 times and only when that is finished, Task 3 gets executed a 1000 times and so on.
I also noticed, that camunda takes a lot of time by itself, outside of the tasks.
Is my Observation correct and is this behavior documented somewhere? Can you change that behavior?
I've run some tests an can explain the behavior:
The Order of Tasks and the overall time to finish is influenced by whenever or not there are transaction boundaries (async after, the red bars in the Screenshot).
Its a bit described here.
By setting the asyncBefore='true' attribute we introduce an additional save point at which the process state will be persisted and committed to the database. A separate job executor thread will continue the process asynchronously by using a separate database transaction. In case this transaction fails the service task will be retried and eventually marked as failed - in order to be dealt with by a human operator.
repeat 1000 times, parallel, no transaction
One Job Executor rushes trough the process, the Order is 1, [2,3,4|2,3,4|...], 5. Not really parallel. But this is as documented here:
The Job Executor makes sure that jobs from a single process instance are never executed concurrently.
It can be turned off if you are an expert and know what you are doing (and have understood this section).
Overall this took around 5 seconds.
repeat 1000 times, parallel, with transaction
Here, due the transactions, there will be 1000 waiting Jobs for Task 7, and each finish Task 7 creates another Job of Task 8. Since the execution of the Jobs is by the order in the database (see here), the order is 6,[7,7,7...8,8,8...9,9,9...],10.
The transaction handling which includes maintaining the variables has a huge impact on the runtime, with Transactions in parallel mode it runs 06:33 minutes.
If you turn off the exclusive-flag it takes around 4:30 minutes, but at the cost of thousands of OptimisticLockingExceptions.
Afaik the recommended approach to gain true parallelism would be to move Task 7, Task 8 and Task 9 to a seperate process and spawn 1000 instances of that process.
You can influence the order of execution if you tweak the job executor settings & priority, see here, but that seems to require the exclusive flag, too. If you do that, the Order will be 6,[7,7,7|8,9,8,9(in random order),...]10
repeat 1000 times, sequential, no transaction
The Order is 11,[12,13,14|12,13,14,...]15
This takes only 2 seconds.
repeat 1000 times, sequential, with transaction
The order is as expected 16,[17,18,19|17,18,19|...],20
Due the Transactions this takes 02:45 minutes.
I heard from colleges, that one should use parallel only if it involves long running/blocking tasks like a human task - in sequential mode there would only be one human task, and after that one is done, another will be created. in parallel mode, you have 1000 human tasks which is more likely the desired behavior.
Parallel performance seems to be improved in Camunda 8

(Django) RQ scheduler - Jobs disappearing from queue

Since my project has so many moving parts.. probably best to explain the symptom
I have 1 scheduler running on 1 queue. I add scheduled jobs ( to be executed within seconds of the scheduling).
I keep repeating scheduling of jobs with NO rq worker doing anything (in fact, the process is completely off). In another words, the queue should just be piling up.
But ALL of a sudden.. the queue gets chopped off (randomly) and first 70-80% of jobs just disappear.
Does this have anything to do with:
the "max length" of queue? (but i dont recall seeing any limits)
does the scheduler automatically "discard" jobs where the start time
is BEFORE the current time?
ran my own experiment. RQ scheduler does indeed remove jobs whose start date < now.

How to purge all celery subtasks from parent task?

In my sharedtask I call several subtasks(...).apply_async(). Thus, both the parent task and the subtasks have their own task_id.
When I cancel the entire operation, I call the revoke of all active tasks and it works correctly. But as soon as cores are released, the queue moves on, executing the next subtasks.
How do I programmatically clear the queue, thereby preventing the next subtasks from being executed?
You could try the following:
app.control.revoke(task_id, terminate=True)
and start the task with:
task_id = subtasks(...).apply_async()
I'm not sure if it works with subtasks to save the task id in this way, but with normal tasks it works well, so it is worth trying.

Standard way to wait for all tasks to finish before exiting

I was wondering - is there a straightforward way to wait for all tasks to finish running before exiting without keeping track of all the ObjectIDs (and get()ing them)? Use case is when I launch #remotes for saving output, for example, where there is no return result needed. It's just extra stuff to keep track of if I have to store those futures.
Currently there is no standard way to block until all tasks have finished.
There are some workarounds that can be used.
Keep track of all of the object IDs in a list object_ids and then call ray.get(object_ids) or ray.wait(object_ids, num_returns=len(object_ids)).
Loop as long as some resources are being used.
import time
while (ray.global_state.cluster_resources() !=
ray.global_state.available_resources()):
time.sleep(1)
The above code will loop until it detects that no tasks are currently being executed. However this is not a foolproof approach. It's possible that there could be a moment in time when no tasks are running but a scheduler a task is about to start running.

2 different task_group instances not running tasks in parallel

I wanted to replace the use of normal threads with the task_group class from ppl, but I ran in to the following problem:
I have a class A with a task_group member,
create 2 different instances of class A,
start a task in the task_group of the first A instance (using run),
after a few seconds start a task in the task_group of the second A instance.
I'm expecting the two tasks to run in parallel but the second task wait for the first task to finish then starts.
This is happening only in my application where the tasks are started from a static function. I did the same scenario in a test application and the tasks are running correctly in parallel.
After spending several hours trying to figure this out I switched back to normal threads.
Does anyone knows why is the concurrency run-time having this behavior, or how I can avoid this?
EDIT
The problem was that it was running on a single core CPU and concurrency run-time looks at throughput. I wonder if microsoft parallel patterns library has the concept of an active object, or something on the lines so that you can specify that the task you are about to lunch is to be executed in parallel with the thread you start it from...
The response can be found here: http://social.msdn.microsoft.com/Forums/en/parallelcppnative/thread/85a84373-4c3d-4862-bff3-9a21ffe82493
For one core machines, this is expected "default" behavior. This can be changed.
By default, number of tasks that can run in parallel = number of hardware threads (num of cores). This improves the raw throughut and efficiency of completing tasks.
However, there are a number of situations where a developer would want many tasks running in parallel, regardless of the number of cores. In this case you have two options:
Oversubsribe locally.
In your example above, you would use
void lengthyTask()
{
Context::Oversubscribe(true)
...do a lengthy task (//OR a blocking task)
Context::Oversubscribe(false)
}
Oversubcribe the scheduler when you start the application.
SchedulerPolicy policy(1, MaxConcurrency, GetProcessorCount() * 2);
SetDefaultSchedulerPolicy(policy);