Camunda created duplicate task and external task not ending - camunda

Inquiring on this matter since BPMN was tested and transactions were done 1000+ times, This was the only isolated case wherein process instance created a duplicate task that has no candidate group or assignee, Also the external task is repetitively running for 2 weeks without ending. Process was supposed to end on task id fd705667-0663-11ed-a39f-00155db42c25 since task was completed.e417b047-0663-11ed-a39f-00155db42c25 was the duplicate that was created.
May we know on where to start to investigate or even the possible root cause of this?

Related

Camunda process versioning using "Process Instance Modification" migrate call activities

In our project we have problem with camunda process versioning.
We have read some guides and decided to use Process Instance Modification over Process Instance Migration due to limitations that the last approach has.
As we see Process Instance Migration does not allow us to change current variables (based on their previous value, and current wait point we stay), sometimes we only want to change variables because we change delegate executions code and we know that business model (BPMN) haven't bean changed.
So currently I am trying to develop migration framework based on Process Instance Modification.
And first issue I encounter is:
How properly migrate process instance which currently stays on wait point in Call Activity?
For example, I have process:
I start it. One exectuions stays on wait point before Message 1 event. Another gets into Call activity:
And stays there before Message 3 and Message 4.
By using Process Instance Modification I stop processes in Call Activity and then start them again (changing variables, and bpmn model to the latest). How can I attach them to the parent process instance which called Call activity in the first place, to make it return back to the parent process instance (which called Call activity) and proceed with processing (executing Task 6). What if I want to migrate parent process as well?

How to stop a compute node with SLURM?

I am using SLURM on AWS to manage jobs as part of AWS parallelcluster. I have two questions :
When using scancel *jobid* to cancel a job, the associated node(s) do not stop. How can I achieve that ?
When starting, I made the mistake of not making my script executable so the sbatch *script.sh* worked but the compute node was doing nothing. How could I identify such behaviour and handle it properly ? Is the proper to e.g. stop the idle node after some time for example and output that in a log ? How can I achieve that ?
Check out this page in the docs: https://docs.aws.amazon.com/parallelcluster/latest/ug/autoscaling.html
Bottom line is that instances that have no jobs for a period of time longer than the scaledown_idletime (the default setting is 10 minutes) will get scaled down (terminated) by the cluster, automagically.
You can tweak the setting in the config file when you build your cluster, if 10 mins is too long. Just think about your workload first, because you don't want small delays between jobs to cause you a lot of churn whilst you wait for nodes to die and then get created again shortly after, hence the 10 minute thing.

How to update MultiInstance User Task to add/delete Tasks?

We have a business scenario where we would like to have the ability to INCREASE or DELETE tasks within a multi-instance context.
I’ve managed to successfully create a mutli-instance User task based on a collection workPartnerList
If a Process is working on a multi instance stage of the workflow - how can I increase or decrease the multi instance state based on the count/values of workPartnerList which can increase or decrease based on updates from the API call. (we need to do this prior to the overall task completion)?
I assume you are referring to a parallel multi-instance task.
https://docs.camunda.org/manual/latest/reference/bpmn20/tasks/task-markers/
Another way to define the number of instances is to specify the name
of a process variable which is a collection using the loopDataInputRef
child element. For each item in the collection, an instance will be
created
The creation of the instances happens at the point in time when the execution reaches the parallel multi-instance activity. The number of instances created is determined by the size of the collection at this specific point in time. (A BPMN2 process engine will not automatically keep the task instances in sync with the collection.)
To "delete" task instance you can complete or cancel them (e.g. via an attached boundary event) or us the completion condition.
A multi-instance activity ends when all instances are finished.
However, it is possible to specify an expression that is evaluated
every time one instance ends. When this expression evaluates to true,
all remaining instances are destroyed and the multi-instance activity
ends, continuing the process. Such an expression must be defined in
the completionCondition child element.
To add additional task instances to a running process instance dynamically you can use for instance event sub processes or attach a boundary event to the task.
https://docs.camunda.org/manual/7.13/reference/bpmn20/events/message-events/#message-boundary-event
Boundary events are catching events that are attached to an activity.
This means that while the activity is running, the message boundary
event is listening for named message. When this is caught, two things
might happen, depending on the configuration of the boundary event:
Interrupting boundary event: The activity is interrupted and the sequence flow going out of the event is followed.
Non-interrupting boundary event: One token stays in the activity and an additional token is created which follows the sequence flow
going out of the event.
If you are willing to approach this on API level then the TaskService allows you to create a new task (with a user defined task id).
Example:
https://github.com/rob2universe/cam-multi-instance/blob/25f524be6a112deb1b4ae3bb4f28a35422e428e0/src/test/java/org/camunda/bpm/example/ProcessJUnitTest.java#L79
The migration API would even allow you to add additional instances to the already created set of task instances - see: https://docs.camunda.org/manual/latest/user-guide/process-engine/process-instance-modification/#modify-multi-instance-activity-instances

Django + Celery with long-term scheduled tasks

I'm developing a Django app which relies heavily on Celery task scheduling, using Redis as backend. Tasks can be set to run at a large periods of time, as well as in a few seconds/minutes.
I've read about Redis visibility timeout and consequences of scheduling tasks with timedelta greater than visibility timeout (I'm also in the process of dealing with it in a previous project), so I'm interested if there's anything neater than my solution, which is to have another "helper" task run 5 minutes before the "main" one needs to be executed, scheduling the "main" task to run in required time, storing task id in DB, and then checking in "main" task if the stored task id is the one that is being run. The last part (with task id storing) is required as multiple runs of "helper" task could spawn a lot of "main" task instances, but with this approach each will have different task id.
I really hate how that approach sounds and how it works, as if the task is scheduled to be run a month from current time, "helper" and "main" tasks are executed up to a hundred times.
I also know that it's an open issue, so I'm interested in more a neat workaround than a solution itself.
Having tested available options, in my opinion only using RabbitMQ as broker solves the whole problem.
Although it's a viable option for me, lack of some of redis configuration parameters (e.g. pool size) makes it unusable for those who are using hosting services with some limit on opened broker connection.

Routing an activity task to a specific worker in the SWF fleet

I have a fleet of multiple worker hosts polling for the following tasks of my SWF:
Activity 1: Perform some business logic to create a large file.
Activity 2: Wait for some time (a human approval, timer, etc.)
Activity 3: Transmit the file using some protocol (governed by input parameters of the SWF).
Activity 4: Clean-up the local-generated file.
The file generated in Step-1 needs to be used again in Step-3, and then eventually discarded at the end of the workflow.
The system would work fine if there is only 1 host polling for all tasks. However, when I have multiple workers, I cannot seem to ensure that task-1 and task-3 would end up on the same host.
I would like to avoid doing the following:
Uploading the file to a central repository (say S3) on step-1 and download it in step-3; or
Having a single activity for the task-1 and task-3.
I have the following questions:
Is it possible to control that subsequent activities be run on the same host as opposed to going to any random host in my fleet?
What are specific guidelines/best practices on re-using resources generated in different activities in a workflow?
Is it possible to control that subsequent activities be run on the
same host as opposed to going to any random host in my fleet?
Yes, absolutely. The basic idea is that SWF task lists (queues used to deliver activity tasks) are dynamic. So each host can have its own task list and workflow can specify specific task list name when calling an activity. See fileprocessing sample which executes download activity on any host from the pool, then converts the file and uploads the result on the same host as the first one.
List item What are specific guidelines/best practices on re-using resources generated in different activities in a workflow?
The approach of caching result in the worker process memory or on the local disk is considered the best practice. Sometimes using external data store and getting it each times also makes sense.