Is there a way to start only one process instance per business key - camunda

I have been trying out the Camunda BPMN engine for a couple of days.
Using the REST API, I have managed to start a process instance and associate it with a business key. However, I realized that it is possible to start multiple process instances under the same business key. Is there a way to enforce a condition such that only one process instance is allowed per business key per process definition?
Thank you.

One way is to check in the process (synchronously) if a process with this businessKey already exists when a process is started.
Here is a related example model which only allows one instance of the definition:
https://raw.githubusercontent.com/rob2universe/process-models/master/bpmn/singleton.bpmn
The interesting part is the expression:
${historyService.createHistoricProcessInstanceQuery().processDefinitionKey(execution.getProcessDefinitionId().split(":")[0]).active().count() > 0}
You can change the filter criteria in the query to check the businessKey instead of the process definition key.

We use the following pattern to solve this:
This is an Event Sub Process.
Now Correlate Message, like
curl --location --request POST 'http://localhost:8080/engine-rest/message' \
--data-raw '{
"messageName": "message",
"businessKey": "my-unique-bk",
"processVariables": {
}
}'
First time: Starts the Process.
Second Time: Starts the Event Sub Process.

Related

How to update MultiInstance User Task to add/delete Tasks?

We have a business scenario where we would like to have the ability to INCREASE or DELETE tasks within a multi-instance context.
I’ve managed to successfully create a mutli-instance User task based on a collection workPartnerList
If a Process is working on a multi instance stage of the workflow - how can I increase or decrease the multi instance state based on the count/values of workPartnerList which can increase or decrease based on updates from the API call. (we need to do this prior to the overall task completion)?
I assume you are referring to a parallel multi-instance task.
https://docs.camunda.org/manual/latest/reference/bpmn20/tasks/task-markers/
Another way to define the number of instances is to specify the name
of a process variable which is a collection using the loopDataInputRef
child element. For each item in the collection, an instance will be
created
The creation of the instances happens at the point in time when the execution reaches the parallel multi-instance activity. The number of instances created is determined by the size of the collection at this specific point in time. (A BPMN2 process engine will not automatically keep the task instances in sync with the collection.)
To "delete" task instance you can complete or cancel them (e.g. via an attached boundary event) or us the completion condition.
A multi-instance activity ends when all instances are finished.
However, it is possible to specify an expression that is evaluated
every time one instance ends. When this expression evaluates to true,
all remaining instances are destroyed and the multi-instance activity
ends, continuing the process. Such an expression must be defined in
the completionCondition child element.
To add additional task instances to a running process instance dynamically you can use for instance event sub processes or attach a boundary event to the task.
https://docs.camunda.org/manual/7.13/reference/bpmn20/events/message-events/#message-boundary-event
Boundary events are catching events that are attached to an activity.
This means that while the activity is running, the message boundary
event is listening for named message. When this is caught, two things
might happen, depending on the configuration of the boundary event:
Interrupting boundary event: The activity is interrupted and the sequence flow going out of the event is followed.
Non-interrupting boundary event: One token stays in the activity and an additional token is created which follows the sequence flow
going out of the event.
If you are willing to approach this on API level then the TaskService allows you to create a new task (with a user defined task id).
Example:
https://github.com/rob2universe/cam-multi-instance/blob/25f524be6a112deb1b4ae3bb4f28a35422e428e0/src/test/java/org/camunda/bpm/example/ProcessJUnitTest.java#L79
The migration API would even allow you to add additional instances to the already created set of task instances - see: https://docs.camunda.org/manual/latest/user-guide/process-engine/process-instance-modification/#modify-multi-instance-activity-instances

AWS Lambda Function not running to completion

I have an AWS Lambda Function which takes a ZipCode and finds locations within a radius. Each location returned is processed further by running it through a series of business processes. My test data is comprised of 24 locations being returned for each request. Each request has a guid that represents the request I'm peppering my code with console.log statements so I can follow what's happening via CloudWatch logs.
Whenever I run the lambda function, the CloudWatch log Ends before the process finishes. I subsequently make another request to Lambda with a different Guid and I see entries in CLoudWatch log for the previous request.
How come CloudWatch is not staying active and capturing log entries throughout the entire process? - I'm assuming that Lambda function ends as soon as "End" shows up in CloudWatch Log.
How come log entries from a previous request are showing up muddled in with a subsequent request?
UPDATE
The data construct containing the 24 locations comes from an async method. Each of the 24 locations is a Task. Because async is a promise of future results, is it possible the lambda function is closing down before the async is satisfied? This could explain why only a handful of the 24 locations are outputted in CLoudWatch log before the "End" is registered. There's no consistency as to how many of the 24 are processed/logged before the "End" event happens. The subsequent call seems to pick up the balance from the original call.
I figured out the issue. The problem I'm experiencing is associated with using Tasks in my code. Tasks are a promise for receiving data at some point in the future. I'm pretty certain my Lambda function was exiting before all of the tasks came to fruition. I added the following to force my Lambda to wait until all tasks completed:
...
var futureList = new List<Task>();
foreach (var range in outerRange.TrySplit(_config.HashKeyLength))
{
var task = RunGeoQuery(geoQueryRequest, geoQueryResult, range, cts.Token);
taskList.Add(task);
}
Task.WhenAll(taskList).Wait();
...
After adding Task.WhenALl(taskList).Wait(), my Lambda function processed all 24 locations and reflected in CloudWatch logs.
I'm not sure if there are ramifications for using this apporach. Any insights are appreciated.
Thank you.

How get the currently running activity instance of the process definition of camunda

I am a new for camunda.
I want cancel the currently running activity instance and start a new activity instance for move the token state.
But I got a hard time of how get the currently running activity instance id by the java api of camunda.
Any thougs ? Thank you all.
Actully the question is "How get the running activity instances". And I already got the answer from somewhere.
Here is the aswer.
Just use the java api like below
ActivityInstance activityInstance = runtimeService.getActivityInstance(instance.getProcessInstanceId());
ActivityInstance[] activityInstances = activityInstance.getChildActivityInstances();
The activityInstances array is the running activity instances. you can use the ids of the activity instances to cancel running activity instance.
Had the same trouble. This line returns a list of ids (whatever they are - user task, service task, and etc). If you don't have parallel active tasks - the list will contain a single activity id.
processEngine.getRuntimeService().getActiveActivityIds(
processInstance.getProcessInstanceId()
);

Throttled Queue Service

I have a function doWork(id) that I'm offloading to some worker servers using AWS SQS. This function can get called very frequently but I'd like to throttle the function so that for a given id, the work is don't no more than once per second.
Is it possible with AWS / are there any services that feature this functionality?
EDIT: Some clarification.
doWork(id) does some expensive work on a record in a database. This work needs to continuously update whenever the user interacts with the record. Thus, I call doWork(id) whenever the user called a method that edits the record. However, the user may edit the record many times very quickly (I'm building a text editor so every character is an edit). Rather than doWork(id) a unnecessary amount of times, I'd like to throttle that work so it happens at most once per second.
Because this work is expensive, I enqueue a message in SQS and have a set of "worker" servers that dequeue tasks and run them.
My goal here is to somehow maintain the stateless horizontal scalability of my servers while throttling doWork(id). To make matters a little more complicated, I don't want to throttle the doWork function itself -- I want to throttle the work for each individual record identified by the id passed to doWork.
You could use a Redis instance on ElastiCache and configure your workers to use a distributed rate limiter for keys based on id. There are also many packages for different languages based on this kind of idea that might be ready to run on your workers.
That's interesting. You want to delay the work in case they hit another key within a given time period. If they don't hit another key in that time period, you then want to do the work. You might also want to do it after x seconds even if they continue typing (Auto Save).
The problem is that each keypress sends a message to the queue. When a worker receives the message, they have no idea whether another key has been pressed since the message was sent, and there's no way to look in the queue for other matching messages.
Amazon SQS does have the ability to delay a message, which means it will not be available for receiving for a given period, but this alone can't solve the problem because the worker doesn't know what else has happened.
Bottom line: A traditional queue is not a suitable mechanism for this use-case. You need something akin to a database/cache that can update a "last modified" timestamp each time that a key is pressed. Once that timestamp is more than x seconds old, you should queue the worker.

(AWS SWF) Is there a way to get a list of all activity workers listening on a particular tasklist?

In our beta stack, we have a single EC2 instance listening to a tasklist. Sometimes another developer in the team start's his own instance for testing purposes and forget to turn it off. This creates problems for the next developer who tries to start an activity only for it to be taken up by the last developer's machine. Is there a way to get the hostnames of all activity workers listening to a particular tasklist ?
It is not currently possible to get a list of pollers waiting on a task list through the SWF API. The workaround is to look at the identity field on the ActivityExecutionStarted event after it was picked up by the wrong worker.
One way to avoid this issue is always use a task list name that is specific to a machine or developer to avoid collisions.