How can assign Camunda task automaticly? - camunda

Is there any way to assign Camunda tasks to users round robin? Or any automatic pattern?

You can use an expression with a process variable in the task assignment attribute. The value of the process variable can then be determined dynamically before the user task is reached. You can for instance call a service to get the desired assignee or use DMN. Here is an example: https://camunda.com/blog/2020/05/camunda-bpm-user-task-assignment-based-on-a-dmn-decision-table/
I would not recommend assigning work to individuals by default. It is not required in most scenarios. Instead assign the tasks to a group/role and let people take the tasks from the group worklist when they are free. Round-robin seems like a scenario where you want to ensure equal task distribution. However, in real-world scenarios people do not always have the same capacity, go on leave, etc. Then you will have to deal with monitoring and reassigning the task in these cases. The audit trail and Optimize provide transparency and can create reports of task completion volumes. One can get people to complete the appropriate amount of work items without fixed task assignment to an individual.

Related

What's the cheapest way to store an auto increment indexed list of values in AWS?

I have a DynamoDB-based web application that uses DynamoDB to store my large JSON objects and perform simple CRUD operations on them via a web API. I would like to add a new table that acts like a categorization of these values. The user should be able to select from a selection box which category the object belongs to. If a desirable category does not exist, the user should be able to create a new category specifying a name which will be available to other objects in the future.
It is critical to the application that every one of these categories be given a integer ID that increments starting the first at 1. These numbers that are auto generated will turn into reproducible serial numbers for back end reports that will not use the user-visible text name.
So I would like to have a simple API available from the web fronted that allows me to:
A) GET /category : produces { int : string, ... } of all categories mapped to an ID
B) PUSH /category : accepts string and stores the string to the next integer
Here are some ideas for how to handle this kind of project.
Store it in DynamoDB with integer indexes. This leaves has some benefits but it leaves a lot to be desired. Firstly, there's no auto incrementing ID in DynamoDB, but I could definitely get the state of the table, create a new ID, and store the result. This might have issues with consistency and race conditions but there's probably a way to achieve this safely. It might, however, be a big anti pattern to use DynamoDB this way.
Store it in DynamoDB as one object in a table with some random index. Just store the mapping as a JSON object. This really forgets the notion of tables in DynamoDB and uses it as a simple file. It might also run into some issues with race conditions.
Use AWS ElasticCache to have a Redis key value store. This might be "the right" decision but the downside is that ElasticCache is an always on DB offering where you pay per hour. For a low-traffic web site like mine I'd be paying minumum $12/mo I think and I would really like for this to be pay per access/update due to the low volume. I'm not sure there's an auto increment feature for Redis built in the way I'd need it. But it's pretty trivial to make a trasaction that gets the length of the table, adds one, and stores a new value. Race conditions are easily avoid with this solution.
Use a SQL database like AWS Aurora or MYSQL. Well this has the same upsides as Redis, but it's also more overkill than Redis is, and also it costs a lot more and it's still always on.
Run my own in memory web service or MongoDB etc... still you're paying for constant containers running. Writing my own thing is obviously silly but I'm sure there are services that match this issue perfectly but they'd all require a constant container to run.
Is there a food way to just store a simple list, or integer mapping like this that doesn't cost a constant monthly cost? Is there a better way to do this with DynamoDB?
Store the maxCounterValue as an item in DyanamoDB.
For the PUSH /category, perform the following:
Get the current maxCounterValue.
TransactWrite:
Put the category name and id into a new item with id = maxCounterValue + 1.
Update the maxCounterValue +1, add a ConditionExpression to check that maxCounterValue = :valueFromGetOperation.
If TransactWrite fails, start at 1 again, try X more times

Azure WebJobs - how to keep the state?

I have a need to implement some kind of orchestrator by a WebJob. So it needs to keep a state for some kind of internal queue.
Is there any way apart from static field and from using database to keep that state?
In general the idea is simple: I have a calculation job. It get's e.g. ProductId and is doing calculations for it. It takes some time, so when another message for same ProductId is coming I need to wait, until previous calculations will finish. But at the same time I can pick a message for another ProductId if there are no running calculations for that.
I haven't found any way to make sequential processing of messages based on a specific conditions. So end up with idea to implement a stateful orchestrator which will do the trick.
Am I doing it in a wrong way?

In Camunda , What is the correct way of optimised coding among two options?

Retrieving userTasks first and then from task, retrive the processInstanceId.
First retrieving ProcessInstances and then iterate it with a loop and find usertasks in each process instance.
Please give reason too.
If you want to build a task list you should use the task queries, as you can do the right queries and filters there easily. Process instance id's are part of the result.
But it would indeed be interesting to understand the use case, what exactly you need the process instance id for.

Optimistic locking and re-try

I'm not sure about proper design of an approach.
We use optimistic locking using long incremented version placed on every entity. Each update of such entity is executed via compare-and-swap algorithm which just succeed or fail depending on whether some other client updates entity in the meantime or not. Classic optimistic locking as e.g. hibernate do.
We also need to adopt re-trying approach. We use http based storage (etcd) and it can happen that some update request is just timeouted.
And here it's the problem. How to combine optimistic locking and re-try. Here is the specific issue I'm facing.
Let say I have an entity having version=1 and I'm trying to update it. Next version is obviously 2. My client than executes conditional update. It's successfully executed only when the version in persistence is 1 and it's atomically updated to version=2. So far, so good.
Now, let say that a response for the update request does not arrive. It's impossible to say if it succeeded or not at this moment. The only thing I can do now is to re-try the update again. In memory entity still contains version=1 intending to update value to 2.
The real problem arise now. What if the second update fails because a version in persistence is 2 and not 1?
There is two possible reasons:
first request indeed caused the update - the operation was successful but the response got lost or my client timeout, whatever. It just did not arrived but it passed
some other client performed the update concurrently on the background
Now I can't say what is true. Did my client update the entity or some other client did? Did the operation passed or failed?
Current approach we use just compares persisted entity and the entity in main memory. Either as java equal or json content equality. If they are equal, the update methods is declared as successful. I'm not satisfied with the algorithm as it's not both cheap and reasonable for me.
Another possible approach is to do not use long version but timestamp instead. Every client generates own timestamp within the update operation in the meaning that potential concurrent client would generate other in high probability. The problem for me is the probability, especially when two concurrent updates would come from same machine.
Is there any other solution?
You can fake transactions in etcd by using a two-step protocol.
Algorithm for updating:
First phase: record the update to etcd
add an "update-lock" node with a fairly small TTL. If it exists, wait until it disappears and try again.
add a watchdog to your code. You MUST abort if performing the next steps takes longer than the lock's TTL (or if you fail to refresh it).
add a "update-plan" node with [old,new] values. Its structure is up to you, but you need to ensure that the old values are copied while you hold the lock.
add a "committed-update" node. At this point you have "atomically" updated the data.
Second phase: perform the actual update
read the "planned-update" node and apply the changes it describes.
If a change fails, verify that the new value is present.
If it's not, you have a major problem. Bail out.
delete the committed-update node
delete the update-plan node
delete the update-lock node
If you want to read consistent data:
While there is no committed-update node, your data are OK.
Otherwise, wait for it to get deleted.
Whenever committed-update is present but update-lock is not, initiate recovery.
Transaction recovery, if you find an update-plan node without a lock:
Get the update-lock.
if there is no committed-update node, delete the plan and release the lock.
Otherwise, continue at "Second phase", above.
IMHO, as etcd is built upon HTTP which is inherently an unsecure protocol, it will be very hard to have a bullet proof solution.
Classical SQL databases use connected protocols, transactions and journalisation to allow users to make sure that a transaction as a whole will be either fully committed or fully rollbacked, even in worst case of power outage in the middle of the operation.
So if 2 operations depend on each other (money transfert from one bank account to the other) you can make sure that either both are ok or none, and you can simply implement in the database a journal of "operations" with their status to be able to later see if a particuliar one was passed by consulting the journal, even if you were disconnected in the middle of the commit.
But I simply cannot imagine such a solution for etcd. So unless someone else finds a better way, you are left with two options
use a classical SQL database in the backend, using etcd (or equivalent) as a simple cache
accept the weaknesses of the protocol
BTW, I do not think that timestamp in lieue of long version number will strengthen the system, because in high load, the probability that two client transaction use same timestamp increases. Maybe you could try to add a unique id (client id or just technical uuid) to your fields, and when version is n+1 just compare the UUID that increased it : if it is yours, the transaction passed if not id did not.
But the really worse problem would arise if at the moment you can read the version, it is not at n+1 but already at n+2. If UUID is yours, you are sure your transaction passed, but if it is not nobody can say.

Auto-increment on Azure Table Storage

I am currently developing an application for Azure Table Storage. In that application I have table which will have relatively few inserts (a couple of thousand/day) and the primary key of these entities will be used in another table, which will have billions of rows.
Therefore I am looking for a way to use an auto-incremented integer, instead of GUID, as primary key in the small table (since it will save lots of storage and scalability of the inserts is not really an issue).
There've been some discussions on the topic, e.g. on http://social.msdn.microsoft.com/Forums/en/windowsazure/thread/6b7d1ece-301b-44f1-85ab-eeb274349797.
However, since concurrency problems can be really hard to debug and spot, I am a bit uncomfortable with implementing this on own. My question is therefore if there is a well tested impelemntation of this?
For everyone who will find it in search, there is a better solution. Minimal time for table lock is 15 seconds - that's awful. Do not use it if you want to create a truly scalable solution. Use Etag!
Create one entity in table for ID (you can even name it as ID or whatever).
1) Read it.
2) Increment.
3) InsertOrUpdate WITH ETag specified (from the read query).
if last operation (InsertOrUpdate) succeeds, then you have a new, unique, auto-incremented ID. If it fails (exception with HttpStatusCode == 412), it means that some other client changed it. So, repeat again 1,2 and 3.
The usual time for Read+InsertOrUpdate is less than 200ms. My test utility with source on github.
See UniqueIdGenerator class by Josh Twist.
I haven't implemented this yet but am working on it ...
You could seed a queue with your next ids to use, then just pick them off the queue when you need them.
You need to keep a table to contain the value of the biggest number added to the queue. If you know you won't be using a ton of the integers, you could have a worker every so often wake up and make sure the queue still has integers in it. You could also have a used int queue the worker could check to keep an eye on usage.
You could also hook that worker up so if the queue was empty when your code needed an id (by chance) it could interupt the worker's nap to create more keys asap.
If that call failed you would need a way to (tell the worker you are going to do the work for them (lock), then do the workers work of getting the next id and unlock)
lock
get the last key created from the table
increment and save
unlock
then use the new value.
The solution I found that prevents duplicate ids and lets you autoincrement it is to
lock (lease) a blob and let that act as a logical gate.
Then read the value.
Write the incremented value
Release the lease
Use the value in your app/table
Then if your worker role were to crash during that process, then you would only have a missing ID in your store. IMHO that is better than duplicates.
Here is a code sample and more information on this approach from Steve Marx
If you really need to avoid guids, have you considered using something based on date/time and then leveraging partition keys to minimize the concurrency risk.
Your partition key could be by user, year, month, day, hour, etc and the row key could be the rest of the datetime at a small enough timespan to control concurrency.
Of course you have to ask yourself, at the price of date in Azure, if avoiding a Guid is really worth all of this extra effort (assuming a Guid will just work).