how to handle nearly concurrent events on a camunda event gateway - camunda

I am calling an external system that is supposed to send two events A and B sequentially. But they are so fast that they land up almost at the same time.
Considering the "FLOW A" (blued), the event based gateway waits for Event A and then activity "do something" is executed before another event based gateway accepts Event B or Event C.
The Super Fast System responds with either events A and B or A and C almost at the same time (millisecond difference). This results in the engine discarding the events B or C. What should be the optimum design?

Implement a retry + delay mechanism on the message broker / event bus or sending system, which kicks in if the message cannot be correlated.

Related

How to tell when Lambdas complete processing of all messages in SQS

Currently I have a process where a Lambda (A) gets triggered which has logic to find out what customers need to have another lambda (B) run for (via a queue). For any run there could be 3k to 4k messages placed on the SQS Queue by Lambda A to be picked up by Lambda B to process. As Lambda B communicates with an external Api, the concurrency is set to 10 for Lambda B so as not to overload the Api. The whole process completes in 35 to 45 minutes.
My problem is how to tell when all the processing is complete?
If you don't need timely information, you could check out the CloudWatch Metrics that SQS offers, e.g.:
ApproximateNumberOfMessagesVisible
The number of messages available for retrieval from the queue.
Reporting Criteria: A non-negative value is reported if the queue is active.
and
ApproximateNumberOfMessagesNotVisible
The number of messages that are in flight. Messages are considered to be in flight if they have been sent to a client but have not yet been deleted or have not yet reached the end of their visibility window.
Reporting Criteria: A non-negative value is reported if the queue is active.
If the sum of these two metrics hits zero, no messages are in the Queue, and processing should be done.
If you need more timely information, the producer of the messages could increment a counter item in DynamoDB with the number of messages added, and each Lambda decrements that counter once it's done. You could then add a Lambda to the DynamoDB Stream of that table with a filter and do something when the value changes to zero again. This is, however, much more complex.
A third option could be to transform the whole thing into a stepfunction and use a map state with a parallelization factor to work on the tasks. The drawback is that the length of the list it can work on is limited afaik.

IoT lifecycle events handling

What is the best practice to check if AWS IoT Core thing is still offline?
Being able to query the state of an AWS IoT thing will for many be an essential part of their application. Lucky AWS has a best practise on how to get lifecycle events here: https://docs.aws.amazon.com/iot/latest/developerguide/life-cycle-events.html
It says that we should check if device is still offline, before performing any actions.
I'm handling it on nodeJs server (listening to events), so the question is, what's the best way to handle it?
For now the plan is, to create some storage (redis?), and implement some timeout(5-10 sec), if I received disconnect event, I'll put it in DB, wait timeout, and if no other messages regarding this device will come (Connected), I'll do some logic.
Is this right approach?
The point is, not to use SQS from aws.
And as AWS docs says, the order of messages is not guaranteed, so what's the best practise to handle it?)
If your device emits a signal at every periodic intervals, then you can treat that as a heartbeat signal.
You can maintain a timer (x minutes/hours etc) and wait for the heartbeat signal from the device.
If the timer times out and you have not received the hearbeat signal, then it is safe to assume that the device has gone offline. Such events are easy to model as a detector model in the IoT Events.
This example from AWS IoT Events is doing exactly the same thing.

Which one is synchronous or asynchronous communication ? And Why?

I am confuse about both communication for the given scenario.I feel that every single list item can be synchronous communication.
Order service calling the shipping service to proceed for shipment.
User buying items from User Interface(UI) Service resulting in
invocation of Order Service.
User Interface(UI) service calling catalog service to get information
about all of the items that it needs to render.
All three examples would be considered asynchronous as they prompt a response due to cause and effect - call and respond. While all three of these could happen concurrently, each in and of themselves is not synchronous.
Synchronous communication happens simultaneously, like two people editing the same document online. Each editor reads and writes at the same time, but does not interrupt the other in any way.
The best example of synchronous communication is a telephone conversation. All connected parties can hear (receive) & speak (transmit) at the same time, and although humans have difficulting performing both actions simultaneously, the telephone connection itself has no trouble providing both concurrently.
Asynchronous acts like a two-way radio. You must stop transmitting in order to receive.
Synchronous = in synch
Sender wait for a response from the receiver to continue further.
Both Sender and Receiver should be in active state.
Sender send data to receiver as it requires an immediate response to continue processing.
When you execute something synchronously, you wait for it to finish before moving on to another task.
Asynchronous = out of synch
Sender does not wait for a response from the receiver
Receiver can be inactive.
Once Receiver is active, it will receive and process.
Sender puts data in message queue and does not require an immediate response to continue processing.
When you execute something asynchronously, you can move on to another task before it finishes.
In your case,
Catalog Service <-- UI --> Order Service --> Shipment service
1) UI has to fetch item details from Catalog Service (Synchronous because it needs item immedietly)
2) Once all items selected, UI has to invoke Order service.(synchronous / asynchronous, depends upon user action)
User might add in shopping cart for future use (or) in favourites (or) to immediate process order.
3) Once all items exist in shopping cart collection , it has to invoke shipmentService. (asynchronous)
Payment should be synchronous. You need acknowledgement.
Assuming all payment and other stuff done, it calls shipment delivery service
Delivery is asynchronous because it cant get acknowledge immedietly. It may take 2 days delay etc.

Event Driven MessageBus architecture with AWS SNS: one or many message buses/ lambda action functions

I am implementing a process in my AWS based hosting business with an event driven architecture on AWS SNS. This is largely a learning experience with a new architecture, programming and hosting paradigm for me.
I have considered AWS Step functions, but have decided to implement a Message Bus with AWS SNS topic(s), because I want to understand the underlying event driven programming model.
Nearly all actions are performed by lambda functions and steps are coupled via SNS and/or SQS.
I am undecided if to implement the process with one or many SNS topics and if I should subscribe the core logic to the message bus(es) with one or many lambda functions.
One or many message buses
My core process currently consist of 9 events which of which 2 sets of 2 can be parallel, the remaining 4 are sequential. Subscribing these all to the same message bus is easier to set up, but requires each lambda function to check if the message is relevant to it, which seems like a waste of resources.
On the other hand I could have 6 message buses and be sure that a notified resource has something to do with the message.
One or many lambda functions
If all lambda functions are subscribed to the same message bus, it may be easier to package them all up with a dispatcher function in a single lambda function. It would also reduce the amount of code to upload to lambda, albeit I don't have to pay for that.
On the other hand I would loose the ability to control the timeout for the lambda function and any changes to the order of events is now dependent on the dispatcher code.
I would still have the ability to scale each process part, as any parts that contain repeating elements are seperated by SQS queues.
You should always emit each type of message to it's own topic, as this allows other services to consume these events without tightly coupling the two services.
Likewise, each worker that wants to consume messages should have it's own queue with it's own subscription to the topic.
Doing the following allows you to add new message consumers for a given event without having to modify the upstream service. Furthermore, responsibility over each component is clear - the service producing messages to a topic owns that topic (and the message format), whereas the consumer owns its queue and event handling semantics.
Your consumer can specify a message filter when subscribing to a topic, so it can only receive messages it cares about (documentation).
For example, a process that sends a customer survey after the customer has received their order would subscribe its queue to the Order Status Changed event with the filter set to only receive events where the new_status field is equal to shipment-received).
The above reflects principles of Service-Oriented architecture - and there's plenty of good material out there elaborating the points above.

When to use delay queue feature of Amazon SQS?

I understand the concept of delay queue of Amazon SQS, but I wonder why it is useful.
What's the usage of SQS delay queue?
Thanks
One use case which i can think of is usage in distributed applications which have eventual consistency semantics. The system consuming the message may have an dependency like a co-relation identifier to be available and hence may need to wait for certain guaranteed duration of time before seeing the co-relation data. In this case, it makes sense for the message to be delayed for certain duration of time.
Like you I was confused as to a use-case for delay queues, until I stumbled across one in my own work. My application needs to have an internal queue with each item waiting at least one minute between each check for completion.
So instead of having to manage a "last-checked-time" on every object, I just shove the object's ID into an SQS queue messagewith a delay time of 60 seconds, and my main loop then becomes a simple long-poll against the queue.
A few off the top of my head:
Emails - Let's say you have a service that sends reminder emails triggered from queue messages. You'd have to delay enqueueing the message in that case.
Race conditions - Delivery delays can be used to overcome race conditions in distributed systems. For example, a service could insert a row into a table, and sends a message about its availability to other services. They can't use the new entry just yet, so you have to delay publishing the SQS message.
Handling retries - Sometimes if a message fails you want to retry with exponential backoffs. This requires re-enqueuing the message with longer delays.
I've built a suite of API's to make queue message scheduling easy. You can call our API's to schedule queue messages, cancel, edit, and check on the status of such messages. Think of it like a scheduler microservice.
www.schedulerapi.com
If you are looking for a solution, let me know. I've built these schedulers before at work for delivering emails at high scale, so I have experience with similar use cases.
One use-case can be:
Think of a time critical expression like a scheduled equity trade order.
If one of your system is fetching all the order scheduled in next 60 minutes and putting them in queue (which will be fetched by another sub system).
If you send these order directly, then they will be visible immediately to process in queue and will be processed depending upon their order.
But most likely, they will not execute in exact time (Hour:Minute:Seconds) in which Customer wanted and this will impact the outcome.
So to solve this, what first sub system will do, it will add delay seconds (difference between current and execution time) so message will only be visible after that much delay or at exact time when user wanted.