What is the general approach to implement event versioning in Commanded using commanded/eventstore? - commanded

Would one put this versioning information into the events' metadata or is there an already established best practice? Wouldn't want to reinvent the wheel if not necessary.

Commanded's next release has support for event upcasting. This allows you upgrade events from any previous state and/or from their associated metadata.

Related

Amazon EventBridge/Step Function/Lambda - Possible to achieve idempotency without introducing synchronous processing?

We have a Workflow (Step Function) that is triggered via an EventBridge rule and inside that Workflow is a lambda that creates a record in a third party vendor system. Assuming the third party vendor API does not have a way to enforce uniqueness of a record, we are wondering if it's possible to achieve idempotency (no duplicate record creations in vendor) without introducing synchronous processing e.g. a FIFO queue/limit lambda concurrency.
As EventBridge does not gaurantee only once delivery but instead guarantees at least once delivery, we encountered an issue with duplicate records being created in the vendor system due to duplicate events being processed at the same time (5ms~ apart).
I am wondering if it's possible to prevent this without introducing a FIFO queue or limiting lambda concurrency to 1, if not we will proceeed with that and add a check in the lambda to see if the record already exists in vendor system or if the record is processed/processing on our side.
My understanding is a control check on the uniqueness of the record will not suffice while concurrency exists (e.g. 2 lambdas concurrently processing the same record checking the vendor API/Dynamo/SQL will return the same result).
I am pretty sure we have our answer/solution but just checking if I have not overlooked something to do this.
Lambda Powertools provides an idempotency module to help with this type of scenario.
It is currently available in Python and Java programming languages.
It handles some of the difficult edge cases and is backed by DynamoDB.

What would be the Pattern to display all existing Actors

I programmed an Akka Application that realises Device Management. Every device is an Akka Actor and I implemented Akka Finite State Machine to control the lifecycle of Device, like FUNCTIONAL, BROKEN, IN_REPAIRS, RETIRED, etc...and I persist the devices with Akka Persistence to Cassandra.
Everything works like a dream but I have dilemma and I like to ask what would be pattern to deal with Akka.
I would nearly have 1 000 000 Devices, Akka is ideal to manage those single instances but how I implement that if user one to see all devices system and select one, change it is state...
I can't show it from Akka Journal table, I would not be able show anything other than persistenceId.
So how would you handle this dilemma.
My current plan, while all events coming to my system from Kafka, consume also these messages from Topic and redirect those to Solr/Elasticsearch, so I can index it some metadata with persistenceId, so user can select a Device to process with Akka Actor.
Do you have a better idea or how do you solve this idea?
Another option to save this information Cassandra to another Keyspace but for some reason I don't fancy it.....
Thx for answers...
Akka persistence is for managing Actor state so that it can be resilient with failures of application ( https://www.reactivemanifesto.org/).May not be optimal for using it for business cases. I understood that your requirement is to able to browse Actors in system. I see couple of options:
Option1:
Akka supports feature called named actors (https://doc.akka.io/docs/akka/current/general/addressing.html). In your case you have device to Actor as one to one mapping. So you can take advantage of this using with names actors feature. During the actors creation in actor system ,you apply this pattern so that all your actors in system are named with device ids.Now you can browse all your device ids (As this is your use case details, you can have searchable module using Solar/Elastic Search as you mentioned). Whenever browsing devices means you are browsing Actors in your system. You can use this named actor path to retrieve actor from system and do some actions.
Option2:
You can use monitoring tools for trace/browse actors in the application. Beyond your need it provides several other useful metrics.
https://www.lightbend.com/blog/akka-monitoring-telemetry
https://kamon.io/solutions/monitoring-for-akka/
Akka Persistence is heavily oriented to the Command-Query Responsibility Segregation style of implementing systems. There are plenty of great outlines describing this pattern if you want more depth, but the broad idea is that you divide responsibility for changing data (the intent to change data being modeled through commands) from responsibility for querying data. In some cases this responsibility carries through to separately deployed services, but it doesn't have to (the more separated, in terms of deployment/operations or development, the less coupled they are, so there's a cost/benefit tradeoff for where you want to be on the level-of-segregation spectrum).
Typically the portion of the system which is handling commands and deciding how (or even if) a given command updates state is often called the "write-side". In your application, the FSM actors modeling the state of a device and persisting changes would be the write-side, and you seem to have that part down pat.
The portion handling the queries is, correspondingly, often called the "read-side", and one key benefit is that it can use a different data model than the write-side, up to and including using a different data store (e.g. Solr/Elasticsearch).
Since you're using Akka Persistence and event-sourcing (judging from mentioning the journal table), Akka Projections provides a good opinionated wrapper for publishing events from the write-side to Kafka for another service to update a Solr/Elasticsearch read-side with. It does require (at least at this time) that your write-side tag events; with some effort you can do something similar by combining the persistenceIds and eventsByPersistenceId query streams to feed events from the write-side to Kafka without having to tag.
Note that when going down the CQRS path, you are generally committing to some level of eventual consistency between the write-side and the read-side.

AWS Lambda - Store state of a queue

I'm currently tasked with building a serverless architecture for communication between government agencies and citizens, and a main component is some form of queue that contains some form of object/pointer to each citizens request, sorted by priority. The government workers can then process an element when available. As Lambda is stateless, I need to save the queue outside in some manner.
For saving state I've gathered that you can use DynamoDB or S3 Buckets and use event triggers to invoke related Lambda methods. Some also suggest using Parameter Store to save some state variables. Storing things globally has also come up, though as you can't guarantee that the Lambda doesn't terminate, it doesn't seem like a good idea.
Finally, I've also read a bit about SQS, though I have no idea if it is at all applicable to this case.
What is the best-practice / suggested approach when working with Lambda in this way? I'm leaning towards S3 Buckets, due to event triggering, and not using DynamoDB as our DB.
Storing things globally has also come up, though as you can't guarantee that the Lambda doesn't terminate, it doesn't seem like a good idea.
Correct -- this is not viable at all. Note that what you are actually referring to when you say "the Lambda" is the process inside the container... and any time your Lambda function is handling more than one invocation concurrently, you are guaranteed that they will not be running in the same container -- so "global" variables are only useful for optimization, not state. Any two concurrent invocations of the same function have two entirely different global environments.
Forgetting all about Lambda for a moment -- I am not saying don't use Lambda; I'm saying that whether or not you use Lambda isn't relevant to the rest of what is written, below -- I would suggest that parallel/concurrent actions in general are perhaps one of the most important factors that many developers tend to overlook when trying to design something like you are describing.
How you will assign work from this work "queue" is extremely important to consider. You can't just "find the next item" and display it to a worker.
You must have a way to do all of these things:
finding the next item that appears to be available
verify that it is indeed available
assign it to a specific worker
mark it as unavailable for assignment
Not only that, but you have to be able to do all of these things atomically -- as a single logical action -- and without collisions.
A naïve implementation runs the risk of assigning the same work item to two or more people, with the first assignment being blindly and silently overwritten by subsequent assignments that happen at almost the same time.
DynamoDB allows conditional updates -- update a record if and only if a certain condition is true. This is a critical piece of functionality that your solution needs to accommodate -- for example, assign work item x to user y if and only if item x is currently unassigned. A conditional update will fail, and changes nothing, if the condition is not true at the instant the update happens and therein lies the power of the feature.
S3 does not support conditional updates, because unlike DynamoDB, S3 operates only on an eventual-consistency model in most cases. After an object in S3 is updated or deleted, there is no guarantee that the next request to S3 will return the most recent version or that S3 will not return an item that has recently been deleted. This is not a defect in S3 -- it's an optimization -- but it makes S3 unsuited to the "work queue" aspect.
Skip this consideration and you will have a system that appears to work, and works correctly much of the time... but at other times, it "mysteriously" behaves wrongly.
Of course, if your work items have accompanying documents (scanned images, PDF, etc.), it's quite correct to store them in S3... but S3 is the wrong tool for storing "state." SSM Parameter Store is the wrong tool, for the same reason -- there is no way for two actions to work cooperatively when they both need to modify the "state" at the same time.
"Event triggers" are useful, of course, but from your description, the most notable "event" is not from the data, or the creation of the work item, but rather it is when the worker says "I'm ready for my next work item." It is at that point -- triggered by the web site/application code -- when the steps above are executed to select an item and assign it to a worker. (In practice, this could be browser → API Gateway → Lambda). From your description, there may be no need for the creation of a new work item to trigger an "event," or if there is, it is not the most significant among the events.
You will need a proper database for this. DynamoDB is a candidate, as is RDS.
The queues provided by SQS are designed to decouple two parts of your application -- when two processes run at different speeds, SQS is used as a buffer, allowing X to safely store the work needing to be done and then continue with something else until Y is able to do the work. SQS queues are opaque -- you can't introspect what's in the queue, you just take the next message and are responsible for handling it. On its face, that seems to partially describe what you need, but it is not a clean match for this use case. Queues are limited in how long messages can be retained, and once a message is successfully processed, it is completely gone.
Note also that SQS is only a match to your use case with the FIFO queue feature enabled, which guarantees perfect in-order delivery and exactly-once delivery -- standard SQS queues, for performance optimization reasons, do not guarantee perfect in-order delivery and may under certain conditions deliver the same message more than once, to the same consumer or a different consumer. But the SQS FIFO queue feature does not coexist with event triggers, which require standard queues.
So SQS may have a role, but you need an authoritative database to store the work and the results of the business process.
If you need to store the message, then SQS is not the best tool here, because your Lambda function would then need to process the message and finally store it somewhere, making SQS nothing but a broker.
The S3 approach gives what you need out of the box, considering you can store the files (messages) in an S3 bucket and then have one Lambda consume its event. Your Lambda would then process this event and the file would remain safe and sound on S3.
If you eventually need multiple consumers for this message, then you can send the S3 event to SNS instead and finally you could subscribe N Lambda Functions to a given SNS topic.
You appear to be worrying too much about the infrastructure at this stage and not enough on the application design. The fact that it will be serverless does not change the basic functionality of the application — it will still present a UI to users, they will still choose options that must trigger some business logic and information will still be stored in a database.
The queue you describe is merely a datastore of messages that are in a particular state. The application will have some form of business logic for determining the next message to handle, which could be based on creation timestamp, priority, location, category, user (eg VIP users who get faster response), specialization of the staff member asking for the next message, etc. This is not a "queue" but rather a calculation to be performed against all 'unresolved' messages to determine the next message to assign.
If you wish to go serverless, then the back-end will certainly be using Lambda and a database (eg DynamoDB or Amazon RDS). The application should store everything in the database so that data is available for the application's business logic. There is no need to use SQS since there really isn't a "queue", and Parameter Store is merely a way of sharing parameters amongst application components — it is not meant for core data storage.
Determine the application functionality first, then determine the appropriate architecture to make it happen.

AWS Event-Sourcing implementation

I'm quite a newbe in microservices and Event-Sourcing and I was trying to figure out a way to deploy a whole system on AWS.
As far as I know there are two ways to implement an Event-Driven architecture:
Using AWS Kinesis Data Stream
Using AWS SNS + SQS
So my base strategy is that every command is converted to an event which is stored in DynamoDB and exploit DynamoDB Streams to notify other microservices about a new event. But how? Which of the previous two solutions should I use?
The first one has the advanteges of:
Message ordering
At least one delivery
But the disadvantages are quite problematic:
No built-in autoscaling (you can achieve it using triggers)
No message visibility functionality (apparently, asking to confirm that)
No topic subscription
Very strict read transactions: you can improve it using multiple shards from what I read here you must have a not well defined number of lamdas with different invocation priorities and a not well defined strategy to avoid duplicate processing across multiple instances of the same microservice.
The second one has the advanteges of:
Is completely managed
Very high TPS
Topic subscriptions
Message visibility functionality
Drawbacks:
SQS messages are best-effort ordering, still no idea of what they means.
It says "A standard queue makes a best effort to preserve the order of messages, but more than one copy of a message might be delivered out of order".
Does it means that giving n copies of a message the first copy is delivered in order while the others are delivered unordered compared to the other messages' copies? Or "more that one" could be "all"?
A very big thanks for every kind of advice!
I'm quite a newbe in microservices and Event-Sourcing
Review Greg Young's talk Polygot Data for more insight into what follows.
Sharing events across service boundaries has two basic approaches - a push model and a pull model. For subscribers that care about the ordering of events, a pull model is "simpler" to maintain.
The basic idea being that each subscriber tracks its own high water mark for how many events in a stream it has processed, and queries an ordered representation of the event list to get updates.
In AWS, you would normally get this representation by querying the authoritative service for the updated event list (the implementation of which could include paging). The service might provide the list of events by querying dynamodb directly, or by getting the most recent key from DynamoDB, and then looking up cached representations of the events in S3.
In this approach, the "events" that are being pushed out of the system are really just notifications, allowing the subscribers to reduce the latency between the write into Dynamo and their own read.
I would normally reach for SNS (fan-out) for broadcasting notifications. Consumers that need bookkeeping support for which notifications they have handled would use SQS. But the primary channel for communicating the ordered events is pull.
I myself haven't looked hard at Kinesis - there's some general discussion in earlier questions -- but I think Kevin Sookocheff is onto something when he writes
...if you dig a little deeper you will find that Kinesis is well suited for a very particular use case, and if your application doesn’t fit this use case, Kinesis may be a lot more trouble than it’s worth.
Kinesis’ primary use case is collecting, storing and processing real-time continuous data streams. Data streams are data that are generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
Another thing: the fact that I'm accessing data from another
microservice stream is an anti-pattern, isn't it?
Well, part of the point of dividing a system into microservices is to reduce the coupling between the capabilities of the system. Accessing data across the microservice boundaries increases the coupling. So there's some tension there.
But basically if I'm using a pull model I need to read
data from other microservices' stream. Is it avoidable?
If you query the service you need for the information, rather than digging it out of the stream yourself, you reduce the coupling -- much like asking a service for data rather than reaching into an RDBMS and querying the tables yourself.
If you can avoid sharing the information between services at all, then you get even less coupling.
(Naive example: order fulfillment needs to know when an order has been paid for; so it needs a correlation id when the payment is made, but it doesn't need any of the other billing details.)

How to enforce entity dependencies in SOA environment - build / download?

When establishing several modular and independent services, I am challenged with dependencies / stored relationships between entities. Consider Job Position and Employee. In my system, the Employee's Assignment is linked (URI) to the Job Position.
For our application, the Job Positions would be managed by a separate service than the Employee service, which leads to the challenge of constraints to prevent inadvertent removal of a Job Position if an employee is already matched to that position.
I've designed a custom solution leveraging a Registry (which should have dependency details, etc.) and enforce a paradigm across the inter-dependent services, however it is complex. In the SOA environment, how could one manage these inter-dependencies?
Many thanks in advance!
In some ways your question could be rephrased as "How to enforce referential integrity in SOA environment". Well the answer is you can't. That's kind of a by-product of the Autonomous in the tenets of SOA.
So almost by definition, the Job Position in the Employee service is not the same thing as the Job Position in the Job Position service. This is actually a good thing. Even though both services define Job Position, they do so from two different capabilities, and are free to develop and evolve their capability as needs arise.
So, hard constraints on the removal of data within one service boundary based on the existence of similar data inside another service boundary are just not possible (or even desirable).
This is all very well, but then how do you avoid the situation where Employees may be "matched" to a Job Position which has changed in some way, either via removal or update?
Well, services can be interested in changes to other services. And in these situations, services can become consumers of each other. It's fairly obvious the Employee capability would be interested in changes to the Job Position capability.
Events are actually a fairly well used design pattern for this scenario. If a business action results in a change the data of a service, that service can publish an event message which describes the change. Other services can become consumers of this type of event and can handle it in their own fashion. Because eventing is usually implemented with a pub-sub semantic, any service capability which so desires can subscribe to the event.
In your example, the event which could be published if a job position was deleted could be defined as (using C#):
class JobPositionRemoved
{
int JobPositionId { get; set; }
string JobPositionName { get; set; }
...
}
How a consumer of this event actually handles it (what action would be taken by the consumer) is another question and would depend on the capability of the consumer. As an example, your Employee service could gather a list of the Employees with this job position and flag them for review, or add them to a queue for "job position reassignment".
Your event could even include a field called int ReplacedByJobPosition which would enable consumers to automatically update any capability that depended on the removed job position.
As long as your event is delivered across a fault-tolerant transport (such as message queuing), you can be fairly confident that while you won't have referential integrity between your service capabilities, your system as a whole should become consistent eventually.
By using events in this way, you also avoid the need for a centralized registry of inter-dependencies (which sounds like a nasty idea). Each service is responsible for publishing events about changes to it's own data, and dependencies are defined by services consuming events from each other.
Hope this is helpful.
EDIT
In answer to your comment - while I can see the benefit of having another service taking care of the position:reassignment problem and I don't see any massive problems with this, there are a few considerations.
One of the reasons why service boundaries and business capability boundaries are a natural fit is that when you change a business capability (eg a change in Billing procedure) it does not generally impact other business capabilities (CRM/Finance/etc). By introducing shared services you're coupled to more than one capability, your service doesn't have well defined boundaries, and as a result has a higher cost of ownership as it will need to be changed a lot.
Additionally you could argue that the consumer of a business event (eg, JobPositionRemoved) should take responsibility for the entire handling of that event.
The handling of the event may well trigger a subsequent event to be published (such as ReviewTaskCreatedForEmployeeChange) which can then be handled by another consumer (eg a workflow tool) if desired.