Common DynamoDB through a common lambda or each microservice

Common DynamoDB through a common lambda or each microservice - amazon-web-services

We have a "shared" layer that has a few resources accessed by different services in the project. There is a table storing shared information (user permission on each of the resources in the project, since it can get big so not being stored in JWT token)
Should we have a Lamba read the dynamoDB table and give other microservices access to the shared lambda only or should we give the microservices access to the table directly so that they can just use a lib method to read the permissions from the table? I am leaning towards direct DynamoDB table access since that avoids the extra hoop through a lambda.

Both approaches have advantages & disadvantages:
Direct Access to DynamoDB - Good Sides
The authors of the other Lambda functions can build on their own phases. Faster teams can sprint and not wait for the slower team
If one lambda function is misbehaving / failing, the other lambdas are still decoupled from it and the blast radius gets limited
Direct Access to DynamoDB - Bad sides
The effort for writing similar stuff is duplicated in different lambda instances.
Each lambda can write their own logic and introduce differences in implementations. This could be intentionally designed to work that way but it could also be that one developer misunderstood the requirements
If this DynamoDB gets poisoned by wrong coding by one of the consuming lambdas, the other lambdas can also go down.
It becomes hard to measure the reserve capacity, Some of the lambdas can easily become greedy when it comes to read units.
Mediating Lambda - Good Sides
Reduces the effort required to implement similar logic for different consumers
If the shared lambda that manages the DynamoDB is performing actions like audit trail storing, you will be able to easily measure the required read & write capacity units.
If it is decoupled from the consumers, then the failure can be reduced and contained within it.
Mediating Lambda - Bad Sides
This shared lambda can easily become a single point of failure if the consuming lambdas are expecting return values from it.
More communication is required between the team managing this lambda and the consuming teams. Politics can easily be introduced by this Lambda :D
If the consuming teams are developing in a much faster rate than the owner of this shared lambda, it could easily be a blocker to other teams if integration is done poorly.

Related

AWS AppSync Resolvers Lambda Function vs Velocity Template Language (VTL)

I have been looking into AWS AppSync to create a managed GraphQL API with DynamoDB as the datastore. I know AppSync can use Apache Velocity Template Language as a resolver to fetch data from dynamoDB. However, that means I have to introduce an extra language to the programming stack, so I would prefer to write the resolvers in Javascript/Node.js
Is there any downside of using a lambda function to fetch data from DynamoDB? What reasons are there to use VTL instead of a lambda for resolvers?

There are pros and cons to using lambda functions as your AppsSync resolvers (although note you'll still need to invoke your lambdas from VTLs):
Pros
Easier to write and maintain
More powerful for marshalling and validating requests and responses
Common functionality can be more DRY than possible with VTLs (macros are not supported)
More flexible debugging and logging
Easier to test
Better tooling and linting available
If you need to support long integers in your DynamoDB table (DynamoDB number types do support long, but AppSync resolvers only support 32-bit integers. You can get around this if you use a lambda, for example by serializing longs to a string before transport through the AppSync resolver layer) - See (currently) open Feature Request: https://github.com/aws/aws-appsync-community/issues/21
Cons
Extra latency for every invocation
Cold starts = even more latency (although this can usually be minimised by keeping your lambdas warm if this is a problem for your use case)
Extra cost
Extra resources for each lambda, eating up the fixed 200 limit
If you're doing a simple vanilla DynamoDB operation it's worth giving VTLs a go. The docs from AWS are pretty good for this: https://docs.aws.amazon.com/appsync/latest/devguide/resolver-mapping-template-reference-dynamodb.html
If you're doing anything mildly complex, such as marshalling fields, looping, or generally hacky non-DRY code, then lambdas are definitely worth considering for the speed of writing and maintaining your code provided you're comfortable with the extra latency and cost.

What is the pattern for Google Cloud Functions to implement mutex

I'm using https triggered Google Cloud Functions to handle client requests to perform database writes. The data is structured in a way that most in parallel writes will not result in corruption.
There are few cases where I need to prevent multiple write actions to happen at once for the same item. What are the common patterns to lock access to some resource on the function level. I'm looking for some "mutex-like" functionality.
I was thinking of some external service that could grant or deny access to the resource for requesting function instances, but the connection overhead would be huge - handshake each time etc.
Added an example as requested. In this specific case, restructuring the data to keep the track of updates isn't a suitable solution.
import * as admin from "firebase-admin";
function updateUserState(userId: string) {
// Query current state
admin
.database()
.ref()
.child(`/users/${userId}/state`)
.once("value")
.then(snapshot => {
return snapshot.val() || 0;
})
.then(currentState =>
// Perform some operation
modifyStateAsync(currentState)
)
.then(newState => {
admin
.database()
.ref()
.child(`/users/${userId}/state`)
.set(newState);
});
}

This is not a pattern that you want to implement in Cloud Functions. Restricting the parallelism of Cloud Functions would limit its scalability, which is counter to the way Cloud Functions works. To learn more about how Cloud Functions scales, watch this video.
If you have a database that needs to have some protection against concurrent access, you should be using the database's own transaction features. Pretty much every database that provides concurrent access to data also provides some ability to perform atomic transactions. Use these transactions, and let the serverless container scale up and down in the way it sees fit.

In the Google Cloud there is an elegant way to have a global distributed mutex for a critical section in a Cloud Function:
gcslock
This is a library written in Go language, hence available for Cloud Functions written in Go, that utilises atomicity guarantees of the Google Cloud Storage service. This approach is apparently not available in AWS because of the lack of such guarantees in the S3 service.
The tool is not applicable for every use case. Acquiring and releasing the lock are operations of order of 10ms, which might be too much for high speed processing use cases.
For a typical batch process, that is not time critical, the tool provides pretty interesting option of guaranteeing that your Cloud Function is not running concurrently over the same target resource. Just create the lock file in GCS with the name that is unique for the operation that you'd like to put into the critical section and release it once its done (or rely on the GCS object lifecycle management to clean the locks up).
Please see more considerations and pros and cons in the original tool GitHub project.
There is also apparently an implementation of the same in Python.
Here is a nice article that summarises use cases for distributed locking on GCP in particular.

Counting AWS lambda calls and segmenting data per api key

Customers (around 1000) sign up to my service and receive a customer unique api key. They then use the key when calling a AWS lambda function through AWS api gateway in to access data in DynamoDb.
Requirement 1: The customers get billed by the number of api calls, so I have to be able to count those. AWS only provides metrics for total number of api calls per lambda so I have a few options:
At every api hit increment a counter in DynamoDB.
At every api hit enqueue a message in SQS, receive it in "hit
counter" lambda and increment a counter in DynamoDB.
Deploy a separate lambda for each customer. Use AWS built-in call
counter.
Requirement 2: The data that the lambda can access is unique for each customer and thus dependent on the api key provided.
To enable this I also have a number of options:
Store the required api key together with the data that the customer
has the right to access.
Deploy a separate lambda for each customer. Use api gateway to
protect it with a key.
Create a separate endpoint in api gateway for each customer, protect
it with the api key.
None of the options above seem like a good way to design the solution. Is there a canonical way of doing this? If not, which of the options above is the best? Have I missed an obvious solution due to my unfamiliarity with AWS?

I will try to break your problems down with my experience, but maybe Michael - Sqlbot or John Rotenstein may be able to give more appropriate answers.
Requirement 1
1) This sounds like a good approach. I don't see anything critical here.
2) This, IMHO, is the best out of the 3. It will decouple data access from the billing service, which is a great thing in a Microservices world.
3) This is not scalable. Imagine your system grows and you end up with 10K Lambda functions. Not only you'll have to build a very reliable mechanism to automate this process, but also you'll need to monitor 10K different things (imagine CloudWatch logs, API Gateway, etc), not to mention you'll have 10 thousand functions with exactly the same code (client specific parameters apart). I wouldn't even think about this one.
Requirement 2
1) It could work and it fits nicely in the DynamoDB model of doing things: store as much data as you can in a unique table, so you can fetch everything in one go. From what I see, you could even use this ApiKey as your partition key and, for the sake of simplicity for this answer, store the client's data as JSON in a column named data. Since your query only needs to query by the ApiKey, storing a JSON in DynamoDB won't hurt (do keep in mind, however, that if you need to query by any of its JSON attributes than you're in bad shoes, since DynamoDB's query capabilities are very limited)
2) No, because of Requirement 1.3
3) No, because of the above.
If you still need to store the ApiKey in a different table so you can run different analysis and keep a finer grained control over the client's calls, access, billing and etc., that's not a problem either, just make sure you duplicate your ApiKey on your ClientData table instead of creating a FK (DynamoDB doesn't support FKs, so you'd need to manage these constraints yourself). Duplication is just fine in a NoSQL world.
Your use case is clearly a Multi-Tenancy one, so I'd also recommend you to read Multi-Tenant Storage with Amazon DynamoDB which will give you some more insights and broaden your options a little bit. Multi-Tenancy is not an easy task and can give you lots of headaches if not implemented correctly. I think this is why AWS has also prepared this nice read for us :)
Happy to continue this on the comments section in case you have more info to share
Hope this helps!

AWS Event-Sourcing implementation

I'm quite a newbe in microservices and Event-Sourcing and I was trying to figure out a way to deploy a whole system on AWS.
As far as I know there are two ways to implement an Event-Driven architecture:
Using AWS Kinesis Data Stream
Using AWS SNS + SQS
So my base strategy is that every command is converted to an event which is stored in DynamoDB and exploit DynamoDB Streams to notify other microservices about a new event. But how? Which of the previous two solutions should I use?
The first one has the advanteges of:
Message ordering
At least one delivery
But the disadvantages are quite problematic:
No built-in autoscaling (you can achieve it using triggers)
No message visibility functionality (apparently, asking to confirm that)
No topic subscription
Very strict read transactions: you can improve it using multiple shards from what I read here you must have a not well defined number of lamdas with different invocation priorities and a not well defined strategy to avoid duplicate processing across multiple instances of the same microservice.
The second one has the advanteges of:
Is completely managed
Very high TPS
Topic subscriptions
Message visibility functionality
Drawbacks:
SQS messages are best-effort ordering, still no idea of what they means.
It says "A standard queue makes a best effort to preserve the order of messages, but more than one copy of a message might be delivered out of order".
Does it means that giving n copies of a message the first copy is delivered in order while the others are delivered unordered compared to the other messages' copies? Or "more that one" could be "all"?
A very big thanks for every kind of advice!

I'm quite a newbe in microservices and Event-Sourcing
Review Greg Young's talk Polygot Data for more insight into what follows.
Sharing events across service boundaries has two basic approaches - a push model and a pull model. For subscribers that care about the ordering of events, a pull model is "simpler" to maintain.
The basic idea being that each subscriber tracks its own high water mark for how many events in a stream it has processed, and queries an ordered representation of the event list to get updates.
In AWS, you would normally get this representation by querying the authoritative service for the updated event list (the implementation of which could include paging). The service might provide the list of events by querying dynamodb directly, or by getting the most recent key from DynamoDB, and then looking up cached representations of the events in S3.
In this approach, the "events" that are being pushed out of the system are really just notifications, allowing the subscribers to reduce the latency between the write into Dynamo and their own read.
I would normally reach for SNS (fan-out) for broadcasting notifications. Consumers that need bookkeeping support for which notifications they have handled would use SQS. But the primary channel for communicating the ordered events is pull.
I myself haven't looked hard at Kinesis - there's some general discussion in earlier questions -- but I think Kevin Sookocheff is onto something when he writes
...if you dig a little deeper you will find that Kinesis is well suited for a very particular use case, and if your application doesn’t fit this use case, Kinesis may be a lot more trouble than it’s worth.
Kinesis’ primary use case is collecting, storing and processing real-time continuous data streams. Data streams are data that are generated continuously by thousands of data sources, which typically send in the data records simultaneously, and in small sizes (order of Kilobytes).
Another thing: the fact that I'm accessing data from another
microservice stream is an anti-pattern, isn't it?
Well, part of the point of dividing a system into microservices is to reduce the coupling between the capabilities of the system. Accessing data across the microservice boundaries increases the coupling. So there's some tension there.
But basically if I'm using a pull model I need to read
data from other microservices' stream. Is it avoidable?
If you query the service you need for the information, rather than digging it out of the stream yourself, you reduce the coupling -- much like asking a service for data rather than reaching into an RDBMS and querying the tables yourself.
If you can avoid sharing the information between services at all, then you get even less coupling.
(Naive example: order fulfillment needs to know when an order has been paid for; so it needs a correlation id when the payment is made, but it doesn't need any of the other billing details.)

Api Gateway, multiple lambda in the same JAR

I'm trying to deploy an API suite by using Api Gateway and implementing code in Java using lambda. Is it ok to have many ( related, of course ) lambdas in a single jar ( what I'm supposing to do ) or it is better to create a single jar for each lambda I want to deploy? ( this will became a mess very easily)

This is really a matter of taste but there are a few things you have to consider.
First of all there are limitations to how big a single Lambda upload can be (50MB at time of writing).
Second, there is also a limit to the total size of all all code that you upload (currently 1.5GB).
These limitations may not be a problem for your use case but are good to be aware of.
The next thing you have to consider is where you want your overhead.
Let's say you deploy a CRUD interface to a single Lambda and you pass an "action" parameter from API Gateway so that you know which operation you want to perform when you execute the Lambda function.
This adds a slight overhead to your execution as you have to route the action to the appropriate operation. This is likely a very fast routing but nevertheless, it adds CPU cycles to your function execution.
On the other hand, deploying the same jar over several Lambda function will quickly get you closer to the limits I mentioned earlier and it also adds administrative overhead in managing your Lambda functions as that number grows. They can of course be managed via CloudFormation or cli scripts but it will still add an administrative overhead.
I wouldn't say there is a right and a wrong way to do this. Look at what you are trying to do, think about what you would need to manage the deployment and take it from there. If you get it wrong you can always start over with another approach.
Personally I like the very small service Lambdas that do internal routing and handles more than just a single operation but they are still very small and focused on a specific type of task be it a CRUD for a database table or managing a selected few very closely related operations.

There's some nice advice on serverless.com
As polythene say's, the answer is "it depends". But they've listed the pros and cons for 4 ways of going about it:
Microservices Pattern
Services Pattern
Monolithic Pattern
Graph Pattern
https://serverless.com/blog/serverless-architecture-code-patterns/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js