AWS Serverless application load time with the Spring framework - amazon-web-services

I am building a web application in AWS using the serverless architecture.
The purpose of the application is to expose a public API to upload files from around the world.
I use AWS API-Gateway and Lambda to execute my code and S3 as storage.
I know that it is very much possible and well supported (even by 3rd parties like the Serverless framework) to use Java Spring framework to write the code that I deploy in my Lambda function.
However, is it really recommended? Spring applications usually take 30 seconds or more to load completely and Lambda should run Immediately.
How come this option is even supported by AWS (since it sounds like a very bad idea)?

Java is one of the supported programming languages of AWS Lambda. It is possible to run an application using Java, you just have to take the warmup time into consideration, if that fits your use-case - then use it. You could also use SNS and a hook to your lambda to keep it warm if you do not receive requests

Using Java with AWS lambdas is perfectly fine but Lambdas are functions not applications!
So you should avoid to use a framework like Spring because you don't need that.
The question is what do you want to achieve in your function and why do you need a framework to execute such small amount of code?
What's your use case?

Personally, I would AVOID using java runtime for AWS lambda as much as possible. I understand that it's very tempting to use java assuming that you are looking into migrating an existing implementation into microservices. But you are always going to pay the penalty of slow warm-up time compared to other runtimes. You may also miss out on Java compiler optimisations as the lambda may not be invoked enough number of times to trigger C1 and C2 compilations.
My preference would be only to use java for lambda if you are planning to write a lean implementation, means no spring, hibernate etc. etc.

Related

Best practices for modularize aws lambda code

I have been checking some resources on internet and all the examples of lambda in AWS are very basic but I am not sure how we will modularize an application with multiples dependencies, for example in java we usually have some structure like this
packages
repository
controllers
..
..
And we place the code related to each logic inside the package, but now in AWS seems that is more like scripting that will glue the pieces than OOP that I am used to, so my question is how we handle (if apply) this relationships, because I have seen code that all the logic is in one lambda and that not seems the best way to go, for example if we had some functionality that fist authenticate, authorize, transform, call an external api, get the response and then do a call to a final rest endpoint, how we can split this, for example will be the same lambda with packages(directories) inside and we call to each other? or we have multiples lambdas each one with one purpose? and this will generate cold start for each lambda?
I was thinking in using layers, but seems very new and not sure if this is production ready feature and seems that is more related to reuse code that is common across all the environment that the way to modularize our code
Generally when you're developing Lambda functions, the function should have a single purpose (which will keep the function relatively small).
If you have multiple actions, by having each Lambda as its own function it will improve the development and deployment experience. Having a single developer working on the function reduces the risk of breaking unrelated functionality, whilst also allowing them to deploy only the function that they've worked on.
To orchestrate between Lambdas for APIs people tend to use API Gateway (be that for your clients communicating to the Lambdas, or between the Lambdas themselves).
Regarding any shared dependencies/libraries Lambda Layers as you mentioned is the correct way to go. It will allow you to centralise the dependencies that your applications share rather than the need to package the Lambda with a version of the dependencies each time.
There's an article on Best Practices for Developing on AWS Lambda that should offer additional guidance.

Is it ok to use API instead of SDK?

I like fast code execution (because of that I switched from Python to Go) and I do not like dependencies. Amazon recommends using SDK for simpler authentication (but in Lambda I can get tokens from IAM from environment variables) and because of built into SDK retry on errors (few lines of code, as I think). Yes it is faster to write my code using SDK, but what additional caveats about using pure HTTP API instead of SDK? Am I too crazy about milliseconds? Such optimizations worth it?
Anything you do with AWS is the result of an API call, whether executed by CLI, Web console, or SDK.
The SDKs make it easier to interact with those APIs. While you may be able to come up with some minor improvements for some calls, overall you will spend a lot of time doing it to very little benefit.
I think the stated focus on performance belies real trade-offs.
Consider that someone will have to maintain your code -- if you use an API, the test area is small, but AWS APIs might change or be deprecated; if you an SDK, next programmer will plug in new SDK version and hope that it works, but if it doesn't they'd be bogged down by sheer weight of the SDK.
Likewise, imagine someone needs to do a security review of this app, or to introduce something not yet covered by SDK (let's imagine propagating accounting group from caller role to underlying storage).
I don't think there is a clear answer.
Here are my suggestions:
keep it consistent -- either API or SDK (within given app)
consider the bigger picture (how many apps do you plan to write?)
don't be afraid to switch to the other approach later
I've had to decide on something similar in the past, with Docker (much nicer APIs and SDKs/libs). Here's how it played out:
For testing, we ended up using beta version of Docker Python bindings: prod version was not enough, and bindings (your SDK) were overall pretty good and clear.
For log scraping, I used HTTP calls (your API), "because performance", in reality comparative mental load using API vs SDK, and because bindings (SDK) did not support asyncio.

How to run a ktor application inside AWS lambda?

I don't find a way to use a ktor application inside an AWS lambda...
That is, instead of starting an embedded server or using an external server as described in http://ktor.io/servers/engine.html, I just need to "execute" the pipeline.
I suppose this is more or less like the TestEngine but I am not so familiar with the ktor framework to be sure
Note :
I have already found examples to run one kotlin function per lambda (the best tutorial IMHO being https://aws.amazon.com/fr/blogs/machine-learning/use-amazon-rekognition-to-build-an-end-to-end-serverless-photo-recognition-system/).
The problem is I dont want to manage one lambda per function (I want one microservice per lambda, the microservice being responsible for multiple tightly coupled operations)
After digging a lot more into AWS lambda and the serverless world in general, I've found that using ktor is not what lambda (or more generally function as a service) is useful for.
That is, I wanted to use ktor to group multiple functions in a logical service and to do the "routing" inside this group.
To achieve that in the FaaS world, you must declare one HTTP endpoint for each function.
As this is very tedious to maintain manually, you can use the serverless framework with a proper serverlesss.yml file.
I had this revealation when reading https://github.com/ajurasz/ascii-less-gallery which is a perfect follow of the article I mention in my intial question

How to build complex apps with AWS Lambda and SOA?

We currently run a Java backend which we're hoping to move away from and switch to Node running on AWS Lambda & Serverless.
Ideally during this process we want to build out a fully service orientated architecture.
My question is if our frontend angular app requests the current user's ordered items to get that information it would need to hit three services, the user service, the order service and the item service.
Does this mean we would need make three get requests to these services? At the moment we would have a single endpoint built for that specific request, which can then take advantage of DB joins for optimal performance.
I understand the benefits SOA, but how to do we scale when performing more compex requests such as this? Are there any good resources I can take a look at?
Looking at your question I would advise to align your priorities first: why do you want to move away from the Java backend that you're running on now? Which problems do you want to overcome?
You're combining the microservices architecture and the concept of serverless infrastructure in your question. Both can be used in conjunction, but they don't have to. A lot of companies are using microservices, even bigger enterprises like Uber (on NodeJS), but serverless infrastructures like Lambda are really just getting started. I would advise you to read up on microservices especially, e.g. here are some nice articles. You'll also find answers to your question about performance and joins.
When considering an architecture based on Lambda, do consider that there's no state whatsoever possible in a Lambda function. This is a step further then stateless services that we usually talk about; they generally target 'client state' that does not exist anymore. But a Lambda function cannot have any state, so e.g. a persistent DB-connection pool is not possible. For all the downsides, there's also a lot of stuff you don't have to deal with which can be very beneficial, especially in terms of scalability.

Api Gateway, multiple lambda in the same JAR

I'm trying to deploy an API suite by using Api Gateway and implementing code in Java using lambda. Is it ok to have many ( related, of course ) lambdas in a single jar ( what I'm supposing to do ) or it is better to create a single jar for each lambda I want to deploy? ( this will became a mess very easily)
This is really a matter of taste but there are a few things you have to consider.
First of all there are limitations to how big a single Lambda upload can be (50MB at time of writing).
Second, there is also a limit to the total size of all all code that you upload (currently 1.5GB).
These limitations may not be a problem for your use case but are good to be aware of.
The next thing you have to consider is where you want your overhead.
Let's say you deploy a CRUD interface to a single Lambda and you pass an "action" parameter from API Gateway so that you know which operation you want to perform when you execute the Lambda function.
This adds a slight overhead to your execution as you have to route the action to the appropriate operation. This is likely a very fast routing but nevertheless, it adds CPU cycles to your function execution.
On the other hand, deploying the same jar over several Lambda function will quickly get you closer to the limits I mentioned earlier and it also adds administrative overhead in managing your Lambda functions as that number grows. They can of course be managed via CloudFormation or cli scripts but it will still add an administrative overhead.
I wouldn't say there is a right and a wrong way to do this. Look at what you are trying to do, think about what you would need to manage the deployment and take it from there. If you get it wrong you can always start over with another approach.
Personally I like the very small service Lambdas that do internal routing and handles more than just a single operation but they are still very small and focused on a specific type of task be it a CRUD for a database table or managing a selected few very closely related operations.
There's some nice advice on serverless.com
As polythene say's, the answer is "it depends". But they've listed the pros and cons for 4 ways of going about it:
Microservices Pattern
Services Pattern
Monolithic Pattern
Graph Pattern
https://serverless.com/blog/serverless-architecture-code-patterns/