Can I deploy a multiclass java jar in aws lambda OR it should be always a single class file recommended in lambda? - amazon-web-services

I have an existing spring boot, not a webservice but a Kafka client app. But the issue is we have been structured with typical Processor->Service->DAO layer. The jar is above 50 MB so anyway its not a candidate for aws lambda. I got some doubts, can I deploy the full jar OR should I use the step functions?. All tutorials are a single class function. Do anyone have tried out this(multiclass jar)? Also now lambda have introduced dockers. Thats adding a more confusion, can I deploy a docker, but looks like its the same under the hood.
My pick is ECS/EKS with Fargate. Basically am planning to get rid of the docker image as well. But looks like there is no way available in lambda to host my existing app other than refactoring it as step function. Is it correct?

You can deploy the full fat jar with the usual multi-class hierarchy, but it is not recommended due to the Cold Start issue unless you use "Provisioned concurrency".
Here are my tips for you:
Keep the multi-class hierarchy, which anyways doesn't have much impact on the Jar size. This will keep your code testable. Try to remove the Spring if it is possible and create your own small dependency injection framework or use other small frameworks for that purpose.
Review all your dependencies, remove jars that are not needed. Our usual code is always very small, the dependent jar makes our deployable huge.

Related

Where does IaC belong in web/API development projects?

The Problem
Suppose I have a simple CRUD web application. The application is containerized and developed locally and set up with main/staging/develop branches on Github. CI/CD is configured with Github actions. Merges to main trigger a deployment to AWS App Runner.
Generally, we need two main AWS services here: Cloud Formation, ECR, and App Runner.
Where does the IaC with AWS CDK belong?
Approaches
Have a separate repository for the IaC. Run this repository once to setup the App Runner service and a dedicated ECR repository. Treat the ECR repository URI as an environment variable in the application repository. On merge to main in the application repository, Github Actions rebuilds and pushes the image to the ECR repo. The App Runner service detects the new image and redeploys per this documentation.
Pros: Separation of responsibilities between repositories. One is for infrastructure, One is for application code. CDK only runs once. Deployments are far simpler and easier to diagnose.
Cons: Significant manual overhead. More work to change infrastructure.
AWS CDK supports declaring an App Runner service with a local docker image per this documentation. Simply create an AWS CDK project directly in the application repository. Upon merge to main, it simply re-runs the CDK with the new image.
Pros: One repository. Almost entirely automated with no manual overhead from infrastructure/DevOps team.
Cons: Developers may have to worry about IaC. Potential compute overhead with constant re-runs of CDK.
The Actual Questions
Which approach is best for websites/API that might rely on multiple backend services?
Which one fits the best in a development culture that relies heavily on microservices? Is there another approach I'm not thinking of? Am I asking the wrong questions?
I personally prefer approach 2 because I hate manual overhead.
I have less experience in microservices, so I was hoping some people with more industry experience could present some insight.
If this is the wrong place to ask this question or if I need to be more specific, please comment below and I'll adjust accordingly.
Have a separate repository for the IaC
The most compelling reason to do this is to decouple CI/CD for IaC from the
app repo. For example, if the application codebase is not continuously delivered, and the IaC is committed to the same repo as the codebase, you end up in a scenario where the IaC is versioned along with the code. If you deploy an old version, or a branch, does that version or branch get IaC from the same ref in the repo as the code itself? If so, you're in a position where you have to merge IaC changes across branches to make them deployable, which is a huge headache.
Most development teams as of 2022 do want to continuously deliver their code and "fail forward" rather than rolling back to old versions. In this scenario it doesn't really matter - because the main branch of the code is also the main branch of the infrastructure. But in this case, rolling back to an old version of the code inherently means rolling back to the matching version of the infrastructure, so you can't do it without looking very carefully at what infrastructure changes were made between the two versions and whether it's safe to roll back infra changes or not.
On the other hand, if the IaC repo is separate from the code repo, the IaC can be made to accommodate multiple versions of the code. Dependencies still exist - new features in the app that require new infrastructure are inherently dependent on the Infra as Code changes, and you don't have the shared repo to make sure those dependencies are deployed before the app.
It usually comes down to a question of ownership. If the infrastructure is primarily managed by a distinct group, then putting the infra in a separate repository makes a lot of sense because commingling infrastructure changes with code changes makes it hard for these groups to operate independently. To push out an infrastructure change from a different repo, is essentially an isolated step. To push out an infrastructure change from the same repo, requires merging a PR into the code base and deploying that. If the infra change is the only thing being deployed , that's pretty straight forward. but if the CI branch of the codebase is in a messy state then the infrastructure becomes undeployable because the code is undeployable. If the infrastructure is owned by the team whose job it is to also keep the code repo clean and deployable, then splitting the repos apart doesn't do much good.
Having spent the last 12 years or so doing DevOps, I'm pretty attracted to putting IaC in a separate repo for messy applications whose teams struggle with continuous delivery. That way when I want to make infrastructure changes, I can consider them in relative isolation and can deploy them to all environments regardless of which version of the code is deployed there. It really sucks to be trying to migrate database hosting, for example, if you need to work with the product team to get your IaC into each version of the code deployed into each environment. But it's not a free lunch - I still have to make sure to coordinate dependencies between the infra and the code, of course.
The smaller the service, the more the development team also handles IaC, and the more disciplined the development team's approach to CICD , the less it matters. If the same code goes out to dev/prod anyway and code merge and deploy is a frequent, comfortable thing, then the IaC may as well be in the app codebase. But you have to be ready to be limited to a fail-forward, continuously integrated approach, and accept that infrastructure and code deployments are coupled at the repo layer.
Most microservice dev teams tend to own their own IaC, continuously deliver their application, and put their IaC in the same repo as their code.

Lambda Function as a Zip with Layer as a Docker Image?

My entire lambda functions architecture is built around the .zip package type. I create a separate layer.zip for dependencies that are shared across but it recently crossed the 250 MB limit.
I'm not keen on moving my lambda functions to docker container images.
But I can happily docker containerize my layers.
Is there any way to use create/use a docker container image as a layer and attach it to a .zip lambda function?
Does this serve as a solution to get past the 250 MB layer limit?
Look forward to any feedback and resources! :)
Short answer: no.
Longer answer: they're two different ways of implementing your Lambda. In the "traditional" style, your code is deployed into an AWS-managed runtime that invokes your functions. In the "container" style, your container must implement that runtime.
While I think that 250 MB is an indication that Lambda is the wrong technology choice, you can work-around it by downloading additional content into /tmp when your Lambda first starts.
This is easier for languages such as Python, where you can update sys.path on the fly. It's much less easy (but still doable) for languages like Java, where you'd need a "wrapper" function that then creates a classpath for the downloaded JARs.
It's also easier if the bulk of your Lambda is a static resource, like an ML model, rather than code.

How to use AWS CLI to create a stack from scratch?

The problem
I'm approaching AWS, and the first test project will be a website, but i'm struggling on how to approach the resource and the tools to accomplish this.
AWS documentation is not really beginner-friendly, so to me it is like to being punched in the face at the first boxe training session.
First attempt
I've installed bot AWS and SAM cli tools, so what I would expect is to be able to create an empty stack at first and adding the resource one by one as the specifications are given/outlined, but instead what I see is that i need to give a template to the tool to create the new stack, but that means I need to know how to write it beforehand and therefore the template specifications for each resource type.
Second attempt
This lead me to create the stack and the related resources from the online console to get the final stack template, but then I need to test every new resource or any updated resource locally, so I have to copy the template from the online console to my machine and run the cli tools with this, but obviously it is not the desired development flow.
What I expected
Coming from a standard/classical web development I would expect to be able to create the project locally, test the related resources locally, version it, and delegate the deployment to the pipeline.
So what?
All this made me understand that "probably" I'm missing somenthing on how to use the aws cli tools and how the development for an aws-hosted application is meant to be done.
I'm not seeking for a guide on specific resource types like every single tutorial I've found online, but something on a higher level on how to handle a project development on aws, best practices and stuffs like that, I can then dig deeper on any resource later when needed.
AWS's Cloud Development Kit ticks the boxes on your specific criteria.
Caveat: the CDK has a learning curve in line with its power and flexibility. There are much easier ways to deploy a web app on AWS, like the higher-level AWS Amplify framework, with abstractions tailored to front-end devs who want to minimise the mental energy spent on the underlying infrastructure.
Each of the squillion AWS and 3rd Party deploy tools is great for somebody. Nevertheless, looking at your explicit requirements in "What I expected", we can get close to the CDK as an objective answer:
Coming from a standard/classical web development
So you know JS/Python. With the CDK, you code infrastructure as functions and classes, rather than 500 lines of YAML as with SAM. The CDK's reference implementation is in Typescript. JS/Python are also supported. There are step-by-step AWS online workshops for these and the other supported languages.
create the project locally
Most of your work will be done locally in your language of choice, with a cdk deploy CLI command to
bundle the deployment artefacts and send them up to the cloud.
test the related resources locally
The CDK has built-in testing and assertion support.
version it
"Deterministic deploy" is a CDK design goal. Commit your code and the generated deployment artefacts so you have change control over your infrastructure.
delegate the deployment to the pipeline
The CDK has good pipeline support: i.e. a push to the remote main branch can kick off a deploy.
AWS SAM is actually a good option if you are just trying to get your feet wet with AWS. SAM is an open-source wrapper around the aws-cli, which allows you to create aws resources like Lambda in say ~10 lines of code vs ~100 lines if you were to use the aws-cli directly. Yes, you'll need to learn SAM specific things like SAMtemplate and SAM-cli but it is pretty straightforward using this doc.
Once you get the hang of it, it would be easier to start looking under the hood of what/how SAM is doing things and get into the weeds with aws-cli if you wanted. Which will then allow you to build out custom solutions (using aws-cli) for your complex use cases that SAM may not support. Caveat: SAM is still pretty new and has open issues that could be a blocker for advanced features/complex use cases.

Using cloud functions vs cloud run as webhook for dialogflow

I don't know much about web development and cloud computing. From what I've read when using Cloud functions as the webhook service for dialogflow, you are limited to write code in just 1 source file. I would like to create a real complex dialogflow agent, so It would be handy to have an organized code structure to make the development easier.
I've recently discovered Cloud run which seems like it can also handle webhook requests and makes it possible to develop a complex code structure.
I don't want to use Cloud Run just because it is inconvenient to write everything in one file, but on the other hand it would be strange to have a cloud function with a single file with thousands of lines of code.
Is it possible to have multiple files in a single cloud function?
Is cloud run suitable for my problem? (create a complex dialogflow agent)
Is it possible to have multiple files in a single cloud function?
Yes. When you deploy to Google Cloud Functions you create a bundle with all your source files or have it pull from a source repository.
But Dialogflow only allows index.js and package.json in the Built-In Editor
For simplicity, the built-in code editor only allows you to edit those two files. But the built-in editor is mostly just meant for basic testing. If you're doing serious coding, you probably already have an environment you prefer to use to code and deploy that code.
Is Cloud Run suitable?
Certainly. The biggest thing Cloud Run will get you is complete control over your runtime environment, since you're specifying the details of that environment in addition to the code.
The biggest downside, however, is that you also have to determine details of that environment. Cloud Funcitons provide an HTTPS server without you having to worry about those details, as long as the rest of the environment is suitable.
What other options do I have?
Anywhere you want! Dialogflow only requires that your webhook
Be at a public address (ie - one that Google can resolve and reach)
Runs an HTTPS server at that address with a non-self-signed certificate
During testing, it is common to run it on your own machine via a tunnel such as ngrok, but this isn't a good idea in production. If you're already familiar with running an HTTPS server in another environment, and you wish to continue using that environment, you should be fine.

Best way to test and deploy aws lambda functions in a step function

Long time stack overflow lurker and fist time poster.
I've started a new project using AWS lambdas and have found the learning curve particularly steep coming from a background of developing desktop applications.
When developing desktop applications it's easy to create a test environment locally. I know it's possible to test lambda functions locally and I've been able to do this for simple cases.
The lambda functions I'm using interact a lot with other AWS services (S3, Aurora, etc). Also, the final solution will include around 15 lambda functions linked via a step function.
I want to know if it's possible to create a separate test environment to the live production environment for the entire step function. This would allow me to perform system tests before deploying to production.
I've looked into AWS codepipeline as a possible solution but I'm not sure if this would allow me to create a seperate test environment before deploying to production.
Any help would be greatly appreciated.
Thanks!