AWS application consistent snapshots of EC2 instances - amazon-web-services

I'm currently setting up a small Lambda to take snapshots of all the important volumes of our EC2 instances. To guarantee application consistency I need to trigger actions inside the instances: One to quiesce the application before the snapshot and one to wake it up again after the snapshot is done. So far I have no clue how to do this.
I've thought about using SNS or SQS to notify the instances about start and stop of the snapshot, but that has several problems:
I'll need to install (and develop) a custom listener inside the instances.
I'll not get feedback if the quiescing/wake-up is done.
So here's my question: How can I trigger an action inside an instance from an Lambda?
But maybe I'm approaching this from the wrong direction. Is there really no simple backup solution? I know azure has a snapshot based backup service that can do application consitent backups. Did I just miss an equivalent AWS service?
Edit 1:
Ok, it looks like the feature 'Run Command' of AWS Systems Manager is what I really need. It allows me to run scripts, Ansible playbooks and more inside an EC2 instance. When I've got a working solution I'll post the necessary steps.

You can trigger a Lambda function on demand:
Using AWS Lambda with Amazon API Gateway (On-Demand Over HTTPS)
You can invoke AWS Lambda functions over HTTPS. You can do this by
defining a custom REST API and endpoint using Amazon API Gateway, and
then mapping individual methods, such as GET and PUT, to specific
Lambda functions. Alternatively, you could add a special method named
ANY to map all supported methods (GET, POST, PATCH, DELETE) to your
Lambda function. When you send an HTTPS request to the API endpoint,
the Amazon API Gateway service invokes the corresponding Lambda
function. For more information about the ANY method, see Step 3:
Create a Simple Microservice using Lambda and API Gateway.

Related

How does AWS Lambda + AWS Websocket API work under the hood?

I know it invokes different Lambda instances for different routes (like connect, disconnect, default, etc) on the Websocket API. But what happens for different messages on the same route, does it keep the Lambda instance running for new messages until disconnect?
Let's say, I am building a login form with 2FA. I take username, password and process it, and then I want the 2FA code from client. Can I do this with a single Lambda instance?
As commenter deceze wrote:
You can never assume that a single Lambda instance will process a request.
The point of serverless is that you don not manage the servers. Amazon does. And they can and will start new instances of your Lambda, terminate existing instances etc.
So if you need "cross invocation persistence", you need to solve this in a different way. One common way is to use DynamoDB or depending on the use cases ElastiCache, S3, EFS etc.

How would I create a Minecraft EC2 server that automaticaly starts when someone tries to use it

Currently, I have a working modded Minecraft server working running on a C5 EC2 instance. The problem is that I have to manually start and stop the server which can be annoying for my friends. I was wondering if it would be possible to automate the EC2 state so that it runs as soon as a player attempts to join the sever. This would be similar to how Minecraft realms behaves which I heard Mojang was using AWS for:
https://aws.amazon.com/blogs/aws/hosting-minecraft-realms-on-aws/
I have looked up tutorials for this and this is the best I could come across:
https://github.com/trevor-laher/OnDemandMinecraft
The problem with this solution is that it requires to make a separate website to log users in and start the EC2 instance while I want the startup and shutdown to be completely automatic.
I would appreciate any guidance.
If the server is off, it would not be possible to "connect" to the server. Therefore, another mechanism is required that can be used to start the server.
Combine that with your desire to minimise costs and the only real solution is to somehow trigger an AWS Lambda function, which could start the server.
There are a few ways you could have users trigger the AWS Lambda function:
Make a call to API Gateway
Upload an object to Amazon S3
Somehow put a message in an SNS topic or an SQS queue
Trigger an Amazon CloudWatch Alarm (which calls Lambda via SNS)
...and probably other ways
When considering a method to use, you should consider security implications such as:
Whether only authorized users should be able to trigger the Lambda function, or is it okay that anybody (eg a web crawler) might trigger it.
Whether you willing to give your friends AWS credentials (not a good idea) that they could use to start the server directly, or whether it should be an indirect method.
Frankly, I would recommend the following architecture:
Create an AWS Lambda function that turns on the server
Create an API Gateway that triggers the Lambda function
Give a URL to your friends that calls the API Gateway and passes a 'secret' (effectively a password)
The API Gateway will call the Lambda function, passing the secret
The Lambda function confirms that the secret is correct and starts the Amazon EC2 instance with Minecraft installed
Here is a tutorial that shows many of these concepts: Build an API Gateway API with Lambda Integration
The purpose of the secret is to avoid the server from starting if an unauthorized person (or a bot) happens to hit the API Gateway endpoint. They will not provide the secret, so the server will not be started.
Stopping the server after a period of non-use is a different matter. The library you referenced might be able to assist with finding a way to do this. You could have a script running on the Minecraft server that monitors the game and, after a period of inactivity, simply calls the operating system to perform a Shutdown.
You could use a BungeeCord hub server that then allows user to begin a connection to the main server and spin it up via. AWS.
This would require the bungee server to be always up however, but the cost of hosting a small bungee server should be relatively cheap.
I don't think there's any way you could do this without having a small server that receives the initial request to spin up the AWS machine.

Idea and guidelines on end to end AWS solution

I want to build an end to end automated system which consists of the following steps:
Getting data from source to landing bucket AWS S3 using AWS Lambda
Running some transformation job using AWS Lambda and storing in processed bucket of AWS S3
Running Redshift copy command using AWS Lambda to push the transformed/processed data from AWS S3 to AWS Redshift
From the above points, I've completed pulling data, transforming data and running manual copy command from a Redshift using a SQL query tool.
Doubts:
I've heard AWS CloudWatch can be used to schedule/automate things but never worked on it. So, if I want to achieve the steps above in a streamlined fashion, how to go about it?
Should I use Lambda to trigger copy and insert statements? Or are there better AWS services to do the same?
Any other suggestion on other AWS Services and of the likes are most welcome.
Constraint: Want as many tasks as possible to be serverless (except for semantic layer, Redshift).
CloudWatch:
Your options here are either to use CloudWatch Alarms or Events.
With alarms, you can respond to any metric of your system (eg CPU utilization, Disk IOPS, count of Lambda invocations etc) when it crosses some threshold, and when this alarm is triggered, invoke a lambda function (or send SNS notification etc) to perform a task.
With events you can use either a cron expression or some AWS service event (eg EC2 instance state change, SNS notification etc) to then trigger another service (eg Lambda), so you could for example run some kind of clean-up operation via lambda on a regular schedule, or create a snapshot of an EBS volume when its instance is shut down.
Lambda itself is a very powerful tool, and should allow you to program a decent copy/insert function in a language you are familiar with. AWS has several GitHub repos with lots of examples too, see for example the serverless examples and many samples. There may be other services which could work for you in your specific case, but part of Lambda's power is its flexibility.

Kafka + AWS lambda

Is it possible to integrate AWS Lambda with Apache Kafka ?
I want to put a consumer in a lambda function. When a consumer receive a message the lambda function execute.
Continuing the point by Arafat. We have successfully built an infrastructure to consume from Kafka using AWS Lambdas. Here are some gotcha's:
Make sure to consistently batch and commit while reading when consuming.
If you are storing the batches to s3, make sure to clean your file descriptors.
If you are forwarding the batches to another service make sure to clean the variables. Variable caching in AWS Lambda might result in memory overflows.
A good idea is to check how much time you have left while from the context object in the Lambda and give yourself some wiggle room to do something with the buffer you populated in your consumer which might not be read to a file unless you call close().
We are using Apache Airflow for scheduling. I hear cloudwatch can do that too.
Here is AWS article on scheduled lambdas.
Given your Kafka installation will be running in a VPC, best practise is to configure your Lambda to run within the VPC as well - this will simplify the security group configuration for the EC2 instances running Kafka.
Here is the AWS blog article on configuring Lambdas to run in a VPC.
Yes it is very much possible to have a Kafka consumer in AWS Lambda function.
However note that you would not be able to invoke the lambda using some sort of notification. You will rather have to poll the Kafka topic. And the easiest way can be to use a Scheduled Lambda
If you are using managed apache kafka in AWS (MSK):
Since august 2020 you can connect AWS Managed Streaming for Kafka (MSK) as event source. Not your own installed kafka cluster but if you already uses AWS managed kafka this could be useful.
More in the announcement https://aws.amazon.com/about-aws/whats-new/2020/08/aws-lambda-now-supports-amazon-managed-streaming-for-apache-kafka-as-an-event-source/
Screenshot from AWS Console:
AWS now supports "self-hosted Apache Kafka as an event source for AWS Lambda"
When you create a new Lambda, in the "Configuration" tab, click "Add trigger", you can now select and configure your self-hosted Apache Kafka.
Feel free to read more here:
https://aws.amazon.com/blogs/compute/using-self-hosted-apache-kafka-as-an-event-source-for-aws-lambda/
https://docs.aws.amazon.com/lambda/latest/dg/kafka-smaa.html
There is a community-provided Kafka Connector for AWS Lambda. This solution would require you to run the connector somewhere such as EC2 or ECS.

How can I expose the status of an Amazon Lambda function in my web app?

I'm hoping to use Amazon Lambda to run some background tasks for my web app. These particular tasks will only need to run once for the app (not once per user), so I'd like any user to see in the UI if a task is already running, and I'd like to disable the UI that allows them to start that task again.
Does Lambda offer a way to check the status of a function to see if it is running? If not, what is the best way to persist this info to my web app? Am I taking the wrong approach here altogether?
Lambda functions are supposed to be stateless and keeping functions stateless enables AWS Lambda to rapidly launch as many copies of the function as needed to scale to the rate of incoming events. While AWS Lambda’s programming model is stateless, your code can access stateful data by calling other web services, such as Amazon S3 or Amazon DynamoDB.