Best place in AWS for a long running subscription background service - amazon-web-services

I currently am trying to set up a system in AWS that utilises EventSourcing & CQRS. I've got everything working on the Command side, and this is storing the events into Aurora. I've got SqlEventStore as my EventSourcing store and that has a Subscription mechanism that will listen for new events and then fire a function appropriately.
So far it's all set up in Lambda, but I can't have the subscription in Lambda as they aren't always running, so my first thought was running this side in Fargate and a docker container. Using my reading though, this seems to need to be fired by a task, rather than sit in the container on a subscription.
So my question is really, where is best to have a long running process in AWS that just sits listening for things to happen, rather than responding to a prod from something like a Lambda.

So my question is really, where is best to have a long running process
in AWS that just sits listening for things to happen, rather than
responding to a prod from something like a Lambda.
I will suggest to go with Fargate or EC2 type ECS container, with fargate you do not need manage server, something similar to lambda but more suitable for such long-running process.
This seems to need to be fired by a task, rather than sit in the
container on a subscription.
no, you can run fargate in two ways.
Running as a long-running services
fire service based on cloud watch event or schedule time ( perform task and terminate)
AWS Fargate now supports the ability to run tasks on a regular,
scheduled basis and in response to CloudWatch Events. This makes it
easier to launch and stop container services that you need to run only
at certain times.
AWS fargate
Where is best to have a long-running process in AWS that just sits
listening for things to happen, rather than responding to an event from something like a Lambda
If your task is supposed for the run for a long time then lambda is not for you, there is always timeout in case of lambda.
If you do not want to manage the server, and the process is supposed to run for a long time, then fargate is for you, so then it's fine to sit for the event and listen.

you can explore aws glue python shell for long running server services.

Related

AWS ECS: Is there a way for code inside a task container to set some custom status for outside services to read?

I am building a file processing service in AWS and here is what I have now as a manager-worker architecture:
A nodejs application running in an EC2 instance, serving as a manager node; In this EC2 instance, there is also a RabbitMQ service hosting a job queue
An ECS service running multiple task containers and the containers are also running nodejs code. The code in every task container runs some custom business logic for processing a job. The task containers get the jobs from the above RabbitMQ job queue. When there are jobs enqueued in the RabbitMQ queue, the jobs are assigned to the ECS task containers and the ECS task container would start processing the job.
Now, this ECS service should scale up or down. When there are no jobs in the queue (which happens very frequently), I just want to keep one worker container alive so that I can save budgets.
When there is a large number of jobs arriving at the manager and enqueue into the job queue, the manager has to figure out how to scale up.
It needs to figure out how many new worker container to add into the ECS service. And to do this, it needs to know:
the number of task containers in the ECS service now;
the status of each container: is it currently processing a job?
This second point leads to my question: is there a way to set a custom status to the task, such that this status can be read by the application in EC2 instance through some AWS ECS API?
As others have noted in the comments, there isn't any built in AWS method to do this. I have two suggestions that I hope can accomplish what you want to do:
Create a lambda function that runs on a regular interval that calls into your RabbitMQ api to check the queue length. Then it can use the ECS API to set the desired task count for your service. You can have as much control as you want over the thresholds and strategy for scaling in your code.
Consider using AWS Batch. The compute backend for Batch is also ECS based, so it might not be such a big change. Long running jobs where you want to scale up and down the processing is its sweet spot. If you want, you can queue the work directly in Batch and skip Rabbit. Or, if you still need to use Rabbit you could create a smaller job in Lambda or anywhere else, that pulls the messages out and creates AWS Batch jobs for each. Batch supports running on EC2 ECS clusters, but it can also use Fargate, so it could simplify your management even further.

AWS "Lambda" Needing More CPU/RAM - Trigger EC2 "Job?"

I have a daily process that needs to digest a tremendous amount of data from two external sources. It normally requires around 28GB or RAM, and a decent amount of processing power. Due to this, an AWS Lambda won't work.
In the meantime, I've been running the process on an EC2 instance. In order to save resources, I've attempted to start the instance using a CloudWatch event. Since no event exists for "StartEC2," I'm kicking off a AWS Lambda instead, which in turn starts the EC2 isntance using Amazon support libraries.
All of this is extremely cumbersome, and I've been looking for a library or pattern that can do what I want. Essentially, I need to start an EC2 instance on a cron/event, deliver a unit of work to it (Shell Script, Java App, whatever), have it run it, then shutdown.
I'd love any suggestions for accomplishing this.
Look into AWS Systems Manager (SSM), you can create an Automation document that will launch the instance, run any custom scripts or tasks, and shut it down again when you're done. You can trigger the SSM Automation with a cron schedule via CloudWatch Events.
You may also want to consider AWS Batch for this type of workload.

Scheduling the stopping/starting of an EC2 instance when not in use by a Beanstalk Deployment or an ECS task?

I have a Docker image containing Python code and third-party binary executables. There are only outbound network requests. The image must run hourly and each execution lasts ~3 minutes.
I can:
Use an EC2 instance and schedule hourly execution via cron
Create a CloudWatch Event/Rule to run an ECS Task Defintion hourly
Setup an Elastic Beanstalk environment and schedule hourly deployment of the image
In all of these scenarios, an EC2 instance is running 24/7 and I am being charged for extended periods of no usage.
How do I accomplish scheduling the starting of an existing EC2 instance hourly and the stopping of said instance after the completion of my docker image?
Here's one approach I can think of. It's very high-level, and omits some details, but conceptually it would work just fine. You'll also need to consider the Identity & Access Management (IAM) Roles used:
CloudWatch Event Rule to trigger the Step Function
AWS Step Function to trigger the Lambda function
AWS Lambda function to start up EC2 instances
EC2 instance polling the Step Functions service for Activity Tasks
Create a CloudWatch Event Rule to schedule a periodic task, using a cron expression
The Target of the CloudWatch Event Rule is an AWS Step Function
The AWS Step Function State Machine starts by triggering an AWS Lambda function, which starts the EC2 instance
The next step in the Step Functions State Machine invokes an Activity Task, representing the Docker container that needs to execute
The EC2 instance has a script running on it, which polls the Activity Task for work
The EC2 instance executes the Docker container, waits for it to finish, and sends a completion message to the Step Functions Activity Task
The script running on the EC2 instance shuts itself down
The AWS Step Function ends
Keep in mind that a potentially better option would be to spin up a new EC2 instance every hour, instead of simply starting and stopping the same instance. Although you might get better startup performance by starting an existing instance vs. launching a new instance, you'll also have to spend time to maintain the EC2 instance like a pet: fix issues if they crop up, or patch the operating system periodically. In today's world, it's a commonly accepted practice that infrastructure should be disposable. After all, you've already packaged up your application into a Docker container, so you most likely don't have overly specific expectations around which host that container is actually being executed on.
Another option would be to use AWS Fargate, which is designed to run Docker containers, without worrying about spinning up and managing container infrastructure.
AWS Step Functions
AWS Fargate
Blog: AWS Fargate: An Overview
Creating a CloudWatch Event Rule that triggers on a schedule

PHP AWS Elastic Beanstalk background workers

I have deployed my application using Elastic Beanstalk, since this gives me a very easy deployment flow, to multiple instances at once using the "git aws.push".
I like to add background processing support to my application. The background worker will use the same codebase, and simply start up a long lived php script that continuously looks for tasks to execute. What AWS should i use to create such a worker instance?
Should i use the EB for this aswell or should i try to setup a standard EC2 instance (since i dont need it to be public available) ? I guess thats the right way of doing it and then create a deployment flow that make it easy to deploy to both my EC2 worker instances and to Elastic beanstalk app? or is there a better way of doing this?
AWS EB now adds support for Worker Instances. They're just a different kind of environment which those two differences:
They don't have a cnamePrefix (whatever.elasticbeanstalk.com)
Instead, they've got a SQS queue bound
On each instance, they run a daemon called sqsd which basically poll their environments' sqs queue and forward it to the local http server.
I believe it is worth a try.
If the worker is just polling a queue for jobs and does not require an ELB, then all you need to do is work with EC2, SQS, and probably S3. You can start EC2 instances as part of an Auto-scaling group that, for example, is configured to scale as a function of depth of the SQS queue. When there is no work to do you can have minimum # of EC2, but if the queue gets deep, auto-scaling will spin up more.

Using AWS SQS for Job Queuing but minimizing "workers" uptime

I am designing my first Amazon AWS project and I could use some help with the queue processing.
This service accepts processing jobs, either via an ASP.net Web API service or a GUI web site (which just calls the API). Each job has one or more files associated with it and some rules about the type of job. I want to queue each job as it comes in, presumably using AWS SQS. The jobs will then be processed by a "worker" which is a python script with a .Net wrapper. The python script is an existing batch processor that cannot be altered/customized for AWS, hence the wrapper in .Net that manages the AWS portions and passing in the correct params to python.
The issue is that we will not have a huge number of jobs, but each job is somewhat compute intensive. One of the reasons to go to AWS was to minimize infrastructure costs. I plan on having the frontend web site (Web API + ASP.net MVC4 site) run on elastic beanstalk. But I would prefer not to have a dedicated worker machine always online polling for jobs, since these workers need to be a bit "beefier" instance (for processing) and it would cost us a lot to mostly sit doing nothing.
Is there a way to only run the web portion on beanstalk and then have the worker process only spin up if there are items in the queue? I realize I could have a micro "controller" instance always online polling and then have it control the compute spinup, but even that seems like it shouldn't be needed. Can EC2 instances be started based on a non-zero SQS queue size? So basically web api adds job to queue, something watches the queue and sees it's non-zero, this triggers the EC2 worker to start, it spins up and polls the queue on startup. It processes until the queue until empty, then something triggers it to shutdown.
You can use Autoscaling in conjunction with SQS to dynamically start and stop EC2 instances. There is a AWS blog post that describes the architecture you are thinking of.