I've read the docs about Services and in particular Worker Service. I am confused about when to use the Worker Service.
Say my worker consumes messages from an SNS+SQS which arrive once in three days, but once arrived it may take 15-20 minutes to process all of them (1 message takes no more than 60 seconds to process). So, my worker is kind of idle for most time. Does it mean I need to pay for my worker service even when it is idle? Lambdas seem to be more suitable in this case but the worker service does not use lambdas.
Note: Spots do not work since the jobs are NOT fault-tolerant.
Does it mean I need to pay for my worker service even when it is idle?
Yes. ECS is charged based on the amount of CPU/RAM you have reserved. You are reserving physical resources for your container, and you will be paying for them regardless of how much you are making use of them. Besides which, your container won't be entirely idle, it will have to be constantly polling SQS.
If you are deploying to EC2 targets then this may not be a big deal, since you would simply be running another container on an EC2 instance you are already paying for, but if you are deploying to Fargate then it is definitely an added expense.
Lambdas seem to be more suitable in this case but the worker service does not use lambdas.
Then I suggest deploying this portion of your application as a Lambda function. Just because you are using Copilot to deploy some containers to AWS doesn't mean you should limit yourself to only using the services supported by CoPilot.
Related
I have a (golang web server) service running on AWS on a EC2 (no auto scaling). This service has a few cron jobs that runs throughout the day and these jobs starts when the service starts.
I would like to take advantage of auto scaling in some form on AWS. Been looking at ECS and Beanstalk.
When I add auto scaling I need the cron job to only execute on one of the scaled services due to rate limits on external APIs. Right now the cron job is tightly coupled within the service and I am looking for an option that does not require moving the cron job to its own service.
How can I achieve this in a good way using AWS?
You're going to get this problem as a general issue in any scalable application where crons cannot / should not run multiple times. It's not really AWS specific. I'm not sure to what extent you want to keep things coupled or how your crons are currently run but here are a few suggestions that might work for you:
Create a "cron runner" instance with a limit to run crons on
You could create a separate ECS service which has no autoscaling and a fixed value of 1 instance. This instance would run the same copy of your code as your "normal" instances and would run crons. You would turn crons off on your "normal" instances. You might find that this can be a very small instance since it doesn't handle any web traffic.
Create a "cron trigger" instance which fires off crons remotely
Here you create one "trigger" instance which sends a request to your normal instances through an ALB. Because your ALB will route the request to 1 of the servers behind it the cron only gets run once. One watch out with this is that if your cron is long running, you may need to consider your request timeouts. You'll also have to think about retries etc but I assume you already have a process that can be adapted for that.
The above solutions can be adapted with message queues etc but the basis of both is that there is another instance of some kind which starts the cron and is separate from your normal servers. Depending on when your cron runs, you may only need to run this cron instance for a few hours per day so it can be cost efficient to do things like this.
Personally I have used both methods in a multi-tenant application and I had to go with the option of running the cron like this due to the number of tenants and the time / resource it took to run the crons for all of them at once:
Cloudwatch schedule triggers a lambda which sends a message to SQS to queue a cron for each tenant individually.
Cron servers (totally separate from main web servers but running same / similar code) pull messages and run the cron for each tenant individually. Stores a key in redis for crons which are vital to only run once to stop issues with "at least once" delivery so crons don't run twice.
This can also help handle failures with retry policies and deadletter queues managed in SQS.
Ultimately you need to kick off these crons from one place. If possible, change up your crons so it doesn't matter if they run twice. It makes it easier to deal with retries and things like that.
I have an Elastic Beanstalk setup where I want to do two things:
Have all workers prioritize certain jobs (premium > free)
Have some workers only do specific jobs (enterprise worker does only enterprise jobs)
The workers use the SQS daemon that fetches from the queue and I'm not sure if and how to modify them.
How would you achieve this using Elastic Beanstalk?
The main EB adventure is that it is out-of-the-box system you setup in minutes. The disadvantage is the you have limited control over it.
What you described could be achieved on the worker environment. I think you could disable the worker daemon, and handle all the message processing yourself in your up according to your criteria.
You could also create multiple queues if you want using by using Resources setup options.
However, the futher you deviate from its behavior, the more management you will have to do yourself. Subsequently, you may get to the point where it is simply easier to make your own environment for processing your messages outside of EB.
With SQS this is usually accomplished by having multiple queues. You could have one for Enterprise, one for Premium, and one for Free. Then have your worker check them in that order (and depending on your application, perhaps have some worker that only check Enterprise/Premium/Free. This may depend how long your jobs take and what your user's expectations are).
I do not know exactly how to set this up in Elastic Beanstalk, but hopefully this is enough to get you started.
I am able to successfully deploy Django Celery worker as a docker container in AWS ECS service using FARGATE as computing.
But my concern is that celery container is running 24/7. If I can run container only when task is assigned, I can save lot of money as per AWS FARGATE billing methodology.
Celery isn't really the right thing to use because it's designed to persist, but the goal should be reasonably easy to achieve.
Architecturally, you probably want to run a script on a Fargate task. The script chews through the queue and then dies. You'd trigger that task somehow:
An API call from your data receiver (e.g. Django)
A lambda function (triggered by what?)
Still some open questions... do you limit yourself to one task at a time or do you need to manage concurrent requests to the queue? Do you retry? But a plausible place to start.
A not-recommended but perhaps easier way to do it would be to run a celery worker in your Django container (e.g. using supervisor) and use Fargate's autoscaling features. You'd always have the one Django container running to receive data. If the celery worker on that container used up all of the available resources, Fargate would scale the service by adding tasks. Once the jobs were done, it'd remove the excess containers. You'd be paying the "overhead" for Django in each container, but it could cost you less than an always-on celery container and would certainly be simpler -- leverage your celery experience and avoid the extra layer of event handling.
EDIT: Another disadvantage of this version is that you need to run Redis somewhere and I've found the minimum cost for this to be relatively high.
Based on my growing AWS experience, here's what you probably should do...
Use AWS API Gateway as an always-on receiver of events/requests. You only pay for requests, the free tier includes a million per month, and the next 300M are $1 (pricing) so this is likely to be free.
While you have many options for responding to the request, an AWS Lambda function (which can be written in python) should have the least overhead.
If your queue will run longer than a Lambda function allows (15 minutes), you'll need to have that Lambda function delegate the processing to e.g. a Fargate task.
(Optional) If you want to user a Dockerhub container for your Fargate task, we experienced a bunch of issues with Tasks and Services failing to start due to rate limits at Dockerhub. We ended up wrapping our Fargate task in a Step Function that checked for this error specifically and retried.
(Optional) If you need to limit concurrency, this SO answer suggests having your Lambda function check for an existing execution (of a Step Function or Fargate task). I was hoping there was something native on Fargate Tasks or Step Functions but I don't see anything.
I imagine this would represent a huge operating cost savings over the always-on Fargate task and Elasticache Redis queue, but the up-front cost/hassle could exceed the savings.
Have you thought of using AWS Lambda instead of the celery worker? You would then pay per task execution, where cost is driven by execution time and memory usage. If you have an application which is mostly idle then paying per request, skipping the idle cost, would make the most sense.
The core of my question is whether or not there are downsides to using an Amazon Machine Image + Micro Spot instances to run a task, vs using the Elastic Container Service (ECS).
Here's my situation: I have the need to run a task on demand that is triggered by a remote web hook.
There is the possibility this task can get triggered 10 times in a row, or go weeks w/o ever executing, so I definitely want a service that only runs (and bills) on demand.
My plan is to point the webhook to a Lambda function, but then the question is what to have the Lambda function do.
Tho it doesn't take very long, this task requires several different runtimes (Powershell Core, Python, PHP, Git) to get its job done, so Lambda isn't really a possibility as I'd hit the deployment package size limit. But I can use Lambda to kick off the job.
What I started doing was creating an AMI that has all the necessary runtimes and code, then using a Spot request to launch an instance, have it execute the operation via a startup script passed in via userdata, then shut itself down when it's done. I'd have to put in some rate control logic to prevent two from running at once, but that's a solvable problem.
I hesitated half way through developing this solution when I realized I could probably do this with a docker container on ECS using Fargate.
I just don't know if there is any benefit of putting in the additional development time of switching to a docker container, when I am not a docker pro and already have the AMI configured. Plus ECS/Fargate is actually more expensive than just running a micro instance.
Are these any concerns about spinning up short-lived (<5min) spot requests (t3a-micro) where there could be a dozen fired off in a single day? Are there rate limits about this? Will I get an angry email from AWS telling me to knock it off? Are there other reasons ECS is the only right answer? Something else entirely?
Your solution using spot instance and AMI is a valid one, though I've experienced slow times to get a spot instance in the past. You also incur the AMI startup time.
As mentioned in the comments, you will incur a minimum of 1 hour charge for the instance, so you should leave your instance up for the hour before terminating, in case more requests can come in the same hour.
IMHO you should build it all with lambda. By splitting the workload for each runtime into its own lambda you can make it work.
AWS supports python, powershell runtimes, and you can create a custom PHP one. Chain them together with your glue of choice, SNS, SQS, direct invocation, or Step Functions, and you have the most cost effective solution. You also get the benefits of better and independent maintenance for each function/runtime.
Put the initial lambda behind API gateway and you will get rate limiting capabiltiy too.
I am trying to create a certain kind of networking infrastructure, and have been looking at Amazon ECS and Kubernetes. However I am not quite sure if these systems do what I am actually seeking, or if I am contorting them to something else. If I could describe my task at hand, could someone please verify if Amazon ECS or Kubernetes actually will aid me in this effort, and this is the right way to think about it?
What I am trying to do is on-demand single-task processing on an AWS instance. What I mean by this is, I have a resource heavy application which I want to run in the cloud and have process a chunk of data submitted by a user. I want to submit a this data to be processed on the application, have an EC2 instance spin up, process the data, upload the results to S3, and then shutdown the EC2 instance.
I have already put together a functioning solution for this using Simple Queue Service, EC2 and Lambda. But I am wondering would ECS or Kubernetes make this simpler? I have been going through the ECS documenation and it seems like it is not very concerned with starting up and shutting down instances. It seems like it wants to have an instance that is constantly running, then docker images are fed to it as task to run. Can Amazon ECS be configured so if there are no task running it automatically shuts down all instances?
Also I am not understanding how exactly I would submit a specific chunk of data to be processed. It seems like "Tasks" as defined in Amazon ECS really correspond to a single Docker container, not so much what kind of data that Docker container will process. Is that correct? So would I still need to feed the data-to-be-processed into the instances via simple queue service, or other? Then use Lambda to poll those queues to see if they should submit tasks to ECS?
This is my naive understanding of this right now, if anyone could help me understand the things I've described better, or point me to better ways of thinking about this it would be appreciated.
This is a complex subject and many details for a good answer depend on the exact requirements of your domain / system. So the following information is based on the very high level description you gave.
A lot of the features of ECS, kubernetes etc. are geared towards allowing a distributed application that acts as a single service and is horizontally scalable, upgradeable and maintanable. This means it helps with unifying service interfacing, load balancing, service reliability, zero-downtime-maintenance, scaling the number of worker nodes up/down based on demand (or other metrics), etc.
The following describes a high level idea for a solution for your use case with kubernetes (which is a bit more versatile than AWS ECS).
So for your use case you could set up a kubernetes cluster that runs a distributed event queue, for example an Apache Pulsar cluster, as well as an application cluster that is being sent queue events for processing. Your application cluster size could scale automatically with the number of unprocessed events in the queue (custom pod autoscaler). The cluster infrastructure would be configured to scale automatically based on the number of scheduled pods (pods reserve capacity on the infrastructure).
You would have to make sure your application can run in a stateless form in a container.
The main benefit I see over your current solution would be cloud provider independence as well as some general benefits from running a containerized system: 1. not having to worry about the exact setup of your EC2-Instances in terms of operating system dependencies of your workload. 2. being able to address the processing application as a single service. 3. Potentially increased reliability, for example in case of errors.
Regarding your exact questions:
Can Amazon ECS be configured so if there are no task running it
automatically shuts down all instances?
The keyword here is autoscaling. Note that there are two levels of scaling: 1. Infrastructure scaling (number of EC2 instances) and application service scaling (number of application containers/tasks deployed). ECS infrastructure scaling works based on EC2 autoscaling groups. For more info see this link . For application service scaling and serverless ECS (Fargate) see this link.
Also I am not understanding how exactly I would submit a specific
chunk of data to be processed. It seems like "Tasks" as defined in
Amazon ECS really correspond to a single Docker container, not so much
what kind of data that Docker container will process. Is that correct?
A "Task Definition" in ECS is describing how one or multiple docker containers can be deployed for a purpose and what its environment / limits should be. A task is a single instance that is run in a "Service" which itself can deploy a single or multiple tasks. Similar concepts are Pod and Service/Deployment in kubernetes.
So would I still need to feed the data-to-be-processed into the
instances via simple queue service, or other? Then use Lambda to poll
those queues to see if they should submit tasks to ECS?
A queue is always helpful in decoupling the service requests from processing and to make sure you don't lose requests. It is not required if your application service cluster can offer a service interface and process incoming requests directly in a reliable fashion. But if your application cluster has to scale up/down frequently that may impact its ability to reliably process.