I am running a simple java HelloWorld program using docker container in AWS Batch. I have created a managed Compute Environment with following values
Minimum vCPUs 0
Desired vCPUs 0
Maximum vCPUs 256
Instance types optimal
On submitting the Job, the job is executed successfully i.e. the job is submitted to the queue, the scheduler provisions the ec2 instance ( with aws-ecs agent container and java helloworld container which is specified in Job Definition) and the job is successfully completed with the logs in CloudWatch Stream.
My issue is that after the job is succeeded the compute environment (ec2 instance) provisioned by scheduler still keeps on running instead of terminating.
Pls. suggest if I am missing anything.
Your compute environment will terminate if it is idle near the end of an AWS Billing Hour.
Inside the Compute Environment Parameters documentation for AWS Batch, there is a definition of State. A compute environment is in the Enabled state and can accept jobs from the queue. Once the compute environment is in Disabled and idle, toward the end of an AWS billing hour the compute environment is scaled in (which will terminate your EC2 instance).
From Oct 5, 2017
AWS Batch evaluate compute resources more frequently and immediately scale down any idle instances when there are no more runnable jobs in your job queues.
So, your compute environment instance will be terminated immediately if it is idle.
Announcement
Related
My cluster sometimes gets a "burst" of information and generates a large number of Kubernetes Jobs at once. And in other times I have ~0 active jobs.
I'm wondering how can I make it autoscale the number of nodes to continuously be able to process all these jobs in a reasonable time-frame.
I specifically use AWS EKS and each job takes a few minutes to complete.
EKS allows you to deploy cluster autoscaler so when new job can not be scheduled due to lack of available cpu/memory, extra node will be added to the cluster.
I have an ECS cluster with a service in it that is running a task I have defined. It's just a simple flask server as I'm learning how to use ECS. Now I'm trying to understand how to update my app and have it seamlessly deploy.
I start with the flask server returning Hello, World! (rev=1).
I modify my app.py locally to say Hello, World! (rev=2)
I rebuild the docker image, and push to ECR
Since my image is still named image_name:latest, I can simply update the service and force a new deployment with: aws ecs update-service --force-new-deployment --cluster hello-cluster --service hello-service
My minimum percent is set to 100 and my maximum is set to 200% (using rolling updates), so I'm assuming that a new EC2 instance should be set up while the old one is being shutdown. What I observe (continually refreshing the ELB HTTP endpoint) is that that the rev=? in the message alternates back and forth: (rev=1) then (rev=2) without fail (round robin, not randomly).
Then after a little bit (maybe 30 secs?) the flipping stops and the new message appears: Hello, World! (rev=2)
Throughout this process I've noticed that no more EC2 instances have been started. So all this must have been happening on the same instance.
What is going on here? Is this the correct way to update an application in ECS?
This is the normal behavior and it's linked to how you configured your minimum and maximum healthy percent.
A minimum healthy percent of 100% means that at every moment there must be at least 1 task running (for a service that should run 1 instance of your task). A maximum healthy percent of 200% means that you don't allow more than 2 tasks running at the same time (again for a service that should run 1 instance of your task). This means that during a service update ECS will first launch a new task (reaching the maximum of 200% and avoiding to go below 100%) and when this new task is considered healthy it will remove the old one (back to 100%). This explains why both tasks are running at the same time for a short period of time (and are load balanced).
This kind of configuration ensures maximum availability. If you want to avoid this, and can allow a small downtime, you can configure your minimum to 0% and maximum to 100%.
About your EC2 instances: they represent your "cluster" = the hardware that your service use to launch tasks. The process described above happens on this "fixed" hardware.
Now I'm architecting AWS ECS infrastructure.
To auto scale in/out, I used auto scailing.
My system is running on AWS ECS(to deploy docker-compose)
Assume that we have 1 cluster, 1 service with 2 ec2 instance.
I defined scailing policy via CloudWatch if cpu utilization up to 50%.
To autoscailing, we have to apply our policy to ecs service and autoscailing group.
When attach cloudwatch policy to ecs service, it will automatically increase task definition count if cpu utilization up to 50%.
When attach cloudwatch policy to autoscailing group, it will automatically increase ec2 instance count if cpu utilization up to 50%.
After tested it, everything works fine.
But in my service event logs, errors appear like this.
service v1 was unable to place a task because no container instance met all of its requirements. The closest matching container-instance 8bdf994d-9f73-42ec-8299-04b0c5e7fdd3 has insufficient memory available.
I think it occured because of service scailing is start before ec2 instance scailing. (Because service scailing(scale in/out task definition) need to ec2 instance to run it)
But it works fine. Maybe it retry automatically about several times. (I'm not sure)
I wonder that, it is normal configuration on AWS ECS autoscailing?
Or, any missing point in my flow?
Thanks.
ECS can only schedule a service if a container instance is available that matches the containers cpu/memory requirements. Ensure you have this space available to guarantee smooth auto-scaling.
The ec2-asg scaling should happen before service auto-scaling to ensure container instance is available for task scheduler.
I have a batch processing system which runs for 5 hours daily on a fixed time. With AWS Batch, I could schedule the job which creates the required EC2 instances to do the job and terminate back the instances.But with ECS, can i launch and terminate the EC2 instance automatically as per my requirement?
This automatic downscaling can be done using AWS Batch as long as you create the Compute Environment as a Managed Compute Environment instead of unmanaged.
However, using Tasks in ECS means the EC2 resources must be cleaned up and deleted manually. It can be built in to part of your application or managed with CloudFormation, but scaling down those resources is going to be your responsibility in the end.
Question -
Can I reuse the ec2 resource created during first on-demand run of the data pipeline in subsequent on-demand runs as well?
Description -
I have configured an 'on-demand' AWS data pipeline which is required to be activated many times during a day ( say 3 times within an hour ).
( I can not go with the cron or timeseries style scheduling since I have to pass different parameters to the pipeline at each execution)
In each on-demand activation, Data pipeline seems to create a new ec2 resource ? Is this the case?
Can I reuse the ec2 resource created during first on-demand run in other subsequent runs as well?
AWS Documentation provides the following information but it's not clear whether that applies to 'on-demand' pipelines as well.
AWS Data Pipeline allows you to maximize the efficiency of resources
by supporting different schedule periods for a resource and an
associated activity.
For example, consider an activity with a 20-minute schedule period. If
the activity's resource were also configured for a 20-minute schedule
period, AWS Data Pipeline would create three instances of the resource
in an hour and consume triple the resources necessary for the task.
Instead, AWS Data Pipeline lets you configure the resource with a
different schedule; for example, a one-hour schedule. When paired with
an activity on a 20-minute schedule, AWS Data Pipeline creates only
one resource to service all three instances of the activity in an
hour, thus maximizing usage of the resource.
This isn't possible with Data-Pipeline-managed resources. For this scenario, you would need to spin up the EC2 instance yourself and configure TaskRunner:
You can install Task Runner on computational resources that you
manage, such as an Amazon EC2 instance, or a physical server or
workstation. Task Runner can be installed anywhere, on any compatible
hardware or operating system, provided that it can communicate with
the AWS Data Pipeline web service.
To connect a Task Runner that you've installed to the pipeline
activities it should process, add a workerGroup field to the object,
and configure Task Runner to poll for that worker group value. You do
this by passing the worker group string as a parameter (for example,
--workerGroup=wg-12345) when you run the Task Runner JAR file.
This way Data Pipeline will not create any resources for you, and all activities will run on the EC2 instance that you provided.