How can I understand `Nodes` in EKS Fargate? - amazon-web-services

I deployed a EKS cluster and a fargate profile. Then I deployed a few application to this cluster. I can see these fargate instances are launched.
when I click each of this instance, it shows me some information like os, image etc. But it doesn't tell me the CPU and memory. When I look at fargate pricing: https://aws.amazon.com/fargate/pricing/. It is calculated based on CPU and Memory.
I have used ECS and it is very clear that I need to provision CPU/Memory in service/task level. But I can't find anything in EKS.
How do I know how much resources they are consuming?

With Fargate you don`t have provision, configure or scale virtual machines to run your containers so that they become fundamental compute primitive.
This solution model is called serverless where you are being charged for only the compute resources and storage that are need to execute some piece of your code. It does not mean that there are not server involved in this, it just you don`t need to care about those.
To monitor there those you can use CloudWatch. Below documents describe how this can be achieved:
How do I troubleshoot high CPU utilization on an Amazon ECS task on
Fargate?
How can I monitor high memory utilization for Amazon ECS tasks on
Fargate?
It is worth to mention that Fargate is just a launch type for ECS (Another one is EC2). Please have a look at the diagram in this document for clear image of how those are connected. The CloudWatch metrics are collected automatically for Fargate. If you are using the AKS with Fargate you can monitor them with usage of metrics-addon or prometheus inside your kubernetes cluster.
Here's an example of monitoring Fargate with Prometheus. Notice that it scrapes the metrics from CloudWatch.

Related

AWS ECS cluster auto-scaling vs service auto-scaling

this is my first time using amazon ecs service.
I have searched online for awhile to understand auto-scaling with ecs services.
I found there are two options to auto-scale my application. But, There are some I don't understand.
First is Service auto scaling which track the cpu/memory metric from cloudWatch and increase the task number accordingly.
Second is cluster auto scaling which needs to create auto scaling resource, create capacity provider and so on. But, in Tutorial: Using cluster auto scaling, it can run the task definition without service. But it also seems increasing the task number in the end.
So what's the different and 'pros and cons' between them?
I will try to explain briefly.
A Task is a container which is running our code(from a docker image).
As Service is making sure that given desired no of tasks are maintained.
We will be running these services in ECS backed by EC2 or Fargate. Ec2 is machines managed by us. Fargate is machines managed by AWS.
Scaling:
Ultimately, We will be scaling the tasks just by setting desired no of tasks between min and max tasks, based on CPU or any other metric of individual task. This is called service auto scaling.
Fargate: Since AWS will manage necessary VMs behind the scenes, we can set any no of desired tasks we want and seamlessly scale without worrying about any infrastructure.
EC2: We can't seamlessly scale services because we need to add/remove EC2 instances behind the scenes too. We need to auto scale these instances also based on cpu or any other metrics of the Ec2 machines, which is called Cluster scaling.

Does AWS ECS Control Plane i.e Cluster cost any thing?

On looking at the EKS Pricing page, its very clear that the cluster i.e control plane as of today costs $.10/hour. Quote from - https://aws.amazon.com/eks/pricing/
You pay $0.10 per hour for each Amazon EKS cluster that you create.
But on looking at ECS pricing page - https://aws.amazon.com/ecs/pricing/, I am not able to figure out the cluster i.e control plane cost. So before creating an ECS cluster and leaving it there irrespective of usage, I want to know how I will be billed.
Please share your thoughts!!!!
Also, my understanding is, for EKS cluster, I will be charged for the cluster irrespective of the usage i.e the cluster is used for deployments/pods/services etc or just left doing nothing. Please correct if wrong.
No, ECS does not have a control plane/cluster fee. You only pay for the EC2 or Fargate resources ECS runs your containers on.
Your understanding about the EKS cluster costs is correct.
Note: There are other fees an ECS cluster can generate, such as CloudWatch Logs and Metrics fees, but that's true for all the AWS compute services, including EKS, Elastic Beanstalk, Lambda, etc.

Correct way to scailing aws ecs

Now I'm architecting AWS ECS infrastructure.
To auto scale in/out, I used auto scailing.
My system is running on AWS ECS(to deploy docker-compose)
Assume that we have 1 cluster, 1 service with 2 ec2 instance.
I defined scailing policy via CloudWatch if cpu utilization up to 50%.
To autoscailing, we have to apply our policy to ecs service and autoscailing group.
When attach cloudwatch policy to ecs service, it will automatically increase task definition count if cpu utilization up to 50%.
When attach cloudwatch policy to autoscailing group, it will automatically increase ec2 instance count if cpu utilization up to 50%.
After tested it, everything works fine.
But in my service event logs, errors appear like this.
service v1 was unable to place a task because no container instance met all of its requirements. The closest matching container-instance 8bdf994d-9f73-42ec-8299-04b0c5e7fdd3 has insufficient memory available.
I think it occured because of service scailing is start before ec2 instance scailing. (Because service scailing(scale in/out task definition) need to ec2 instance to run it)
But it works fine. Maybe it retry automatically about several times. (I'm not sure)
I wonder that, it is normal configuration on AWS ECS autoscailing?
Or, any missing point in my flow?
Thanks.
ECS can only schedule a service if a container instance is available that matches the containers cpu/memory requirements. Ensure you have this space available to guarantee smooth auto-scaling.
The ec2-asg scaling should happen before service auto-scaling to ensure container instance is available for task scheduler.

How to scale in/out EC2 instances based on ECS cluster resources availability?

I have multiple services running in my ECS cluster. Each service contains one or more tasks based on CPU utilization or a number of users.
I have deployed these containers with EC2 launch type.
Now, I want to increase/decrease the number of EC2 instances based on available resources in the cluster.
Let's say there are four ECS tasks running in two m5.large instances.
Now, if an ECS service increases the number of tasks and there aren't enough resources available in the cluster, how can I spin up an instance and add to the cluster?
And same goes for vice versa. If there is instance running with no ecs task in it, how can I destroy it automatically?
PS - I was using Fargate. Since it's cost is very high, I moved to EC2 instances.
you need to setup your ecs cluster instances in a ASG as #Nitesh says, second you need to set up a cloudwatch alert based in a key metric, with ecs is complex because you need to set up two autoscaling policies one by service another one to scale up your instances, for ec2 the metric that you could use is Cluster CPU reservation and /or Cluster memory reservation.
The scheme works like this your service increases the number of the desired container by an autoscaling rule using a key metric for your service as could be de CPU usage or the number the request in a load balancer and in consequence the Cluster CPU reservation increase this triggers the cloudwatch alert and your ASG increase the number of instaces.
Some tips scale up fast, and scale down slow this could by handle by setting up the time of the alerts
For the containers use Service Auto Scaling and Target tracking policies for more info see
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html#cluster_reservation
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html
https://aws.amazon.com/blogs/compute/automatic-scaling-with-amazon-ecs/
I hope this help
Regards

Mesos - dynamic cluster size

Is it possible in Mesos to have dynamic cluster size - with total cluster CPU and RAM quotas set?
Mesos knows my AWS credentials and spawns new ec2 instances only if there is a new job that cannot fit into existing resources. (AWS or other cloud provider). Similar to that - when the job is finished it could kill the ec2 instance.
It can be Mesos plugin/framework or some external tool - any help appreciated.
Thanks
What we are doing is we are using Mesos monitoring tools and HTTP endpoints # http://mesos.apache.org/documentation/latest/endpoints/ to monitor the cluster.
We have our own framework that gets all the relevant information from the master and slave nodes and our algorithm uses that information to scale the cluster.
For example if the cluster CPU utilization is > 0.90 we bring up a new instance and register that slave to master.
If I understand you correctly you are looking for a solution to autoscale your Mesos cluster?
What some people will do on AWS for example is to create an autoscaling group allowing them to scale up and down the number of agents/slave nodes depending on their needs.
Note that the trigger when to scale up/down are usually application dependent (e.g., could be ok for one app to be at a 100% utilization while for others 80% should already trigger a scale-up action).
For an example of using the AWS auto scaling groups you could have a look at Mesosphere DCOS Community edition (note as mentioned above you will still have to write the trigger code for scaling your scaling group).
AFAIK, the Mesos can not autoscaling itself; it need someone to start Mesos Agent for the cluster. One option is to build a script and be managed by Marathon, this script is to start/stop agents after comparing your pending tasks in the framework and Mesos cluster.