AWS comparison between nodegroup and managed nodegroup - amazon-web-services

I use eksctl to create EKS cluster on AWS
After create a yaml configuration file define EKS cluster follow docs, when I run the command eksctl create cluster -f k8s-dev/k8s-dev.yaml to execute the create cluster action, the log show some lines below:
2021-12-15 16:23:55 [ℹ] will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
2021-12-15 16:23:55 [ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
What is the different between nodegroup and managed nodegroup?
I have read from official docs from AWS about managed nodegroup but I'm still can not clearly which exactly reason to choose nodegroup or managed nodegroup?
What would you use when you need to create a EKS cluster?

eksctl only provide option for you to choose nodeGroups or managedNodeGroups docs: https://eksctl.io/usage/container-runtime/#managed-nodes but not describe the different. But I think the follow document will give you the information you need
It describe the different features between EKS managed node groups - Self managed nodes and AWS Fargate
https://docs.aws.amazon.com/eks/latest/userguide/eks-compute.html
Depend on which purpose you want to use to choose the match one with your purpose, and if I was you, I will choose managed nodegroup.

Related

unable to get nodegroup info using eskctl

Total noob and have a runaway EKS cluster adding up $$ on AWS.
I'm having a tough time scaling down my cluster ad not sure what to do. I'm following the recommendations here: How to stop AWS EKS Worker Instances reference below
If I run:
"eksctl get cluster", I get the following:
NAME REGION EKSCTL CREATED
my-cluster us-west-2 True
unique-outfit-1636757727 us-west-2 True
I then try the next line "eksctl get nodegroup --cluster my-cluster" and get:
2021-11-15 15:31:14 [ℹ] eksctl version 0.73.0
2021-11-15 15:31:14 [ℹ] using region us-west-2
Error: No nodegroups found
I'm desperate to try and scale down the cluster, but stuck in the above command.
Seems everything installed and is running as intended, but the management part is failing! Thanks in advance! What am I doing wrong?
Reference --
eksctl get cluster
eksctl get nodegroup --cluster CLUSTERNAME
eksctl scale nodegroup --cluster CLUSTERNAME --name NODEGROUPNAME --nodes NEWSIZE
To completely scale down the nodes to zero use this (max=0 threw errors):
eksctl scale nodegroup --cluster CLUSTERNAME --name NODEGROUPNAME --nodes 0 --nodes-max 1 --nodes-min 0
You don't have managed node group therefore eksctl does not return any node group result. The same applies to aws eks cli.
...scaling down my cluster...
You can logon to the console, goto EC2->Auto Scaling Groups, locate the launch template and scale by updating the "Group details". Depends on how your cluster was created, you can look for the launch template tag kubernetes.io/cluster/<your cluster name> to find the correct template.

Deleting EKS Cluster with eksctl not working properly, requires manual deletion of resources such as ManagedNodeGroups

I'm running a cluster on EKS, and following the tutorial to deploy one using the command eksctl create cluster --name prod --version 1.17 --region eu-west-1 --nodegroup-name standard-workers --node-type t3.medium --nodes 3 --nodes-min 1 --nodes-max 4 --ssh-access --ssh-public-key public-key.pub --managed.
Once I'm done with my tests (mainly installing and then uninstalling helm charts), and i have a clean cluster with no jobs running, i then try to delete it with eksctl delete cluster --name prod, causing these errors.
[ℹ] eksctl version 0.25.0
[ℹ] using region eu-west-1
[ℹ] deleting EKS cluster "test"
[ℹ] deleted 0 Fargate profile(s)
[✔] kubeconfig has been updated
[ℹ] cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress
[ℹ] 2 sequential tasks: { delete nodegroup "standard-workers", delete cluster control plane "test" [async] }
[ℹ] will delete stack "eksctl-test-nodegroup-standard-workers"
[ℹ] waiting for stack "eksctl-test-nodegroup-standard-workers" to get deleted
[✖] unexpected status "DELETE_FAILED" while waiting for CloudFormation stack "eksctl-test-nodegroup-standard-workers"
[ℹ] fetching stack events in attempt to troubleshoot the root cause of the failure
[✖] AWS::CloudFormation::Stack/eksctl-test-nodegroup-standard-workers: DELETE_FAILED – "The following resource(s) failed to delete: [ManagedNodeGroup]. "
[✖] AWS::EKS::Nodegroup/ManagedNodeGroup: DELETE_FAILED – "Nodegroup standard-workers failed to stabilize: [{Code: Ec2SecurityGroupDeletionFailure,Message: DependencyViolation - resource has a dependent object,ResourceIds: [[REDACTED]]}]"
[ℹ] 1 error(s) occurred while deleting cluster with nodegroup(s)
[✖] waiting for CloudFormation stack "eksctl-test-nodegroup-standard-workers": ResourceNotReady: failed waiting for successful resource state
To fix them I had to manually delete AWS VPCs and then ManagednodeGroups, to then delete everything again.
I tried again with the steps above (creating and deleting with the commands provided in the official getting started documentation), but I get the same errors upon deleting.
It seems extremely weird that I have to manually delete resources when doing something like this. Is there a fix for this problem, am i doing something wrong, or is this standard procedure?
All commands are run through the official eksctl cli, and I'm following the official eksctl deployment
If we try to delete the corresponding Security Group to which the Node Group EC2 is attached to, we will find the root cause.
Mostly it will say there is a Network Interface attached.
So the solution is to delete that linked Network Interface manually. Now the Node Group will be deleted without any error.
If you are using Managed Node Groups and public subnets, be sure that you update your subnet settings to map public IPs on launch before April 20 April 22. You can follow the progress of the updates to managed node groups on our GitHub roadmap.
If you want to learn more about networking configurations and IP assignment for EKS clusters, check blog on cluster networking for worker nodes.
Also you can try:
Go to EC2 > Network Interfaces
Sort by VPC, find the interfaces assigned to your VPC
The interface to delete should be the only one that is "available", it should also be the only one assigned to the problematic remote access SG. If more than one interface matches this description, delete them all.
Take a look: eks-managed-node-groups, eksctl-node-group.
Have you tried running the eksctl delete cluster command with the --wait flag?
Without that flag it will output a message that it is deleted but deletion activities are still going on in the background.

How to stop AWS EKS Worker Instances

I wonder if that would be possible to temporary stop the worker VM instances so they are not running at night time when I am not working on a cluster development. So far the only way I am aware of to "stop" the instances from running is to delete the cluster itself which I don't want to do. Any suggestions are highly appreciated.
P.S. Edited later
The cluster was created following steps outlined in this guide.
I'm just learning myself but this might help. If you have eksctl installed, you can use it from the command line to scale your cluster. I scale mine down to the min size when I'm not using it:
eksctl get cluster
eksctl get nodegroup --cluster CLUSTERNAME
eksctl scale nodegroup --cluster CLUSTERNAME --name NODEGROUPNAME --nodes NEWSIZE
To completely scale down the nodes to zero use this (max=0 threw errors):
eksctl scale nodegroup --cluster CLUSTERNAME --name NODEGROUPNAME --nodes 0 --nodes-max 1 --nodes-min 0
Go to EC2 instances dashboard of your Node Group and from right panel in bottom click on Auto Scaling Groups then select your group by click on checkbox and click edit button and change Desired, Min & Max capacities to 0
Edit the autoscaling group and set the instances to 0.
This will shut down all worker nodes.
Now you can use AWS Automation to schedule a repetitive action through automation documents that will be stopping/starting at given periods of time.
You can't stop the master nodes as they are managed by AWS.
Take a look at the kube-downscaler which can be deployed to cluster to scale in and out the deployments based on time of day.
More cost reduction techniques in this blog.

AWS - ECS: List cluster and their Amazon EC2 instances

Is there a way to list
all your ecs clusters
the ec2 instance(s) comprising each cluster?
The aws cli does not seem to support such option.
I am trying to create an inventory of such resources and I want the above info to be recorded (ECS clusters + instance number / type of each of those instances)
Do you have the latest AWS CLI installed so you have ecs subcommand available?
how to list available cluster - it will return a list of clusters ARNs:
aws ecs list-clusters
how to get container instances of the cluster - it will return a list of container instances ARNs in the cluster:
aws ecs list-container-instances --cluster FOOBAR
finally, how to get EC2 instance(s) ID of the container instance(s):
aws ecs describe-container-instances --cluster FOOBAR --container-instances FOOBAR_CLUSTER_CONTAINER_INSTANCES_ARNS
The last command will describe particular container instance(s) where you can filter out ec2InstanceId parameter to find out EC2 instance(s) ID.

In which scenario we should consider creating manual VPC(and subnets) for KOPS?

We are trying to create KOPS cluster, however we need to deploy our database server separately(outside of KOPS cluster). We thought of creating a CloudFormation for Infrastructure(vpc, subnets etc..) and create database server(ec2) manually.
Then deploy Application servers through KOPS cluster.
My query is like Is it recommended to create manual VPC(and subnets) for KOPS?
Kops uses terraform on the backend to create resources in AWS. Its usually best to let Kops create and manage those resources unless you can't and have to deploy to an existing vpc. Kops can output to terraform so you can manage your infrastructure as code. This also give you the option to add an RDS cluster in terraform and have it added to your Kops cluster so its all managed together.
Here is how you would do that and keep your state files in S3.
$ kops create cluster \
--name=kubernetes.mydomain.com \
--state=s3://mycompany.kubernetes \
--dns-zone=kubernetes.mydomain.com \
[... your other options ...]
--out=. \
--target=terraform
Then you would add your RDS cluster to the terraform code and do a terraform plan , then terraform apply.