Kubernetes multi-master cluster on AWS - amazon-web-services

We have created a single-master three-node worker cluster on AWS using Terraform, user-data YAML files, and CoreOS AMIs. The cluster works as expected but we are now in need to scale the master's up from one to three for redundancy purposes. My question is: other than using etcd clustering and/or the information provided on http://kubernetes.io/docs/admin/high-availability/, do we have any options to deploy a new or scale-up the existing cluster with multi-master nodes? Let me know if more details are required to answer this question.

The kops project can set up a high-availability master for you when creating a cluster.
Pass the following when you create the cluster (replacing the zones with whatever is relevant to you):
--master-zones=us-east-1b,us-east-1c,us-east-1d
Additionally, it can export Terraform files if you want to continue to use Terraform.

Related

Dataproc cluster underlying VMs using default service account

I created a Dataproc cluster using a service account via a Terraform script. The cluster has 1 master and 2 workers, so three Compute Engine instances got created as a part of this cluster creations. My questions are-
Why these VMs have default service accounts? Shouldn't they use the same service account that I used to create the dataproc cluster?
Edited: Removed one question as suggested in comment (as topic became too broad)
Here is how you can specify the service account used by the cluster VMs. If you are sure they still use the default service account, it might be a mistake in the Terraform script. You can test with gcloud without Terraform to confirm.

What exactly does EKS do if CloudFormation is needed?

What does AWS' Elastic Kubernetes Service (EKS) do exactly if so much configuration is needed in CloudFormation which is (yet) another AWS service?
I followed the AWS EKS Getting Started in the docs at (https://docs.aws.amazon.com/eks/latest/userguide/eks-ug.pdf) where it seems CloudFormation knowledge is heavily required to run EKS.
Am I mistaken or something?
So in addition to learning the Kubernetes .yaml manifest definitions, to run k8s on EKS, AWS expects you to learn their CloudFormation .yaml configuration manifests as well (which are all PascalCase as opposed to k8s' camelCase i might add)?
I understand that EKS does some management of latest version of k8s and control plane, and is "secure by default" but other than that?
Why wouldn't I just run k8s on AWS using kops then, and deal with the slightly outdated k8s versions?
Or am I supposed to do EKS + CloudFormation + kops at which point GKE looks like a really tempting alternative?
Update:
At this point I'm really thinking EKS is just a thin wrapper over CloudFormation after searching on EKS in detail and how it is so reliant on CloudFormation manifests.
Likely a business response to the alarming popularity of k8s, GKE in general with no substance to back the service.
Hopefully this helps save the time of anyone evaluating the half-baked service that is EKS.
To run Kubernetes on AWS you have basically 2 options:
using kops, it will create Master nodes + workers node under the hood, in plain EC2 machines
EKS + Cloudformation workers stack (you can use also Terraform as an alternative to deploy the workers, or eksctl, that will create both the EKS cluster and the workers. I recommend you to follow this workshop)
EKS alone provides only the master nodes of a kubernetes cluster, in a highly available setup. You still need to add the worker nodes, where your containers will be created.
I tried both kops and EKS + Workers, and I ended up using EKS, because I found it easier to setup and maintain and more fault-tolerant.
I feel the same difficulties earlier, and none of article could give me requirement in a glance for things that need to be done. Lot of people just recommend using eksctl which in my opinion will create a bloated and hard to manage kind of CloudFormation.
Basically both EKS is just a wrapper of Kubernetes, there's some points of integration between Kubernetes and AWS that still need to be done manually.
I've wrote an article that hope could help you understand all the process that need to be inplaces
EKS is the managed control plane for kubernetes , while Cloud-formation is a infrastructure templating service .
Instead of EKS you can run and manage the control plane(master nodes) on top of EC2 machines if you want to optimize for costs.For using EKS you have to pay for the underlying infra(EC2+networking..) and managed service fee(EKS price) .
Cloud-formation provides a nice interface to template and automate your infrastructure.You may use terraform in place of CF

AWS ECS SDK.Register new container instance (EC2) for ECS Cluster using SDK

I've faced with the problem while using AWS SDK. Currently I am using SDK for golang, but solutions from other languages are welcome too!
I have ECS cluster created via SDK
Now I need to add EC2 containers for this cluster. My problem is that I can't use Amazon ECS Agent to specify cluster name via config:
#!/bin/bash
echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config
or something like that. I can use only SDK.
I found method called RegisterContainerInstance.
But it has note:
This action is only used by the Amazon ECS agent, and it is not
intended for use outside of the agent.
It doesn't look like working solution.
I need to understand how (if it's possible) to create working ECS clusterusing SDK only.
UPDATE:
My main target is that I need to start specified count of servers from my Docker image.
While I am investigating this task i've found that I need:
create ECS cluster
assign to it needed count of ec2 instances.
create Task with my Docker image.
run it on cluster manually or as service.
So I:
Created new cluster via CreateCluster method with name "test-cluster".
Created new task via RegisterTaskDefinition
Created new EC2 instance with ecsInstanceRole role with ecs-optimized AMI type, that is correct for my region.
And there place where problems had started.
Actual result: All new ec2 instances had attached to "default" cluster (AWS created it and attach instance to it).
If I am using ECS agent I can specify cluster name by using ECS_CLUSTER config env. But I am developing tool that use only SDK (without any ability of using ECS agent).
With RegisterTaskDefinition I haven't any possibility to specify cluster, so my question, how I can assign new EC2 instance exactly to specified cluster?
When I had tried to just start my task via RunTask method (with hoping that AWS somehow create instances for me or something like that) I receive an error:
InvalidParameterException: No Container Instances were found in your cluster.
I actually can't sort out which question you are asking. Do you need to add containers to the cluster, or add instances to the cluster? Those are very different.
Add instances to the cluster
This is not done with the ECS API, it is done with the EC2 API by creating EC2 instances with the correct ecsInstanceRole. See the Launching an Amazon ECS Container Instance documentation for more information.
Add containers to the cluster
This is done be defining a task definition, then running those tasks manually or as services. See the Amazon ECS Task Definitions for more information.

Autoscaling a running Hadoop cluster setup on AWS EC2

My goal is to understand how can I auto-scale a Hadoop cluster on AWS EC2.
I am exploring AWS offerings from elastic scaling perspective for a Hadoop as service (EMR) and Hadoop on EC2.
For EMR, I gathered that using CloudWatch, performance metrics can be monitored and the user can be alerted once they reach the set threshold, thereafter the cluster can be scaled up or down depending on its utilization state.
This approach would require some custom implementation to automate the steps.(correct me if I am missing anything here)
For Hadoop on EC2, I came across with the auto scaling option which can add or remove instances as per configured scaling policies.
But I am not clear how a newly added node would get bootstrapped to the cluster automatically? How would YARN know that it can spawn a new container on this newly added node?
Does auto-scaling work for master-slave kind of setup as well or is limited to the web application?
There is 'Qubole' offering services to manage Hadoop on AWS as well....should that be used for automatically managing scaling the cluster?

Possibility of taking snapshot of AWS EMR cluster or namenode

I am new with AWS services and trying some use-cases. I want to create EMR clusters on demand with some predefined configurations and applications/scripts installed. I was planning to create a snapshot of existing EMR cluster or at-least namenode initially and then use it every-time whenever I want to create other clusters. But after some Google search, I couldn't find any way to capture snapshot of EMR cluster. Is it possible to create snapshot ? or any other alternate way that can help me out with my use-case.
Appreciate any kind of help.
Thanks
It is not possible to create a snapshot of an EMR cluster node and you cannot use a custom AMI when running a cluster. However you can install software on the cluster nodes at the cluster creation time using custom bootstrap actions. You can create your custom bootstrap scripts and use them every time you launch a new cluster. This way you can achieve a similar functionality with the one you are seeking.
For more information using bootstrap actions on EMR please visit: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-bootstrap.html#bootstrapCustom
Let us know if you need any further assistance.