How to open master node of aws ec2 in ssh - amazon-web-services

I have created kubernetes cluster using kops and kubectl in main EC2 instance and master-child node created automatically.
kops create cluster \
--state=${KOPS_STATE_STORE} \
--node-count=2 \
--master-size=t2.medium \
--node-size=t2.medium \
--zones=ap-south-1a,ap-south-1b \
--name=${KOPS_CLUSTER_NAME} \
--dns private \
--master-count 1
I am able to connect kops ec2 instance(main) from git bash,or directly. But I am not able to open master instance either way.
ssh -i key1.pem ec2-user#kops-ip #working for kops
While connecting to master node its giving:
There was a problem setting up the instance connection
Log in failed. If this instance has just started up, try again in a minute or two.
My question is:
1.How to open master ec2 instance?
2.Do I need to install kubernetes dashboard in master or in kops instance(currently having)?
AWS Instances:
kops(ec2-user)
master-ap-south-1a.masters.a.com
nodes.a.com
nodes.a.com

Related

How to recover Kubernetes cluster with KOPS?

We were trying to upgrade the Kops version of the Kubernetes Cluster. We have followed the below steps for that;
Download the latest KOPS version 1.24 (the old version is 1.20)
Do the template changes according to 1.24
Set ENV variables
export KUBECONFIG="<<Kubeconfig file>>"
export AWS_PROFILE="<< AWS PROFILE NAME >>"
export AWS_DEFAULT_REGION="<< AWS Region >>"
export KOPS_STATE_STORE="<< AWS S3 Bucket Name >>"
export NAME="<< KOPS Cluster Name >>"
kops get $NAME -o yaml > existing-cluster.yaml
kops toolbox template --template templates/tm-eck-mixed-instances.yaml --values values_files/values-us-east-1.yaml --snippets snippets --output cluster.yaml --name $NAME
kops replace -f cluster.yaml
kops update cluster --name $NAME
kops rolling-update cluster --name $NAME --instance-group=master-us-east-1a --yes --cloudonly
Once the master is rolled over I noticed that this master is not joined to the cluster.
After a few rounds of troubleshooting, I found the below error in the API server.
I0926 09:54:41.220817 1 flags.go:59] FLAG: --vmodule=""
I0926 09:54:41.223834 1 dynamic_serving_content.go:111] Loaded a new cert/key pair for "serving-cert::/srv/kubernetes/kube-controller-manager/server.crt::/srv/kubernetes/kube-controller-manager/server.key"
unable to load configmap based request-header-client-ca-file: Get "https://127.0.0.1/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 127.0.0.1:443: connect: connection refused
I have tried to resolve this issue and couldn't find a way, SO decided to roll back using a backup. These are the steps I've followed for that;
kops replace -f cluster.yaml
kops update cluster --name $NAME
kops rolling-update cluster --name $NAME --instance-group=master-us-east-1a --yes --cloudonly
Still, I'm getting the same error in the Master node.
Does anyone know how I can restore the cluster using Kops ??
After a few rounds of troubleshooting, I've found that whenever we deploy a new version using kops it's creating a new version in the launch template in AWS. I have manually changed the launch template version used in the Auto scaling group of all node groups. Then cluster is rollbacked to the previous state and starts working properly. Then I reran the upgrade process after adding the missing configurations into the kops template file.

Docker service update downtime

I have an AWS EC2 with docker service.
The service has just 1 container, and when I update the container (changing image), I have a downtime (about 1 minute).
This is my docker service create code:
docker service create \
--name service-$IMAGE_NAME \
--publish 80:80 \
--env ENVIRONMENT=$(cat /etc/service_environment) \
--env-file=/etc/.env \
--replicas=1 \
--update-failure-action rollback \
--update-order start-first \
$ECR_IMAGE
Here update code:
#pull image from private ECR repository
docker pull $IMAGE
docker service update \
--force \
--image $IMAGE:latest \
--update-failure-action rollback \
--update-order start-first \
service-$IMAGE_NAME
Why this happen? What's wrong?
Thank you
Changing image means you are stopping existing docker instance, and start a new one. That's why it will be down for a while, in your case, 1 minute. It's the time for the docker instance to restart. You can make this seamless using 2 ec2 instance and load balancer. You update 1 instance and repointing the traffic to other instance, and then update the other one after the instance you've update is successful.

How do I configure Kubernetes node labels when creating a cluster with kops?

When creating a Kubernetes cluster on AWS with kops version 1.6.2, how do I configure Kubernetes labels for nodes? My specific scenario is that I need to set the label beta.kubernetes.io/fluentd-ds-ready as true, because otherwise Fluentd pods won't be scheduled.
My current kops command for creating a cluster looks as follows:
kops --state s3://example.com create cluster \
--zones eu-central-1a,eu-central-1b,eu-central-1c \
--master-zones eu-central-1a,eu-central-1b,eu-central-1c \
--topology private --networking flannel --master-size m4.large \
--node-size m4.large --node-count 2 --bastion --cloud aws \
--ssh-public-key id_rsa.pub --authorization RBAC --yes \
production.example.com
How do I also configure kops to set the label beta.kubernetes.io/fluentd-ds-ready=true on created Kubernetes nodes?
From https://github.com/kubernetes/kops/issues/742
You can create a yaml file with the cluster definition and the labelling there. On an existing cluster you can then do a rolling update.

AWS Aurora: how to restore a db cluster snapshot via aws cli?

It's pretty easy via the console but I need to do the same from CLI.
First I created a db snapshot:
aws rds create-db-cluster-snapshot \
--db-cluster-snapshot-identifier $SNAPSHOT_ID \
--db-cluster-identifier $CLUSTER \
CLUSTER contains only one writer instance
I did not use create-db-snapshot method because it throwned an error
A client error (InvalidParameterValue) occurred when calling the CreateDBSnapshot operation: The specified instance is a member of a cluster and a snapshot cannot be created directly. Please use the CreateDBClusterSnapshot API instead.
It works:
aws rds create-db-cluster-snapshot \
--db-cluster-snapshot-identifier $SNAPSHOT_ID \
--db-cluster-identifier $CLUSTER \
{
"DBClusterSnapshot": {
"Engine": "aurora",
"SnapshotCreateTime": "2016-12-08T11:48:07.534Z",
....
}
So, I wanted to restore a new Aurora cluster from the snapshot, then I tried:
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier from-snap2 \
--db-snapshot-identifier snap2 \
A client error (DBSnapshotNotFound) occurred when calling the RestoreDBInstanceFromDBSnapshot operation: DBSnapshot not found: snap2
So I tried to restore with:
aws rds restore-db-cluster-from-snapshot \
--db-cluster-identifier from-snap2 \
--snapshot-identifier snap2 \
--engine aurora \
--vpc-security-group-ids $PREPROD_SG \
--db-subnet-group-name my-db-subnet-group \
It works...
{
"DBCluster": {
...
"EngineVersion": "5.6.10a",
"DBClusterIdentifier": "from-snap2",
...
"DBClusterMembers": [],
...
}
But why the cluster does not contain any Aurora instance?
Where is the mistake?
This is very counterintuitive. If you restore a cluster from a snapshot, but there are no member instances in the cluster, what operation has actually succeeded? It seems as if all this does is create some kind of logical entity, maybe the backing store, but no instances.
Strange. But, the API documentation does show the cluster members as an empty set in the example response.
<DBClusterMembers/>
So it seems you create a cluster, as you did, then you apparently create instances in the cluster, as explained in an AWS Forum post:
aws rds create-db-instance --db-instance-identifier my-instance --db-instance-class db.r3.large --engine aurora --db-subnet-group-name default-vpc-xxxxxx --db-cluster-identifier my-instance-cluster
https://forums.aws.amazon.com/thread.jspa?messageID=688727
Apparently the console encapsulates multiple API requests behind the same action.
Response from AWS Support:
This is a known issue when using the API calls and our engineers are working on it. Even if the cluster is visible on AWS Console after the creation via CLI it will not create any instance automatically in your Aurora Cluster. In this case, you will need to create a db-instance and associate it to your newly restored cluster. When performing this Action on the AWS Console a new instance is automatically created for the cluster, but the action from the CLI uses separated API calls.
The following documentation provides detailed information on how to create a DB instance:
http://docs.aws.amazon.com/cli/latest/reference/rds/create-db-instance.html
You can describe your clusters using the AWS Console or using the CLI:
http://docs.aws.amazon.com/cli/latest/reference/rds/describe-db-clusters.html
Here is a command line example that creates the instance and associate it to a fictional cluster:
aws rds create-db-instance --engine aurora --db-cluster-identifier yourauroraclusteridentifier --db-instance-class db.t2.medium --db-instance-identifier yourinstanceidentifier
In my case, --db-cluster-identifier is the cluster created from the cluster snapshot.
If you create with aws rds create-db-cluster-snapshot then you can't restore with aws rds restore-db-instance-from-db-snapshot. The first creates a DB snapshot and the second restores a Cluster snapshot, different types.
From your question it looks like your restore is correct, maybe you need --database-name specified. Also you could try the restore with only the required parameters, i.e no vpc sg or DB subnet.

EC2 Instance creation fails due to VPC Issues

So I am following this link: Autoscale based on SQS queue size to create an autoscaling group for my instances. I have read many articles about this problem that I am getting and many people are getting the same problem, but theirs occurs when they try to use "t1.micro". Whereas, I am using "c4.xlarge" instance type and I already have a VPC defined for my Image. Why am I still getting this error:
Launching a new EC2 instance. Status Reason: The specified instance
type can only be used in a VPC. A subnet ID or network interface ID is
required to carry out the request. Launching EC2 instance failed.
Does anybody have a solution for this?
you need to include VPC information in your scripts or init:
http://docs.aws.amazon.com/autoscaling/latest/userguide/asg-in-vpc.html
Not sure what sdk you are using, but with any sdk you have choosen, you need to specify the VPC subnets where you generate the instances.
When using aws cli to create a ASG, you specify the same using --vpc-zone-identifier
Please check the link to documentation below:
http://docs.aws.amazon.com/cli/latest/reference/autoscaling/create-auto-scaling-group.html
Make sure you are defining the subnet id in the cli command.
Although the service is different, aws cli generally follows the same syntax so adjust this to any resource.
aws emr create-cluster \
--name "Test cluster" \
--release-label emr-4.2.0 \
--applications Name=Hadoop Name=Hive Name=Pig \
--use-default-roles \
--ec2-attributes KeyName=myKey,SubnetId=subnet-77XXXX03 \
--instance-type m4.large \
--instance-count 3