EKS Cluster in a private subnet - unhealthy nodes in the kubernetes cluster - amazon-web-services

I'm trying to create a EKS cluster in a private subnet. I'm having issues getting it working. I get the error unhealthy nodes in the kubernetes cluster. Wonder if its due to security group or some other issues like VPC endpoints?
When I use NAT gateway setup then it works fine. But I don't want to use nat gateway anymore.
One think I'm not sure is should the EKS cluster subnet_ids be only private subnets?
In the below config I'm using both public and private subnets.
resource "aws_eks_cluster" "main" {
name = var.eks_cluster_name
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = concat(var.public_subnet_ids, var.private_subnet_ids)
security_group_ids = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id, aws_security_group.external_access.id]
endpoint_private_access = true
endpoint_public_access = false
}
# Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
# Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
depends_on = [
"aws_iam_role_policy_attachment.aws_eks_cluster_policy",
"aws_iam_role_policy_attachment.aws_eks_service_policy"
]
}

Since you don't have NAT gateway/instance, your nodes can't connect to the internet and fail as they can't "communicate with the control plane and other AWS services" (from here).
Thus, you can use VPC endpoints to enable communication with the plain and the services. To view the properly setup VPC with private subnets for EKS, you can check AWS provided VPC template for EKS (from here).
From the template, the VPC endpoints in us-east-1:
com.amazonaws.us-east-1.ec2
com.amazonaws.us-east-1.ecr.api
com.amazonaws.us-east-1.s3
com.amazonaws.us-east-1.logs
com.amazonaws.us-east-1.ecr.dkr
com.amazonaws.us-east-1.sts
Please note that all these endpoints, escept S3, are not free. So you have to consider if running cheap NAT instances or gateway would be cheaper or more expensive then maintaining these endpoints.

Related

AWS Lambda-RabbitMQ event mapping usign VPC endpoints

TL/DR
Trying to create a Lambda trigger on a AmazonMQ (RabbitMQ) queue, using private subnets and VPC endpoints does not work.
POC Goal
I'm doing this POC:
An AmazonMQ (RabbitMQ) in a private subnet and a Lambda triggered by incoming messages to the queue.
Disclaimer
All I'll state here is what I'm learning, any correction will be appreciated.
On networking
Since Amazon MQ is an AWS-managed service, it runs in its own network. So, when we ask AWS to place the broker in a subnet a network interface is created for this broker in the subnet, giving the broker access and reachability in the subnet.
Something similar goes for Lambda, the network interface gives lambda access to the subnet. But to invoke this lambda, since the invoking endpoints live outside our subnet, there is a need of creating a VPC endpoint exposing the lambda endpoints inside the subnet.
The other option is to grant broker with public access (creating public nats) so the broker can reach the public lambda endpoints.
The problem
Simply it doesn't work with VPC endpoints option (it does with the public NATs).
Here is the code I'm using: https://gitlab.com/templates14/terraform-templates/-/tree/master/lambda_rabbitmq_trigger
If you want to test just change the AWS account here:
# here using an AWS profile of my own, change it
provider "aws" {
region = "us-east-1"
profile = "myown-terraform"
}
Analysis
As far as I can tell, the broker and lambda have their network interfaces in the same subnet, the security groups are OK (they allow the needed traffic), and the VPC endpoint is created. But the event mapping (aka the-trigger, created manually or using terraform) never can complete the configuration.
As #jarmod mentioned (thanks for this), I missed the VPC endpoints for STS and SecretsManager.
Basically, the solution was ok, but this had to be added:
resource "aws_vpc_endpoint" "sts_endpoint" {
vpc_id = module.red.vpc_id
service_name = "com.amazonaws.${ var.region }.sts"
vpc_endpoint_type = "Interface"
subnet_ids = [module.red.private_subnets[0]]
security_group_ids = [ aws_security_group.sg-endpoint.id ]
private_dns_enabled = true
}
resource "aws_vpc_endpoint" "secretsmanager_endpoint" {
vpc_id = module.red.vpc_id
service_name = "com.amazonaws.${ var.region }.secretsmanager"
vpc_endpoint_type = "Interface"
subnet_ids = [module.red.private_subnets[0]]
security_group_ids = [ aws_security_group.sg-endpoint.id ]
private_dns_enabled = true
}
This is the final diagram:
Here's the code if you want to play with it: https://gitlab.com/templates14/terraform-templates/-/tree/master/lambda_rabbitmq_trigger

Terraform on GCP: How to whitelist my cluster nodes for RDS

I have a kubernetes cluster and an RDS configured in terraform, now i want to whitelist the node-IPs for the RDS. Is there a way to somehow access the node-pool instances from the cluster-config? What i basically want for the RDS config is something like
ip_configuration {
dynamic "authorized_networks" {
for_each = google_container_cluster.data_lake.network
iterator = node
content {
name = node.network.ip
value = node.network.ip
}
}
}
But from what i see there seems to be no way to get a list of the nodes/the-IPs..
I tried
ip_configuration {
authorized_networks {
value = google_container_cluster.my_cluster.cluster_ipv4_cidr
}
}
which resulted in Non-routable or private authorized network (10.80.0.0/14).., invalid so it looks like this only works with public IPs. Or i have to setup a separate VPC for that?
i would suggest you to first set up the NAT gateway in front of the GKE so that you can manage your all outgoing traffic from a single egress point.
You can use this terraform to create & setup the NAT gateway : https://registry.terraform.io/modules/GoogleCloudPlatform/nat-gateway/google/latest/examples/gke-nat-gateway
Module source code : https://github.com/GoogleCloudPlatform/terraform-google-nat-gateway/tree/v1.2.3/examples/gke-nat-gateway
Using NAT gateway your all Nodes traffic will be going out of single IP and you can whitelist this single IP into the RDS.
Since RDS is in AWS service, VPC peering is not possible otherwise if are using the GCP SQL that would also one option.

VPC Endpoint: Specific Services Not Available in Availability Zone

When I attempt to create a VPC Endpoint for the com.amazonaws.us-east-1.lambda (lambda service), the "us-east-1a" Availability Zone is not an option. However, when I choose a different service, like "com.amazonaws.us-east-1.rds", I can choose a subnet in the "us-east-1a" Availability Zone.
I am creating VPC endpoints via CloudFormation template, but also confirmed this occurs when creating via the UI.
I have been reviewing AWS documentation and also previous questions, but I cannot determine why this is occurring and how to fix this so we can select the subnets in that AZ for that VPC endpoint. Any guidance is appreciated.
Screenshot of attempting to create VPC endpoint for lambda with us-east-1a not allowed:
Screenshot of attempting to create VPC endpoint for another service:
You can run the CLI command to check for a service and the Availability Zones which are available to use for creating a VPC endpoint.
aws ec2 describe-vpc-endpoint-services --service-names SERVICE-NAME
Example for Lambda:
aws ec2 describe-vpc-endpoint-services --service-names com.amazonaws.us-east-1.lambda
{
"ServiceDetails": [
{
"ServiceName": "com.amazonaws.us-east-1.lambda",
"AvailabilityZones": [
"us-east-1a",
"us-east-1b",
"us-east-1c"
]....}
Why can’t I select an Availability Zone for my Amazon VPC interface endpoint?
https://aws.amazon.com/premiumsupport/knowledge-center/interface-endpoint-availability-zone/

Terraform, EKS and a aurora-mysql serverless RDS - subnets in same AZ

I've started with Terraform a while back, and I've been working on an AWS dev env where I need to put up EKS and a aurora-mysql serverless RDS, and get them to talk to one another.
I used the excellent examples here:
https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/managed_node_groups
and here:
https://github.com/terraform-aws-modules/terraform-aws-rds-aurora/tree/master/examples/serverless (this actually is set to put up a aurora-mysql serverless DB, not postgres as advertised, but mysql is what I'm looking for so, cheers).
So far so good, the serverless example uses the default VPC and that's fine for games. But I want to either:
1. Create the RDS in the same VPC as the EKS to simplify networking:
Towards that end, I added the contents of ....terraform-aws-rds-aurora/examples/serverless/main.tf to ....terraform-aws-eks/examples/managed_node_groups/main.tf and set the tf files from ....terraform-aws-rds-aurora to a folder, and set it like so:
module "aurora" {
source = "../../modules/aurora"
and replaced:
data.aws_vpc.default.id
with
module.vpc.vpc_id
and I got:
Error: error creating RDS cluster: InvalidParameterValue: Aurora Serverless doesn't support DB subnet groups with subnets in the same Availability Zone. Choose a DB subnet group with subnets in different Availability Zones.
status code: 400, request id: 7d2e359f-6609-4dde-b63e-11a16d1efaf2
on ../../modules/aurora/main.tf line 33, in resource "aws_rds_cluster" "this":
33: resource "aws_rds_cluster" "this" {
fair is fair, I read some and realized that I might prefer a different VPC for EKS and RDS in order for each to have redundancy over all AZs in us-west-2. So now I tried -
Creating a new VPC for RDS:
I went back to ..../terraform-aws-rds-aurora/tree/master/examples/serverless/main.tf , and set:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 2.6"
name = "${var.env}-mysql-vpc"
cidr = "172.16.0.0/16"
azs = data.aws_availability_zones.available.names
private_subnets = ["172.16.7.0/24", "172.16.8.0/24", "172.16.9.0/24"]
public_subnets = ["172.16.10.0/24", "172.16.11.0/24", "172.16.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
}
data "aws_vpc" "created" {
id = module.vpc.vpc_id
}
data "aws_subnet_ids" "all" {
vpc_id = data.aws_vpc.created.id
}
and got the same message!
I'm stumped. I don't want to use the default VPC for RDS, and eventually I'll want to edit the VPC for security/configurations.
My questions are:
Is it possible and practical for both EKS and RDS live together in the same VPC?
Seeing that the example runs fine with the default VPC, what am I missing with the VPC creation for RDS?
Can Terraform create an "empty" VPC and the aurora module will then create subnets in it? Or is there a simple way for me to then create the missing subnets (while specifying the AZ for each) and the rest of the VPC requirements for serverless?
I realize that this falls between AWS and Terraform, but will appreciate your help.
Thanks to #mokugo-devops comments I was able to create a new VPC where each subnet had a different AZ. But as it turns out, EKS and Aurora Serverless can live in the same VPC, I just needed to get the public subnets only (that are created by terraform-aws-modules/vpc/aws in different AZs) for serverless, like so:
and have the module "aurora" read them them like so:
module "aurora" {
source = "../../modules/aurora"
name = "aurora-serverless"
engine = "aurora"
engine_mode = "serverless"
replica_scale_enabled = false
replica_count = 0
backtrack_window = 10 # ignored in serverless
subnets = module.vpc.public_subnets

EKS cluster - api endpoint access - public/private

I have provisioned EKS cluster on AWS with public access to api endpoint. While doing I configured SG with ingress only from specific IP. But I could still run the kubectl get svc against the cluster when accessing it from another IP.
I want to have IP restricted access to EKS cluster.
ref - Terraform - Master cluster SG
If public access is enabled does it mean that anyone who has cluster name can deploy anything?
When you create a new cluster, Amazon EKS creates an endpoint for the managed Kubernetes API server that you use to communicate with your cluster (using Kubernetes management tools such as kubectl as you have done).
By default, this API server endpoint is public to the internet, and access to the API server is secured using a combination of AWS Identity and Access Management (IAM) and native Kubernetes Role Based Access Control (RBAC).
So the public access does not mean that anyone who has the cluster name can deploy anything. You can read more about that in the Amazon EKS Cluster Endpoint Access Control AWS documentation.
If you want to provision EKS with Terraform and manage the network topology it's happened through the VPC (Virtual Private Network). You can check this VPC Terraform Module to get all the proper settings.
Hope it'll help.
As well as Claire Bellivier' answer about how EKS clusters are protected via authentication using IAM and RBAC you can now also configure your EKS cluster to be only accessible from private networks such as the VPC the cluster resides in or any peered VPCs.
This has been added in the (as yet unreleased) 2.3.0 version of the AWS provider and can be configured as part of the vpc_options config of the aws_eks_cluster resource:
resource "aws_eks_cluster" "example" {
name = %[2]q
role_arn = "${aws_iam_role.example.arn}"
vpc_config {
endpoint_private_access = true
endpoint_public_access = false
subnet_ids = [
"${aws_subnet.example.*.id[0]}",
"${aws_subnet.example.*.id[1]}",
]
}
}