AWS Lambda-RabbitMQ event mapping usign VPC endpoints - amazon-web-services

TL/DR
Trying to create a Lambda trigger on a AmazonMQ (RabbitMQ) queue, using private subnets and VPC endpoints does not work.
POC Goal
I'm doing this POC:
An AmazonMQ (RabbitMQ) in a private subnet and a Lambda triggered by incoming messages to the queue.
Disclaimer
All I'll state here is what I'm learning, any correction will be appreciated.
On networking
Since Amazon MQ is an AWS-managed service, it runs in its own network. So, when we ask AWS to place the broker in a subnet a network interface is created for this broker in the subnet, giving the broker access and reachability in the subnet.
Something similar goes for Lambda, the network interface gives lambda access to the subnet. But to invoke this lambda, since the invoking endpoints live outside our subnet, there is a need of creating a VPC endpoint exposing the lambda endpoints inside the subnet.
The other option is to grant broker with public access (creating public nats) so the broker can reach the public lambda endpoints.
The problem
Simply it doesn't work with VPC endpoints option (it does with the public NATs).
Here is the code I'm using: https://gitlab.com/templates14/terraform-templates/-/tree/master/lambda_rabbitmq_trigger
If you want to test just change the AWS account here:
# here using an AWS profile of my own, change it
provider "aws" {
region = "us-east-1"
profile = "myown-terraform"
}
Analysis
As far as I can tell, the broker and lambda have their network interfaces in the same subnet, the security groups are OK (they allow the needed traffic), and the VPC endpoint is created. But the event mapping (aka the-trigger, created manually or using terraform) never can complete the configuration.

As #jarmod mentioned (thanks for this), I missed the VPC endpoints for STS and SecretsManager.
Basically, the solution was ok, but this had to be added:
resource "aws_vpc_endpoint" "sts_endpoint" {
vpc_id = module.red.vpc_id
service_name = "com.amazonaws.${ var.region }.sts"
vpc_endpoint_type = "Interface"
subnet_ids = [module.red.private_subnets[0]]
security_group_ids = [ aws_security_group.sg-endpoint.id ]
private_dns_enabled = true
}
resource "aws_vpc_endpoint" "secretsmanager_endpoint" {
vpc_id = module.red.vpc_id
service_name = "com.amazonaws.${ var.region }.secretsmanager"
vpc_endpoint_type = "Interface"
subnet_ids = [module.red.private_subnets[0]]
security_group_ids = [ aws_security_group.sg-endpoint.id ]
private_dns_enabled = true
}
This is the final diagram:
Here's the code if you want to play with it: https://gitlab.com/templates14/terraform-templates/-/tree/master/lambda_rabbitmq_trigger

Related

VPC Endpoint: Specific Services Not Available in Availability Zone

When I attempt to create a VPC Endpoint for the com.amazonaws.us-east-1.lambda (lambda service), the "us-east-1a" Availability Zone is not an option. However, when I choose a different service, like "com.amazonaws.us-east-1.rds", I can choose a subnet in the "us-east-1a" Availability Zone.
I am creating VPC endpoints via CloudFormation template, but also confirmed this occurs when creating via the UI.
I have been reviewing AWS documentation and also previous questions, but I cannot determine why this is occurring and how to fix this so we can select the subnets in that AZ for that VPC endpoint. Any guidance is appreciated.
Screenshot of attempting to create VPC endpoint for lambda with us-east-1a not allowed:
Screenshot of attempting to create VPC endpoint for another service:
You can run the CLI command to check for a service and the Availability Zones which are available to use for creating a VPC endpoint.
aws ec2 describe-vpc-endpoint-services --service-names SERVICE-NAME
Example for Lambda:
aws ec2 describe-vpc-endpoint-services --service-names com.amazonaws.us-east-1.lambda
{
"ServiceDetails": [
{
"ServiceName": "com.amazonaws.us-east-1.lambda",
"AvailabilityZones": [
"us-east-1a",
"us-east-1b",
"us-east-1c"
]....}
Why can’t I select an Availability Zone for my Amazon VPC interface endpoint?
https://aws.amazon.com/premiumsupport/knowledge-center/interface-endpoint-availability-zone/

AWS Systems Manager - Instance not showing

Could anyone help me investigate an issue with EC2 instance profile? I have create an EC2 instance and I put an IAM role.
But, when I check on the instance I see: No roles attached to instance profile: xxx-instance-profile.
Any idea where I have to look? Because, when I check that instance profile (role), I have this in the trust:
Trusted entities The identity provider(s) ec2.amazonaws.com
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
I have attached one permission policy AmazonSSMManagedInstanceCore
When I go to my instance, I see that no roles are attached. And, in Systems Manager -> Session Manager, I don't see my instances.
I have no clue what I'm doing wrong :(
Any suggestions?
Regards,
I am not sure what you mean by an issue with EC2 instance profile. Instance profiles are permission sets that you grant to an EC2 instance, by defining a policy that contains the permissions required and attaching that policy to a role. The role is attached to an EC2 instance. Because the role will be used by a service it must have a trust relationship to that service.
Which Systems Manager service do you want to use? You can create your own custom policy with specific services and restrictions to specific AWS instances. Or you can use the managed policies.
Here are some example of various policies.
Let's suppose you want a role attached to an EC2 instance so that you can remotely login to that instance using Systems Manager Session Manager.
Let's assume the instance is in a VPC that has a route to the internet, either directly via the Internet Gateway or via a NAT Gateway to the Internet Gateway.
In this case, your instance has a route to the AWS Public Service for Systems Manager Session Manager. The instance must have the Systems Manager Session Manager agent installed. This agent is pre-installed on Amazon Linux 2, Amazon Linux and Ubuntu 16.04, 18.04, 20.04.
Assuming the agent is installed and there is a route to the service, then your instance as you mentioned need rights via IAM to access the service. This is done by granting a role to the EC2 instance.
To do this go to IAM - https://console.aws.amazon.com/iam/.
Select Roles from the navigation panel, create a new role
Select Type of trusted entity as AWS Service
Choose the EC2 option under Common Use cases
Press Next:Permissions
Here you can create a custom policy if you want, I suggest using a managed policy
Select an existing managed policy by searching for AmazonEC2RoleforSSM, there are other SSM managed policies, AmazonEC2RoleforSSM is specific for the management of EC2
select it and press next:tags
press next:review,
give it a name - my-ec2-ssm-role
Now we have a role for the EC2 instance, next we need to add that role to the instance.
Go to EC2 - https://console.aws.amazon.com/ec2
select your instance
from the menu on the top right, select actions, security, modify IAM role.
select the role you just created my-ec2-ssm-role
press save
Now that the role is linked go to Systems Manager Session Manager https://console.aws.amazon.com/systems-manager/session-manager
Press Start session
Your instance should be visible, and you can select it and press start session
If you instance is not visible, it could be that you do not have a route to the AWS Service Endpoints. For example the EC2 instance is not in a public subnet or does not have a route to the internet. In this case you need to add 3 VPC endpoints to your subnet. These endpoints are:
com.amazonaws.[region].ssm
com.amazonaws.[region].ssmmessages
com.amazonaws.[region].ec2messages
You can read how to set it up here.
After attaching the AmazonSSMManagedInstanceCore policy to an existing EC2 role, I had to reboot the EC2 instance before it showed up in Systems Manager. Thanks to #Jason who mentioned this in a comment.
You can run AWSSupport-TroubleshootManagedInstance runbook to check what it is missing in your instance's configuration.
If you need to make any change after the troubleshoot like adding an IAM role make sure to restart, the ssm agent in the ec2 instance in order to make it visible in the registered managed instances.
Answering "Systems Manager -> Session Manager, I don't see my instances" --Do you see your managed instances in Fleet Manager? One reason why Instances are not visible to the Systems manager is if the instance has no ssm agent installed. Eg: Ubuntu comes with ssm pre-installed but RHEL does not have ssm pre-installed. Check this out : https://aws.amazon.com/premiumsupport/knowledge-center/systems-manager-ec2-instance-not-appear/
Systems manager immediately showed my ubuntu instances, for RHEL instances I had to manually install ssm agent. https://docs.aws.amazon.com/systems-manager/latest/userguide/agent-install-rhel.html
This might be the reason why you cant see instances in session manager as well.
I had the same issue with all of my EC2 instances not showing up in Session Manager, even though they had the correct security/networking set up, turns out I had to go to Systems Manager -> Session Manager -> Preferences and Enable KMS encryption.
There are a few scenarios in which ssm can be deployed and break. All of this assumes you have the proper role attached to the vm.
resource aws_iam_role "ssm" {
name = "myssm"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
resource aws_iam_role_policy_attachment "ssm" {
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
role = aws_iam_role.ssm.id
}
resource aws_iam_instance_profile "ssm" {
name = "myssm"
role = aws_iam_role.ssm.id
}
resource "aws_instance" "private" {
ami = data.aws_ami.amzn2.id
instance_type = "t3.micro"
subnet_id = aws_subnet.private.id
vpc_security_group_ids = [aws_security_group.test-ssm.id]
iam_instance_profile = aws_iam_instance_profile.ssm.name
tags = {
Name = "session-manager-private"
}
}
public subnet with no public ip (internet access)
In this scenario even though you have a vm in a public subnet and outbound via the igw, there's no public ip on the vm so ssm will not work.
public subnet with public ip (internet access)
In this scenario, same as the previous only difference being I've added a public ip to the vm and ssm kicks into life.
private subnet with public ip (internet access)
In this scenario even though the vm is in a private subnet it has outbound internet access via a public nat gateway which in turn has outbound access via the internet gateway. The nat gateway picutred here has a public ip. When a nat gw with a public ip sits infront of a private subnet those vms use that pubic ip for internet outbound, so ssm works.
Well I wonder what's going to happen here? If you guessed absolutely nothing, you'd be right. There's no public ip no route out of any kind and no way in. It's a completely isolated tenant. So how do you get ssm working this scenario? Next diagram.
private subnet with no public ip (no internet access)
In this instance, you need to add vpc endpoints - unsurprisingly to the vpc - and then associate them with the private subnet you want to connect into. Endpoints are created at vpc level and then "associated". The ssm endpoints are of type "interface" so an eni is created in that subnet for each endpoint and a private dns zone is set up so that the vm sends traffic to the local ssm enis and not to the aws fabric globally.
Here's some terraform to do it, sg is allowing 443.
locals {
endpoints= toset([
"com.amazonaws.eu-west-2.ssm",
"com.amazonaws.eu-west-2.ssmmessages",
"com.amazonaws.eu-west-2.ec2messages"
])
}
resource aws_vpc_endpoint "endpoints" {
for_each = local.endpoints
vpc_id = data.aws_vpc.main.id
service_name = each.key
vpc_endpoint_type = "Interface"
security_group_ids = [
aws_security_group.test-ssm.id
]
private_dns_enabled = true
}
resource "aws_vpc_endpoint_subnet_association" "association" {
for_each = local.endpoints
vpc_endpoint_id = aws_vpc_endpoint.endpoints[each.key].id
subnet_id = aws_subnet.completely-private.id
}
In my case, it took about 30 minutes for EC2 instance to appear in Fleet Manager. I had existing EC2 without any attached IAM service role. I created a new IAM role with AmazonSSMManagedInstanceCore and AmazonEC2RoleforSSM permission policies and in about 30 minutes my EC2 popped up in Fleet Manager. BTW, Windows platform EC2 instance also comes with preinstalled SSM Agent.

EKS Cluster in a private subnet - unhealthy nodes in the kubernetes cluster

I'm trying to create a EKS cluster in a private subnet. I'm having issues getting it working. I get the error unhealthy nodes in the kubernetes cluster. Wonder if its due to security group or some other issues like VPC endpoints?
When I use NAT gateway setup then it works fine. But I don't want to use nat gateway anymore.
One think I'm not sure is should the EKS cluster subnet_ids be only private subnets?
In the below config I'm using both public and private subnets.
resource "aws_eks_cluster" "main" {
name = var.eks_cluster_name
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = concat(var.public_subnet_ids, var.private_subnet_ids)
security_group_ids = [aws_security_group.eks_cluster.id, aws_security_group.eks_nodes.id, aws_security_group.external_access.id]
endpoint_private_access = true
endpoint_public_access = false
}
# Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
# Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
depends_on = [
"aws_iam_role_policy_attachment.aws_eks_cluster_policy",
"aws_iam_role_policy_attachment.aws_eks_service_policy"
]
}
Since you don't have NAT gateway/instance, your nodes can't connect to the internet and fail as they can't "communicate with the control plane and other AWS services" (from here).
Thus, you can use VPC endpoints to enable communication with the plain and the services. To view the properly setup VPC with private subnets for EKS, you can check AWS provided VPC template for EKS (from here).
From the template, the VPC endpoints in us-east-1:
com.amazonaws.us-east-1.ec2
com.amazonaws.us-east-1.ecr.api
com.amazonaws.us-east-1.s3
com.amazonaws.us-east-1.logs
com.amazonaws.us-east-1.ecr.dkr
com.amazonaws.us-east-1.sts
Please note that all these endpoints, escept S3, are not free. So you have to consider if running cheap NAT instances or gateway would be cheaper or more expensive then maintaining these endpoints.

Terraform, EKS and a aurora-mysql serverless RDS - subnets in same AZ

I've started with Terraform a while back, and I've been working on an AWS dev env where I need to put up EKS and a aurora-mysql serverless RDS, and get them to talk to one another.
I used the excellent examples here:
https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/managed_node_groups
and here:
https://github.com/terraform-aws-modules/terraform-aws-rds-aurora/tree/master/examples/serverless (this actually is set to put up a aurora-mysql serverless DB, not postgres as advertised, but mysql is what I'm looking for so, cheers).
So far so good, the serverless example uses the default VPC and that's fine for games. But I want to either:
1. Create the RDS in the same VPC as the EKS to simplify networking:
Towards that end, I added the contents of ....terraform-aws-rds-aurora/examples/serverless/main.tf to ....terraform-aws-eks/examples/managed_node_groups/main.tf and set the tf files from ....terraform-aws-rds-aurora to a folder, and set it like so:
module "aurora" {
source = "../../modules/aurora"
and replaced:
data.aws_vpc.default.id
with
module.vpc.vpc_id
and I got:
Error: error creating RDS cluster: InvalidParameterValue: Aurora Serverless doesn't support DB subnet groups with subnets in the same Availability Zone. Choose a DB subnet group with subnets in different Availability Zones.
status code: 400, request id: 7d2e359f-6609-4dde-b63e-11a16d1efaf2
on ../../modules/aurora/main.tf line 33, in resource "aws_rds_cluster" "this":
33: resource "aws_rds_cluster" "this" {
fair is fair, I read some and realized that I might prefer a different VPC for EKS and RDS in order for each to have redundancy over all AZs in us-west-2. So now I tried -
Creating a new VPC for RDS:
I went back to ..../terraform-aws-rds-aurora/tree/master/examples/serverless/main.tf , and set:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 2.6"
name = "${var.env}-mysql-vpc"
cidr = "172.16.0.0/16"
azs = data.aws_availability_zones.available.names
private_subnets = ["172.16.7.0/24", "172.16.8.0/24", "172.16.9.0/24"]
public_subnets = ["172.16.10.0/24", "172.16.11.0/24", "172.16.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
}
data "aws_vpc" "created" {
id = module.vpc.vpc_id
}
data "aws_subnet_ids" "all" {
vpc_id = data.aws_vpc.created.id
}
and got the same message!
I'm stumped. I don't want to use the default VPC for RDS, and eventually I'll want to edit the VPC for security/configurations.
My questions are:
Is it possible and practical for both EKS and RDS live together in the same VPC?
Seeing that the example runs fine with the default VPC, what am I missing with the VPC creation for RDS?
Can Terraform create an "empty" VPC and the aurora module will then create subnets in it? Or is there a simple way for me to then create the missing subnets (while specifying the AZ for each) and the rest of the VPC requirements for serverless?
I realize that this falls between AWS and Terraform, but will appreciate your help.
Thanks to #mokugo-devops comments I was able to create a new VPC where each subnet had a different AZ. But as it turns out, EKS and Aurora Serverless can live in the same VPC, I just needed to get the public subnets only (that are created by terraform-aws-modules/vpc/aws in different AZs) for serverless, like so:
and have the module "aurora" read them them like so:
module "aurora" {
source = "../../modules/aurora"
name = "aurora-serverless"
engine = "aurora"
engine_mode = "serverless"
replica_scale_enabled = false
replica_count = 0
backtrack_window = 10 # ignored in serverless
subnets = module.vpc.public_subnets

EKS cluster - api endpoint access - public/private

I have provisioned EKS cluster on AWS with public access to api endpoint. While doing I configured SG with ingress only from specific IP. But I could still run the kubectl get svc against the cluster when accessing it from another IP.
I want to have IP restricted access to EKS cluster.
ref - Terraform - Master cluster SG
If public access is enabled does it mean that anyone who has cluster name can deploy anything?
When you create a new cluster, Amazon EKS creates an endpoint for the managed Kubernetes API server that you use to communicate with your cluster (using Kubernetes management tools such as kubectl as you have done).
By default, this API server endpoint is public to the internet, and access to the API server is secured using a combination of AWS Identity and Access Management (IAM) and native Kubernetes Role Based Access Control (RBAC).
So the public access does not mean that anyone who has the cluster name can deploy anything. You can read more about that in the Amazon EKS Cluster Endpoint Access Control AWS documentation.
If you want to provision EKS with Terraform and manage the network topology it's happened through the VPC (Virtual Private Network). You can check this VPC Terraform Module to get all the proper settings.
Hope it'll help.
As well as Claire Bellivier' answer about how EKS clusters are protected via authentication using IAM and RBAC you can now also configure your EKS cluster to be only accessible from private networks such as the VPC the cluster resides in or any peered VPCs.
This has been added in the (as yet unreleased) 2.3.0 version of the AWS provider and can be configured as part of the vpc_options config of the aws_eks_cluster resource:
resource "aws_eks_cluster" "example" {
name = %[2]q
role_arn = "${aws_iam_role.example.arn}"
vpc_config {
endpoint_private_access = true
endpoint_public_access = false
subnet_ids = [
"${aws_subnet.example.*.id[0]}",
"${aws_subnet.example.*.id[1]}",
]
}
}