Terraform ec2 - Permission denied (publickey) - amazon-web-services

I try to learn Terraform.
Want to install some stuff on an EC2 and connect from ssh.
I have created a new ssh-key pair for this.
When i try to ssh -i ssh-keys/id_rsa_aws ubuntu#52.47.123.18 I got the error
Permission denied (publickey).
Here a sample of my .tf script.
resource "aws_instance" "airflow" {
ami = "ami-0d3f551818b21ed81"
instance_type = "t3a.xlarge"
key_name = "admin"
vpc_security_group_ids = [aws_security_group.ssh-group.id]
tags = {
"Name" = "airflow"
}
subnet_id = aws_subnet.ec2_subnet.id
}
resource "aws_key_pair" "admin" {
key_name = "admin"
public_key = "ssh-rsa ........" # I cat my key.pub fot this
}
EDIT
Thanks to #GrzegorzOledzki I see that the issue come from my subnet work. Here the files.
gateway.tf
resource "aws_internet_gateway" "my_gateway" {
vpc_id = aws_vpc.my_vpc.id
tags = {
Name = "my-gateway"
}
}
network.tf
resource "aws_vpc" "my_vpc" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "my-vpc"
}
}
resource "aws_eip" "airflow_ip" {
instance = aws_instance.airflow.id
vpc = true
}
security_group.tf
resource "aws_security_group" "ssh-group" {
name = "ssh-group"
vpc_id = aws_vpc.my_vpc.id
ingress {
# TLS (change to whatever ports you need)
from_port = 22
to_port = 22
protocol = "tcp"
# Please restrict your ingress to only necessary IPs and ports.
# Opening to 0.0.0.0/0 can lead to security vulnerabilities.
cidr_blocks = ["my.ip.from.home/32"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
subnet.tf
resource "aws_subnet" "ec2_subnet" {
cidr_block = cidrsubnet(aws_vpc.my_vpc.cidr_block, 3, 1)
vpc_id = aws_vpc.my_vpc.id
availability_zone = "eu-west-3c"
}
resource "aws_route_table" "my_route_table" {
vpc_id = aws_vpc.my_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.my_gateway.id
}
tags = {
Name = "my_route_table"
}
}
resource "aws_route_table_association" "subnet_association" {
route_table_id = aws_route_table.my_route_table.id
subnet_id = aws_subnet.ec2_subnet.id
}
EDIT 2
I destroy everything then rebuilt it and it's working. I'm not sure but I have create my EC2 instance before making and linking the custom-vpc, subnet and security group. It was like something (what ?) went wrong and it couldn't not reassign everything to my instance.

Currenly that pem key has read permission only for users (chmod 400 *.pem). In this case, you need to enable read permission for others also (chmod 404 / chmod o=r *.pem).

I ran in to the same issue and the fix for me too was to just terraform destroy and then rebuilding the infrastructure.
The root cause for my issues was that I had first created and set the key pair for the ec2 instance, but I didn't output/save the private key at all. After creating the ec2 instance with the key pair, I added output + save functionality to the generated ssh keys, but they were actually not the key pair that was set to the ec2 instance, but instead just a new generated one. Now that I generated the key pair set for the newly created instance and saved the output from the private key I could ssh successfully in to the instance

Related

How to launch multiple AWS EC2 instances from a single VPC using Terraform?

Is it possible to launch multiple ec2 instances from terraform using a single VPC? I'm building something which requires multiple instances to be launched from the same region and I'm doing all this using Terraform. But there's a limit in AWS VPC: per region only 5 VPCs are allowed. What I've been doing until now is each time when I need to launch an instance I create a separate VPC for it in terraform. Below is the code for reference:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
}
# Configure the AWS Provider
provider "aws" {
region = "us-east-2"
access_key = "XXXXXXXXXXXXXXXXX"
secret_key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
}
# 1. Create vpc
resource "aws_vpc" "prod-vpc" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "production"
}
}
# 2. Create Internet Gateway
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.prod-vpc.id
}
# 3. Create Custom Route Table
resource "aws_route_table" "prod-route-table" {
vpc_id = aws_vpc.prod-vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gw.id
}
route {
ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.gw.id
}
tags = {
Name = "Prod"
}
}
# 4. Create a Subnet
resource "aws_subnet" "subnet-1" {
vpc_id = aws_vpc.prod-vpc.id
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-2a"
tags = {
Name = "prod-subnet"
}
}
# 5. Associate subnet with Route Table
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.subnet-1.id
route_table_id = aws_route_table.prod-route-table.id
}
# 6. Create Security Group to allow port 22,80,443
resource "aws_security_group" "allow_web" {
name = "allow_web_traffic"
description = "Allow Web inbound traffic"
vpc_id = aws_vpc.prod-vpc.id
ingress {
description = "HTTPS"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "self"
from_port = 8000
to_port = 8000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "HTTP"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_web"
}
}
# 7. Create a network interface with an ip in the subnet that was created in step 4
resource "aws_network_interface" "web-server-nic" {
subnet_id = aws_subnet.subnet-1.id
private_ips = ["10.0.1.50"]
security_groups = [aws_security_group.allow_web.id]
}
# 8. Assign an elastic IP to the network interface created in step 7
resource "aws_eip" "one" {
vpc = true
network_interface = aws_network_interface.web-server-nic.id
associate_with_private_ip = "10.0.1.50"
depends_on = [aws_internet_gateway.gw]
}
output "server_public_ip" {
value = aws_eip.one.public_ip
}
# 9. Create Ubuntu server and install/enable apache2
resource "aws_instance" "web-server-instance" {
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
The only downside to this code is it creates separate VPC everytime I create an instance. I read in a stackoverflow post that we can import an existing VPC using terraform import command. Along with the VPC, I had to import the Internet Gateway and Route Table as well (it was throwing error otherwise). But then I wasn't able to access the instance using SSH and also the commands in the user_data part didn't execute (setting_instance.py will send a firebase notification once the instance starts. That's the only purpose of setting_instance.py)
Not only VPC I'd also like to know if I can use the other resources as well to it's fullest extent possible.
I'm new to terraform and AWS. Any suggestions in the above code are welcome.
EDIT: Instances are created one at a time according to the need, i.e., whenever there is a need to create a new instance I use this code. In the current scenario if there are already 5 instances running up in a region then I won't be able to use this code to create a 6th instance in the same region when the demand arises.
If as you say, they would be exactly same, the easiest way would be to use count, which would indicate how many instance you want to have. For that you can introduce new variable:
variable "number_of_instance" {
default = 1
}
and then
resource "aws_instance" "web-server-instance" {
count = var.number_of_instance
ami = var.AMI_ID
instance_type = "g4dn.xlarge"
availability_zone = "us-east-2a"
key_name = "us-east-2"
network_interface {
device_index = 0
network_interface_id = aws_network_interface.web-server-nic.id
}
root_block_device {
volume_size = "200"
}
iam_instance_profile = aws_iam_instance_profile.training_profile.name
depends_on = [aws_eip.one]
user_data = <<-EOF
#!/bin/bash
python3 /home/ubuntu/setting_instance.py
EOF
tags = {
Name = var.INSTANCE_NAME
}
}
All this must be manage by same state file, not fully separate state files, as again you will end up with duplicates of the VPC. You only change number_of_instance to what you want. For more resilient solution, you would have to use autoscaling group for the instances.

I can't ssh into my newly created EC2 instance and can't figure it out for the life of me

I really can't figure out why I'm unable to SSH into my newly created EC2 instance and can't figure out why for the life of me.
Here is some of my code in Terraform where I created the EC2 and security groups for it.
This is my EC2 code
resource "aws_key_pair" "AzureDevOps" {
key_name = var.infra_env
public_key = var.public_ssh_key
}
# Create network inferface for EC2 instance and assign secruity groups
resource "aws_network_interface" "vm_nic_1" {
subnet_id = var.subnet_id
private_ips = ["10.0.0.100"]
tags = {
Name = "${var.infra_env}-nic-1"
}
security_groups = [
var.ssh_id
]
}
# Add elastic IP addresss for public connectivity
resource "aws_eip" "vm_eip_1" {
vpc = true
instance = aws_instance.virtualmachine_1.id
associate_with_private_ip = "10.0.0.100"
depends_on = [var.gw_1]
tags = {
Name = "${var.infra_env}-eip-1"
}
}
# Deploy virtual machine using Ubuntu ami
resource "aws_instance" "virtualmachine_1" {
ami = var.ami
instance_type = var.instance_type
key_name = aws_key_pair.AzureDevOps.id
#retrieve the Administrator password
get_password_data = true
connection {
type = "ssh"
port = 22
password = rsadecrypt(self.password_data, file("id_rsa"))
https = true
insecure = true
timeout = "10m"
}
network_interface {
network_interface_id = aws_network_interface.vm_nic_1.id
device_index = 0
}
user_data = file("./scripts/install-cwagent.ps1")
tags = {
Name = "${var.infra_env}-vm-1"
}
}
Here is the code for my security group
resource "aws_security_group" "ssh" {
name = "allow_ssh"
description = "Allow access to the instance via ssh"
vpc_id = var.vpc_id
ingress {
description = "Access the instance via ssh"
from_port = 22
to_port = 22
protocol = "TCP"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.infra_env}-allow-ssh"
}
}
If I need to provide any more code or information I can, it's my first time trying to do this and it's frustrating trying to figure it out. I'm trying to use Putty as well and not sure if I just don't know how to use it correctly or if it's something wrong with my EC2 configuration.
I used my public ssh key from my computer for the variable in my aws_key_pair resource. I saved my public ssh key pair as a .ppk file for putty and on my aws console when I go to "connect" it says to use ubuntu#10.0.0.100 for my host name in Putty which I did and when I click okay and it tries to connect it gets a network error connection timed out
I used my public ssh key
You need to use your private key, not public.
use ubuntu#10.0.0.100
10.0.0.100 is private IP address. To be able to connect to your instance over the internet you need to use public IP address.

Terraform config isn't using output from other file for already created resource, instead tries to recreate it and fails (security group id)

In terraform/aws/global/vpc/security_groups.tf I have the below code to create my bastion security group, and the output.tf file as well which is below. But in terraform/aws/layers/bastion/main.tf (code also below) I reference that security group as I need its security group ID to create my EC2 instance, the issue I have is that rather than getting the ID from the already existing security group created by the /vpc/security_groups.tf config it tries to create the whole security group and the run obviously fails because it already exists. How can I change my code to get the ID of the existing SG? I don't want to create my SG in the same config file as my instance, some of my security groups are shared between different resources. I am using Terraform Cloud and VPC has its own workspace, so I assume this could actually be an issue with the states being different.. is there a work around for this?
terraform/aws/global/vpc/security_groups.tf
provider "aws" {
region = "eu-west-1"
}
resource "aws_security_group" "bastion" {
name = "Bastion_Terraform"
description = "Bastion SSH access Terraform"
vpc_id = "vpc-12345"
ingress {
description = "Bastion SSH"
from_port = ##
to_port = ##
protocol = "##"
cidr_blocks = ["1.2.3.4/56"]
}
ingress {
description = "Bastion SSH"
from_port = ##
to_port = ##
protocol = "##"
cidr_blocks = ["1.2.3.4/0"]
}
egress {
description = "Access to "
from_port = ##
to_port = ##
protocol = "tcp"
security_groups = ["sg-12345"]
}
egress {
description = "Access to ##"
from_port = ##
to_port = ##
protocol = "tcp"
security_groups = ["sg-12345"]
}
tags = {
Name = "Bastion Terraform"
}
}
terraform/aws/global/vpc/outputs.tf
output "bastion-sg" {
value = aws_security_group.bastion.id
}
terraform/aws/layers/bastion/main.tf
provider "aws" {
region = var.region
}
module "vpc" {
source = "../../global/vpc"
}
module "ec2-instance" {
source = "terraform-aws-modules/ec2-instance/aws"
name = "bastion"
instance_count = 1
ami = var.image_id
instance_type = var.instance_type
vpc_security_group_ids = ["${module.vpc.bastion-sg}"]
subnet_id = var.subnet
iam_instance_profile = var.iam_role
tags = {
Layer = "Bastion"
}
}
When you have a child module block like this in a TF module:
module "ec2-instance" {
source = "terraform-aws-modules/ec2-instance/aws"
name = "bastion"
instance_count = 1
ami = var.image_id
instance_type = var.instance_type
vpc_security_group_ids = ["${module.vpc.bastion-sg}"]
subnet_id = var.subnet
iam_instance_profile = var.iam_role
tags = {
Layer = "Bastion"
}
}
It doesn't just reference that child module, it instatiates a completely new instance of it unique only to the parent module and its state. Think of this not like an assignment or a pointer but the construction of a whole new instance of the module (using the module as a template) with all of its resources created again.
You will need to either directly reference the outputs of the child module in the parent module that has its module block or you will need to use a terraform_remote_state data source or Terragrunt dependency to load the outputs from the state file.

Why can I not ping my EC2 instance when I've set up the VPC and EC2 via Terraform?

I have a setup via Terraform which includes a VPC, a public subnet, and an EC2 instance with a security group. I am trying to ping the EC2 instance but get timeouts.
A few things I've tried to ensure:
the EC2 is in the subnet, and the subnet is routed to internet via the gateway
the EC2 has a security group allowing all traffic both ways
the EC2 has an elastic IP
The VPC has an ACL that is attached to the subnet and allows all traffic both ways
I'm not sure what I missed here.
My tf file looks like (edited to reflect latest changes):
resource "aws_vpc" "foobar" {
cidr_block = "10.0.0.0/16"
}
resource "aws_internet_gateway" "foobar_gateway" {
vpc_id = aws_vpc.foobar.id
}
/*
Public subnet
*/
resource "aws_subnet" "foobar_subnet" {
vpc_id = aws_vpc.foobar.id
cidr_block = "10.0.1.0/24"
}
resource "aws_route_table" "foobar_routetable" {
vpc_id = aws_vpc.foobar.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.foobar_gateway.id
}
}
resource "aws_route_table_association" "foobar_routetable_assoc" {
subnet_id = aws_subnet.foobar_subnet.id
route_table_id = aws_route_table.foobar_routetable.id
}
/*
Web
*/
resource "aws_security_group" "web" {
name = "vpc_web"
vpc_id = aws_vpc.foobar.id
ingress {
protocol = -1
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
egress {
protocol = -1
from_port = 0
to_port = 0
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_network_acl" "main" {
vpc_id = aws_vpc.foobar.id
subnet_ids = [aws_subnet.foobar_subnet.id]
egress {
protocol = -1
rule_no = 100
action = "allow"
cidr_block = "0.0.0.0/0"
from_port = 0
to_port = 0
}
ingress {
protocol = -1
rule_no = 100
action = "allow"
cidr_block = "0.0.0.0/0"
from_port = 0
to_port = 0
}
}
resource "aws_instance" "web-1" {
ami = "ami-0323c3dd2da7fb37d"
instance_type = "t2.micro"
subnet_id = aws_subnet.foobar_subnet.id
associate_public_ip_address = true
}
resource "aws_eip" "web-1" {
instance = aws_instance.web-1.id
vpc = true
}
Why can I not ping my EC2 instance when I've set up the VPC and EC2 via Terraform?
Why are you adding the self parameter in your security group rule. The docs for terraform state that If true, the security group itself will be added as a source to this ingress rule. Which basically means that only that security group can access the instance. Please remove that and try.
EDIT: see comments below for steps that fixed the problem
Allowing all the traffic through security group would not enable ping to the instance. You need to add a specific security rule - image shown below to enable the ping request.
Remember that AWS has made this rule separate to ensure that you know what you are doing. Being able to ping the instance from anywhere around the world leaves your instance vulnerable to people trying to find instance by bruteforcing various IP address.
Hence, it is advisable to carefully change this rule.

Terraform - DB and security group are in different VPCs

What am I trying to achive:
Create and RDS Aurora cluster and place it in the same VPC as EC2 instances that I start so they can comunicate.
I'm trying to start an SG named "RDS_DB_SG" and make it part of the VPC i'm creating in the process.
I also create an SG named "BE_SG" and make it part of the same VPC.
I'm doing this so I can get access between the 2 (RDS and BE server).
What I did so far:
Created an .tf code and started everything up.
What I got:
It starts ok if I don't include the RDS cluster inside the RDS SG - The RDS creates it's own VPC.
When I include the RDS in the SG I want for him, The RDS cluster can't start and get's an error.
Error I got:
"The DB instance and EC2 security group are in different VPCs. The DB instance is in vpc-5a***63c and the EC2 security group is in vpc-0e5391*****273b3d"
Workaround for now:
I started the infrastructure without specifing a VPC for the RDS. It created it's own default VPC.
I then created manuall VPC-peering between the VPC that was created for the EC2's and the VPC that was created for the RDS.
But I want them to be in the same VPC so I won't have to create the VPC-peering manuall.
My .tf code:
variable "vpc_cidr" {
description = "CIDR for the VPC"
default = "10.0.0.0/16"
}
resource "aws_vpc" "vpc" {
cidr_block = "${var.vpc_cidr}"
tags = {
Name = "${var.env}_vpc"
}
}
resource "aws_subnet" "vpc_subnet" {
vpc_id = "${aws_vpc.vpc.id}"
cidr_block = "${var.vpc_cidr}"
availability_zone = "eu-west-1a"
tags = {
Name = "${var.env}_vpc"
}
}
resource "aws_db_subnet_group" "subnet_group" {
name = "${var.env}-subnet-group"
subnet_ids = ["${aws_subnet.vpc_subnet.id}"]
}
resource "aws_security_group" "RDS_DB_SG" {
name = "${var.env}-rds-sg"
vpc_id = "${aws_vpc.vpc.id}"
ingress {
from_port = 3396
to_port = 3396
protocol = "tcp"
security_groups = ["${aws_security_group.BE_SG.id}"]
}
}
resource "aws_security_group" "BE_SG" {
name = "${var.env}_BE_SG"
vpc_id = "${aws_vpc.vpc.id}"
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "BE" {
ami = "ami-*********************"
instance_type = "t2.large"
associate_public_ip_address = true
key_name = "**********"
tags = {
Name = "WEB-${var.env}"
Porpuse = "Launched by Terraform"
ENV = "${var.env}"
}
subnet_id = "${aws_subnet.vpc_subnet.id}"
vpc_security_group_ids = ["${aws_security_group.BE_SG.id}", "${aws_security_group.ssh.id}"]
}
resource "aws_rds_cluster" "rds-cluster" {
cluster_identifier = "${var.env}-cluster"
database_name = "${var.env}-rds"
master_username = "${var.env}"
master_password = "PASSWORD"
backup_retention_period = 5
vpc_security_group_ids = ["${aws_security_group.RDS_DB_SG.id}"]
}
resource "aws_rds_cluster_instance" "rds-instance" {
count = 1
cluster_identifier = "${aws_rds_cluster.rds-cluster.id}"
instance_class = "db.r4.large"
engine_version = "5.7.12"
engine = "aurora-mysql"
preferred_backup_window = "04:00-22:00"
}
Any suggestions on how to achieve my first goal?