Terraform provisioner error for multiple instances

Terraform provisioner error for multiple instances - amazon-web-services

When running the below file with Terraform I get the following error:
Resource 'aws_instance.nodes-opt-us-k8s' not found for variable
'aws_instance.nodes-opt.us1-k8s.id'.
Do I need to include the provisioner twice because my 'count' variable is creating two? When I just include one for 'count' variable I get the error my Ansible playbook needs to run playbook files, which makes since because it is empty until I figure this error out.
I am in the early stages with Terraform and Linux so pardon my ignorance
#-----------------------------Kubernetes Master & Worker Node Server Creations----------------------------
#-----key pair for Workernodes-----
resource "aws_key_pair" "k8s-node_auth" {
key_name = "${var.key_name2}"
public_key = "${file(var.public_key_path2)}"
}
#-----Workernodes-----
resource "aws_instance" "nodes-opt-us1-k8s" {
instance_type = "${var.k8s-node_instance_type}"
ami = "${var.k8s-node_ami}"
count = "${var.NodeCount}"
tags {
Name = "nodes-opt-us1-k8s"
}
key_name = "${aws_key_pair.k8s-node_auth.id}"
vpc_security_group_ids = ["${aws_security_group.opt-us1-k8s_sg.id}"]
subnet_id = "${aws_subnet.opt-us1-k8s.id}"
#-----Link Terraform worker nodes to Ansible playbooks-----
provisioner "local-exec" {
command = <<EOD
cat <<EOF >> workers
[workers]
${self.public_ip}
EOF
EOD
}
provisioner "local-exec" {
command = "aws ec2 wait instance-status-ok --instance-ids ${aws_instance.nodes-opt-us1-k8s.id} --profile Terraform && ansible-playbook -i workers Kubernetes-Nodes.yml"
}
}

Terraform 0.12.26 resolved similar issue for me (when using multiple file provisioners when deploying multiple VMs to Azure)
Hope this helps you: https://github.com/hashicorp/terraform/issues/22006

When using a provisioner and referring to the resource the provisioner is attached to you need to use the self keyword as you've already spotted with what you are writing to the file.
So in your case you want to use the following provisioner block:
...
provisioner "local-exec" {
command = <<EOD
cat <<EOF >> workers
[workers]
${self.public_ip}
EOF
EOD
}
provisioner "local-exec" {
command = "aws ec2 wait instance-status-ok --instance-ids ${self.id} --profile Terraform && ansible-playbook -i workers Kubernetes-Nodes.yml"
}

Related

How can I have a Terraform output become a permanent value in a userdata script?

I'm not sure what the best way to do this is - but I want to deploy EFS and an ASG + Launch Template with Terraform. I'd like my userdata script (in my launch template) to run commands to mount to EFS
For example:
sudo mount -t efs -o tls fs-0b28edbb9efe91c25:/ efs
My issue is: I need my userdata script to receive my EFS ID, however, this can't just happen on my initial deploy, I also need this to happen whenever I perform a rolling update. I want to be able to change the AMI ID in my launch template, which will perform a rolling update when I run terraform apply and need my EFS ID to be in my userdata script to run the command to mount EFS.
Is there a way to have a terraform output get permanently added to my Userdata script? What are other alternatives for making this happen? Would it involve Cloudformation or other AWS services?
main.tf
resource "aws_vpc" "mtc_vpc" {
cidr_block = "10.123.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "dev"
}
}
resource "aws_launch_template" "foobar" {
name_prefix = "LTTest"
image_id = "ami-017c001a88dd93847"
instance_type = "t2.micro"
update_default_version = true
key_name = "lttest"
user_data = base64encode(templatefile("${path.module}/userdata.sh", {efs_id = aws_efs_file_system.foo.id}))
iam_instance_profile {
name = aws_iam_instance_profile.test_profile.name
}
vpc_security_group_ids = [aws_security_group.mtc_sg.id]
}
resource "aws_autoscaling_group" "bar" {
desired_capacity = 2
max_size = 2
min_size = 2
vpc_zone_identifier = [
aws_subnet.mtc_public_subnet1.id
]
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
}
launch_template {
id = "${aws_launch_template.foobar.id}"
version = aws_launch_template.foobar.latest_version
}
}
resource "aws_efs_file_system" "foo" {
creation_token = "jira-efs"
}
resource "aws_efs_mount_target" "alpha" {
file_system_id = aws_efs_file_system.foo.id
subnet_id = aws_subnet.mtc_public_subnet1.id
security_groups = [aws_security_group.mtc_sg.id]
}
Update:
User-data Script:
#!/usr/bin/env bash
sudo yum install -y amazon-efs-utils
sudo yum install -y git
cd /home/ec2-user
mkdir efs
sudo mount -t efs -o tls ${efs_id}:/ efs

There are a few ways to do this. A couple that come to mind are:
Provide the EFS ID to the user data script using the templatefile() function.
Give your EC2 instance permissions (via IAM) to use the EFS API to search for the ID.
The first option is probably the most practical.
First, define your EFS filesystem (and associated aws_efs_mount_target and aws_efs_access_point resources, but I'll omit those here):
resource "aws_efs_file_system" "efs" {}
Now you can define the user data with the templatefile() function:
resource "aws_launch_template" "foo" {
# ... all the attributes ...
user_data = base64encode(templatefile("${path.module}/user-data.sh.tpl", {
efs_id = aws_efs_file_system.efs.id # Use dns_name or id here
}))
}
The contents of user-data.sh.tpl can have all your set up steps, including the filesystem mount:
sudo mount -t efs -o tls ${efs_id}:/ efs
When Terraform renders the user data in the launch template, it will substitute the variable.

Pass ProxyCommand to Terraform Provisioner 'local-exec' Successfully

I am setting up several servers in AWS utilizing terraform to deploy them, and ansible to configure (the configuration is quite complex). I would like to accomplish all of this from Terraform but I can't seem to get the ProxyCommand to execute correctly (I believe due to the use of mixed quotes). I need to utilize the ProxyCommand as the commands must be proxied through a bastion host. First I provision the bastion:
resource "aws_instance" "bastion" {
ami = var.ubuntu2004
instance_type = "t3.small"
associate_public_ip_address = true
subnet_id = aws_subnet.some_subnet.id
vpc_security_group_ids = [aws_security_group.temp.id]
key_name = "key"
tags = {
Name = "bastion"
}
}
and then I deploy another server which I would like to configure with Ansible utilizing Terraform's provisioner 'local-exec':
resource "aws_instance" "server1" {
ami = var.ubuntu2004
instance_type = "t3.small"
subnet_id = aws_subnet.some_other_subnet.id
vpc_security_group_ids = [aws_security_group.other_temp.id]
key_name = "key"
tags = {
Name = "server1"
}
provisioner "local-exec" {
command = "sleep 120; ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -u ubuntu --private-key ~/.ssh/id_rsa --ssh-common-args='-o ProxyCommand='ssh -W %h:%p ubuntu#${aws_instance.bastion.public_ip}'' -i ${self.private_ip} main.yml"
}
}
I have confirmed I can get all of this working if I just have Terraform provision the infrastructure, and then manually run Ansible with the Proxy Command input, but it fails if I try and utilize local-exec, seemingly because I have to incorporate multiple single quotes which breaks the command. Not sure if the bastion variable is done correctly either. Probably a simple fix, but anyone know how to fix this or maybe accomplish this in an easier way? Thanks

Terraform Resource: Connection Error while executing apply?

I am trying to login to ec2 instance that terraform will create with the following code:
resource "aws_instance" "sess1" {
ami = "ami-c58c1dd3"
instance_type = "t2.micro"
key_name = "logon"
connection {
host= self.public_ip
user = "ec2-user"
private_key = file("/logon.pem")
}
provisioner "remote-exec" {
inline = [
"sudo yum install nginx -y",
"sudo service nginx start"
]
}
}
But this gives me an error:
PS C:\Users\Amritvir Singh\Documents\GitHub\AWS-Scribble\Terraform> terraform apply
provider.aws.region
The region where AWS operations will take place. Examples
are us-east-1, us-west-2, etc.
Enter a value: us-east-1
Error: Invalid function argument
on Session1.tf line 13, in resource "aws_instance" "sess1":
13: private_key = file("/logon.pem")
Invalid value for "path" parameter: no file exists at logon.pem; this function
works only with files that are distributed as part of the configuration source
code, so if this file will be created by a resource in this configuration you
must instead obtain this result from an attribute of that resource.
How do I save pass the key from resource to provisioner at runtime without logging into the console?

Have you tried using the full path? Especially beneficial if you are using modules.
I.E:
private_key = file("${path.module}/logon.pem")
Or I think even this will work
private_key = file("./logon.pem")
I believe your existing code is looking for the file at the root of your filesystem.

connection should be in the provisioner block:
resource "aws_instance" "sess1" {
ami = "ami-c58c1dd3"
instance_type = "t2.micro"
key_name = "logon"
provisioner "remote-exec" {
connection {
host= self.public_ip
user = "ec2-user"
private_key = file("/logon.pem")
}
inline = [
"sudo yum install nginx -y",
"sudo service nginx start"
]
}
}
The above assumes that everything else is correct, e.g. the key file exist or security groups allow for ssh connection.

setting hostname on multiple EC2 based on tags

I am running a terraform code to create multiple EC2 instances. Is there a way to setup the hostname of the instance based on tag and a domain name . Currently i login and run hostnamectl set-hostname ..
here is my tf script i use the create the instance.
resource "aws_instance" "RR-TEMP-V-DB" {
ami = var.linux_ami[var.region]
availability_zone = var.availability_zone
instance_type = var.temp_instance_type
key_name = var.linux_key_name
vpc_security_group_ids = [var.vpc_security_group_ids[var.region]]
subnet_id = var.db_subnet_id
count = var.temp_count
tags = {
Name = "RR-TEMP-V-DB-${format("%02d", count.index + 1)}"
Environment = var.env_tag
}
}
Thanks

We accomplish as part of user data, looks similar to:
instance_name=$(aws ec2 describe-instances --instance-id $(curl -s http://169.254.169.254/latest/meta-data/instance-id) --query "Reservations[*].Instances[*].Tags[?Key=='Name'].Value" --region $(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed -e "s/.$//") --output text)
sudo hostnamectl set-hostname --static $instance_name

you can accomplish that with running as user data as #duhaas suggested or using remote-exec provisioner of terraform. here is the provisioner documentation of terraform, as you will see recommended way is setting user data on instance provision:
https://www.terraform.io/docs/provisioners/
for more details on remote-exec:
https://www.terraform.io/docs/provisioners/remote-exec.html

Terraform: wait till the instance is "reachable"

I have some Terraform code with an aws_instance and a null_resource:
resource "aws_instance" "example" {
ami = data.aws_ami.server.id
instance_type = "t2.medium"
key_name = aws_key_pair.deployer.key_name
tags = {
name = "example"
}
vpc_security_group_ids = [aws_security_group.main.id]
}
resource "null_resource" "example" {
provisioner "local-exec" {
command = "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -T 300 -i ${aws_instance.example.public_dns}, --user centos --private-key files/id_rsa playbook.yml"
}
}
It kinda works, but sometimes there is a bug (probably when the instance in a pending state). When I rerun Terraform - it works as expected.
Question: How can I run local-exec only when the instance is running and accepting an SSH connection?

The null_resource is currently only going to wait until the aws_instance resource has completed which in turn only waits until the AWS API returns that it is in the Running state. There's a long gap from there to the instance starting the OS and then being able to accept SSH connections before your local-exec provisioner can connect.
One way to handle this is to use the remote-exec provisioner on the instance first as that has the ability to wait for the instance to be ready. Changing your existing code to handle this would look like this:
resource "aws_instance" "example" {
ami = data.aws_ami.server.id
instance_type = "t2.medium"
key_name = aws_key_pair.deployer.key_name
tags = {
name = "example"
}
vpc_security_group_ids = [aws_security_group.main.id]
}
resource "null_resource" "example" {
provisioner "remote-exec" {
connection {
host = aws_instance.example.public_dns
user = "centos"
file = file("files/id_rsa")
}
inline = ["echo 'connected!'"]
}
provisioner "local-exec" {
command = "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -T 300 -i ${aws_instance.example.public_dns}, --user centos --private-key files/id_rsa playbook.yml"
}
}
This will first attempt to connect to the instance's public DNS address as the centos user with the files/id_rsa private key. Once it is connected it will then run echo 'connected!' as a simple command before moving on to your existing local-exec provisioner that runs Ansible against the instance.
Note that just being able to connect over SSH may not actually be enough for you to then provision the instance. If your Ansible script tries to interact with your package manager then you may find that it is locked from the instance's user data script running. If this is the case you will need to remotely execute a script that waits for cloud-init to be complete first. An example script looks like this:
#!/bin/bash
while [ ! -f /var/lib/cloud/instance/boot-finished ]; do
echo -e "\033[1;36mWaiting for cloud-init..."
sleep 1
done

There is an ansible specific solution for this problem. Add this code to you playbook(there is all so pre_task clause if you use roles)
- name: will wait till reachable
hosts: all
gather_facts: no # important
tasks:
- name: Wait for system to become reachable
wait_for_connection:
- name: Gather facts for the first time
setup:

For cases where instances are not externally exposed (About 90% of the time in most of my projects), and SSM agent is installed on the target instance (newer AWS AMIs come pre-loaded with it), you can leverage SSM to probe the instance. Here's some sample code:
instanceId=$1
echo "Waiting for instance to bootstrap ..."
tries=0
responseCode=1
while [[ $responseCode != 0 && $tries -le 10 ]]
do
echo "Try # $tries"
cmdId=$(aws ssm send-command --document-name AWS-RunShellScript --instance-ids $instanceId --parameters commands="cat /tmp/job-done.txt # or some other validation logic" --query Command.CommandId --output text)
sleep 5
responseCode=$(aws ssm get-command-invocation --command-id $cmdId --instance-id $instanceId --query ResponseCode --output text)
echo "ResponseCode: $responseCode"
if [ $responseCode != 0 ]; then
echo "Sleeping ..."
sleep 60
fi
(( tries++ ))
done
echo "Wait time over. ResponseCode: $responseCode"
Assuming you have AWS CLI installed locally, you can have this null_resource required before you act on the instance. In my case, I was building an AMI.
resource "null_resource" "wait_for_instance" {
depends_on = [
aws_instance.my_instance
]
triggers = {
always_run = "${timestamp()}"
}
provisioner "local-exec" {
command = "${path.module}/scripts/check-instance-state.sh ${aws_instance.my_instance.id}"
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Terraform provisioner error for multiple instances - amazon-web-services

Terraform 0.12.26 resolved similar issue for me (when using multiple file provisioners when deploying multiple VMs to Azure) Hope this helps you: https://github.com/hashicorp/terraform/issues/22006

Related

How can I have a Terraform output become a permanent value in a userdata script?

Pass ProxyCommand to Terraform Provisioner 'local-exec' Successfully

Terraform Resource: Connection Error while executing apply?

setting hostname on multiple EC2 based on tags

Terraform: wait till the instance is "reachable"

Categories

Resources