Packer unable to communicate with AWS Instance - amazon-web-services

I am just playing with packer and I created a simple template like so:
{
"variables": {
"aws_access_key": "{{env `AWS_ACCESS_KEY`}}",
"aws_secret_key": "{{env `AWS_SECRET_KEY`}}"
},
"builders": [
{
"type": "amazon-ebs",
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "us-east-1",
"vpc_id": "MY_DEFAULT_VPC_ID",
"subnet_id": "MY_PUBLIC_SUBNET_ID",
"source_ami": "ami-a025aeb6",
"instance_type": "t2.micro",
"ssh_username": "ubuntu",
"ami_name": "packer-example {{timestamp | clean_resource_name}}"
}
]
}
When I run packer build initial_ami.json I get the following error.
amazon-ebs: output will be in this color.
==> amazon-ebs: Prevalidating any provided VPC information
==> amazon-ebs: Prevalidating AMI Name: packer-example 1628354042
amazon-ebs: Found Image ID: ami-a025aeb6
==> amazon-ebs: Creating temporary keypair: packer_***********
==> amazon-ebs: Creating temporary security group for this instance: packer_****
==> amazon-ebs: Authorizing access to port 22 from [0.0.0.0/0] in the temporary security groups...
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
amazon-ebs: Adding tag: "Name": "Packer Builder"
amazon-ebs: Instance ID: i-******
==> amazon-ebs: Waiting for instance (i-*****) to become ready...
==> amazon-ebs: Using SSH communicator to connect: 172.**.*.**
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' errored after 6 minutes 51 seconds: Timeout waiting for SSH.
==> Wait completed after 6 minutes 51 seconds
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
==> Builds finished but no artifacts were created.
So everything is fine until it tries to connect on the ssh port of the instance. It seems to be using 172.... IP so I don't think it will connect to the instance. My questions are:
Is this issue caused by the fact that packer is creating an instance without public IP?
If so, then how do I force packer to create an instance with a public IP and then use that public IP to connect to the ssh service?

I would suggest you not use public IP while building Packer image rather set the ssh_interface to private_ip so it can be connected from the vpc itself if you are using it as part of your ci/cd process or else you would be charge a hell lot of data transfer cost if you build them more frequently

Ok I figured it out, all we need to do is set:
"associate_public_ip_address": true in the builder section of the template file.

Related

How to solve amazon-ebs: Timeout waiting for SSH? Status Blackhole

I run my packer build
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' errored after 6 minutes 23 seconds: Timeout waiting for SSH.
and
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
I checked my VPC(for user in packer build example)
How to solve this issue?
I removed the default VPC and created the new one with subnet. Added two lines into json
"vpc_id": "vpc-0bb5a477b899e995d",
"subnet_id": "subnet-00934a52461387401",
I got again
Build 'amazon-ebs' errored after 6 minutes 26 seconds: Timeout waiting for SSH.
I checked the route table again

Ansible - mount Amazon EFS filesystem on Windows Server 2019 using NFS: Network Error 53

I'm trying to mount an AWS EFS filesystem on Windows Server 2019, using NFS, and configuring it with Ansible.
I was already able to mount the same AWS EFS filesystem on a Linux instance in the same Region, VPC and Availibility Zone, which makes me think that the AWS EFS part is OK.
This is what I have to configure NFS on the Windows instance:
---
- name: Ensure NFS is installed.
win_feature:
name: "{{ nfs_package }}"
state: present
- name: Add registry key AnonymousGID
win_regedit:
path: HKLM:\SOFTWARE\Microsoft\ClientForNFS\CurrentVersion\Default
name: AnonymousGID
value: 0
type: dword
- name: Add registry key AnonymousUID
win_regedit:
path: HKLM:\SOFTWARE\Microsoft\ClientForNFS\CurrentVersion\Default
name: AnonymousUID
value: 0
type: dword
- name: Copy BAT file
win_copy:
src: nfs_mount_script.bat
dest: C:\nfs_mount_script.bat
- name: Create scheduled task which will mount the network drive
win_scheduled_task:
name: nfs_mount
description: Map NFS share so that it is visible for Ansible tasks
actions:
- path: C:\nfs_mount_script.bat
triggers:
- type: boot
username: SYSTEM
run_level: highest
- name: Mount an NFS volume
win_command: C:\nfs_mount_script.bat
This is nfs_mount_script.bat:
mount -o anon fs-0123456789abcdef.efs.eu-central-1.amazonaws.com:/ J:
This is the error in my console output:
amazon-ebs: TASK [foo.jenkins-node.windows : Ensure NFS is installed.] *******************[0m
amazon-ebs: Friday 28 May 2021 21:18:10 +0200 (0:00:00.023) 0:00:56.326 ************[0m
amazon-ebs: changed: [default][0m
amazon-ebs:[0m
amazon-ebs: TASK [foo.jenkins-node.windows : Add registry key AnonymousGID] **************[0m
amazon-ebs: Friday 28 May 2021 21:19:23 +0200 (0:01:12.874) 0:02:09.201 ************[0m
amazon-ebs: changed: [default][0m
amazon-ebs:[0m
amazon-ebs: TASK [foo.jenkins-node.windows : Add registry key AnonymousUID] **************[0m
amazon-ebs: Friday 28 May 2021 21:19:25 +0200 (0:00:01.963) 0:02:11.164 ************[0m
amazon-ebs: ok: [default][0m
amazon-ebs:[0m
amazon-ebs: TASK [foo.jenkins-node.windows : Copy BAT file] ******************************[0m
amazon-ebs: Friday 28 May 2021 21:19:27 +0200 (0:00:01.913) 0:02:13.077 ************[0m
amazon-ebs: changed: [default][0m
amazon-ebs:[0m
amazon-ebs: TASK [foo.jenkins-node.windows : Create scheduled task which will mount the network drive] ***[0m
amazon-ebs: Friday 28 May 2021 21:19:31 +0200 (0:00:03.667) 0:02:16.745 ************[0m
amazon-ebs: changed: [default][0m
amazon-ebs:[0m
amazon-ebs: TASK [foo.jenkins-node.windows : Mount an NFS volume] ************************[0m
amazon-ebs: Friday 28 May 2021 21:19:33 +0200 (0:00:02.482) 0:02:19.227 ************[0m
amazon-ebs: fatal: [default]: FAILED! => changed=true[0m
amazon-ebs: cmd: C:\nfs_mount_script.bat[0m
amazon-ebs: delta: '0:00:47.121981'[0m
amazon-ebs: end: '2021-05-28 07:20:22.253220'[0m
amazon-ebs: msg: non-zero return code[0m
amazon-ebs: rc: 1[0m
amazon-ebs: start: '2021-05-28 07:19:35.131239'[0m
amazon-ebs: stderr: ''[0m
amazon-ebs: stderr_lines: <omitted>[0m
amazon-ebs: stdout: |2-[0m
amazon-ebs:[0m
amazon-ebs: C:\Users\Administrator>mount -o anon fs-0123456789abcdef.efs.eu-central-1.amazonaws.com:/ J:[0m
amazon-ebs: Network Error - 53[0m
amazon-ebs:[0m
amazon-ebs: Type 'NET HELPMSG 53' for more information.[0m
amazon-ebs: stdout_lines: <omitted>[0m
Already tried:
Googling NET HELPMSG 53 - not very helpful or I wouldn't ask here.
Replace mount -o anon fs-0123456789abcdef.efs.eu-central-1.amazonaws.com:/ J: with mount -o anon \\fs-03614eb713a56f8c2.efs.eu-central-1.amazonaws.com\ J: - neither of the two work.
For reference, this is the corresponding Ansible code on a Linux (Ubuntu) instance, where it does work:
---
- name: Ensure NFS is installed.
package:
name: "{{ nfs_package }}"
state: present
- name: Create a mountable directory if it does not exist
file:
path: "{{ efs_mount_dir }}"
state: directory
owner: "{{ jenkins_user }}"
group: "{{ jenkins_user }}"
mode: '0775'
- name: Mount an NFS volume
mount:
name: "{{ efs_mount_dir }}"
src: "{{ efs_file_system_id }}.efs.{{ aws_region }}.amazonaws.com:/"
fstype: nfs4
opts: nfsvers=4.1
state: mounted
What are the magic Ansible incantations that I need to copy/paste into my YAML file so that the Windows Server will mount the EFS filesystem?
The Microsoft supplied NFS client in Windows Server 2022 (and below) only support NFSv3. EFS requires NFSv4 or NFS4.1 so the MS client is not going to work. (note that Windows NFS server uses NFSv4)
If you want a commercially supported client, OpenText sells a client that works (it does require a little registry work).
https://www.opentext.com/products-and-solutions/products/specialty-technologies/connectivity/nfs-client-nfs-solo
Other options, free but dated and take more effort/maintenance on your side:
http://citi.umich.edu/projects/nfsv4/windows/
https://github.com/contentfree/ms-nfs41-client
Amazon EFS is not supported on Windows instances.
https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/AmazonEFS.html

Can't get SSH connections through AWS Session Manager working

I have an EC2 instance in a private subnet in which I want to copy files.
Instead of a S3 bucket I want to use Secure File Copy through Session Manager as documented on here and announced on here.
A running EC2 instance is attached with an instance profile containing the policy AmazonEC2RoleforSSM. On my local machine (macOS 10.14.5) the AWS CLI (aws-cli/1.16.195) and the Session Manager Plugin (1.1.26.0) is installed and .ssh/config is configured accordingly.
I can log into the instance with Session Manager on the web AWS Console.
I can log into the instance using the CLI with aws ssm start-session --target i-XXX.
I can't log into the instance using SSH. I've tried 2 different OpenSSH client versions:
OpenSSH_7.9p1:
When I run ssh ec2-user#i-XXX it hangs infinitely. However I can see an connected session in the Session Manager. When I SIGTERM the process I get following output and the session is terminated:
Command '['session-manager-plugin', '{"SessionId": "XXX", "TokenValue": "XXX", "StreamUrl": "wss://ssmmessages.eu-central-1.amazonaws.com/v1/data-channel/XXX?role=publish_subscribe", "ResponseMetadata": {"RetryAttempts": 0, "HTTPStatusCode": 200, "RequestId": "XXX", "HTTPHeaders": {"x-amzn-requestid": "XXX", "date": "Wed, 07 Aug 2019 08:47:23 GMT", "content-length": "579", "content-type": "application/x-amz-json-1.1"}}}', 'eu-central-1', 'StartSession', u'cc', '{"DocumentName": "AWS-StartSSHSession", "Target": "i-XXX", "Parameters": {"portNumber": ["22"]}}', u'https://ssm.eu-central-1.amazonaws.com']' returned non-zero exit status -13
OpenSSH_8.0p1:
When I run ssh ec2-user#i-XXX I get the following error and need to manually terminate the session in the Session Manager:
kex_exchange_identification: banner line contains invalid characters
I just got an answer from AWS Support and it working for me now. There was a bug in one of the following components.
Ensure at least following versions and it should work then.
local
aws cli: aws-cli/1.16.213 Python/3.7.2 Darwin/18.7.0 botocore/1.12.203
aws --version
session-manager-plugin: 1.1.26.0
session-manager-plugin --version
target ec2 instance
amazon-ssm-agent: 2.3.687.0
for AmazonLinux yum info amazon-ssm-agent | grep "^Version"
I've also created a neat SSH ProxyCommand script that temporary adds your public ssh key to target instance during connection to target instance.
AWS SSM SSH ProxyComand -> https://gist.github.com/qoomon/fcf2c85194c55aee34b78ddcaa9e83a1

How to use fluentd log driver on Elastic Beanstalk Multicontainer docker

I tried to use fluentd log driver with the following Dockerrun.aws.json,
{
"AWSEBDockerrunVersion": 2,
"containerDefinitions": [
{
"name": "apache",
"image": "php:5.6-apache",
"essential": true,
"memory": 128,
"portMappings": [
{
"hostPort": 80,
"containerPort": 80
}
],
"logConfiguration": {
"logDriver": "fluentd",
"options": {
"fluentd-address": "127.0.0.1:24224"
}
}
}
]
}
but the following error occurred.
ERROR: Encountered error starting new ECS task: {cancel the command.
"failures": [
{
"reason": "ATTRIBUTE",
"arn": "arn:aws:ecs:ap-northeast-1:000000000000:container-instance/00000000-0000-0000-0000-000000000000"
}
],
"tasks": []
}
ERROR: Failed to start ECS task after retrying 2 times.
ERROR: [Instance: i-00000000] Command failed on instance. Return code: 1 Output: beanstalk/hooks/appdeploy/enact/03start-task.sh failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
What sould do I configure?
Seems that you can also accomplish it with .ebextensions/01-fluentd.config file in your application environment directory with the following content:
files:
"/home/ec2-user/setup-available-log-dirvers.sh":
mode: "000755"
owner: root
group: root
content: |
#!/bin/sh
set -e
if ! grep fluentd /etc/ecs/ecs.config &> /dev/null
then
echo 'ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","syslog","fluentd"]' >> /etc/ecs/ecs.config
fi
container_commands:
01-configure-fluentd:
command: /home/ec2-user/setup-available-log-dirvers.sh
Now you have to deploy a new application version (without fluentd configuration yet), rebuild your environment, add fluentd configuration:
logConfiguration:
logDriver: fluentd
options:
fluentd-address: localhost:24224
fluentd-tag: docker.myapp
and now deploy updated app, everything should work now.
I have resolved the problem myself.
First, I prepare a custom ami having the following user data.
#cloud-config
repo_releasever: 2015.09
repo_upgrade: none
runcmd:
- echo 'ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","syslog","fluentd"]' >> /etc/ecs/ecs.config
Second, I define the ami id which is created custom ami in my environment EC2 settings. Finally, I deploy my application to Elastic Beanstalk. After this, fluentd log driver in my environment works normally.
In order to use fluentd log driver in Elastic Beanstalk Multicontainer Docker, it requires to define ECS_AVAILABLE_LOGGING_DRIVERS variable in /etc/ecs/ecs.config. Elastic Beanstalk Multicontainer Docker is using ECS inside, thus related settings is in the ECS documentation.
Please read logConfiguration section in the following documentation:
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html
I have added a comment already to the accepted answer, just adding the complete ebextension file that I used to make it work for me
files:
"/home/ec2-user/setup-available-log-dirvers.sh":
mode: "000755"
owner: root
group: root
content: |
#!/bin/sh
set -e
if ! grep fluentd /etc/ecs/ecs.config &> /dev/null
then
echo 'ECS_AVAILABLE_LOGGING_DRIVERS=["json-file","syslog","fluentd"]' >> /etc/ecs/ecs.config
fi
container_commands:
00-configure-fluentd:
command: /home/ec2-user/setup-available-log-dirvers.sh
01-stop-ecs:
command: stop ecs
02-stop-ecs:
command: start ecs
We are just restating ecs after setting logging drivers

Having trouble creating a basic AWS AMI with Packer.io. SSH Timeout

I'm trying to follow these instructions to build a basic AWS image using Packer.io. But it is not working for me.
Here is my Template file:
{
"variables": {
"aws_access_key": "",
"aws_secret_key": ""
},
"builders": [{
"type": "amazon-ebs",
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "us-east-1",
"source_ami": "ami-146e2a7c",
"instance_type": "t2.micro",
"ssh_username": "ubuntu",
"ami_name": "packer-example {{timestamp}}",
# The following 2 lines don't appear in the tutorial.
# But I had to add them because it said this source AMI
# must be launched inside a VPC.
"vpc_id": "vpc-98765432",
"subnet_id": "subnet-12345678"
}]
}
You will notice that I had to deviate from the instructions by adding the two lines at the bottom (for VPC and subnets). This is because I kept getting the following error:
==> amazon-ebs: Error launching source instance: The specified instance type
can only be used in a VPC. A subnet ID or network interface
ID is required to carry out the request.
(VPCResourceNotSpecified)
That VPC and Subnet are temprorary ones that I manually had to create. But why should I have to do that? Why doesn't packer create those and then delete them like I see it creates a temporary security group and key-pair?
Furthermore, even after I add those two lines, it fails to create the AMI because it gets an SSH timeout. Why? I am having no trouble manually SSHing to other instances in this VPC. The temporary packer instance has InstanceState=Running, StatusChecks=2/2 and SecurityGroup that allows SSH from all over the world.
See the debug output of the packer command below:
$ packer build -debug -var 'aws_access_key=MY_ACCESS_KEY' -var 'aws_secret_key=MY_SECRET_KEY' packer_config_basic.json
Debug mode enabled. Builds will not be parallelized.
amazon-ebs output will be in this color.
==> amazon-ebs: Inspecting the source AMI...
==> amazon-ebs: Pausing after run of step 'StepSourceAMIInfo'. Press enter to continue.
==> amazon-ebs: Creating temporary keypair: packer 99999999-8888-7777-6666-555555555555
amazon-ebs: Saving key for debug purposes: ec2_amazon-ebs.pem
==> amazon-ebs: Pausing after run of step 'StepKeyPair'. Press enter to continue.
==> amazon-ebs: Creating temporary security group for this instance...
==> amazon-ebs: Authorizing SSH access on the temporary security group...
==> amazon-ebs: Pausing after run of step 'StepSecurityGroup'. Press enter to continue.
==> amazon-ebs: Launching a source AWS instance...
amazon-ebs: Instance ID: i-12345678
==> amazon-ebs: Waiting for instance (i-12345678) to become ready...
amazon-ebs: Private IP: 10.0.2.204
==> amazon-ebs: Pausing after run of step 'StepRunSourceInstance'. Press enter to continue.
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
==> amazon-ebs: Pausing before cleanup of step 'StepRunSourceInstance'. Press enter to continue.
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Pausing before cleanup of step 'StepSecurityGroup'. Press enter to continue.
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Pausing before cleanup of step 'StepKeyPair'. Press enter to continue.
==> amazon-ebs: Deleting temporary keypair...
==> amazon-ebs: Pausing before cleanup of step 'StepSourceAMIInfo'. Press enter to continue.
Build 'amazon-ebs' errored: Timeout waiting for SSH.
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
==> Builds finished but no artifacts were created.
You're using t2.micro instance type, which can only run in a VPC environment (see T2 Instances).
Since you are in a VPC, by default all traffics is behind the firewall, so you'll need to setup a Security Groups to allow your IP to access the SSH port on that instance.
More easier way is to use m3.medium instance type, a bit expensive but it run everything quicker and you don't need to setup VPC/Security Groups at all.
make sure,1) internetgateway(active, not blackhole) is attached to the default vpc, where we are launching the instance, 2) and also check the route table, the route to internetgateway(current,not old) is present.