Having trouble creating a basic AWS AMI with Packer.io. SSH Timeout - amazon-web-services

I'm trying to follow these instructions to build a basic AWS image using Packer.io. But it is not working for me.
Here is my Template file:
{
"variables": {
"aws_access_key": "",
"aws_secret_key": ""
},
"builders": [{
"type": "amazon-ebs",
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "us-east-1",
"source_ami": "ami-146e2a7c",
"instance_type": "t2.micro",
"ssh_username": "ubuntu",
"ami_name": "packer-example {{timestamp}}",
# The following 2 lines don't appear in the tutorial.
# But I had to add them because it said this source AMI
# must be launched inside a VPC.
"vpc_id": "vpc-98765432",
"subnet_id": "subnet-12345678"
}]
}
You will notice that I had to deviate from the instructions by adding the two lines at the bottom (for VPC and subnets). This is because I kept getting the following error:
==> amazon-ebs: Error launching source instance: The specified instance type
can only be used in a VPC. A subnet ID or network interface
ID is required to carry out the request.
(VPCResourceNotSpecified)
That VPC and Subnet are temprorary ones that I manually had to create. But why should I have to do that? Why doesn't packer create those and then delete them like I see it creates a temporary security group and key-pair?
Furthermore, even after I add those two lines, it fails to create the AMI because it gets an SSH timeout. Why? I am having no trouble manually SSHing to other instances in this VPC. The temporary packer instance has InstanceState=Running, StatusChecks=2/2 and SecurityGroup that allows SSH from all over the world.
See the debug output of the packer command below:
$ packer build -debug -var 'aws_access_key=MY_ACCESS_KEY' -var 'aws_secret_key=MY_SECRET_KEY' packer_config_basic.json
Debug mode enabled. Builds will not be parallelized.
amazon-ebs output will be in this color.
==> amazon-ebs: Inspecting the source AMI...
==> amazon-ebs: Pausing after run of step 'StepSourceAMIInfo'. Press enter to continue.
==> amazon-ebs: Creating temporary keypair: packer 99999999-8888-7777-6666-555555555555
amazon-ebs: Saving key for debug purposes: ec2_amazon-ebs.pem
==> amazon-ebs: Pausing after run of step 'StepKeyPair'. Press enter to continue.
==> amazon-ebs: Creating temporary security group for this instance...
==> amazon-ebs: Authorizing SSH access on the temporary security group...
==> amazon-ebs: Pausing after run of step 'StepSecurityGroup'. Press enter to continue.
==> amazon-ebs: Launching a source AWS instance...
amazon-ebs: Instance ID: i-12345678
==> amazon-ebs: Waiting for instance (i-12345678) to become ready...
amazon-ebs: Private IP: 10.0.2.204
==> amazon-ebs: Pausing after run of step 'StepRunSourceInstance'. Press enter to continue.
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
==> amazon-ebs: Pausing before cleanup of step 'StepRunSourceInstance'. Press enter to continue.
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Pausing before cleanup of step 'StepSecurityGroup'. Press enter to continue.
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Pausing before cleanup of step 'StepKeyPair'. Press enter to continue.
==> amazon-ebs: Deleting temporary keypair...
==> amazon-ebs: Pausing before cleanup of step 'StepSourceAMIInfo'. Press enter to continue.
Build 'amazon-ebs' errored: Timeout waiting for SSH.
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
==> Builds finished but no artifacts were created.

You're using t2.micro instance type, which can only run in a VPC environment (see T2 Instances).
Since you are in a VPC, by default all traffics is behind the firewall, so you'll need to setup a Security Groups to allow your IP to access the SSH port on that instance.
More easier way is to use m3.medium instance type, a bit expensive but it run everything quicker and you don't need to setup VPC/Security Groups at all.

make sure,1) internetgateway(active, not blackhole) is attached to the default vpc, where we are launching the instance, 2) and also check the route table, the route to internetgateway(current,not old) is present.

Related

How to solve amazon-ebs: Timeout waiting for SSH? Status Blackhole

I run my packer build
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' errored after 6 minutes 23 seconds: Timeout waiting for SSH.
and
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
I checked my VPC(for user in packer build example)
How to solve this issue?
I removed the default VPC and created the new one with subnet. Added two lines into json
"vpc_id": "vpc-0bb5a477b899e995d",
"subnet_id": "subnet-00934a52461387401",
I got again
Build 'amazon-ebs' errored after 6 minutes 26 seconds: Timeout waiting for SSH.
I checked the route table again

Packer unable to communicate with AWS Instance

I am just playing with packer and I created a simple template like so:
{
"variables": {
"aws_access_key": "{{env `AWS_ACCESS_KEY`}}",
"aws_secret_key": "{{env `AWS_SECRET_KEY`}}"
},
"builders": [
{
"type": "amazon-ebs",
"access_key": "{{user `aws_access_key`}}",
"secret_key": "{{user `aws_secret_key`}}",
"region": "us-east-1",
"vpc_id": "MY_DEFAULT_VPC_ID",
"subnet_id": "MY_PUBLIC_SUBNET_ID",
"source_ami": "ami-a025aeb6",
"instance_type": "t2.micro",
"ssh_username": "ubuntu",
"ami_name": "packer-example {{timestamp | clean_resource_name}}"
}
]
}
When I run packer build initial_ami.json I get the following error.
amazon-ebs: output will be in this color.
==> amazon-ebs: Prevalidating any provided VPC information
==> amazon-ebs: Prevalidating AMI Name: packer-example 1628354042
amazon-ebs: Found Image ID: ami-a025aeb6
==> amazon-ebs: Creating temporary keypair: packer_***********
==> amazon-ebs: Creating temporary security group for this instance: packer_****
==> amazon-ebs: Authorizing access to port 22 from [0.0.0.0/0] in the temporary security groups...
==> amazon-ebs: Launching a source AWS instance...
==> amazon-ebs: Adding tags to source instance
amazon-ebs: Adding tag: "Name": "Packer Builder"
amazon-ebs: Instance ID: i-******
==> amazon-ebs: Waiting for instance (i-*****) to become ready...
==> amazon-ebs: Using SSH communicator to connect: 172.**.*.**
==> amazon-ebs: Waiting for SSH to become available...
==> amazon-ebs: Timeout waiting for SSH.
==> amazon-ebs: Terminating the source AWS instance...
==> amazon-ebs: Cleaning up any extra volumes...
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group...
==> amazon-ebs: Deleting temporary keypair...
Build 'amazon-ebs' errored after 6 minutes 51 seconds: Timeout waiting for SSH.
==> Wait completed after 6 minutes 51 seconds
==> Some builds didn't complete successfully and had errors:
--> amazon-ebs: Timeout waiting for SSH.
==> Builds finished but no artifacts were created.
So everything is fine until it tries to connect on the ssh port of the instance. It seems to be using 172.... IP so I don't think it will connect to the instance. My questions are:
Is this issue caused by the fact that packer is creating an instance without public IP?
If so, then how do I force packer to create an instance with a public IP and then use that public IP to connect to the ssh service?
I would suggest you not use public IP while building Packer image rather set the ssh_interface to private_ip so it can be connected from the vpc itself if you are using it as part of your ci/cd process or else you would be charge a hell lot of data transfer cost if you build them more frequently
Ok I figured it out, all we need to do is set:
"associate_public_ip_address": true in the builder section of the template file.

How to launch workers on AWS EC2 instance when running locally?

I have put together a distributed setup at my university using the Distributed package that comes with Julia for running some intensive simulations. I usually launch workers on local machines through ssh using addprocs.
I have launched an c5.24xlarge EC2 instance. The aws_key.pem file exists and I have done
chmod 400 aws_key.pem
I am able to ssh into the instance just fine.
I am trying to add workers with the following code
workervec2 = [("ubuntu#ec2-xxxx:22", 24)]
addprocs(workervec2 ; sshflags="-i aws_key.pem",
tunnel=true, exename="/home/ubuntu/julia-1.0.4/bin/julia",
dir="/home/ubuntu/simulator")
I am trying to add additional workers on my Amazon EC2 instances, but I am failing with the following error
Warning: Identity file aws_key.pem not accessible: No such file or directory.
ubuntu#ec2-xxxx: Permission denied (publickey).
ERROR: LoadError: Unable to read host:port string from worker. Launch command exited with error?
The warning comes even when launching workers on the local machines, but the launch goes through. However, launching on my EC2 instance fails with the following error, while I am able to ssh from the terminal. What is going wrong?
Adding the ssh key from my local machine to the EC2 instance did the trick. This helped.
Then, workers can be added as usual
workervec2 = [("ubuntu#ec2-xxxx:22", 24)]
addprocs(workervec2 ; sshflags="-i ~/.ssh/id_rsa.pub",
tunnel=true, exename="/home/ubuntu/julia-1.0.4/bin/julia",
dir="/home/ubuntu/simulator")

Can't get SSH connections through AWS Session Manager working

I have an EC2 instance in a private subnet in which I want to copy files.
Instead of a S3 bucket I want to use Secure File Copy through Session Manager as documented on here and announced on here.
A running EC2 instance is attached with an instance profile containing the policy AmazonEC2RoleforSSM. On my local machine (macOS 10.14.5) the AWS CLI (aws-cli/1.16.195) and the Session Manager Plugin (1.1.26.0) is installed and .ssh/config is configured accordingly.
I can log into the instance with Session Manager on the web AWS Console.
I can log into the instance using the CLI with aws ssm start-session --target i-XXX.
I can't log into the instance using SSH. I've tried 2 different OpenSSH client versions:
OpenSSH_7.9p1:
When I run ssh ec2-user#i-XXX it hangs infinitely. However I can see an connected session in the Session Manager. When I SIGTERM the process I get following output and the session is terminated:
Command '['session-manager-plugin', '{"SessionId": "XXX", "TokenValue": "XXX", "StreamUrl": "wss://ssmmessages.eu-central-1.amazonaws.com/v1/data-channel/XXX?role=publish_subscribe", "ResponseMetadata": {"RetryAttempts": 0, "HTTPStatusCode": 200, "RequestId": "XXX", "HTTPHeaders": {"x-amzn-requestid": "XXX", "date": "Wed, 07 Aug 2019 08:47:23 GMT", "content-length": "579", "content-type": "application/x-amz-json-1.1"}}}', 'eu-central-1', 'StartSession', u'cc', '{"DocumentName": "AWS-StartSSHSession", "Target": "i-XXX", "Parameters": {"portNumber": ["22"]}}', u'https://ssm.eu-central-1.amazonaws.com']' returned non-zero exit status -13
OpenSSH_8.0p1:
When I run ssh ec2-user#i-XXX I get the following error and need to manually terminate the session in the Session Manager:
kex_exchange_identification: banner line contains invalid characters
I just got an answer from AWS Support and it working for me now. There was a bug in one of the following components.
Ensure at least following versions and it should work then.
local
aws cli: aws-cli/1.16.213 Python/3.7.2 Darwin/18.7.0 botocore/1.12.203
aws --version
session-manager-plugin: 1.1.26.0
session-manager-plugin --version
target ec2 instance
amazon-ssm-agent: 2.3.687.0
for AmazonLinux yum info amazon-ssm-agent | grep "^Version"
I've also created a neat SSH ProxyCommand script that temporary adds your public ssh key to target instance during connection to target instance.
AWS SSM SSH ProxyComand -> https://gist.github.com/qoomon/fcf2c85194c55aee34b78ddcaa9e83a1

why does my website stops loading on aws ec2 instance randomly once in a while?

I am running a t2.micro ec2 instance on us-west-2a and instance's state is all green.
When I access my website it stops loading once in a while. Even if I reboot it, the website still doesn't load. When I stop an instance and then relaunch it, it shows 1/2 status checks failed.
ALARM TYPE: awsec2-i-20aaa52c-High-Network-Out
I also faced same type of issue.
EC2 instances were failing Instance Status Checks after a stop/start. I was able to take a look on my side at the System logs available to support and I could confirm that the system was having a kernel panic and was unable to boot from the root volume.
So I launched new EC2 temporary instance so we can attach the EBS root volumes of each EC2 instance . Here we modified the grub configuration file so it can load from a previous kernel.
The following commands:
1. Mount the EBS volume as a secondary volume into mnt folder: $ sudo mount /dev/xvdf1 /mnt
2. Backup the grub.cfg file: sudo cp /mnt/boot/grub2/grub.cfg grub.cfg_backup
3. Edit the grub.cfg file: sudo vim /mnt/boot/grub2/grub.cfg
4. Here we commented # all the lines for the first entry loading the new kernel.
Then you attached the original EBS volumes back to the original EC2 instances and these EC2 instances were able to successfully boot.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/TroubleshootingInstances.html#FilesystemKernel