AWS EMR provisioning fails when I use custom AMI - amazon-web-services

Problem:
I have an EMR cluster (along with a number of other resources) defined in a cloudformation template. I use the AWS rest api to provision my stack. It works, I can provision the stack successfully.
Then, I made one change: I specified a custom AMI for my EMR cluster. And now the EMR provisioning fails when I provision my stack.
And now my stack creation fails, due to EMR provisioning failing. The only information I can find is an error on the console: null: Error provisioning instances.. Digging into each instance, I see that the master node failed with error Status: Terminated. Last state change reason:Time out occurred during bootstrap
I have s3 logging configured for my EMR cluster, but there are no logs in the s3 bucket.
Details:
I updated my cloudformation script like so:
my_stack.cfn.yaml:
rMyEmrCluster:
Type: AWS::EMR::Cluster
...
Properties:
...
CustomAmiId: "ami-xxxxxx" # <-- I added this
Custom AMI details:
I am adding a custom AMI because I need to encrypt the root EBS volume on all of my nodes. (This is required per documentation)
The steps I took to create my custom AMI:
I launched the base AMI that is used by AWS for EMR nodes: emr 5.7.0-ami-roller-27 hvm ebs (ID: ami-8a5cb8f3)
I created an image from my running instance
I created a copy of this image, with EBS root volume encryption enabled. I use the default encryption key. (I must create my own base image from a running instance, because you are not allowed to create an encrypted copy from an AMI you don't own)
I wonder if this might be a permissions issue, or perhaps my AMI is misconfigured in some way. But it would be prudent for me to find some logs first, to figure out exactly what is going wrong with node provisioning.

I feel stupid. I accidentally used a completely un-related AMI (a redhat 7 image) as the base image, instead of the AMI that EMR uses for it's nodes by default: emr 5.7.0-ami-roller-27 hvm ebs (ami-8a5cb8f3)
I'll leave this question and answer up in case someone else makes the same mistake.
Make sure you create your custom AMI from the correct base AMI: emr 5.7.0-ami-roller-27 hvm ebs (ami-8a5cb8f3)

You mention that you created your custom AMI based on an EMR AMI. However, according to the documentation you linked, you should actually base your AMI on "the most recent EBS-backed Amazon Linux AMI". Your custom AMI does not need to be based on an EMR AMI, and indeed I suppose that doing so could cause some problems (though I have not tried it myself).

Related

packer to bake AMI from shared AMI and Share with other AWS Account

I am trying to create AMI with (shared AMI from another Account). since i do not have access to snapshot i cannot create or rename AMI so i opted to use Packer to Bake New AMI with needed custom Name.
Since Shared AMI is encrypted so the newly created AMI its created with default AWS Key due to this i cannot share AMI with other accounts.
(error msg: ==> amazon-ebs.instance: Error modify AMI attributes: InvalidParameter: Snapshots encrypted with the AWS Managed CMK can't be shared. Specify another snapshot)
need some advice on how to address this issue.
P.S i need to create new AMI with custom name from Shared AMI so i can share same AMI across AWS Accounts.
i am open for hearing alternate approach also.

'm3.xlarge' is not supported in AWS Data Pipeline

I am new to AWS, trying to run an AWS DATA Pipeline by loading data from DynamoDB to S3. But i am getting below error. Please help
Unable to create resource for #EmrClusterForBackup_2020-05-01T14:18:47 due to: Instance type 'm3.xlarge' is not supported. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 3bd57023-95e4-4d0a-a810-e7ba9cdc3712)
I was facing the same problem when I have dynamoDB table and s3 bucket created in us-east-2 region and pipeline in us-east-1 as I was not allowed to create pipeline in us-east-2 region.
But, once I created dynamoDB table and s3 bucket created in us-east-1 region and then pipeline also in the same region, it worked well even with m3.xlarge instance type.
It is always good to use latest generation instances. They are technologically more advanced and some times even cheaper.
So there is no reason to start on older generations.. They are there only for people who are already having infrastructure on those machines.. so to provide backward compatibility.
I think this should help you. AWS will force you to use m3 if you use DynamoDBDataNode or resizeClusterBeforeRunning
https://aws.amazon.com/premiumsupport/knowledge-center/datapipeline-override-instance-type/?nc1=h_ls
I faced the same error but just changing from m3.xlarge to m4.xlarge didn't solve the problem. The DynamoDB table I was trying to export was in eu-west-2 but at the time of writing Data Pipeline is not available in eu-west-2. I found I had to edit the pipeline to change the following:
Instance type from m3.xlarge to m4.xlarge
Release Label from emr-5.23.0 to emr-5.24.0 not strictly necessary for export but required for import [1]
Hardcode the region to eu-west-2
So the end result was:
[1] From: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-prereq.html
On-Demand Capacity works only with EMR 5.24.0 or later
DynamoDB tables configured for On-Demand Capacity are supported only when using Amazon EMR release version 5.24.0 or later. When you use a template to create a pipeline for DynamoDB, choose Edit in Architect and then choose Resources to configure the Amazon EMR cluster that AWS Data Pipeline provisions. For Release label, choose emr-5.24.0 or later.

Can the EC2 instance be created with the latest Image-id of the region when cloudformation template execute

I am having a template where I am creating a Ubuntu EC2 instance based on the region and the associated image id mapped in the template.Is there anyway through which the the latest Ubuntu image id will get selected based on the region.This will happen during template execution.It would be helpful to get any sample template for the same.
There are few ways you can achieve this:
A) You can use the Mappings section of the template to specify an AMI for each region. You would then use Fn::FindInMap to retrieve the value of the AMI according to the evaluation of the pseudo parameter AWS::Region.
See:
Mappings - AWS CloudFormation
Fn::FindInMap - AWS CloudFormation
Pseudo Parameters Reference - AWS CloudFormation
B) You can use a lambda backed custom resource to retrieve the latest ubuntu AMI during stack creation. There is a getting started guide for the same, you can use it as a starting point.
See: Walkthrough: Looking Up Amazon Machine Image IDs - AWS CloudFormation
C) If you can migrate to an Amazon Linux AMI, based on RHEL, you can reference public systems manager parameters for the latest AMI id for that region. I have an example template in github you can use as a reference.
See: CloudFormationExamples/highlyavailable-asg-lamp-server-alb at master ยท smith-b/CloudFormationExamples

Cannot create AWS EMR with autoscaling via cloudformation

I am working on EMR template with autoscaling.
While a static EMR setup with instance group works fine, I cannot attach
AWS::ApplicationAutoScaling::ScalableTarget
As a troubleshooting I've split my template into 2 separate ones. In first I am creating a normal EMR cluster (which is fine). And then in second I have a ScalableTarget definition which fails attach with error:
11:29:34 UTC+0100 CREATE_FAILED AWS::ApplicationAutoScaling::ScalableTarget AutoscalingTarget EMR instance group doesn't exist: Failed to find Cluster XXXXXXX
Funny thing is that this cluster DOES exist.
I also had a look at IAM roles but everything seems to be ok there...
Can anyone advice on that matter?
Did anyone for Autoscaling instancegroup to work via Cloudformation?
I have already tried and raised a request with AWS. This autoscaling feature is not yet available using CloudFormation. Now I am using CF for Custom EMR SecGrp creation and S3 etc and in output tab, I am adding Command line command(aws emr create-cluster...... ). After getting output querying the result to launch Cluster.
Actually, autoscaling can be enabled at the time of cluster launching by using --auto-scaling-role. If we use CF for EMR, autoscaling feature is not available because it launches cluster without "--auto-scaling-role".
I hope this can be useful...

AWS AMI deprecation (API: ec2:RunInstances Not authorized for images)

So I've been using AWS AMI in my cloud formation template.
It seems they create new images every month and deprecate the old ones 2 weeks or so after the new one's released. This creates many problems:
Old template stacks becomes broken.
Templates need to be updated.
Am I missing something?
E.G.
I'm staring at
API: ec2:RunInstances Not authorized for images: [ami-1523bd2f]
error in my
cloud formation events.
Looking it up that's the 02.12 image id:
http://thecloudmarket.com/image/ami-1523bd2f--windows-server-2012-rtm-english-64bit-sql-2012-sp1-web-2014-02-12
Where as now there's a new image id:
http://thecloudmarket.com/image/ami-e976efd3--windows-server-2012-rtm-english-64bit-sql-2012-sp1-web-2014-03-12
You are correct indeed. Windows AMI are deprecated when a new version is released (see http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/Basics_WinAMI.html)
There is no "point and click" solution as of today, documentation says : "AWS updates the AWS Windows AMIs several times a year. Updating involves deprecating the previous AMI and replacing it with a new AMI and AMI ID. To find an AMI after it's been updated, use the name instead of the ID. The basic structure of the AMI name is usually the same, with a new date added to the end. You can use a query or script to search for an AMI by name, confirm that you've found the correct AMI, and then launch your instance."
One possible solution might be to develop a CloudFormation Custom Resource that would check for AMI availability before launching an EC2 instance.
See this documentation about CFN Custom Resources : http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/crpg-walkthrough.html
And this talk from re:Invent : https://www.youtube.com/watch?v=ZhGMaw67Yu0#t=945 (and this sample code for AMI lookup)
You also have the option to create your own custom AMI based on an Amazon provided one.Even if you do not modify anything. Your custom AMI will be an exact copy of the one provided by Amazon but will stay available after Amazon AMI's deprecation.
Netflix has open sourced tools to help to manage AMIs, have a look at Aminator
Linux AMI are deprecated years after release (2003.11 is still available today !) but Windows AMI are deprecated as soon as a patched version is available. This is for security reason.
This ps script works for my purposes, we use windows 2012 base image:
$imageId = "xxxxxxx"
if ( (Get-EC2Image -ImageIds $imageId) -eq $null ) {
$f1 = New-Object Amazon.EC2.Model.Filter ; $f1.Name="owner-alias";$f1.Value="amazon"
$f2 = New-Object Amazon.EC2.Model.Filter ; $f2.Name="platform";$f2.Value="windows"
$img = Get-EC2Image -Filters $f1,$f2 | ? {$_.Name.StartsWith("Windows_Server-2012-RTM-English-64Bit-Base")} | Select-Object -First 1
$imageId =$img.ImageId
}
I recently ran into the same error. I had built a custom ami in one account, and was trying to run an EC2 instance from another account.
The issue for me was that the AMI did not have the correct permissions to enable my user from the other account to run it.
To fix it, I logged in the other account and added the required permissions to the ami:
aws ec2 modify-image-attribute --image-id youramiid --launch-permission "Add=[{UserId=youruserid}]"
More information at this documentation page.
If you are using a training material and copied the code, make sure to replace the AMI name with the correct AMI Image values available under list of AMI's visible under your account. Similar with other values. If you are just cut and paste the values from training code may not be available now.