Cannot create AWS EMR with autoscaling via cloudformation - amazon-web-services

I am working on EMR template with autoscaling.
While a static EMR setup with instance group works fine, I cannot attach
AWS::ApplicationAutoScaling::ScalableTarget
As a troubleshooting I've split my template into 2 separate ones. In first I am creating a normal EMR cluster (which is fine). And then in second I have a ScalableTarget definition which fails attach with error:
11:29:34 UTC+0100 CREATE_FAILED AWS::ApplicationAutoScaling::ScalableTarget AutoscalingTarget EMR instance group doesn't exist: Failed to find Cluster XXXXXXX
Funny thing is that this cluster DOES exist.
I also had a look at IAM roles but everything seems to be ok there...
Can anyone advice on that matter?
Did anyone for Autoscaling instancegroup to work via Cloudformation?

I have already tried and raised a request with AWS. This autoscaling feature is not yet available using CloudFormation. Now I am using CF for Custom EMR SecGrp creation and S3 etc and in output tab, I am adding Command line command(aws emr create-cluster...... ). After getting output querying the result to launch Cluster.
Actually, autoscaling can be enabled at the time of cluster launching by using --auto-scaling-role. If we use CF for EMR, autoscaling feature is not available because it launches cluster without "--auto-scaling-role".
I hope this can be useful...

Related

The results of `aws eks list-nodegroups` and `eksctl get nodegroups` are inconsistent

eksctl get nodegroups --cluster=cluster-name --profile=dev
aws eks list-nodegroups --cluster=cluster-name --profile=dev
First result is correct
Second result is air as follows:
{
"nodegroups": []
}
I used these two commands to get the nodegroup of the cluster, but found that the results were not consistent.
The configuration file I used was the same ~/.aws/config.
The cluster_name was checked by the command. Come out, these two commands can correctly detect cluster but cannot detect nodegroup
Thanks in advance
According to eksctl documentation:
Listing nodegroups
To list the details about a nodegroup or all of the nodegroups, use:
eksctl get nodegroup --cluster=<clusterName> [--name=<nodegroupName>]
Nodegroup immutability
By design, nodegroups are immutable. This means that if you need to
change something (other than scaling) like the AMI or the instance
type of a nodegroup, you would need to create a new nodegroup with the
desired changes, move the load and delete the old one. Check
Deleting and
draining.
And for list-nodegroup from AWS documentation
Lists the Amazon EKS managed node groups associated with the specified cluster in your AWS account in the specified Region. Self-managed node groups are not listed.
As you can see there are differences in these commands such as Self-managed node groups are not listed in the second command.

AWS codedeploy blue green deployment

I have setup code pipeline for end to end automatic deployment of revision on EC2 instances using cloudformation template, the deployment group is of type blue/green for codedploy.
But I dont understand how to keep the code deployment group in sync with newly created auto scaling group (green).
Do I have to create new lambda invoke action in pipeline after successful deployment to update the newly created auto scaling group name.
Unfortunately, CloudFormation does not support Blue/Green deployments for EC2 platform:
For blue/green deployments, AWS CloudFormation supports deployments on Lambda compute platforms only.
Support for ECS is very new.
To create deployment group for blue/green for EC2 platform you would have to create a custom resource in CloudFormation .
The custom resource would be based on a lambda function, and in that lambda function you would use create_deployment_group to define blue/green details for your EC2 instances. As part of this process, you will have an option to choose how to deal with AutoScaling group, e.g.
"greenFleetProvisioningOption": {
"action": "COPY_AUTO_SCALING_GROUP"
}
For creation of custom resource, crhelper by AWS is very useful.
Hope this helps and hope Blue/Green for EC2 will be supported by CloudFormation soon.

CloudFormation is not propagating stack-level tags for EMR

As per the AWS Cloudformation documentation
it is mentioned that Cloudformation automatically provides stack-level tags to resources.
aws:cloudformation:logical-id
aws:cloudformation:stack-id
aws:cloudformation:stack-name
I could see that for resources like EC2, S3, etc.
But when it comes to EMR I couldn’t see those tags. I need aws:cloudformation:stack-id tag value, so that I can later identify stackId without any hustle.
Isn’t it supported for EMR?
If not what could be workaround? I need to add CF stackId using which I can easily identify the stack for other use.
Note: aws cloudformation describe-stack-resources --physical-resource-id j-XXXXXXXXXXX this is not an option to get stackId because of not having enough IAM politics.
How I'm creating EMR cluster: I have one lambda which invokes CloudFormation using boto3, which then created the cluster.
I checked that on my EMR cluster and CloudFormation. You are correct. Tags are no where to be seen.
Could be oversight on AWS part, as they explicitly write in the docs that only EBS volumes don't have such tags:
All stack-level tags, including automatically created tags, are propagated to resources that AWS CloudFormation supports. Currently, tags are not propagated to Amazon EBS volumes that are created from block device mappings.
The only workaround I can think of is to "manually" create such tags, e.g. using custom resources. Or as you are already using lambda, do it in your lambda after EMR cluster creation.

'm3.xlarge' is not supported in AWS Data Pipeline

I am new to AWS, trying to run an AWS DATA Pipeline by loading data from DynamoDB to S3. But i am getting below error. Please help
Unable to create resource for #EmrClusterForBackup_2020-05-01T14:18:47 due to: Instance type 'm3.xlarge' is not supported. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 3bd57023-95e4-4d0a-a810-e7ba9cdc3712)
I was facing the same problem when I have dynamoDB table and s3 bucket created in us-east-2 region and pipeline in us-east-1 as I was not allowed to create pipeline in us-east-2 region.
But, once I created dynamoDB table and s3 bucket created in us-east-1 region and then pipeline also in the same region, it worked well even with m3.xlarge instance type.
It is always good to use latest generation instances. They are technologically more advanced and some times even cheaper.
So there is no reason to start on older generations.. They are there only for people who are already having infrastructure on those machines.. so to provide backward compatibility.
I think this should help you. AWS will force you to use m3 if you use DynamoDBDataNode or resizeClusterBeforeRunning
https://aws.amazon.com/premiumsupport/knowledge-center/datapipeline-override-instance-type/?nc1=h_ls
I faced the same error but just changing from m3.xlarge to m4.xlarge didn't solve the problem. The DynamoDB table I was trying to export was in eu-west-2 but at the time of writing Data Pipeline is not available in eu-west-2. I found I had to edit the pipeline to change the following:
Instance type from m3.xlarge to m4.xlarge
Release Label from emr-5.23.0 to emr-5.24.0 not strictly necessary for export but required for import [1]
Hardcode the region to eu-west-2
So the end result was:
[1] From: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-importexport-ddb-prereq.html
On-Demand Capacity works only with EMR 5.24.0 or later
DynamoDB tables configured for On-Demand Capacity are supported only when using Amazon EMR release version 5.24.0 or later. When you use a template to create a pipeline for DynamoDB, choose Edit in Architect and then choose Resources to configure the Amazon EMR cluster that AWS Data Pipeline provisions. For Release label, choose emr-5.24.0 or later.

AWS EMR provisioning fails when I use custom AMI

Problem:
I have an EMR cluster (along with a number of other resources) defined in a cloudformation template. I use the AWS rest api to provision my stack. It works, I can provision the stack successfully.
Then, I made one change: I specified a custom AMI for my EMR cluster. And now the EMR provisioning fails when I provision my stack.
And now my stack creation fails, due to EMR provisioning failing. The only information I can find is an error on the console: null: Error provisioning instances.. Digging into each instance, I see that the master node failed with error Status: Terminated. Last state change reason:Time out occurred during bootstrap
I have s3 logging configured for my EMR cluster, but there are no logs in the s3 bucket.
Details:
I updated my cloudformation script like so:
my_stack.cfn.yaml:
rMyEmrCluster:
Type: AWS::EMR::Cluster
...
Properties:
...
CustomAmiId: "ami-xxxxxx" # <-- I added this
Custom AMI details:
I am adding a custom AMI because I need to encrypt the root EBS volume on all of my nodes. (This is required per documentation)
The steps I took to create my custom AMI:
I launched the base AMI that is used by AWS for EMR nodes: emr 5.7.0-ami-roller-27 hvm ebs (ID: ami-8a5cb8f3)
I created an image from my running instance
I created a copy of this image, with EBS root volume encryption enabled. I use the default encryption key. (I must create my own base image from a running instance, because you are not allowed to create an encrypted copy from an AMI you don't own)
I wonder if this might be a permissions issue, or perhaps my AMI is misconfigured in some way. But it would be prudent for me to find some logs first, to figure out exactly what is going wrong with node provisioning.
I feel stupid. I accidentally used a completely un-related AMI (a redhat 7 image) as the base image, instead of the AMI that EMR uses for it's nodes by default: emr 5.7.0-ami-roller-27 hvm ebs (ami-8a5cb8f3)
I'll leave this question and answer up in case someone else makes the same mistake.
Make sure you create your custom AMI from the correct base AMI: emr 5.7.0-ami-roller-27 hvm ebs (ami-8a5cb8f3)
You mention that you created your custom AMI based on an EMR AMI. However, according to the documentation you linked, you should actually base your AMI on "the most recent EBS-backed Amazon Linux AMI". Your custom AMI does not need to be based on an EMR AMI, and indeed I suppose that doing so could cause some problems (though I have not tried it myself).