Terraform AWS EMR HBase cluster creation - application provisioning timed out

Terraform AWS EMR HBase cluster creation - application provisioning timed out - amazon-web-services

I use terraform to create an HBase cluster in AWS.
When I use these settings a cluster is provisioned successfully:
resource "aws_emr_cluster" "hbase" {
name = "hbase"
release_label = "emr-6.3.1"
applications = ["HBase"]
termination_protection = false
keep_job_flow_alive_when_no_steps = true
ec2_attributes {
key_name = <removed>
subnet_id = <removed>
instance_profile = aws_iam_instance_profile.emr_profile.arn
}
master_instance_group {
instance_type = "m1.medium"
instance_count = "1"
}
core_instance_group {
instance_type = "m1.medium"
instance_count = 4
ebs_config {
size = "20"
type = "gp2"
volumes_per_instance = 1
}
}
ebs_root_volume_size = 10
As soon as I increase the number of master nodes to three, the cluster creation fails with the error message:
Error: Error waiting for EMR Cluster state to be “WAITING” or “RUNNING”: TERMINATING: BOOTSTRAP_FAILURE: On the master instance (i-), application provisioning timed out
I checked the documentation for aws_emr_cluster, but could not find anything to set a timeout.
I also checked the timeout settings for IAM roles, but the default setting is one hour which would be absolutely sufficient.
https://docs.aws.amazon.com/en_en/IAM/latest/UserGuide/id_roles_use.html
I get the above mentioned error message every time cluster creation takes longer than about 16 minutes (16 minutes and 20 seconds according to the Terraform output).
I have also created an AWS MSK resource in the same project which took longer than 17 minutes. This finished successfully without complaining. So it does not seem like it is a global timeout value.
Any ideas would be much appreciated.
Btw:
terraform version
Terraform v1.1.2
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v3.60.0
Best,
Denny

The issue has now been resolved. To keep the costs down for this (test) setup I chose instance type "m1.medium", turned out this was the problem.
Using a bigger instance type solved it.

Related

Terraform created EC2 Instances not associated with ECS Cluster

I'm new Terraform and I'm working on an infrastructure setup for deploying Docker Containers. I've based my ECS Cluster off Infrablocks/ECS-Cluster and my Base Networking on Infrablocks/Base-Network. I've opted to use these due to time constraints on the project.
The problem I'm having is that the two EC2 Container Instances that are created by Infrablocks/ECS-Cluster module are not associated with ECS Cluster that Infrablocks builds. I've had zero luck determining why. This is blocking my task definitions from being able to run containers in the ECS Cluster because there are no associated EC2 Instances. I've provided my two dependent module configurations below.
Thank you in advance for any help you can provide!
My Terraform thus far:
module "base_network" {
source = "infrablocks/base-networking/aws"
version = "2.3.0"
vpc_cidr = "10.0.0.0/16"
region = "us-east-1"
availability_zones = ["us-east-1a", "us-east-1b"]
component = "dev-base-network"
deployment_identifier = "development"
include_route53_zone_association = "true"
private_zone_id = module.route53.private_zone_id
include_nat_gateway = "true"}
module "ecs_cluster" {
source = "infrablocks/ecs-cluster/aws"
version = "2.2.0"
region = "us-east-1"
vpc_id = module.base_network.vpc_id
subnet_ids = module.base_network.public_subnet_ids
associate_public_ip_addresses = "yes"
component = "dev"
deployment_identifier = "devx"
cluster_name = "services"
cluster_instance_ssh_public_key_path = "~/.ssh/id_rsa.pub"
cluster_instance_type = "t2.small"
cluster_minimum_size = 2
cluster_maximum_size = 10
cluster_desired_capacity = 2 }

You'd have to troubleshoot the instance to see why it isn't joining the cluster. On your EC2 instances (which, I have not looked, but I would hope that the "infrablocks" ecs-cluster module uses an AMI with the ECS agent installed), you can look in /var/log/ecs/ecs-agent.log .
If the networking configuration is sound, my first guess would be to check the ECS configuration file. If your module is working properly, it should have populated the config with the cluster name. See here for more on that
(I would have commented instead of answered but this account doesn't have enough rep :shrug:)

Is AWS ECS with Terraform broken?

I am trying to spin up an ECS cluster with Terraform, but can not make EC2 instances register as container instances in the cluster.
I first tried with the verified module from Terraform, but this seems out dated (ecs-instance-profile has wrong path).
Then I tried with another module from anrim, but still no container instances. Here is the script I used:
provider "aws" {
region = "us-east-1"
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "2.21.0"
name = "ecs-alb-single-svc"
cidr = "10.10.10.0/24"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.10.10.0/27", "10.10.10.32/27", "10.10.10.64/27"]
public_subnets = ["10.10.10.96/27", "10.10.10.128/27", "10.10.10.160/27"]
tags = {
Owner = "user"
Environment = "me"
}
}
module "ecs_cluster" {
source = "../../modules/cluster"
name = "ecs-alb-single-svc"
vpc_id = module.vpc.vpc_id
vpc_subnets = module.vpc.private_subnets
tags = {
Owner = "user"
Environment = "me"
}
}
I then created a new ecs cluster (from the aws console) on the same VPC and carefully compared the differences in resources. I managed to find some small differences, fixed them and tried again. But still no container instances!
A fork of the module is available here.

Can you see instances being created in the autoscaling group? If so, I'd suggest SSHing to one of them (either directly or using a bastion host, eg. see this module) and checking ECS agent logs. In my experience those problems are usually related to IAM policies, and that's pretty visible in logs but YMMV.

Terraform Apply fails when trying to create an AWS instance with CoreOS AMI ID due to AWS market place related error

I have created a launch configuration which contains AWS CoreOS AMI as the image. This has been attached into AWS Auto Scaling Group. All the above process has been done via Terraform. But when Auto Scaling group tries to create the instance it fails with following error.
StatusMessage: "In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=ryg425ue2hwnsok9ccfastg4. Launching EC2 instance failed."
It seems like I have to Subscribe to use this CoreOS AMI image, but when I'm creating and instance on AS console, I just select the CoreOS image from market place and continue to other configurations related to instance. But how to achieve this in Terraform? Should I subscribe to AWS CoreOS AMI beforehand or is there a way to bypass this in Terraform?
All the related files and erro trace is given below,
launch-configuration.tf File
resource "aws_launch_configuration" "tomcat-webapps-all" {
name = "tomcat-webapps-all"
image_id = "ami-028e043d0e518a84a"
instance_type = "t2.micro"
key_name = "rnf-sec"
security_groups = ["${aws_security_group.allow-multi-tomcat-webapp-traffic.id}"]
user_data = "${data.ignition_config.webapps.rendered}"
}
auto-scale-group.tf File
resource "aws_autoscaling_group" "tomcat-webapps-all-asg" {
name = "tomcat-webapps-all-asg"
depends_on = ["aws_launch_configuration.tomcat-webapps-all"]
vpc_zone_identifier = ["${aws_default_subnet.default-az1.id}", "${aws_default_subnet.default-az2.id}", "${aws_default_subnet.default-az3.id}"]
max_size = 1
min_size = 0
health_check_grace_period = 300
health_check_type = "EC2"
desired_capacity = 1
force_delete = true
launch_configuration = "${aws_launch_configuration.tomcat-webapps-all.id}"
target_group_arns = ["${aws_lb_target_group.newdasboard-lb-tg.arn}", "${aws_lb_target_group.signup-lb-tg.arn}"]
}
Error Trace
Error: Error applying plan:
1 error(s) occurred:
* aws_autoscaling_group.tomcat-webapps-all-asg: 1 error(s) occurred:
* aws_autoscaling_group.tomcat-webapps-all-asg: "tomcat-webapps-all-asg": Waiting up to 10m0s: Need at least 1 healthy instances in ASG, have 0. Most recent activity: {
ActivityId: "9455ab55-426a-c888-ac95-2d45c78d445a",
AutoScalingGroupName: "tomcat-webapps-all-asg",
Cause: "At 2019-05-20T12:56:29Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1.",
Description: "Launching a new EC2 instance. Status Reason: In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=ryg425ue2hwnsok9ccfastg4. Launching EC2 instance failed.",
Details: "{\"Subnet ID\":\"subnet-c650458f\",\"Availability Zone\":\"ap-southeast-1a\"}",
EndTime: 2019-05-20 12:56:30 +0000 UTC,
Progress: 100,
StartTime: 2019-05-20 12:56:30.642 +0000 UTC,
StatusCode: "Failed",
StatusMessage: "In order to use this AWS Marketplace product you need to accept terms and subscribe. To do so please visit https://aws.amazon.com/marketplace/pp?sku=ryg425ue2hwnsok9ccfastg4. Launching EC2 instance failed."
}

If you log into the console and accept the ULA terms once this error will go away when you apply it via terraform.
If I was you I'd log in, go through the whole process to launch an instance with this AMI, terminate it, then apply the terraform.

If somebody is also having the same issue, I was able to solve it by login into my EC2 console with root user and subscribing to AWS CoreOS Product Page on AWS Marketplace.
After that everything worked as expected. The error returned with a web URL to CoreOS product page on AWS Marketplace. Its just a matter of clicking Continue to Subscribe Button.
If above steps didn't work refer this answer - https://stackoverflow.com/a/56222898/4334340

AWS EC2 Target Tracking Scaling Policies - Scaling multiple instances

I am exploring AWS EC2 autoscaling with target tracking and custom metrics. From the documents I understand that when a particular target is hit, an autoscale event is triggered that either scales in or out the EC2 instance.
I followed the instructions as provided by the terraform docs for aws_autoscaling_policy and is working, but this is scaling in and out just one instance.
Now, for my use-case, i want to scale in and out two instances. Is there a way to do this with target tracking scaling policy?
Any help, much appreciated.
Following is a working policy written in terraform for target tracking with custom metrics.
resource "aws_autoscaling_policy" "target-tracking-autoscale" {
name = "target-traclking-policy"
autoscaling_group_name = "target-tracking-asg"
policy_type = "TargetTrackingScaling"
target_tracking_configuration {
customized_metric_specification {
metric_dimension {
name = "asg"
value = "custom-value"
}
metric_name = "CUSTOM_METRIC"
namespace = "CUSTOM-METRIC/NAMESPACE"
statistic = "Average"
}
target_value = "2"
}
}
Regards.
Update 1
I have tried adding the step_adjustment but this parameter is exclusively for step scaling. Terraform throws this following error:
Error: Error applying plan:
1 error(s) occurred:
* module.pt-wowza.aws_autoscaling_policy.target-tracking-autoscale: 1 error(s) occurred:
* aws_autoscaling_policy.target-tracking-autoscale: step_adjustment is only supported for policy type StepScaling
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

I ran into the same problem and was able to fix it by adding policy_type attribute to the scaling policy resource
policy_type = "TargetTrackingScaling"
policy_type - (Optional) The policy type, either "SimpleScaling", "StepScaling", "TargetTrackingScaling", or "PredictiveScaling". If this value isn't provided, AWS will default to "SimpleScaling."

Elastic Beanstalk instance profile not automatically created when using Terraform in eu-west-2 region

I am using Terraform to successfully spin up some Elastic Beanstalk apps (Single Docker configuration) and enable auto-scaling as part of the app / environment creation.
This works fine in most regions I’ve tried, but when I try to spin it up in London (eu-west-2) I get an error:
Error: Error applying plan:
1 error(s) occurred:
* aws_elastic_beanstalk_environment.my-service-env: 1 error(s) occurred:
* aws_elastic_beanstalk_environment.my-service-env: Error waiting for Elastic Beanstalk Environment (e-mt7f3i5bmq) to become ready: 2 error(s) occurred:
* 2018-06-11 19:31:29.28 +0000 UTC (e-mt7f3i5bmq) : Environment must have instance profile associated with it.
* 2018-06-11 19:31:29.39 +0000 UTC (e-mt7f3i5bmq) : Failed to launch environment.
I have found that if I manually attach the aws-elasticbeanstalk-ec2-role as the IamInstanceProfile it works fine - but this relies on the role having been automatically created previously...
Is there something about the eu-west-2 region which would mean the Beanstalk apps don’t get created with the instance profile as they do in other regions?
What am I missing?
Thanks for your help!

For others stuck on this issue I have found a solution by adding the instance profile directly as a setting. This instance profile doesn't get automatically added like it does when creating an elastic beanstalk through the console. Below is the full beanstalk environment resource defined:
resource "aws_elastic_beanstalk_environment" "beanstalkenvironment" {
name = "dev-example"
application = aws_elastic_beanstalk_application.beanstalkapp.name
solution_stack_name = "64bit Amazon Linux 2018.03 v2.14.1 running Docker 18.09.9-ce"
version_label = aws_elastic_beanstalk_application_version.beanstalkapplicationversion.name
setting {
namespace = "aws:autoscaling:launchconfiguration"
name = "IamInstanceProfile"
value = "aws-elasticbeanstalk-ec2-role"
}
setting {
namespace = "aws:autoscaling:launchconfiguration"
name = "InstanceType"
value = "t2.micro"
}
tags = {
Name = "test"
Environment = "test"
}
}
The exact setting used to fix this error was:
setting {
namespace = "aws:autoscaling:launchconfiguration"
name = "IamInstanceProfile"
value = "aws-elasticbeanstalk-ec2-role"
}
To find what the value "aws-elasticbeanstalk-ec2-role" that's required I checked an existing elastic beanstalk instance that was created through the console. Under the environment, in configuration there is a security section. The role name needed is listed as "IAM instance profile". Hopefully this helps others who get stuck on this issue.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js