terraform with private cloud implementation of ec2 - amazon-web-services

Updated :
We have a private cloud hosted in our datacenter which is stripped down version of AWS. We have exposed EC2 API's to allow users to create VM's using awscli.
I am trying to create VM's using Terraform and for initial tests i created a .tf file as below:
provider "aws" {
access_key = "<key>"
secret_key = "<key>"
region = "us-west-1"
skip_credentials_validation = true
endpoints
{
ec2 = "https://awsserver/services/api/aws/ec2"
}
}
resource "aws_instance" "Automation" {
ami = "ami-100011201"
instance_type = "c3.xlarge"
subnet_id = "subnet1:1"
}
This is the error message after running terraform plan
Error: Error running plan: 1 error(s) occurred:
* provider.aws: AWS account ID not previously found and failed retrieving via all available methods. See https://www.terraform.io/docs/providers/aws/index.html#skip_requesting_account_id for workaround and implications. Errors: 2 errors occurred:
* error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
status code: 403, request id: 58f9d498-6259-11e9-b146-95598aa219b5
* failed getting account information via iam:ListRoles: InvalidClientTokenId: The security token included in the request is invalid.
status code: 403, request id: c10f8a06-58b4-4d0c-956a-5c8c684664ea
We haven't implemented sts and the query always goes to the AWS cloud instead of the private cloud API server.
What am I missing?

This worked for me to create a vm.
provider "aws" {
access_key = "<key>"
secret_key = "<key>"
region = "us-west-1"
skip_credentials_validation =true
skip_requesting_account_id = true
skip_metadata_api_check = true
endpoints
{
ec2 = "https://awsserver/services/api/aws/ec2"
}
}
resource "aws_instance" "Automation" {
ami = "ami-100011201"
instance_type = "c3.xlarge"
subnet_id = "subnet1:1"
}
It creates a VM, however the command errors out with
aws_instance.Automation: Still creating... (1h22m4s elapsed)
aws_instance.Automation: Still creating... (1h22m14s elapsed)
aws_instance.Automation: Still creating... (1h22m24s elapsed)
Error: Error applying plan:
1 error(s) occurred:
* aws_instance.Automation: 1 error(s) occurred:
* aws_instance.Automation: Error waiting for instance (i-101149362) to become ready: timeout while waiting for state to become 'running' (last state: 'pending', timeout: 10m0s)
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Related

Terraform Launch Type Fargate for windows container Error:- You do not have authorization to access the specified platform

Description
Terraform: For Launch type, Fargate with windows container getting below error after running terraform apply Error:
error creating app-name service: error waiting for ECS service (app-name) creation: AccessDeniedException: You do not have authorization to access the specified platform.
Below Terraform and AWS provider version used:
Terraform CLI and Terraform AWS Provider Version
User-Agent: APN/1.0 HashiCorp/1.0 Terraform/0.12.31 (+https://www.terraform.io) terraform-provider-aws/3.70.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.42.23 (go1.16; linux; amd64)
Affected Resource(s):- aws_ecs_service
Terraform Configuration Files
resource "aws_ecs_task_definition" "app_task" {
family = "${var.tags["environment"]}-app"
container_definitions = data.template_file.app_task_definition.rendered
requires_compatibilities = ["FARGATE"]
network_mode = "awsvpc"
task_role_arn = aws_iam_role.ecs_role.arn
execution_role_arn = aws_iam_role.ecs_role.arn
memory = var.fargate_memory
cpu = var.fargate_cpu
runtime_platform {
operating_system_family = "WINDOWS_SERVER_2019_CORE"
cpu_architecture = "X86_64"
}
depends_on = [null_resource.confd_cluster_values]
}
resource "aws_ecs_service" "app" {
name = "${var.tags["environment"]}-app"
cluster = data.terraform_remote_state.fargate_cluster.outputs.cluster.id
task_definition = aws_ecs_task_definition.app_task.arn
desired_count = var.ecs_app_desired_count
health_check_grace_period_seconds = 2147483647
deployment_minimum_healthy_percent = 0
deployment_maximum_percent = 100
launch_type = "FARGATE"
enable_execute_command = true
network_configuration {
security_groups = [data.terraform_remote_state.fargate_cluster.outputs.cluster_security_group]
subnets = data.aws_subnet_ids.private.ids
}
load_balancer {
target_group_arn = aws_alb_target_group.app.arn
container_name = var.alb_target_container_name
container_port = 8097
}
lifecycle {
ignore_changes = [desired_count]
}
depends_on = [aws_ecs_task_definition.app_task]
}
Debug Output
-----------------------------------------------------: timestamp=2022-01-01T16:30:06.055+0530
2022-01-01T16:30:06.055+0530 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/01 16:30:06 [DEBUG] [aws-sdk-go] {"__type":"AccessDeniedException","message":"You do not have authorization to access the specified platform."}: timestamp=2022-01-01T16:30:06.055+0530
2022-01-01T16:30:06.055+0530 [INFO] plugin.terraform-provider-aws_v3.70.0_x5: 2022/01/01 16:30:06 [DEBUG] [aws-sdk-go] DEBUG: Validate Response ecs/CreateService failed, attempt 0/25, error AccessDeniedException: You do not have authorization to access the specified platform.: timestamp=2022-01-01T16:30:06.055+0530
The issue is not due to your TF code, but due to your IAM permissions that you use to run the code. You have to verity your permissions. You may also be limited at the AWS Organization level if your account is part of a group of accounts.
After reading this https://aws.amazon.com/blogs/containers/running-windows-containers-with-amazon-ecs-on-aws-fargate/ came to know that Amazon ECS Exec feature is unsupported in Fargate for Windows tasks and therefore the error occurred.
Disabling below in aws_ecs_service resolved the issue.
enable_execute_command = true
It would be helpful if terraform can show users an appropriate message saying the above feature is not available for windows instead of throwing an error "You do not have authorization to access the specified platform."

Terraform EC2 (Root Block device Encryption error failing to reach target state)

This error has been in the head for over 10 days.
While creating an EC2 Instance in Terraform, the Instance won’t reach the target state and says:
Error: Error waiting for instance (i-*************) to become ready: Failed to reach target state. Reason: Client.InternalError: Client error on launch
And also we have an encryption of the new EBS Volumes enabled in our EC2 dashboard.
And my basic EC2 Code looks like this:
resource “aws_instance” “web” {
ami = “ami-"
instance_type = “t2.micro”
availability_zone = “ap-south-1a”
root_block_device {
volume_size = “10”
volume_type = “gp2”
delete_on_termination = true
encrypted = true
kms_key_arn = "arn:aws:kms:*************”
}
}
I've just been doing this myself.
I'm using KMS CMK and call the arn using:
root_block_device {
kms_key_id = "arn:aws:kms:*************”
}
You may also find that you need to allow the user account in KMS for that CMK, which is usually why you see that error:
Error waiting for instance (i-*************) to become ready...

Unable to find subscription for given ARN

I am testing my AWS terraform configuration with LocalStack. The final goal is to make a queue listen to my topic.
I am running Localstack with the following command:
docker run --rm -it -p 4566:4566 localstack/localstack
After running the command terraform destroy I get the error message:
aws_sns_topic_subscription.subscription: Destroying... [id=arn:aws:sns:us-east-1:000000000000:topic:a0d47652-3ae4-46df-9b63-3cb6e154cfcd]
╷
│ Error: error waiting for SNS topic subscription (arn:aws:sns:us-east-1:000000000000:topic:a0d47652-3ae4-46df-9b63-3cb6e154cfcd) deletion: InvalidParameter: Unable to find subscription for given ARN
│ status code: 400, request id: 2168e636
│
│
╵
I have run the code against the real AWS without a problem.
Here is the code for the terraform file
terraform {
required_version = ">= 0.12.26"
}
provider "aws" {
region = "us-east-1"
s3_force_path_style = true
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true
endpoints {
sns = "http://localhost:4566"
sqs = "http://localhost:4566"
}
}
resource "aws_sqs_queue" "queue" {
name = "queue"
}
resource "aws_sns_topic" "topic" {
name = "topic"
}
resource "aws_sns_topic_subscription" "subscription" {
endpoint = aws_sqs_queue.queue.arn
protocol = "sqs"
topic_arn = aws_sns_topic.topic.arn
}
Sadly this is an issue with AWS, you have to create a ticket look here and https://stackoverflow.com/a/64568018/6085193
"When you delete a topic, subscriptions to the topic will not be "deleted" immediately, but become orphans. SNS will periodically clean these orphans, usually every 10 hours, but not guaranteed. If you create a new topic with the same topic name before these orphans are cleared up, the new topic will not inherit these orphans. So, no worry about them"
This has been fixed with issue:
https://github.com/localstack/localstack/issues/4022

How to create EC2 instance on LocalStack with terraform?

I am trying to run EC2 instance on LocalStack using Terraform.
After 50 minutes of trying to create the instance
I got this response from terraform apply:
Error: error getting EC2 Instance (i-cf4da152ddf3500e1) Credit
Specifications: SerializationError: failed to unmarshal error message
status code: 500, request id: caused by: UnmarshalError: failed to
unmarshal error message caused by: expected element type <Response>
but have <title>
on main.tf line 34, in resource "aws_instance" "example": 34:
resource "aws_instance" "example" {
For LocalStack and Terraform v0.12.18 I use this configuration:
provider "aws" {
access_key = "mock_access_key"
region = "us-east-1"
s3_force_path_style = true
secret_key = "mock_secret_key"
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true
endpoints {
apigateway = "http://localhost:4567"
cloudformation = "http://localhost:4581"
cloudwatch = "http://localhost:4582"
dynamodb = "http://localhost:4569"
es = "http://localhost:4578"
firehose = "http://localhost:4573"
iam = "http://localhost:4593"
kinesis = "http://localhost:4568"
lambda = "http://localhost:4574"
route53 = "http://localhost:4580"
redshift = "http://localhost:4577"
s3 = "http://localhost:4572"
secretsmanager = "http://localhost:4584"
ses = "http://localhost:4579"
sns = "http://localhost:4575"
sqs = "http://localhost:4576"
ssm = "http://localhost:4583"
stepfunctions = "http://localhost:4585"
sts = "http://localhost:4592"
ec2 = "http://localhost:4597"
}
}
resource "aws_instance" "example" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
}
When I run LocalStack with docker-compose up directly from newest github (https://github.com/localstack/localstack)
From logs I have seen that EC2 related endpoint was setup.
I appreciate any advice that would help me to run EC2 on LocalStack.
Working fine with below docker image of localstack.
docker run -it -p 4500-4600:4500-4600 -p 8080:8080 --expose 4572 localstack/localstack:0.11.1
resource "aws_instance" "web" {
ami = "ami-0d57c0143330e1fa7"
instance_type = "t2.micro"
tags = {
Name = "HelloWorld"
}
}
provider "aws" {
region = "us-east-1"
s3_force_path_style = true
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true
endpoints {
apigateway = "http://localhost:4567"
cloudformation = "http://localhost:4581"
cloudwatch = "http://localhost:4582"
dynamodb = "http://localhost:4569"
es = "http://localhost:4578"
firehose = "http://localhost:4573"
iam = "http://localhost:4593"
kinesis = "http://localhost:4568"
lambda = "http://localhost:4574"
route53 = "http://localhost:4580"
redshift = "http://LOCALHOST:4577"
s3 = "http://localhost:4572"
secretsmanager = "http://localhost:4584"
ses = "http://localhost:4579"
sns = "http://localhost:4575"
sqs = "http://localhost:4576"
ssm = "http://localhost:4583"
stepfunctions = "http://localhost:4585"
sts = "http://localhost:4592"
ec2 = "http://localhost:4597"
}
}
terraform apply
aws_instance.web: Destroying... [id=i-099392def6b574255]
aws_instance.web: Still destroying... [id=i-099392def6b574255, 10s elapsed]
aws_instance.web: Destruction complete after 10s
aws_instance.web: Creating...
aws_instance.web: Still creating... [10s elapsed]
aws_instance.web: Creation complete after 12s [id=i-9c942d138970d44a4]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
Note: Its a dummy instance so won't be available for ssh and all. however, suitable for testing of terraform apply/destroy use case on ec2.

Terraform: errors with aws_appautoscaling_target and aws_appautoscaling_policy due to IAM DynamoDBAutoscaleRole

Short question:
Why aws_appautoscaling_policy and aws_appautoscaling_target don't see recently created DynamoDBAutoscaleRole (maybe due to IAM DynamoDBAutoscaleRole eventual consistency)?
Terraform version: v0.10.7
Long description and question:
Terraform configuration has the following resources (creation order as described here): DynamoDBAutoscaleRole,
DynamoDB table, read and write auto scaling resources for table
aws_appautoscaling_policy and aws_appautoscaling_target.
Sometimes this script runs fine (with terraform apply), but in most cases aws_appautoscaling_target and aws_appautoscaling_policy are failed with different errors like the following:
"aws_appautoscaling_policy.Read: Failed to create scaling policy:
Error putting scaling policy: FailedResourceAccessException: Unable to
retrieve capacity for resource: table/Preferences, scalable dimension:
dynamodb:table:ReadCapacityUnits. Reason: The security token included
in the request is invalid. status code: 400"
"aws_appautoscaling_target.Read: Error creating application
autoscaling target: ValidationException: Validation failed for
resource: table/Preferences, scalable dimension:
dynamodb:table:ReadCapacityUnits. Reason: The security token included
in the request is invalid. status code: 400"
If I re-run terraform apply, sometimes multiple times, it works. terraform plan always runs successfully.
If I run creation of only DynamoDBAutoscaleRole (in a separate .tf file),
and after several minutes run creation of table, aws_appautoscaling_policy and aws_appautoscaling_target, this configuration always run successfully.
So I guess that problem is in aws_appautoscaling_policy and aws_appautoscaling_target don't see recently created role maybe due to IAM DynamoDBAutoscaleRole eventual consistency. Is the assumption true? Does exist any workaround to wait until role will be definitely seen by aws_appautoscaling_policy and aws_appautoscaling_target?
resource "aws_appautoscaling_target" "Read" {
max_capacity = 100
min_capacity = 5
resource_id = "table/Preferences"
role_arn = "arn:aws:iam::XXX:role/service-role/DynamoDBAutoscaleRole"
scalable_dimension = "dynamodb:table:ReadCapacityUnits"
service_namespace = "dynamodb"
depends_on = [ "aws_dynamodb_table.Preferences-table", "aws_iam_role.DynamoDBAutoscale", "aws_iam_role_policy.DynamoDBAutoscale" ]
}
resource "aws_appautoscaling_policy" "Read" {
name = "${aws_appautoscaling_target.Read.id}"
service_namespace = "dynamodb"
policy_type = "TargetTrackingScaling"
resource_id = "table/Preferences"
scalable_dimension = "dynamodb:table:ReadCapacityUnits"
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "DynamoDBReadCapacityUtilization"
}
scale_in_cooldown = 60
scale_out_cooldown = 60
target_value = 70
}
depends_on = ["aws_appautoscaling_target.Read"]
}