Rebuild/recreate terraform from remote state - amazon-web-services

A contractor built an application's AWS infrastructure on his local laptop, never commited his code then left (wiping the notebook's HD). But he did create the infrastructure with Terraform and stored the remote state in an s3 bucket, s3://analytics-nonprod/analytics-dev.tfstate.
This state file includes all of the VPC, subnets, igw, nacl, ec2, ecs, sqs, sns, lambda, firehose, kinesis, redshift, neptune, glue connections, glue jobs, alb, route53, s3, etc. for the application.
I am able to run Cloudformer to generate cloudformation for the entire infrastructure, and also tried to import the infrastructure using terraformer but terraformer does not include neptune and lambda components.
What is the best way/process to recreate a somewhat usable terraform just from the remote state?
Should I generate some generic :
resource "aws_glue_connection" "dev" { }
and run "terraform import aws_glue_connection.dev"
then run "terraform show"
for each resource?

Terraform doesn't have a mechanism specifically for turning existing state into configuration, and indeed doing so would be lossy in the general case because the Terraform configuration likely contained expressions connecting resources to one another that are not captured in the state snapshots.
However, you might be able to get a starting point -- possibly not 100% valid but hopefully a better starting point than nothing at all -- by configuring Terraform just enough to find the remote state you have access to, running terraform init to make Terraform read it, and then run terraform show to see the information from the state in a human-oriented way that is designed to resemble (but not necessarily exactly match) the configuration language.
For example, you could write a backend configuration like this:
terraform {
backend "s3" {
bucket = "analytics-nonprod"
key = "analytics-dev.tfstate"
}
}
If you run terraform init with appropriate AWS credentials available then Terraform should read that state snapshot, install the providers that the resource instances within it belong to, and then leave you in a situation where you can run Terraform commands against that existing state. As long as you don't take any actions that modify the state, you should be able to inspect it with commands like terraform show.
You could then copy the terraform show output into another file in your new Terraform codebase as a starting point. The output is aimed at human consumption and is not necessarily all parsable by Terraform itself, but the output style is similar enough to the configuration language that hopefully it won't take too much effort to massage it into a usable shape.
One important detail to watch out for is the handling of Terraform modules. If the configuration that produced this state contained any module "foo" blocks then in your terraform show output you will see some things like this:
# module.foo.aws_instance.bar
resource "aws_instance" "bar" {
# ...
}
In order to replicate the configuration for that, it is not sufficient to paste the entire output into one file. Instead, any resource block that has a comment above it indicating that it belongs to a module will need to be placed in a configuration file belonging to that module, or else Terraform will not understand that block as relating to the object it can see in the state.
I'd strongly suggest taking a backup copy of the state object you have before you begin, and you should be very careful not to apply any plans while you're in this odd state of having only a backend configuration, because Terraform might (if it's able to pick up enough provider configuration from the execution environment) plan to destroy all of the objects in the state in order to match the configuration.

Related

Amazon Cloud - Target Groups removed

We have provisioned amazon resources (EC2, Loadbalancers, Target groups,...) using Terragrunt, when we re-apply EC2 Instances script it removes Target Groups associated to load balancer.
This is due to the dependencies we create in Target Groups scripts, but would like to understand the best practices to implement the loosely couple terraform/terragrunt scripts. I mean when we re-apply the .hcl file it shouldn't impact the other related resources.
Please suggest.
The way terraform/terragrunt know what to destroy is by referencing the state file (local, remote). When you run the terraform apply or terragrunt apply inside a folder, terraform looks at what is in AWS, what is in tfstate file on disk, what are your scripts asking you to do and it performs diffs on all three of those, figures out the delta and decides what to do. An important thing to know about terraform is that terraform is directory specific, any directory you run terraform, it creates a state file in the directory you are running in. There is also a concept of remote state using S3 alongwith DynamoDB so that multiple developers can share the state without stepping on each other toes

What's the best practice to use created resources in Terraform?

I will start a new Terraform project on AWS. The VPC is already created and i want to know what's the best way to integrate it in my code. Do i have to create it again and Terraform will detect it and will not override it ? Or do i have to use Data source for that ? Or is there other best way like Terraform Import ?
I want also to be able in the future to deploy the entire infrastructure in other Region or other Account.
Thanks.
When it comes to integrating with existing objects, you first have to decide between two options: you can either import these objects into Terraform and use Terraform to manage them moving forward, or you can leave them managed by whatever existing system and use them in Terraform by reference.
If you wish to use Terraform to manage these existing objects, you must first write a configuration for the object as if Terraform were going to create it itself:
resource "aws_vpc" "example" {
# fill in here all the same settings that the existing object already has
cidr_block = "10.0.0.0/16"
}
# Can then use that vpc's id in other resources using:
# aws_vpc.example.id
But then rather than running terraform apply immediately, you can first run terraform import to instruct Terraform to associate this resource block with the existing VPC using its id assigned by AWS:
terraform import aws_vpc.example vpc-abcd1234
If you then run terraform plan you should see that no changes are required, because Terraform detected that the configuration matches the existing object. If Terraform does propose some changes, you can either accept them by running terraform apply or continue to update the configuration until it matches the existing object.
Once you have done this, Terraform will consider itself the owner of the VPC and will thus plan to update it or destroy it on future runs if the configuration suggests it should do so. If any other system was previously managing this VPC, it's important to stop it doing so or else this other system is likely to conflict with Terraform.
If you'd prefer to keep whatever existing system is managing the VPC, you can also use the Data Sources feature to look up the existing VPC without putting Terraform in charge of it.
In this case, you might use the aws_vpc data source, which can look up VPCs by various attributes. A common choice is to look up a VPC by its tags, assuming your environment has a predictable tagging scheme that allows you to describe the single VPC you are looking for:
data "aws_vpc" "example" {
tags = {
Name = "example-VPC-name"
}
}
# Can then use that vpc's id in other resources using:
# data.aws_vpc.example.id
In some cases users will introduce additional indirection to find the VPC some other way than by querying the AWS VPC APIs directly. That is a more advanced configuration and the options here are quite broad, but for example if you are using SSM Parameter Store you could place the VPC into a parameter store parameter and retrieve it using the aws_ssm_parameter data source.
If the existing system managing the VPC is CloudFormation, you could also use aws_cloudformation_export or aws_cloudformation_stack to retrieve the information from the CloudFormation API.
If you are happy to manage it via terraform moving forward then you can import existing resources into your terraform state. Here is the usage page for it https://www.terraform.io/docs/import/usage.html
You will have to define a resource block inside of your configuration for the vpc first. You could do something like:
resource "aws_vpc" "existing" {
cidr_block = "172.16.0.0/16"
tags = {
Name = "prod"
}
}
and then on the cli run the command
terraform import aws_vpc.existing <vpc-id>
Make sure you run a terraform plan afterwards, because terraform may try to make changes to it. You kind of have to reverse engineer it a bit, by adding all the necessary configuration to the aws_vpc resource. Once it is aligned, terraform will not attempt to change it. You can then re-use this to deploy to other accounts and regions.
As you suggested, you could use a data source for the vpc. This can be useful if you want to manage it outside of terraform, instead of having the potential to destroy the vpc if it is run by an inexperienced user.
Some customers I've worked with prefer to manage resources like vpcs/subnets (and other core infrastructure) in separate terraform scripts that only senior engineers have access to. This can avoid the disaster scenarios where people destroy the underlying infrastructure by accident.
I personally prefer managing all my terraform code in a git repository that is then deployed using a CI/CD tool, even if it's just myself working on it. Some people may not see the value in spending the time creating the pipeline though and may stick with running it locally.
This post has some great recommendations on running terraform in an an automated environment https://learn.hashicorp.com/terraform/development/running-terraform-in-automation

how to update terraform state with manual change done on resources

i had provisioned some resources over AWS which includes EC2 instance as well,but then after that we had attached some extra security groups to these instances which now been detected by terraform and it say's it'll rollback it as per the configuration file.
Let's say i had below code which attaches SG to my EC2
vpc_security_group_ids = ["sg-xxxx"]
but now my problem is how can i update the terraform.tfstate file so that it should not detach manually attached security groups :
I can solve it as below:
i would refresh terraform state file with terraform refresh which will update the state file.
then i have to update my terraform configuration file manually with security group id's that were attached manually
but that possible for a small kind of setup what if we have a complex scenario, so do we have any other mechanism in terraform which would detect the drift and update it
THanks !!
There is no way Terraform will update your source code when detecting a drift on AWS.
The process you mention is right:
Report manual changes done in AWS into the Terraform code
Do a terraform plan. It will refresh the state and show you if there is still a difference
You can use terraform import with the id to import the remote changes to your terraform state file. Later use terraform plan to check if the change is reflected in the code.
This can be achieved by updating terraform state file manually but it is not best practice to update this file manually.
Also, if you are updating your AWS resources (created by Terraform) manually or outside terraform code then it defeats the whole purpose of Infrastructure as Code.
If you are looking to manage complex infrastructure on AWS using Terraform then it is very good to follow best practices and one of them is all changes should be done via code.
Hope this helps.
terraform import <resource>.<resource_name> [unique_id_from_aws]
You may need to temporarily comment out any provider/resource that relies on the output of the manually created resource.
After running the above, un-comment the dependencies and run terraform refresh.
The accepted answer is technically not correct.
As per my testing:
Terraform refresh will update the state file with current live configuration
Terraform plan will only internally update with the live configuration and compare to the code, but not actually update the state file
Terraform apply will update the state file to current live configuration, even if it says no changes to apply (use case = manual change then update TF code to reflect change and now want to update state file)

What AWS Resources Does Terraform Know About

Recently, we had issues with tfstate being deleted on S3.
As a result, there are a number of EC2 instances still running (duplicates if you will)
Is there a way to query Terraform and list which EC2 instances (and other resources) Terraform has under its control? I want to delete the duplicate AWS resources without messing up Terraform state.
Depending on whether you care about availability you could just delete everything and let Terraform recreate it all.
Or you could use terraform state list and then iterate through that with terraform state show (eg. terraform state list | xargs terraform state show) to show everything.
terraform import is for importing stuff that exists back in to your state which doesn't sound like what you want because it sounds like you've already recreated some things so have duplicates. If you had caught the loss of the resources from your state file before Terraform recreated it (for example by seeing an unexpected creation in the plan and seeing that the resource already existed in the AWS console) then you could have used that to import the resources back into the state file so that Terraform would then show an empty plan for these resources.
Iin the future make sure you use state locking to prevent this from happening again!

Terraform initial state file creation

Is there a way to create the terraform state file, from the existing infrastructure. For example, an AWS account comes with a some services already in place ( for ex: default VPC).
But terraform, seems to know only the resources, it creates. So,
What is the best way to migrate an existing AWS Infrastructure to Terraform code
Is it possible to add a resource manually and modify the state file manually (bad effects ?)
Update
Terraform 0.7.0 supports importing single resource.
For relatively small things I've had some success in manually mangling a state file to add stubbed resources that I then proceeded to Terraform over the top (particularly with pre-existing VPCs and subnets and then using Terraform to apply security groups etc).
For anything more complicated there is an unofficial tool called terraforming which I've heard is pretty good at generating Terraform state files but also merging with pre-existing state files. I've not given it a go but it might be worth looking into.
Update
Since Terraform 0.7, Terraform now has first class support for importing pre-existing resources using the import command line tool.
As of 0.7.4 this will import the pre-existing resource into the state file but not generate any configuration for the resource. Of course if then attempt a plan (or an apply) Terraform will show that it wants to destroy this orphaned resource. Before running the apply you would then need to create the configuration to match the resource and then any future plans (and applys) should show no changes to the resource and happily keep the imported resource.
Use Terraforming https://github.com/dtan4/terraforming , To date it can generate most of the *.tfstate and *.tf file except for vpc peering.