Terraform initial state file creation - amazon-web-services

Is there a way to create the terraform state file, from the existing infrastructure. For example, an AWS account comes with a some services already in place ( for ex: default VPC).
But terraform, seems to know only the resources, it creates. So,
What is the best way to migrate an existing AWS Infrastructure to Terraform code
Is it possible to add a resource manually and modify the state file manually (bad effects ?)
Update
Terraform 0.7.0 supports importing single resource.

For relatively small things I've had some success in manually mangling a state file to add stubbed resources that I then proceeded to Terraform over the top (particularly with pre-existing VPCs and subnets and then using Terraform to apply security groups etc).
For anything more complicated there is an unofficial tool called terraforming which I've heard is pretty good at generating Terraform state files but also merging with pre-existing state files. I've not given it a go but it might be worth looking into.
Update
Since Terraform 0.7, Terraform now has first class support for importing pre-existing resources using the import command line tool.
As of 0.7.4 this will import the pre-existing resource into the state file but not generate any configuration for the resource. Of course if then attempt a plan (or an apply) Terraform will show that it wants to destroy this orphaned resource. Before running the apply you would then need to create the configuration to match the resource and then any future plans (and applys) should show no changes to the resource and happily keep the imported resource.

Use Terraforming https://github.com/dtan4/terraforming , To date it can generate most of the *.tfstate and *.tf file except for vpc peering.

Related

easy way to copy/move already created AWS stuffs from current VPC to another?

Could you please suggest easy way to copy/move already created AWS stuffs from current VPC to another ?
There is no "easy" way but you can use different tools to import existing resources and deploy them to different VPCs.
There is a free tool called Former2 (https://github.com/iann0036/former2) which can be used to scan existing resources and produce outputs. These outputs can be used to deploy new resources. I tested this tool and it seems to be quite intuitive to use to gather information about existing resources and produce outputs in different template languages (Cloudformation, Terraform, CDK, Stroposphere, Pulumi).
Terraform can be used to import existing resources to the current state and after that copy resources to configuration. Future versions of Terraform will be able to also update the configuration. To use terraform you must know every resource you want to import and use import command with their ids. Terraform does not support nested imports so importing vpc does not add subnets or other resources to the state but they have to be imported separately.
I think CloudFormation has also import feature but specific resource template must be written beforehand and submitted during the import. This is not an easy or fast way to copy resources but should work and as an end product there should a template which can be used to deploy resources to another VPCs.

How can I sync Terraform state with existing resources?

I have a Terraform script that sets up 30 resources in all, including roles, policies and ECS infrastructure.
The state file was destroyed when switching git branches - I should've backed it up, but now I do not have a state file referencing the already existing resources.
I can't use terraform apply because the names already exist, and I can't clear the existing resources with terraform destroy because the state file now thinks the resources don't exist.
Is there a way I could sync with existing resources? I am aware of terraform import but to my knowledge that only works with one resource at a time.
One of the functions of Terraform "state" is to allow Terraform to recognize which subset of objects in your account it's responsible for managing, and so unfortunately there is no built-in mechanism in Terraform to automatically recover that information: terraform import is the way to explicitly tell Terraform that it should begin managing an existing object.
However, there are some external tools that make use of additional knowledge about particular cloud platforms in order to provide a higher-level interface to importing. One example is terraformer, which allows you to import a set of objects matching a query, as long as the objects you want to import are in the subset of types it supports.

Rebuild/recreate terraform from remote state

A contractor built an application's AWS infrastructure on his local laptop, never commited his code then left (wiping the notebook's HD). But he did create the infrastructure with Terraform and stored the remote state in an s3 bucket, s3://analytics-nonprod/analytics-dev.tfstate.
This state file includes all of the VPC, subnets, igw, nacl, ec2, ecs, sqs, sns, lambda, firehose, kinesis, redshift, neptune, glue connections, glue jobs, alb, route53, s3, etc. for the application.
I am able to run Cloudformer to generate cloudformation for the entire infrastructure, and also tried to import the infrastructure using terraformer but terraformer does not include neptune and lambda components.
What is the best way/process to recreate a somewhat usable terraform just from the remote state?
Should I generate some generic :
resource "aws_glue_connection" "dev" { }
and run "terraform import aws_glue_connection.dev"
then run "terraform show"
for each resource?
Terraform doesn't have a mechanism specifically for turning existing state into configuration, and indeed doing so would be lossy in the general case because the Terraform configuration likely contained expressions connecting resources to one another that are not captured in the state snapshots.
However, you might be able to get a starting point -- possibly not 100% valid but hopefully a better starting point than nothing at all -- by configuring Terraform just enough to find the remote state you have access to, running terraform init to make Terraform read it, and then run terraform show to see the information from the state in a human-oriented way that is designed to resemble (but not necessarily exactly match) the configuration language.
For example, you could write a backend configuration like this:
terraform {
backend "s3" {
bucket = "analytics-nonprod"
key = "analytics-dev.tfstate"
}
}
If you run terraform init with appropriate AWS credentials available then Terraform should read that state snapshot, install the providers that the resource instances within it belong to, and then leave you in a situation where you can run Terraform commands against that existing state. As long as you don't take any actions that modify the state, you should be able to inspect it with commands like terraform show.
You could then copy the terraform show output into another file in your new Terraform codebase as a starting point. The output is aimed at human consumption and is not necessarily all parsable by Terraform itself, but the output style is similar enough to the configuration language that hopefully it won't take too much effort to massage it into a usable shape.
One important detail to watch out for is the handling of Terraform modules. If the configuration that produced this state contained any module "foo" blocks then in your terraform show output you will see some things like this:
# module.foo.aws_instance.bar
resource "aws_instance" "bar" {
# ...
}
In order to replicate the configuration for that, it is not sufficient to paste the entire output into one file. Instead, any resource block that has a comment above it indicating that it belongs to a module will need to be placed in a configuration file belonging to that module, or else Terraform will not understand that block as relating to the object it can see in the state.
I'd strongly suggest taking a backup copy of the state object you have before you begin, and you should be very careful not to apply any plans while you're in this odd state of having only a backend configuration, because Terraform might (if it's able to pick up enough provider configuration from the execution environment) plan to destroy all of the objects in the state in order to match the configuration.

What's the best practice to use created resources in Terraform?

I will start a new Terraform project on AWS. The VPC is already created and i want to know what's the best way to integrate it in my code. Do i have to create it again and Terraform will detect it and will not override it ? Or do i have to use Data source for that ? Or is there other best way like Terraform Import ?
I want also to be able in the future to deploy the entire infrastructure in other Region or other Account.
Thanks.
When it comes to integrating with existing objects, you first have to decide between two options: you can either import these objects into Terraform and use Terraform to manage them moving forward, or you can leave them managed by whatever existing system and use them in Terraform by reference.
If you wish to use Terraform to manage these existing objects, you must first write a configuration for the object as if Terraform were going to create it itself:
resource "aws_vpc" "example" {
# fill in here all the same settings that the existing object already has
cidr_block = "10.0.0.0/16"
}
# Can then use that vpc's id in other resources using:
# aws_vpc.example.id
But then rather than running terraform apply immediately, you can first run terraform import to instruct Terraform to associate this resource block with the existing VPC using its id assigned by AWS:
terraform import aws_vpc.example vpc-abcd1234
If you then run terraform plan you should see that no changes are required, because Terraform detected that the configuration matches the existing object. If Terraform does propose some changes, you can either accept them by running terraform apply or continue to update the configuration until it matches the existing object.
Once you have done this, Terraform will consider itself the owner of the VPC and will thus plan to update it or destroy it on future runs if the configuration suggests it should do so. If any other system was previously managing this VPC, it's important to stop it doing so or else this other system is likely to conflict with Terraform.
If you'd prefer to keep whatever existing system is managing the VPC, you can also use the Data Sources feature to look up the existing VPC without putting Terraform in charge of it.
In this case, you might use the aws_vpc data source, which can look up VPCs by various attributes. A common choice is to look up a VPC by its tags, assuming your environment has a predictable tagging scheme that allows you to describe the single VPC you are looking for:
data "aws_vpc" "example" {
tags = {
Name = "example-VPC-name"
}
}
# Can then use that vpc's id in other resources using:
# data.aws_vpc.example.id
In some cases users will introduce additional indirection to find the VPC some other way than by querying the AWS VPC APIs directly. That is a more advanced configuration and the options here are quite broad, but for example if you are using SSM Parameter Store you could place the VPC into a parameter store parameter and retrieve it using the aws_ssm_parameter data source.
If the existing system managing the VPC is CloudFormation, you could also use aws_cloudformation_export or aws_cloudformation_stack to retrieve the information from the CloudFormation API.
If you are happy to manage it via terraform moving forward then you can import existing resources into your terraform state. Here is the usage page for it https://www.terraform.io/docs/import/usage.html
You will have to define a resource block inside of your configuration for the vpc first. You could do something like:
resource "aws_vpc" "existing" {
cidr_block = "172.16.0.0/16"
tags = {
Name = "prod"
}
}
and then on the cli run the command
terraform import aws_vpc.existing <vpc-id>
Make sure you run a terraform plan afterwards, because terraform may try to make changes to it. You kind of have to reverse engineer it a bit, by adding all the necessary configuration to the aws_vpc resource. Once it is aligned, terraform will not attempt to change it. You can then re-use this to deploy to other accounts and regions.
As you suggested, you could use a data source for the vpc. This can be useful if you want to manage it outside of terraform, instead of having the potential to destroy the vpc if it is run by an inexperienced user.
Some customers I've worked with prefer to manage resources like vpcs/subnets (and other core infrastructure) in separate terraform scripts that only senior engineers have access to. This can avoid the disaster scenarios where people destroy the underlying infrastructure by accident.
I personally prefer managing all my terraform code in a git repository that is then deployed using a CI/CD tool, even if it's just myself working on it. Some people may not see the value in spending the time creating the pipeline though and may stick with running it locally.
This post has some great recommendations on running terraform in an an automated environment https://learn.hashicorp.com/terraform/development/running-terraform-in-automation

how to update terraform state with manual change done on resources

i had provisioned some resources over AWS which includes EC2 instance as well,but then after that we had attached some extra security groups to these instances which now been detected by terraform and it say's it'll rollback it as per the configuration file.
Let's say i had below code which attaches SG to my EC2
vpc_security_group_ids = ["sg-xxxx"]
but now my problem is how can i update the terraform.tfstate file so that it should not detach manually attached security groups :
I can solve it as below:
i would refresh terraform state file with terraform refresh which will update the state file.
then i have to update my terraform configuration file manually with security group id's that were attached manually
but that possible for a small kind of setup what if we have a complex scenario, so do we have any other mechanism in terraform which would detect the drift and update it
THanks !!
There is no way Terraform will update your source code when detecting a drift on AWS.
The process you mention is right:
Report manual changes done in AWS into the Terraform code
Do a terraform plan. It will refresh the state and show you if there is still a difference
You can use terraform import with the id to import the remote changes to your terraform state file. Later use terraform plan to check if the change is reflected in the code.
This can be achieved by updating terraform state file manually but it is not best practice to update this file manually.
Also, if you are updating your AWS resources (created by Terraform) manually or outside terraform code then it defeats the whole purpose of Infrastructure as Code.
If you are looking to manage complex infrastructure on AWS using Terraform then it is very good to follow best practices and one of them is all changes should be done via code.
Hope this helps.
terraform import <resource>.<resource_name> [unique_id_from_aws]
You may need to temporarily comment out any provider/resource that relies on the output of the manually created resource.
After running the above, un-comment the dependencies and run terraform refresh.
The accepted answer is technically not correct.
As per my testing:
Terraform refresh will update the state file with current live configuration
Terraform plan will only internally update with the live configuration and compare to the code, but not actually update the state file
Terraform apply will update the state file to current live configuration, even if it says no changes to apply (use case = manual change then update TF code to reflect change and now want to update state file)