Terraform Multiple State Files Best Practice Examples - amazon-web-services

I am trying to build out our AWS environments using Terraform but am hitting some issues scaling. I have a repository of just modules that I want to use repeatedly when building my environments and a second repository just to handle the actual implementations of those modules.
I am aware of HashiCorp's Github page that has an example but there, each environment is one state file. I want to split environments out but then have multiple state files within each environment. When the state files get big, applying small updates takes way too long.
Every example I've seen where multiple state files are used, the Terraform files are extremely un-DRY and not ideal.
I would prefer to be able to have different variable values between environments but have the same configuration.
Has anyone ever done anything like this? Am I missing something? I'm a bit frustrated because every Terraform example is never at scale and it makes it hard for n00b such as myself to start down the right path. Any help or suggestions is very much appreciated!

The idea of environment unfortunately tends to mean different things to different people and organizations.
To some, it's simply creating multiple copies of some infrastructure -- possibly only temporary, or possibly long-lived -- to allow for testing and experimentation in one without affecting another (probably production) environment.
For others, it's a first-class construct in a deployment architecture, with the environment serving as a container into which other applications and infrastructure are deployed. In this case, there are often multiple separate Terraform configurations that each have a set of resources in each environment, sharing data to create a larger system from smaller parts.
Terraform has a feature called State Environments that serves the first of these use-cases by allowing multiple named states to exist concurrently for a given configuration, and allowing the user to switch between them using the terraform env commands to focus change operations on a particular state.
The State Environments feature alone is not sufficient for the second use-case, since it only deals with multiple states in a single configuration. However, it can be used in conjunction with other Terraform features, making use of the ${terraform.env} interpolation value to deal with differences, to allow multiple state environments within a single configuration to interact with a corresponding set of state environments within another configuration.
One "at scale" approach (relatively-speaking) is described in my series of articles Terraform Environment+Application Pattern, which describes a generalization of a successful deployment architecture with many separate applications deployed together to form an environment.
In that pattern, the environments themselves (which serve as the "container" for applications, as described above) are each created with a separate Terraform configuration, allowing each to differ in the details of how it is configured, but they each expose data in a standard way to allow multiple applications -- each using the State Environments feature -- to be deployed once for each environment using the same configuration.
This compromise leads to some duplication between the environment configurations -- which can be mitgated by using Terraform modules to share patterns between them -- but these then serve as a foundation to allow other configurations to be generalized and deployed multiple times without such duplication.

Related

How to manage multiple Environments within one project (GCP/AWS)

I am building a lab utility to deploy my teams development environments (testing / stress etc).
At present, the pipeline is as follows:
Trigger pipeline via HTTP request, args contain the distribution, web server and web server version using ARGs that are passed too multi stage dockerfiles.
Dockerx builds the container (if it doesn't exist in ECR)
Pipeline job pushes that container to ECR (if it doesn't already exist).
Terraform deploys the container using Fargate, sets up VPCs and a ALB to handle ingress externally.
FQDN / TLS is then provisioned on ...com
Previously when I've made tools like this that create environments, environments were managed and deleted solely at project level, given each environment had it's own project, given this is best practice for isolation and billing tracking purposes, however given the organisation security constraints of my company, I am limited to only 1 project wherein I can create all the resources.
This means I have to find a way to manage/deploy 30 (the max) environments in one project without it being a bit of a clustered duck.
More or less, I am looking for a way that allows me to keep track, and tear down environments (autonomously) and their associated resources relevant to a certain identifier, most likely these environments can be separated by resource tags/groups.
It appears the CDKTF/Pulumi looks like a neat way of achieving some form of "high-level" structure, but I am struggling to find ways to use them to do what I want. If anyone can recommend an approach, it'd be appreciated.
I have not tried anything yet, mainly because this is something that requires planning before I start work on it (don't like reaching deadends ha).

Managing AWS Parameter Store variables with a data migrations tool

We are increasingly using AWS Parameter Store for managing configuration.
One issue we have is managing which variables need to be set when releases occur to different environments (staging, dev, prod etc). There is always a lot of work to configure the different environments, and it is easy to overlook required settings when we release microservices.
It seems what is needed is a database migration similar to Flyway or Liquibase, but I haven't found any products available, and it is unclear to me how secrets would be managed with this system.
What are people doing to manage pushing new configuration into Parameter Store when new application code is deployed?
Do you know AWS AppConfig? It’s a different way of managing configuration and I’m not sure if this fits your requirements but it might be worth a look.

How to swap between projects with terraform

Hey I started migrating infrastructure to terraform and came across few questions that are hard for me to answer
How to easily swap between different projects, assuming I have same resources in few projects separated by environments. Do I store it all in one tfstate - or do I have multiple ones ? Is it stored in one bucket or few buckets or somewhere else entirely
Can you create a new project with some random number at the end and automatically deploy resources to it
If you can create new project and deploy to it - how do you enable the API for terraform to work - like iam.googleapis.com etc.
Here, some pieces of answer to your questions
If you use only one terraform, you use only 1 tfstate. By the way, when you would like to update a project, you have to take into account all the dependencies in all project (and you risk to break other projects), the file are bigger and harder to maintain... I recommend you to have 1 terraform per project, and 1 TF state per project. If you use common pattern (IP naming, VM settings,...) you can create modules to import in the terraform of each project.
(and 3) Yes, you can create and deploy then. But I don't recommend it for a separation of concern. Use a terraform to manage your ressources' organisation (projects, folders, ....) Another one to manage your infrastructure.
A good way to think is: Build once, maintain a lot! So the build phase is not the hardest, but having something maintainable, easy to read, concise is hard!!

Different S3 buckets for different environment VS one bucket and different objects for each environment

I migrating my applications to start using S3 but the thing is I am at this point where I am not really sure if the best practice for different environments is to use multiple buckets?
Or should I create one bucket with multiple objects for each environment.
like:
my-document-alpha
my-document-beta
my-document-gamma
my-document
Or
- my-document has:
alpha/
beta/
gama/
prd/
In my opinion different buckets is better so you can have one for each purpose without necessarily pollution your production bucket. It will also help in access control if you have multiple developers working on the project.
It is better to have multiple buckets. It is easier to maintain and you can have different sort of settings for each env. Production can have encryption at rest, backups enabled, different lifecycle rules etc.
Having one bucket can lead to unwanted downtimes. Say there is a bug in the code that is writing data in the wrong format or at wrong path, you don't want the test code / env to affect production code / env.
Its always better to have environments as isolated as possible.

Build AWS infrastructure using terrafrom in an order that you specify

Recently I came across a situation where am building AWS infrastructure using terraform to setup a clustered environment for some applications. Thing is when I apply terraform scripts it builds all the necessary modules and spins multiple instances altogether and then finishes. This may be meant to do like this and there is nothing to blame anyways terraform works greatly to build such infra.
When I'm trying to setup such infra to deploy an application in a clustered way, here am using a configuration management tool. While building ec2 instances CM scripts gets invoked and configured accordingly. Problem comes when there is some dependency on the modules.
Consider a scenario that 2(A & B) components are part of Autoscale group and 2(C & D) components are normal ec2-instances. Here if I wish to build A first and then C since C instance got dependency on A which has to be fully configured first or vice versa, how can I control the order in which terraform helps me to achieve this.
Please can someone helps me achieving it.
Thanks in advance
The other answer is correct in the literal sense, but overall this is something to avoid. Build your CM code so that it will keep re-trying to converge until it succeeds. With Chef in particular, you can use the chef-client cookbook to deploy a service which runs Chef converges automatically at a given interval (30 minutes by default but you might want to make that shorter). Running things in the "right" order sounds nice, but when dealing with byzantine failures you'll thank your past self for ensuring reliable convergence no matter the order.
You can use the depends_on parameter. Resources can be made explicitly dependant on other resources. Terraform will in turn only build the resource once dependent resources have provisioned successfully.
https://www.terraform.io/intro/getting-started/dependencies.html
The question has a broad nature and the other answers are right in their own rights. What I would like to add is that making use of modules to determine order of logical sub projects works well too.
In terraform you can force procedural order with depends_on in resource level but you cannot use it for modules. However for modules you can use the output of one module as input to the other one, which would help you manage procedural order on modules level.
So, in your case, I would put A & B in one module, C & D into another and use the output variables from one to the other to control order.