I am currently using Terraform to deploy a PHP app to AWS.
This PHP app is deployed as a Service using AWS ECS.
I have multiple clients using this app, and each client receives their own copy of the system with their own configuration as their own service - a white label if you will.
Now, having done a bit of research on Terraform I've modularized my code and created the following file structure:
+---my-application
| shared.tf
| iam_policies.tf
| iam_roles.tf
| variables.tf
| web-apps.tf
|
+---modules
| \---role
| | main.tf
| | outputs.tf
| | variables.tf
| |
| \---webapp
| main.tf
| variables.tf
|
+---templates
web_definition.tpl.json
My problem lies in the web-apps.tf file which I use as the "glue" for all of the webapp modules:
module "client_bob" {
source = "modules/webapp"
...
}
module "client_alice" {
source = "modules/webapp"
...
}
... Over 30 more client module blocks ...
Needless to say that this is not a good setup.
It is not scalable and also creates huge .tfstate files.
Once when attempting to use Consul as a backend I got an error message saying I've reached the size limit allowed for Consul KV value.
What is the correct way to approach this situations?
I've looked at all of the questions in the Similar Questions section when I wrote this one and all of them revolve around the idea of using multiple .tfstate files, but I don't quite understand how this would solve my problem, any help would be greatly appreciated!
I did similar projects with terragrunt, take a look.
It was born to answer your requests.
the oss website is https://github.com/gruntwork-io/terragrunt
Terragrunt is a thin wrapper for Terraform that provides extra tools for working with multiple Terraform modules. https://www.gruntwork.io
In your case, you can easily manage with different tfstate files for each client.
I also recommend to manage the iam roles, policies or any other resources for each client as well, do not mix them.
For example, the structure will become to
(I guess you will manage different environments for each client, right?)
└── bob
├── prod
│ ├── app
│ │ └── terraform.tfvars
├── nonprod
├── app
└── terraform.tfvars
└── alice
├── prod
│ ├── app
│ │ └── terraform.tfvars
├── nonprod
├── app
└── terraform.tfvars
...
Later, after you master the command terraform apply-all, then it makes the deployment simpler and easier.
Quick start
https://github.com/gruntwork-io/terragrunt-infrastructure-modules-example
https://github.com/gruntwork-io/terragrunt-infrastructure-live-example
Related
So I ran this:
terraform state replace-provider terraform-mars/credstash granular-oss/credstash
and this was the output
Terraform will perform the following actions:
~ Updating provider:
- registry.terraform.io/terraform-mars/credstash
+ registry.terraform.io/granular-oss/credstash
Changing 1 resources:
module.operations.data.credstash_secret.key_name
Do you want to make these changes?
Only 'yes' will be accepted to continue.
Enter a value: yes
Successfully replaced provider for 1 resources.
then I checked it with
terraform providers
Providers required by configuration:
.
├── provider[registry.terraform.io/hashicorp/archive] ~> 2.2.0
├── provider[registry.terraform.io/vancluever/acme] ~> 2.5.3
├── provider[registry.terraform.io/hashicorp/aws] ~> 4.13.0
├── provider[registry.terraform.io/hashicorp/dns] ~> 3.2.3
├── provider[registry.terraform.io/hashicorp/local] ~> 2.2.3
├── provider[registry.terraform.io/hashicorp/cloudinit] ~> 2.2.0
├── provider[registry.terraform.io/granular-oss/credstash] ~> 0.6.1
├── provider[registry.terraform.io/hashicorp/external] ~> 2.2.2
├── provider[registry.terraform.io/hashicorp/null] ~> 3.1.1
├── provider[registry.terraform.io/hashicorp/tls] ~> 3.4.0
├── module.account
│ ├── provider[registry.terraform.io/hashicorp/aws]
│ └── module.static
└── module.operations
├── provider[registry.terraform.io/hashicorp/local]
├── provider[registry.terraform.io/hashicorp/aws]
├── provider[registry.terraform.io/terraform-mars/credstash]
It still uses the old provider for some reason.
I don't understand why this happens.
I also ran terraform init but the provider still shows up there also.
When I run terraform plan, it gives me this error:
Error: NoCredentialProviders: no valid providers in chain. Deprecated.
│ For verbose messaging see aws.Config.CredentialsChainVerboseErrors
│
│ with module.operations.data.credstash_secret.key_name,
│ on ../modules/stacks/operations/bastion.tf line 1, in data "credstash_secret" "bastion_pubkey":
│ 1: data "credstash_secret" "key_name" {
The part of the terraform providers output included in the question is describing the provider requirements declared in the configuration. This includes both explicit provider requirements and also some automatically-detected requirements that Terraform infers for backward compatibility for modules written for modules that were targeting Terraform v0.12 and earlier.
The terraform state replace-provider command instead replaces references to providers inside the current Terraform state. The Terraform state remembers which provider most recently managed each resource so that e.g. Terraform knows which provider to use to destroy the object if you subsequently remove it from the configuration.
When you use terraform state replace-provider you'll typically need to first update the configuration of each of your modules to refer to the new provider instead of the old and to make sure each of your resources is associated (either implicitly or explicitly) with the intended provider. You can then use terraform state replace-provider to force the state to change to match, and thereby avoid the need to install the old provider in terraform init.
I'm quite new to the Amplify function world. I've been struggling to deploy my Golang function, connected to a DynamoDB stream. Even though I am able to run my lambda successfully by manually uploading a .zip I created myself after I built the binary using GOARCH=amd64 GOOS=linux go build src/index.go (I develop on a mac), when I use the Amplify CLI tools I am not able to deploy my function.
This is the folder structure of my function myfunction
+ myfunction
├── amplify.state
├── custom-policies.json
├── dist
│ └── latest-build.zip
├── function-parameters.json
├── go.mod
├── go.sum
├── parameters.json
├── src
│ ├── event.json
│ └── index.go
└── tinkusercreate-cloudformation-template.json
The problem is I can't use the amplify function build command, since looks like it creates a .zip file with my source file index.go in there (not the binary), so the lambda, regardless of the handler I set, it seems not able to run it from that source. I record errors like
fork/exec /var/task/index.go: exec format error: PathError null or
fork/exec /var/task/index: no such file or directory: PathError null
depending on the handler I set.
Is there a way to make Amplify build function work for a Golang lambda? I would like to successfully execute the command amplify function build myfunction so that I will be able to deliver the working deployment by amplify push to my target environment.
My infrastructure its composed by a Host Project and several Service Projects that are using its Shared VPC.
I have refactored my .tf files of my infrustructure as it follows:
├── env
| ├── dev
│ ├── main.tf
│ ├── outputs.tf
│ └── variables.tf
│ ├── pre
│ └── pro
├── host
│ ├── main.tf
│ ├── outputs.tf
│ ├── terraform.tfvars
│ └── variables.tf
└── modules
├── compute
├── network
└── projects
The order of creation of the infrastructure is:
terraform apply in /host
terraform apply in /env/dev (for instance)
In the main.tf of the host directory I have created the VPC and enabled Shared VPC hosting:
# Creation of the hosted network
resource "google_compute_network" "shared_network" {
name = var.network_name
auto_create_subnetworks = false
project = google_compute_shared_vpc_host_project.host_project.project
mtu = "1460"
}
# Enable shared VPC hosting in the host project.
resource "google_compute_shared_vpc_host_project" "host_project" {
project = google_project.host_project.project_id
depends_on = [google_project_service.host_project]
}
The issue comes when I have refer to the Shared VPC Network in the Service Projects.
In the main.tf from env/dev/ I have set the following:
resource "google_compute_shared_vpc_service_project" "service_project_1" {
host_project = google_project.host_project.project_id
service_project = google_project.service_project_1.project_id
depends_on = [
google_compute_shared_vpc_host_project.host_project,
google_project_service.service_project_1,
]
}
QUESTION
How do I refer to the Host Project ID from another directory in the Service Project?
What I have tried so far
I have thought of using Ouput Values and Data Sources:
In the host/outputs.tf declared as an output the Project ID as:
output "project_id" {
value = google_project.host_project.project_id
}
But then I end up not knowing how to implement this output in my env/dev/main.tf
I have thought on Data Sources and, in the env/dev/main.tf fetch for the Host Project ID. But then, in order to fetch it, I would need its name (which breaks the purpose of providing it in a programatic way if I have to hardcode it).
What should I try next? What I am missing?
The files under the env/dev folder can't see anything above it, only any referenced modules.
You could refactor the host folder into a module to allow access to it's outputs... but that adds a risk that the host will be destroyed whenever you destroy a dev environment.
I would try running terraform output -raw project_id after creating the host and piping it to a text file or environment variable. Then using that as the input for a new "host_project" or similar variable in the 'env/dev' deployment.
I have an S3 bucket that is structured like this:
root/
├── year=2020/
│ └── month=01
│ ├── day=01
| | ├── file1.log
| | ├── ...
| | └── file8.log
│ ├── day=...
│ └── day=31
| ├── file1.log
| ├── ...
| └── file8.log
└── year=2019/
├── ...
Each day would have 8 files with identical names across the days ─ there would be a file1.log in every 'day' folders. I crawled this bucket using a custom classifier.
Expected behavior: Glue will create one single table with year, month, and day as partition fields, and several other fields that I described in my custom classifier. I then can use the table in my Job scripts.
Actual behavior:
1) Glue created one table that fulfilled my expectations. However, when I tried to access it in Job scripts, the table was devoid of columns.
2) Glue created one table for every 'day' partitions, and 8 tables for every file<number>.log files
I have tried excluding **_SUCCESS and **crc like people suggested on this other question: AWS Glue Crawler adding tables for every partition? However, it doesn't seem to work. I have also checked the 'Create a single schema for each S3 path' option in the crawler's setting. It still doesn't work.
What am I missing?
You should have one folder at root (e.g. customers) and inside it, you should have partition sub-folders. If you have partitions at S3 bucket level, it will not create one table.
I am developing a serverless data pipeline on AWS. Compared to the Serverless framework, Terraform has better support for services like Glue.
The good thing about Serverless is that you can define the --stage argument when deploying. This allows creating an isolated stack on AWS. When developing new features on our data pipeline I can deploy my state of the code like
serverless deploy --stage my-new-feature
this allows me to do an isolated integration test on the AWS account I share with my colleagues. Is this possible using Terraform?
Did you have a look at workspace? https://www.terraform.io/docs/state/workspaces.html
Terraform manages resources by way of state.
If a resource already exists in the state file and Terraform doesn't detect any drift between the configuration, the state and any differences in the provider (eg something was changed in the AWS console or by another tool) then it will show that there are no changes. If it does detect some form of drift then a plan will show you what changes it needs to make to push the existing state of things in the provider to what is defined in the Terraform code.
Separating state between different environments
If you want to have multiple environments or even other resources that are separate from each other and not managed by the same Terraform action (such as a plan, apply or destroy) then you want to separate these into different state files.
One way to do this is to separate your Terraform code by environment and use a state file matching the directory structure of your code base. A simple example might look something like this:
terraform/
├── production
│ ├── main.tf -> ../stacks/main.tf
│ └── terraform.tfvars
├── stacks
│ └── main.tf
└── staging
├── main.tf -> ../stacks/main.tf
└── terraform.tfvars
stacks/main.tf
variable "environment" {}
resource "aws_lambda_function" "foo" {
function_name = "foo-${var.environment}"
# ...
}
production/terraform.tfvars
environment = "production"
staging/terraform.tfvars
environment = "staging"
This uses symlinks so that staging and production are kept in line in code with the only changes being introduced by the terraform.tfvars file. In this case it changes the Lambda function's name to include the environment.
This is what I generally recommend for static environments as it's much clearer from looking at the code/directory structure which environments exist.
Dynamic environments
However, if you have more dynamic environments, such as per feature branch, then it's not going to work to hard code the environment name directly in your terraform.tfvars file.
In this case I would recommend something like this:
terraform/
├── production
│ ├── main.tf -> ../stacks/main.tf
│ └── terraform.tfvars
├── review
│ ├── main.tf -> ../stacks/main.tf
│ └── terraform.tfvars
├── stacks
│ └── main.tf
└── staging
├── main.tf -> ../stacks/main.tf
└── terraform.tfvars
This works the same way but I would omit the environment variable from the review structure so that it's set interactively or via CI environment variables (eg export TF_VAR_environment=${TRAVIS_BRANCH} when running in Travis CI, adapt this to support whatever CI system you use).
Keeping state separate between review environments on different branches
This only gets you half way though because when another person tries to use this with another branch then they will see that Terraform wants to destroy/update any resources that are already created by running Terraform if you were just using the default workspace.
Workspaces provide an option for separating state in a more dynamic way and also allows you to interpolate the workspace name into Terraform code:
resource "aws_instance" "example" {
tags {
Name = "web - ${terraform.workspace}"
}
# ... other arguments
}
Instead the review environments will need to create or use a dynamic workspace that is scoped for that branch only. You can do this by running the following command:
terraform workspace new [NAME]
If the workspace already exists then you should instead use the following command:
terraform workspace select [NAME]
In CI you can use the same environment variables as before to automatically use the branch name as your workspace name.