I'm trying to understand why my data from my gcs backend is saying it does not have any outputs.
I have a module called DB which creates a postgres database.
I have a file called outputs.tf, where I have
terraform {
backend "gcs" {
bucket = "projectgun-terraform-state"
prefix = "db-workspaces"
}
}
I am using a workspace i called a1
I run terraform apply and viola, it worked, I created a DB.
Furthermore, when i go into GCS, I can find my bucket, and find my key. MY workspace name is a1, I have the prefix "db-workspaces", so my remote state is saved in #{my-bucket}/db-workspaces/a1.tfstate.
When I go to that key in my bucket I see a bunch of JSON that looks like this
If i go into my db module, and do terraform state pull it looks just like that also. Everything checks out.
But when I go to my other module, I try to access the outputs from GCS, and I can't.
I am using module a1.
data "terraform_remote_state" "db" {
backend = "gcs"
config = {
bucket = "projectgun-terraform-state"
prefix = "db-workspaces"
}
}
When i try to access this data via outputs, I see
79: db_user = data.terraform_remote_state.db.outputs.user
│ ├────────────────
│ │ data.terraform_remote_state.db.outputs is object with no attributes
│
│ This object does not have an attribute named "user".
What am I doing wrong? Is there a better way to debug my issue? How could I be sure what key terraform is looking at when it's attempting to pull the data?
Specifically
data.terraform_remote_state.db.outputs is object with no attributes
Can i debug data.terraform_remote_state ? How can i inspect what's going on here? There are very clearly outputs when i look at the remote state, so I feel like it's grabbing the wrong key, but don't know where to look.
I found a github issue that summarizes the issue I was having and a solution.
https://github.com/hashicorp/terraform/issues/24935
data "terraform_remote_state" "network" {
backend = "gcs"
workspace = terraform.workspace
config = {
bucket = "tf-state"
prefix = "base-layer/network/"
}
}
This does not seem to be a documented fix. Thank you to #HebertCL for the answer!
Related
I want to seperate data form main code and use them in seperate file similar to local.tf or variables.tf, however even in the docs there is no reference.
use case
I am trying to create access logging for s3 bucket. Target bucket is not managed by s3 so I want to make sure that it exists before using it via data source
resource "aws_s3_bucket" "artifact" {
bucket = "jatin-123"
}
data "aws_s3_bucket" "selected" {
bucket = "bucket.test.com"
}
resource "aws_s3_bucket_logging" "artifacts_server_access_logs" {
for_each = local.env
bucket = data.aws_s3_bucket.selected.id
target_bucket = local.s3_artifact_access_logs_bucket_name
target_prefix = "${aws_s3_bucket.artifact[each.key].id}/"
}
Yes, you can have data sources in whatever file you want.
Terraform basically does not care about the file composition and their names and just lumps all .tf files in the same directory into one big blob.
Yes, of course, you can have. For organization purposes, you SHOULD use different files. When you have a simple project it's easy to check your code or even troubleshoot within a single file, but when you start to deploy more infrastructure will be a nightmare. So my advice is to start your "small" projects by splitting the through different files.
Here is my suggestion for you, regarding your example:
base.auto.tfvars
Here you can put variables that will be used along all the project.
E.g: region = us-east-1
project = web-appliance
s3.auto.tfvars
Variables that you will use in your s3 bucket
s3.tf
The code for S3 creation
datasource.tf
Here you will put all the datasources that you need in your project.
provider.tf
The configuration for your provider(s). In your example, aws provider
versions.tf
The versions of your providers
I have two workspaces (like dev and prd) and I have to create single resource to use on all of them.
My example is to create AWS ECR repository:
resource "aws_ecr_repository" "example" {
name = "example"
}
I applied it on prd workspace and after switching to dev workspace, Terraform wants to create the same, but it exist.
After consideration I used count to create it only on prd like that:
resource "aws_ecr_repository" "example" {
count = local.stage == "prd" ? 1 : 0
name = "example"
}
and on prd workspace I use it like that:
aws_ecr_repository.default[0].repository_url
but there is a problem how to use it on dev workspace.
What is the better way to solve this?
since i´m not able to add a comment (i do not have enough rep)
i´m adding this as an answer.
as Jens mentioned, best is to avoid this approach.
but you can import a remote state with something like this:
data "terraform_remote_state" "my_remote_state" {
backend = "local" # could also be a remote state like s3
config = {
key = "project-key"
}
workspace = "prd"
}
in your prod workspace you have to define the outputs of your repo:
output "ecr_repo_url" {
aws_ecr_repository.default[0].repository_url
}
in your dev workspace, you can access the value with:
data.terraform_remote_state.my_remote_state.ecr_repo_url
in some cases this maybe useful, but be aware to what Jens said: if you destroy your prod environment, you can´t apply or change your dev environment!
I need to upload a folder to S3 Bucket. But when I apply for the first time. It just uploads. But I have two problems here:
uploaded version outputs as null. I would expect some version_id like 1, 2, 3
When running terraform apply again, it says Apply complete! Resources: 0 added, 0 changed, 0 destroyed. I would expect to upload all the times when I run terraform apply and create a new version.
What am I doing wrong? Here is my Terraform config:
resource "aws_s3_bucket" "my_bucket" {
bucket = "my_bucket_name"
versioning {
enabled = true
}
}
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "my_files.zip"
}
output "my_bucket_file_version" {
value = "${aws_s3_bucket_object.file_upload.version_id}"
}
Terraform only makes changes to the remote objects when it detects a difference between the configuration and the remote object attributes. In the configuration as you've written it so far, the configuration includes only the filename. It includes nothing about the content of the file, so Terraform can't react to the file changing.
To make subsequent changes, there are a few options:
You could use a different local filename for each new version.
You could use a different remote object path for each new version.
You can use the object etag to let Terraform recognize when the content has changed, regardless of the local filename or object path.
The final of these seems closest to what you want in this case. To do that, add the etag argument and set it to be an MD5 hash of the file:
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "${path.module}/my_files.zip"
etag = "${filemd5("${path.module}/my_files.zip")}"
}
With that extra argument in place, Terraform will detect when the MD5 hash of the file on disk is different than that stored remotely in S3 and will plan to update the object accordingly.
(I'm not sure what's going on with version_id. It should work as long as versioning is enabled on the bucket.)
The preferred solution is now to use the source_hash property. Note that aws_s3_bucket_object has been replaced by aws_s3_object.
locals {
object_source = "${path.module}/my_files.zip"
}
resource "aws_s3_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = local.object_source
source_hash = filemd5(local.object_source)
}
Note that etag can have issues when encryption is used.
You shouldn't be using Terraform to do this. Terraform is supposed to orchestrate and provision your infrastructure and its configuration, not files. That said, terraform is not aware of changes on your files. Unless you change their names, terraform will not update the state.
Also, it is better to use local-exec to do that. Something like:
resource "aws_s3_bucket" "my-bucket" {
# ...
provisioner "local-exec" {
command = "aws s3 cp path_to_my_file ${aws_s3_bucket.my-bucket.id}"
}
}
I have used Terragrunt to orchestrate the creation of a non-default AWS VPC.
I've got S3/DynamoDB state mgmt, and the VPC code is a module. I have the 'VPC environment' terraform.tfvars code checked into a second repo as per the terragrunt README.md.
I created a second module which will eventually create hosts in this VPC but for now just aims to output its ID. I have created a separate 'hosts environment' / terraform.tfvars for the instantiation of this module.
I run terragrunt apply in the VPC environment directory - VPC created
I run terragrunt apply a second time in the hosts environment directory - output directive doesn't work (no error, but incorrect, see below).
This is a precursor to one day running a terragrunt apply-all in the parent directory of the VPC/hosts environment directories; my reading of the docs suggest using a terraform_remote_state data source to expose the VPC ID, so I specified access like this in the data.tf file of the hosts module:
data "terraform_remote_state" "vpc" {
backend = "s3"
config {
bucket = "myBucket"
key = "keyToMy/vpcEnvironment.tfstate"
region = "stateRegion"
}
}
Then, in the hosts module outputs.tf, I specified an output to check assignment:
output "mon_vpc" {
value = "${data.terraform_remote_state.vpc.id}"
}
When I run (2) above, it exits with:
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
mon_vpc = 2018-06-02 23:14:42.958848954 +0000 UTC
Questions:
I'm going wrong setting up the code so that the hosts environment is configured to correctly acquire the VPC ID from the already-existing VPC (terraform state file) - any advice on what to change here would be appreciated.
It does look like I've managed to acquire the date of when the VPC was created rather than its ID, which given the code is perplexing - anyone know why?
I'm not using community modules - all hand rolled.
EDIT: In response to Brandon Miller, here is a bit more. In my VPC module, I have an outputs.tf containing among other outputs:
output "aws_vpc.mv.id-op" {
value = "${aws_vpc.mv.id}"
}
and the vpc.tf contains
resource "aws_vpc" "mv" {
cidr_block = "${var.vpcCidr}"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "mv-vpc-${var.aws_region}"
}
}
As this cfg results in a vpc being created, and as most of the parameters are <computed>, I assumed state would contain sufficient data for other modules to refer to by consulting state (I assumed at first that terraform used the AWS API for this under the bonnet, rather than consulting a different state key).
EDIT 2: Read all of #brendan-miller's answer and following comments first.
Use of periods causes a problem as it confuses terraform (see Brendan's answer for the specification format below):
Error: output 'mon_vpc': unknown resource 'data.aws_vpc.mv-ds' referenced in variable data.aws_vpc.mv-ds.vpc.id
You named your output aws_vpc.mv.id-op but when you retrieve it you are retrieving just id. You could try
data.terraform_remote_state.vpc.aws_vpc.mv.id
but im not sure if Terraform will complain about the additional .. However the format should always be
data.terraform_remote_state.<name of the remote state module>.<name of the output>
You mentioned wanting to be able to get this info with the AWS API. That is also possible by using the aws_vpc data source. Their example uses id, but you can also use any tag you used on your vpc.
Like this:
data "aws_vpc" "default" {
filter {
name = "tag:Name"
values = ["example-vpc-name"]
}
}
Then you can use this for the id
${data.aws_vpc.default.id}
In addition this retrieves all tags set, for example:
${data.aws_vpc.default.tags.Name}
And the cidr block
${data.aws_vpc.default.cidr_block}
As well as some other info. This can be very useful for storing and retrieving things about your VPC.
I tried creating a datasource using boto for machine learning but ended up with an error.
Here's my code :
import boto
bucketname = 'mybucket'
filename = 'myfile.csv'
schema = 'myfile.csv.schema'
conn = boto.connect_s3()
datasource = 'my_datasource'
ml = boto.connect_machinelearning()
#create a data source
ds = ml.create_data_source_from_s3(
data_source_id = datasource,
data_spec ={
'DataLocationS3':'s3://'+bucketname+'/'+filename,
'DataSchemaLocationS3':'s3://'+bucketname+'/'+schema},
data_source_name=None,
compute_statistics = True)
print ml.get_data_source(datasource,verbose=None)
I get this error as a result of get_data_source call:
Could not access 's3://mybucket/myfile.csv'. Either there is no file at that location, or the file is empty, or you have not granted us read permission.
I have checked and I have FULL_CONTROL as my permissions. The bucket, file and schema all are present and are non-empty.
How do I solve this?
You may have FULL_CONTROL over that S3 resource but in order for this to work you have to grant the Machine Learning service the appropriate access to that S3 resource.
I know links to answers are frowned upon but in this case I think its best to link to the definitive documentation from the Machine Learning Service since the actual steps are complicated and could change in the future.