Terraform - Upload file to S3 on every apply - amazon-web-services

I need to upload a folder to S3 Bucket. But when I apply for the first time. It just uploads. But I have two problems here:
uploaded version outputs as null. I would expect some version_id like 1, 2, 3
When running terraform apply again, it says Apply complete! Resources: 0 added, 0 changed, 0 destroyed. I would expect to upload all the times when I run terraform apply and create a new version.
What am I doing wrong? Here is my Terraform config:
resource "aws_s3_bucket" "my_bucket" {
bucket = "my_bucket_name"
versioning {
enabled = true
}
}
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "my_files.zip"
}
output "my_bucket_file_version" {
value = "${aws_s3_bucket_object.file_upload.version_id}"
}

Terraform only makes changes to the remote objects when it detects a difference between the configuration and the remote object attributes. In the configuration as you've written it so far, the configuration includes only the filename. It includes nothing about the content of the file, so Terraform can't react to the file changing.
To make subsequent changes, there are a few options:
You could use a different local filename for each new version.
You could use a different remote object path for each new version.
You can use the object etag to let Terraform recognize when the content has changed, regardless of the local filename or object path.
The final of these seems closest to what you want in this case. To do that, add the etag argument and set it to be an MD5 hash of the file:
resource "aws_s3_bucket_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = "${path.module}/my_files.zip"
etag = "${filemd5("${path.module}/my_files.zip")}"
}
With that extra argument in place, Terraform will detect when the MD5 hash of the file on disk is different than that stored remotely in S3 and will plan to update the object accordingly.
(I'm not sure what's going on with version_id. It should work as long as versioning is enabled on the bucket.)

The preferred solution is now to use the source_hash property. Note that aws_s3_bucket_object has been replaced by aws_s3_object.
locals {
object_source = "${path.module}/my_files.zip"
}
resource "aws_s3_object" "file_upload" {
bucket = "my_bucket"
key = "my_bucket_key"
source = local.object_source
source_hash = filemd5(local.object_source)
}
Note that etag can have issues when encryption is used.

You shouldn't be using Terraform to do this. Terraform is supposed to orchestrate and provision your infrastructure and its configuration, not files. That said, terraform is not aware of changes on your files. Unless you change their names, terraform will not update the state.
Also, it is better to use local-exec to do that. Something like:
resource "aws_s3_bucket" "my-bucket" {
# ...
provisioner "local-exec" {
command = "aws s3 cp path_to_my_file ${aws_s3_bucket.my-bucket.id}"
}
}

Related

GCP: Output CSV file using local-exec in terraform

I am trying to create an reporting output file where I am listing all the buckets of different GCP projects.
The challenge is neither it prints out the output on the stdout nor it creates the file.
I can confirm that one bucket is there in the project for testing purposes.
Here is the code:
main.tf
data "google_client_config" "default" {}
resource "null_resource" "list_all_buckets" {
triggers = {
filename = "${path.module}/storage_output.csv"
}
provisioner "local-exec" {
command ="bash list_buckets.sh ${data.google_client_config.default.access_token} '*<gcp project name in single quote>*' >> storage_ouput.csv"
working_dir = "${path.module}"
}
}
data "local_file" "test" {
filename = "${null_resource.list_all_buckets.triggers.filename}"
}
output "result" {
value = "${data.local_file.test.content}"
}
list_buckets.sh
#!/bin/bash
readonly TOKEN="$1"
readonly PROJECT="$2"
readonly URL="https://storage.googleapis.com/storage/v1/b?project=${2}"
LIST_BUCKETS="$(curl -X GET -H "Authorization: Bearer "${TOKEN} ${URL})"
Please feel free to ask any questions.
I am expecting a storage_ouput.csv file to be created in the existing module directory.
Output Truncated
You can apply this plan to save these new output values to the Terraform
state, without changing any real infrastructure.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
result = ""
Cleaning up project directory and file based variables
00:00
Job succeeded
provisioners in Terraform only run on creation or deletion time (defaulted to on create), which means that your script will be only called in the first time your run the plan. That run should create a storage_ouput.csv file (which has a typo, unlike the path in your triggers storage_output.csv).
For testing purposes only I recreated the following:
resource "null_resource" "list_all_buckets" {
triggers = {
filename = "${path.module}/storage_output.csv"
}
provisioner "local-exec" {
command = "date >> ${path.module}/storage_output.csv"
working_dir = path.module
}
}
data "local_file" "test" {
filename = null_resource.list_all_buckets.triggers.filename
}
output "result" {
value = data.local_file.test.content
}
which runs date command and writes it to storage_output.csv. When running this for the first time it creates the CSV file on the current directory, but if I run it again nothing will happen since the null_resource was already created.
To force-recreating this I would need to taint it, or I run:
terraform apply --replace null_resource.list_all_buckets
Which force recreating null_resource, thus executing the script in your provisioned.
As Hashicorp mention in their documentation:
Use provisioners as a last resort. There are better alternatives for most situations.
You may want to have a look at Google Storage's data sources like:
google_storage_bucket
Or, if you created these buckets with Terraform you can list your resources with show or state list. And if they are created in different workspaces you could pull them using terraform_remote_state

Terraform:- can we create " data source" in a seperate file like local or variables?

I want to seperate data form main code and use them in seperate file similar to local.tf or variables.tf, however even in the docs there is no reference.
use case
I am trying to create access logging for s3 bucket. Target bucket is not managed by s3 so I want to make sure that it exists before using it via data source
resource "aws_s3_bucket" "artifact" {
bucket = "jatin-123"
}
data "aws_s3_bucket" "selected" {
bucket = "bucket.test.com"
}
resource "aws_s3_bucket_logging" "artifacts_server_access_logs" {
for_each = local.env
bucket = data.aws_s3_bucket.selected.id
target_bucket = local.s3_artifact_access_logs_bucket_name
target_prefix = "${aws_s3_bucket.artifact[each.key].id}/"
}
Yes, you can have data sources in whatever file you want.
Terraform basically does not care about the file composition and their names and just lumps all .tf files in the same directory into one big blob.
Yes, of course, you can have. For organization purposes, you SHOULD use different files. When you have a simple project it's easy to check your code or even troubleshoot within a single file, but when you start to deploy more infrastructure will be a nightmare. So my advice is to start your "small" projects by splitting the through different files.
Here is my suggestion for you, regarding your example:
base.auto.tfvars
Here you can put variables that will be used along all the project.
E.g: region = us-east-1
project = web-appliance
s3.auto.tfvars
Variables that you will use in your s3 bucket
s3.tf
The code for S3 creation
datasource.tf
Here you will put all the datasources that you need in your project.
provider.tf
The configuration for your provider(s). In your example, aws provider
versions.tf
The versions of your providers

Is it possible to update the source code of a GCP Cloud Function in Terraform?

I use Terraform to manage resources of Google Cloud Functions. But while the inital deployment of the cloud function worked, further deploments with changed cloud function source code (the source archive sourcecode.zip) were not redeployed when I use terraform apply after updating the source archive.
The storage bucket object gets updated but this does not trigger an update/redeployment of the cloud function resource.
Is this an error of the provider?
Is there a way to redeploy a function in terraform when the code changes?
The simplified source code I am using:
resource "google_storage_bucket" "cloud_function_source_bucket" {
name = "${local.project}-function-bucket"
location = local.region
uniform_bucket_level_access = true
}
resource "google_storage_bucket_object" "function_source_archive" {
name = "sourcecode.zip"
bucket = google_storage_bucket.cloud_function_source_bucket.name
source = "./../../../sourcecode.zip"
}
resource "google_cloudfunctions_function" "test_function" {
name = "test_func"
runtime = "python39"
region = local.region
project = local.project
available_memory_mb = 256
source_archive_bucket = google_storage_bucket.cloud_function_source_bucket.name
source_archive_object = google_storage_bucket_object.function_source_archive.name
trigger_http = true
entry_point = "trigger_endpoint"
service_account_email = google_service_account.function_service_account.email
vpc_connector = "projects/${local.project}/locations/${local.region}/connectors/serverless-main"
vpc_connector_egress_settings = "ALL_TRAFFIC"
ingress_settings = "ALLOW_ALL"
}
You can append MD5 or SHA256 checksum of the content of zip to the bucket object's name. That will trigger recreation of cloud function whenever source code changes.
${data.archive_file.function_src.output_md5}
data "archive_file" "function_src" {
type = "zip"
source_dir = "SOURCECODE_PATH/sourcecode"
output_path = "./SAVING/PATH/sourcecode.zip"
}
resource "google_storage_bucket_object" "function_source_archive" {
name = "sourcecode.${data.archive_file.function_src.output_md5}.zip"
bucket = google_storage_bucket.cloud_function_source_bucket.name
source = data.archive_file.function_src.output_path
}
You can read more about terraform archive here - terraform archive_file
You might consider that as a defect. Personally, I am not so sure about it.
Terraform has some logic, when an "apply" command is executed.
The question to think about - how does terraform know that the source code of the cloud function is changed, and the cloud function is to be redeployed? Terraform does not "read" the cloud function source code, does not compare it with the previous version. It only reads the terraform's script files. And if nothing is changed in those files (in comparison to the state file, and resources existed in GCP projects) - nothing to be redeployed.
Therefore, something is to be changed. For example the name of the archive file. In that case, terraform finds out that the cloud function has to be redeployed (because the state file has the old name of the archive object). The cloud function is redeployed.
An example of that code with more detailed explanation, was provided some time ago: don't take into account the question working - just read the answer

Using Github Relase .zip file for Lambda Function

I am trying to use Terraform to spin up a lambda function that uses source code in a github release package. The location of the package is:
https://github.com/DataDog/datadog-serverless-functions/releases
This will allow me to manually create the AWS DataDog forwarder without using their Cloudformation template (we want to control as much of the process as possible).
I'm not entirely sure how to pull down that zip file for lambda functions to use
resource "aws_lambda_function" "test_lambda" {
filename = "lambda_function_payload.zip"
function_name = "datadog-forwarder"
role = aws_iam_role.datadog_forwarder_role.arn
source_code_hash = filebase64sha256("lambda_function_payload.zip")
runtime = "python3.7"
environment {
variables = {
DD_API_KEY_SECRET_ARN = aws_secretsmanager_secret_version.dd_api_key.arn
#This stops the Forwarder from generating enhanced metrics itself, but it will still forward custom metrics from other lambdas.
DD_ENHANCED_METRICS = false
DD_S3_BUCKET_NAME = aws_s3_bucket.datadog_forwarder.name
}
}
}
I know that the source_code_hash file name will change and the filename of the lambda function will change as well. Any help would be appreciated.
There is no build in functionality to download files from the internet in terraform. But you could relatively easily do that by using external data source. For that you would create a bash script that could use curl to download your zip, open it up, inspect or do any processing you need. The source would also return data that you can use for creation of your function.
Alternative is to use null_resource with local-exec to curl your zip file. But local-exec is less versitile then using the external data source.
There is a way to specify a zip file for an AWS Lambda. Checkout the example configuration in https://github.com/hashicorp/terraform-provider-aws/blob/main/examples/lambda.
It uses a data source of type archive_file
data "archive_file" "zip" {
type = "zip"
source_file = "hello_lambda.py"
output_path = "hello_lambda.zip"
}
to set the filename and source_code_hash for the aws_lambda_function resource:
resource "aws_lambda_function" "lambda" {
function_name = "hello_lambda"
filename = data.archive_file.zip.output_path
source_code_hash = data.archive_file.zip.output_base64sha256
.....
}
See the example files for complete details.
The Terraform AWS provider is calling the CreateFunction API ( https://docs.aws.amazon.com/lambda/latest/dg/API_CreateFunction.html), which allows you to specify a zip file.

Can depends_on in terraform be set to a file path?

I am trying to break down my main.tf file . So I have set aws config via terraform, created the configuration recorder and set the delivery channel to a s3 bucket created in the same main.tf file. Now for the AWS config rules, I have created a separate file viz config-rule.tf. As known , every aws_config_config_rule that we create has a depends_on clause where in we call the dependent resource, which in this case being aws_config_configuration_recorder. So my question is can I interpolate the depends_on clause to something like :
resource "aws_config_config_rule" "s3_bucket_server_side_encryption_enabled" {
name = "s3_bucket_server_side_encryption_enabled"
source {
owner = "AWS"
source_identifier = "S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED"
}
depends_on = ["${file("aws-config-setup.tf")}"]
}
Considering I move my aws config setup from my main.tf file to a new file called aws-config-setup.tf.
If I'm reading your question correctly, you shouldn't need to make any changes for this to work, assuming you didn't move code to its own module (a separate directory).
When terraform executes in a particular directory it takes all files into account, basically treating them all as one terraform file.
So, in general, if you had a main.tf that looks like the following
resource "some_resource" "resource_1" {
# ...
}
resource "some_resource" "resource_2" {
# ...
depends_on = [some_resource.resource_1]
}
and you decided to split these out into the following files
file1.tf
resource "some_resource" "resource_1" {
# ...
}
file2.tf
resource "some_resource" "resource_2" {
# ...
depends_on = [some_resource.resource_1]
}
if terraform is run in the same directory, it will evaluate the main.tf scenario exactly the same as the multi-file scenario.