I have multiple files under some root directory, let’s call it module/data/.
I need to upload this directory to the corresponding S3 bucket. All this works as expected with:
resource "aws_s3_bucket_object" "k8s-state" {
for_each = fileset("${path.module}/data", "**/*")
bucket = aws_s3_bucket.kops.bucket
key = each.value
source = "${path.module}/data/${each.value}"
etag = filemd5("${path.module}/data/${each.value}")
}
The only thing is left is that I need to loop over all files recursively and replace markers (for example !S3!) with values from variables of terraform’s module.
Similar to this, but across all files in directories/subdirectories:
replace(file("${path.module}/launchconfigs/file"), “#S3”, aws_s3_bucket.kops.bucket)
So the question in one sentence: how to loop over files and replace parts of them with variables from terraform?
An option could be using templates, the code will look like:
provider "aws" {
region = "us-west-1"
}
resource "aws_s3_bucket" "sample_bucket2222" {
bucket = "my-tf-test-bucket2222"
acl = "private"
}
resource "aws_s3_bucket_object" "k8s-state" {
for_each = fileset("${path.module}/data", "**/*")
bucket = aws_s3_bucket.sample_bucket2222.bucket
key = each.value
content = data.template_file.data[each.value].rendered
etag = filemd5("${path.module}/data/${each.value}")
}
data "template_file" "data" {
for_each = fileset("${path.module}/data", "**/*")
template = "${file("${path.module}/data/${each.value}")}"
vars = {
bucket_id = aws_s3_bucket.sample_bucket2222.id
bucket_arn = aws_s3_bucket.sample_bucket2222.arn
}
}
Instead of source you can see I'm using content to consume the template_file, that is the only difference in that resource with yours
On your files the variables could be consumed like:
Hello ${bucket_id}
I have all my test code here:
https://github.com/heldersepu/hs-scripts/tree/master/TerraForm/regional
Related
I am very new to terraform. My requirement is to upload objects to existing s3 buckets. I want to upload one or more objects from my source to one or more buckets utilizing only one resource. Using count and count.index I can create different numbers of resources. However, doing so will prevent me from using fileset which helps to recursively upload all the contents in the folder.
The basic code look like this. This is for multiple file uploads to single bucket but I would like to modify for multiple uploads to different buckets.;
variable "source_file_path"{
type = list(string)
description = "Path from where objects are to be uploaded"
}
variable "bucket_name"{
type = list(string)
description = "Name or ARN of the bucket to put the file in"
}
variable "data_folder"{
type = list(string)
description = "Object path inside the bucket"
}
resource "aws_s3_bucket_object" "upload_object"{
for_each = fileset(var.source_file_path, "*")
bucket = var.bucket_name
key = "${var.data_folder}${each.value}"
source = "${var.source_file_path}${each.value}"
}
I have created a vars.tfvars file with following values;
source_file_path = ["source1","source2"]
bucket_name = ["bucket1","bucket2"]
data_folder = ["path1","path2"]
So, what I need is, terraform to be able to upload all the files from the source1 to bucket1 s3 bucket by creating path1 inside the bucket. And similarly for source2, bucket2, and path2.
Is this something that can be done in terraform?
From your problem description it sounds like a more intuitive data structure to describe what you want to create would be a map of objects where the keys are bucket names and the values describe the settings for that bucket:
variable "buckets" {
type = map(object({
source_file_path = string
key_prefix = string
}))
}
When defining the buckets in your .tfvars file this will now appear as a single definition with a complex type:
buckets = {
bucket1 = {
source_file_path = "source1"
key_prefix = "path1"
}
bucket2 = {
source_file_path = "source2"
key_prefix = "path2"
}
}
This data structure has one element for each bucket, so it is suitable to use directly as the for_each for a resource describing the buckets:
resource "aws_s3_bucket" "example" {
for_each = each.buckets
bucket = each.key
# ...
}
There is a pre-existing official module hashicorp/dir/template which already encapsulates the work of finding files under a directory prefix, assigning each one a Content-Type based on its filename suffix, and optionally rendering templates. (You can ignore the template feature if you don't need it, by making your directory only contain static files.)
We need one instance of that module per bucket, because each bucket will have its own directory and thus its own set of files, and so we can use for_each chaining to tell Terraform that each instance of this module is related to one bucket:
module "bucket_files" {
for_each = aws_s3_bucket.example
base_dir = var.buckets[each.key].source_file_path
}
The module documentation shows how to map the result of the module to S3 bucket objects, but that example is for only a single instance of the module. In your case we need an extra step to turn this into a single collection of files across all buckets, which we can do using flatten:
locals {
bucket_files_flat = flatten([
for bucket_name, files_module in module.bucket_files : [
for file_key, file in files_module.files : {
bucket_name = bucket_name
local_key = file_key
remote_key = "${var.buckets[each.key].key_prefix}${file_key}"
source_path = file.source_path
content = file.content
content_type = file.content_type
etag = file.digests.md5
}
]
])
}
resource "aws_s3_bucket_object" "example" {
for_each = {
for bf in local.bucket_files_flat :
"s3://${bf.bucket_name}/${bf.remote_key}" => bf
}
# Now the rest of this is basically the same as
# the hashicorp/dir/template S3 example, but using
# the local.bucket_files_flat structure instead
# of the module result directly.
bucket = each.value.bucket_name
key = each.value.remote_key
content_type = each.value.content_type
# The template_files module guarantees that only one of these two attributes
# will be set for each file, depending on whether it is an in-memory template
# rendering result or a static file on disk.
source = each.value.source_path
content = each.value.content
# Unless the bucket has encryption enabled, the ETag of each object is an
# MD5 hash of that object.
etag = each.value.etag
}
Terraform needs a unique tracking key for each instance of aws_s3_bucket_object.example, and so I just arbitrarily decided to use the s3:// URI convention here, since I expect that's familiar to folks accustomed to working with S3. This means that the resource block will declare instances with addresses like this:
aws_s3_bucket_object.example["s3://bucket1/path1example.txt"]
aws_s3_bucket_object.example["s3://bucket2/path2other_example.txt"]
Because these objects are uniquely identified by their final location in S3, Terraform will understand changes to the files as updates in-place, but any changes to the location as removing an existing object and adding a new one at the same time.
(I replicated the fact that your example just concatenated the path prefix with the filename without any intermediate separator, and so that's why it appears as path1example.txt above and not path1/example.txt. If you want the slash in there, you can add it to the expression which defined remote_key inside local.bucket_files_flat.)
I have built the following terraform code:
data "archive_file" "lambda_dependencies_bundle" {
depends_on = [
null_resource.lambda_dependencies
]
output_path = "${local.function_build_folder_path}/build/${local.function_s3_object_key}.zip"
excludes = ["${local.function_build_folder_path}/build/*"]
source_dir = local.function_build_folder_path
type = "zip"
}
resource "aws_s3_bucket" "lambda_dependencies_bucket" {
bucket = local.function_s3_bucket
acl = "private"
}
resource "aws_s3_bucket_object" "lambda_dependencies_upload" {
bucket = aws_s3_bucket.lambda_dependencies_bucket.id
key = "${local.function_s3_object_key}.zip"
source = data.archive_file.lambda_dependencies_bundle.output_path
}
The null_resource.lambda_dependencies is triggered by a file change and just builds all of my code to local.function_build_folder_path.
Everytime the null_resource changes, the archive_file.lambda_dependencies_bundle rebuilds (correct behavior!).
But other than expected, the aws_s3_bucket_object.lambda_dependencies_upload is not triggered by the rebuild of the archive_file.
How can I achieve a reupload of my archive_file on a rebuild?
I would add etag:
Triggers updates when the value changes.
resource "aws_s3_bucket_object" "lambda_dependencies_upload" {
bucket = aws_s3_bucket.lambda_dependencies_bucket.id
key = "${local.function_s3_object_key}.zip"
source = data.archive_file.lambda_dependencies_bundle.output_path
etag = data.archive_file.lambda_dependencies_bundle.output_md5
}
I have this main.tf file:
provider "google" {
project = var.projNumber
region = var.regName
zone = var.zoneName
}
resource "google_storage_bucket" "bucket_for_python_application" {
name = "python_bucket_exam"
location = var.regName
force_destroy = true
}
resource "google_storage_bucket_object" "file-hello-py" {
name = "src/hello.py"
source = "app-files/src/hello.py"
bucket = "python_bucket_exam"
}
resource "google_storage_bucket_object" "file-main-py" {
name = "main.py"
source = "app-files/main.py"
bucket = "python_bucket_exam"
}
When executed first time It worked fine, but after terraform destroy and again terraform plan -> terraform apply I've noticed that terraform tries to create object before actually creating a bucket:
Ofc it cant't create object inside something that does'nt exist. Why is that?
You have to create a dependency between your objects and your bucket (see code below). Otherwise, Terraform won't know that it has to create bucket first, and then objects. This is related to how Terraform stores the resources in a directed graph.
resource "google_storage_bucket_object" "file-hello-py" {
name = "src/hello.py"
source = "app-files/src/hello.py"
bucket = google_storage_bucket.bucket_for_python_application.name
}
resource "google_storage_bucket_object" "file-main-py" {
name = "main.py"
source = "app-files/main.py"
bucket = google_storage_bucket.bucket_for_python_application.name
}
By doing this, you declare an implicit order : bucket, then objects. This is equivalent to using depends_on in your google_storage_bucket_objects, but in that particular case I recommend using a reference to your bucket in your objects, rather than using an explicit depends_on.
I have some Terraform code like this:
resource "aws_s3_bucket_object" "file1" {
key = "someobject1"
bucket = "${aws_s3_bucket.examplebucket.id}"
source = "./src/index.php"
}
resource "aws_s3_bucket_object" "file2" {
key = "someobject2"
bucket = "${aws_s3_bucket.examplebucket.id}"
source = "./src/main.php"
}
# same code here, 10 files more
# ...
Is there a simpler way to do this?
Terraform supports loops via the count meta parameter on resources and data sources.
So, for a slightly simpler example, if you wanted to loop over a well known list of files you could do something like the following:
locals {
files = [
"index.php",
"main.php",
]
}
resource "aws_s3_bucket_object" "files" {
count = "${length(local.files)}"
key = "${local.files[count.index]}"
bucket = "${aws_s3_bucket.examplebucket.id}"
source = "./src/${local.files[count.index]}"
}
Unfortunately Terraform's AWS provider doesn't have support for the equivalent of aws s3 sync or aws s3 cp --recursive although there is an issue tracking the feature request.
How to create a multiple folders inside an existing bucket using terraform.
example: bucket/folder1/folder2
resource "aws_s3_bucket_object" "folder1" {
bucket = "${aws_s3_bucket.b.id}"
acl = "private"
key = "Folder1/"
source = "/dev/null"
}
While the answer of Nate is correct, this would lead to a lot of code duplication. A better solution in my opinion would be to work with a list and loop over it.
Create a variable (variable.tf file) that contains a list of possible folders:
variable "s3_folders" {
type = "list"
description = "The list of S3 folders to create"
default = ["folder1", "folder2", "folder3"]
}
Then alter the piece of code you already have:
resource "aws_s3_bucket_object" "folders" {
count = "${length(var.s3_folders)}"
bucket = "${aws_s3_bucket.b.id}"
acl = "private"
key = "${var.s3_folders[count.index]}/"
source = "/dev/null"
}
Apply the same logic as you did to create the first directory.
resource "aws_s3_bucket_object" "folder1" {
bucket = "${aws_s3_bucket.b.id}"
acl = "private"
key = "Folder1/Folder2/"
source = "/dev/null"
}
There a no tips for windows users but this should work for you.
Slightly easier than using an empty file as "source"
resource "aws_s3_bucket_object" "output_subdir" {
bucket = "${aws_s3_bucket.file_bucket.id}"
key = "output/"
content_type = "application/x-directory"
}
resource "aws_s3_bucket_object" "input_subdir" {
bucket = "${aws_s3_bucket.file_bucket.id}"
key = "input/"
content_type = "application/x-directory"
}