I have a need to load data from a non public S3 bucket. Using this JSON I wanted be able to loop over lists within the terraform.
Example:
{
info: [
"10.0.0.0/24",
"10.1.1.0/24",
"10.2.2.0/24"
]
}
I can retrieve the JSON fine using the following:
data "aws_s3_bucket_object" "config" {
bucket = "our-bucket"
key = "global.json"
}
What I cannot do is utilize this as a map|list within terraform so that I can utilize this data. Any ideas?
After a good deal of trial and error I figured out a solution. Note that for this to work it appears the JSON source needs to be simple, by that I mean no nested objects like lists or maps.
{
foo1: "my foo1",
foo2: "my foo2",
foo3: "my foo3"
}
data "aws_s3_bucket_object" "config-json" {
bucket = "my-bucket"
key = "foo.json"
}
data "external" "config-map" {
program = ["echo", "${data.aws_s3_bucket_object.config-json.body}"]
}
output "foo" {
value = ["${values(data.external.config-map.result)}"]
}
Related
How to get aws configuration parameters stored in json format on S3 in terraform scripts. I want to use those parameters in another resources.
I just want to externalise all the variable parameters in the script.
e.g: we have Data Source: aws_ssm_parameter to get AWS ssm parameters.
'''
data "aws_ssm_parameter" "foo" {
name = "foo"
}
'''
Similarly how can we get aws app configurations in terraform scripts.
From my understanding you need to read S3 objects' value's and use it in terraform.
Used data because it's external resource that we're referencing.
I would use like this:
data "aws_s3_object" "obj" {
bucket = "foo"
key = "foo.json"
}
output "s3_json_value" {
value = data.aws_s3_object.obj.body
}
To parse JSON you can use jsondecode
locals {
a_variable = jsondecode(data.aws_s3_object.obj.body)
}
output "Username" {
value = local.a_variable.name
}
Is it possible to setup template_file with yaml file stored on s3 bucket?
Is there any other solution to attatch external file to API Gateway (like in case of lambda function which can be build based on file stored on s3)?
Update:
I tried to combine api_gateway resource with s3_bucket_object as datasource but terraform probably do not see it. There is an information that there are no changes.
data "aws_s3_bucket_object" "open_api" {
bucket = aws_s3_bucket.lambda_functions_bucket.bucket
key = "openapi-${var.current_api_version}.yaml"
}
resource "aws_api_gateway_rest_api" "default" {
name = "main-gateway"
body = data.aws_s3_bucket_object.open_api.body
endpoint_configuration {
types = ["REGIONAL"]
}
}
I tried also achieve it by using template_file
data "aws_s3_bucket_object" "open_api" {
bucket = aws_s3_bucket.lambda_functions_bucket.bucket
key = "openapi-${var.current_api_version}.yaml"
}
data "template_file" "open_api" {
template = data.aws_s3_bucket_object.open_api.body
vars = {
lambda_invocation_arn_user_post = aws_lambda_function.user_post.invoke_arn
lambda_invocation_arn_webhook_post = aws_lambda_function.webhook_post.invoke_arn
}
}
resource "aws_api_gateway_rest_api" "default" {
name = "main-gateway"
body = data.template_file.open_api.rendered
endpoint_configuration {
types = ["REGIONAL"]
}
}
but result is the same.
In case of REST API Gateway, you can try combining body parameter of aws_api_gateway_rest_api and aws_s3_bucket_object as a data source.
In case of HTTP API Gateway, you can try combining body parameter of aws_apigatewayv2_api and aws_s3_bucket_object as a data source.
Edit:
From terraform docs for aws_s3_bucket_object: The content of an object (body field) is available only for objects which have a human-readable Content-Type (text/* and application/json). The Content-Type header for YAML files seems to be unclear, but in this case using an application/* Content-Type for YAML would result in terraform ignoring the file's contents.
Issue was in YAML file, its look like terraform is not supporting it. It is neccesary to use JSON format.
data "aws_s3_bucket_object" "open_api" {
bucket = aws_s3_bucket.lambda_functions_bucket.bucket
key = "openapi-${var.current_api_version}.json"
}
data "template_file" "open_api" {
template = data.aws_s3_bucket_object.open_api.body
vars = {
lambda_invocation_arn_user_post = aws_lambda_function.user_post.invoke_arn
lambda_invocation_arn_webhook_post = aws_lambda_function.webhook_post.invoke_arn
}
}
resource "aws_api_gateway_rest_api" "default" {
name = "main-gateway"
body = data.template_file.open_api.rendered
endpoint_configuration {
types = ["REGIONAL"]
}
}
I am enabling AWS Macie 2 using terraform and I am defining a default classification job as following:
resource "aws_macie2_account" "member" {}
resource "aws_macie2_classification_job" "member" {
job_type = "ONE_TIME"
name = "S3 PHI Discovery default"
s3_job_definition {
bucket_definitions {
account_id = var.account_id
buckets = ["S3 BUCKET NAME 1", "S3 BUCKET NAME 2"]
}
}
depends_on = [aws_macie2_account.member]
}
AWS Macie needs a list of S3 buckets to analyze. I am wondering if there is a way to select all buckets in an account, using a wildcard or some other method. Our production accounts contain hundreds of S3 buckets and hard-coding each value in the s3_job_definition is not feasible.
Any ideas?
The Terraform AWS provider does not support a data source for listing S3 buckets at this time, unfortunately. For things like this (data sources that Terraform doesn't support), the common approach is to use the AWS CLI through an external data source.
These are modules that I like to use for CLI/shell commands:
As a data source (re-runs each time)
As a resource (re-runs only on resource recreate or on a change to a trigger)
Using the data source version, it would look something like:
module "list_buckets" {
source = "Invicton-Labs/shell-data/external"
version = "0.1.6"
// Since the command is the same on both Unix and Windows, it's ok to just
// specify one and not use the `command_windows` input arg
command_unix = "aws s3api list-buckets --output json"
// You want Terraform to fail if it can't get the list of buckets for some reason
fail_on_error = true
// Specify your AWS credentials as environment variables
environment = {
AWS_PROFILE = "myprofilename"
// Alternatively, although not recommended:
// AWS_ACCESS_KEY_ID = "..."
// AWS_SECRET_ACCESS_KEY = "..."
}
}
output "buckets" {
// We specified JSON format for the output, so decode it to get a list
value = jsondecode(module.list_buckets.stdout).Buckets
}
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
buckets = [
{
"CreationDate" = "2021-07-15T18:10:20+00:00"
"Name" = "bucket-foo"
},
{
"CreationDate" = "2021-07-15T18:11:10+00:00"
"Name" = "bucket-bar"
},
]
Using Terraform, I am trying to create an AWS SSM Document Package for Chrome so I can install it on various EC2 instances I have. I define these steps via terraform:
Upload zip containing Chrome installer plus install and uninstall powershell scripts.
Add that ZIP to an SSM package.
However, when I execute terraform apply I receive the following error...
Error updating SSM document: InvalidParameterValueException:
AttachmentSource not provided in the input request.
status code: 400, request id: 8d89da70-64de-4edb-95cd-b5f52207794c
The contents of my main.tf are as follows:
# 1. Add package zip to s3
resource "aws_s3_bucket_object" "windows_chrome_executable" {
bucket = "mybucket"
key = "ssm_document_packages/GoogleChromeStandaloneEnterprise64.msi.zip"
source = "./software-packages/GoogleChromeStandaloneEnterprise64.msi.zip"
etag = md5("./software-packages/GoogleChromeStandaloneEnterprise64.msi.zip")
}
# 2. Create AWS SSM Document Package using zip.
resource "aws_ssm_document" "ssm_document_package_windows_chrome" {
name = "windows_chrome"
document_type = "Package"
attachments_source {
key = "SourceUrl"
values = ["/path/to/mybucket"]
}
content = <<DOC
{
"schemaVersion": "2.0",
"version": "1.0.0",
"packages": {
"windows": {
"_any": {
"x86_64": {
"file": "GoogleChromeStandaloneEnterprise64.msi.zip"
}
}
}
},
"files": {
"GoogleChromeStandaloneEnterprise64.msi.zip": {
"checksums": {
"sha256": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
}
}
}
DOC
}
If I change the file from a zip to a vanilla msi I do not receive the error message, however, when I navigate to the package in the AWS console it tells me that the install.ps1 and uninstall.ps1 files are missing (since obviously they weren't included).
Has anyone experienced the above error and do you know how to resolve it? Or does anyone have reference to a detailed example of how to do this?
Thank you.
I ran into this same problem, in order to fix it I added a trailing slash to the source url value parameter:
attachments_source {
key = "SourceUrl"
values = ["/path/to/mybucket/"]
}
My best guess is it appends the filename from the package spec to the value provided in the attachments source value so it needs the trailing slash to build a valid path to the actual file.
This is the way it should be set up for an attachment in s3:
attachments_source {
key = "S3FileUrl"
values = ["s3://packer-bucket/packer_1.7.0_linux_amd64.zip"]
name = "packer_1.7.0_linux_amd64.zip"
}
I realized that in the above example there was no way terraform could identify a dependency between the two resources i.e. the s3 object needs to be created before the aws_ssm_document. Thus, I added in the following explicit dependency inside the aws_ssm_document:
depends_on = [
aws_s3_bucket_object.windows_chrome_executable
]
Amazon Athena reads data from input Amazon S3 buckets using the IAM credentials of the user who submitted the query; query results are stored in a separate S3 bucket.
Here is the script in Hashicorp site https://www.terraform.io/docs/providers/aws/r/athena_database.html
resource "aws_s3_bucket" "hoge" {
bucket = "hoge"
}
resource "aws_athena_database" "hoge" {
name = "database_name"
bucket = "${aws_s3_bucket.hoge.bucket}"
}
Where it says
bucket - (Required) Name of s3 bucket to save the results of the query execution.
How can I specify the input S3 bucket in the terraform script?
You would use the storage_descriptor argument in the aws_glue_catalog_table resource:
https://www.terraform.io/docs/providers/aws/r/glue_catalog_table.html#parquet-table-for-athena
Here is an example of creating a table using CSV file(s):
resource "aws_glue_catalog_table" "aws_glue_catalog_table" {
name = "your_table_name"
database_name = "${aws_athena_database.your_athena_database.name}"
table_type = "EXTERNAL_TABLE"
parameters = {
EXTERNAL = "TRUE"
}
storage_descriptor {
location = "s3://<your-s3-bucket>/your/file/location/"
input_format = "org.apache.hadoop.mapred.TextInputFormat"
output_format = "org.apache.hadoop.mapred.TextInputFormat"
ser_de_info {
name = "my-serde"
serialization_library = "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
parameters = {
"field.delim" = ","
"skip.header.line.count" = "1"
}
}
columns {
name = "column1"
type = "string"
}
columns {
name = "column2"
type = "string"
}
}
}
The input S3 bucket is specified in each table you create in the database, as such, there's no global definition for it.
As of today, the AWS API doesn't have much provision for Athena management, as such, neither does the aws CLI command, and nor does Terraform. There's no 'proper' way to create a table via these means.
In theory, you could create a named query to create your table, and then execute that query (for which there is API functionality, but not yet Terraform). It seems a bit messy to me, but it would probably work if/when TF gets the StartQuery functionality. The asynchronous nature of Athena makes it tricky to know when that table has actually been created though, and so I can imagine TF won't fully support table creation directly.
TF code that covers the currently available functionality is here: https://github.com/terraform-providers/terraform-provider-aws/tree/master/aws
API doco for Athena functions is here: https://docs.aws.amazon.com/athena/latest/APIReference/API_Operations.html