try to use DBT unload_table macro with syntax error - amazon-web-services

I want to UNLOAD a Redshift table into S3.
The query below has the same format as dbt lab shows on their website, and I’m sure that the s3 path, key and secret are correct. However, after I run the query it shows a syntax error at line 13.
{{
config(
{
"materialized":"table",
"post-hook": [
"{{ redshift.unload_table(
this.dbt_test_ods,
this.user_performance,
s3_path = 's3://XXXX/user_performance/',
aws_key = ‘XXXX',
aws_secret = ‘XXXX',
format = 'parquet',
parallel = False
) }}"
]
}
)
}}

Related

PynamoDB TransactWrite NoCredentialError

I'm trying to TransactWrite using PynamoDB.
But, I'm getting botocore.exceptions.NoCredentialsError: Unable to locate credentials error.
I am sending a request to DynamoDB from local PC. And, since MFA is applied to the locally stored AWS Profile, I got and applied the credentials through STS.
It seems that the credentials entered in the meta information of the table is not being applied well. Has anyone had this kind of experience?
Below is the code that tries TransactWrite and the code that defines the table.
from pynamodb.connection import Connection
connection = Connection(region="ap-northeast-2")
with TransactWrite(connection=connection) as transaction:
transaction.update(
item,
actions=[
Table.one.set(1),
Table.type.set("test)
]
)
transaction.save(
Table("1234", "test"),
)
class Table(Model):
class Meta:
aws_access_key_id = credentials["AccessKeyId"]
aws_secret_access_key = credentials["SecretAccessKey"]
aws_session_token = credentials["SessionToken"]
table_name = "test-dev-table"
region = "ap-northeast-2"
id = UnicodeAttribute(hash_key=True)
ts = NumberAttribute(range_key=True)
one = NumberAttribute(null=True)
type = UnicodeAttribute(null=True)
Thank you for your help.

Terraform AWS Redshift and Secret Manager

I am trying to deploy REDSHIFT by generating password in AWS secret manager.
Secret works only when I try to connect with sql client.
I wrote python script
import awswrangler as wr
Create a Redshift table
print("Connecting to Redshift...")
con = wr.redshift.connect(secret_id=redshift_credential_secret, timeout=10)
print("Successfully connected to Redshift.")
trying fetch secret from SECRET MANAGER and connect to redshift and do some operations but it gives an error.
redshift_connector.error.InterfaceError: ('communication error', gaierror(-2, 'Name or service not known'))
So for testing I create secret manually in Secret Manager by choosing the type of secret "REDSHIFT CREDENTIALS" and defined it in my python script and it worked. But the secret which I created with terraform not working.
It seems creating usual secret not working with Redshift cluster when you try to fetch it via some programming language. It requiers changing type of the secret in secrets manager.
But there is no such option in terraform to choose the secret type.
Is there any other way to deploy this solution ?
Here is my code below:
# Firstly create a random generated password to use in secrets.
resource "random_password" "password" {
length = 16
special = true
override_special = "!#$%&=+?"
}
# Creating a AWS secret for Redshift
resource "aws_secretsmanager_secret" "redshiftcred" {
name = "redshift"
recovery_window_in_days = 0
}
# Creating a AWS secret versions for Redshift
resource "aws_secretsmanager_secret_version" "redshiftcred" {
secret_id = aws_secretsmanager_secret.redshiftcred.id
secret_string = jsonencode({
engine = "redshift"
host = aws_redshift_cluster.redshift_cluster.endpoint
username = aws_redshift_cluster.redshift_cluster.master_username
password = aws_redshift_cluster.redshift_cluster.master_password
port = "5439"
dbClusterIdentifier = aws_redshift_cluster.redshift_cluster.cluster_identifier
})
depends_on = [
aws_secretsmanager_secret.redshiftcred
]
}
resource "aws_redshift_cluster" "redshift_cluster" {
cluster_identifier = "tf-redshift-cluster"
database_name = lookup(var.redshift_details, "redshift_database_name")
master_username = "admin"
master_password = random_password.password.result
node_type = lookup(var.redshift_details, "redshift_node_type")
cluster_type = lookup(var.redshift_details, "redshift_cluster_type")
number_of_nodes = lookup(var.redshift_details, "number_of_redshift_nodes")
iam_roles = ["${aws_iam_role.redshift_role.arn}"]
skip_final_snapshot = true
publicly_accessible = true
cluster_subnet_group_name = aws_redshift_subnet_group.redshift_subnet_group.id
vpc_security_group_ids = [aws_security_group.redshift.id]
depends_on = [
aws_iam_role.redshift_role
]
}
Unfortunately, until now, Terraform does not support the AWS::SecretsManager::SecretTargetAttachment which CloudFormation does and it supports the Target Type as AWS::Redshift::Cluster.
For more information, you can check the following Open Issue since 2019.
You can perform a workaround by using Terraform to create CloudFormation resource.

Variables in terragrunt. AWS region selection

I'm trying to create AWS environment using terragrunt. I worked in us-east-2 region and now I want to work in eu-central-1.
In my code I have the only variable, that represents region. I changed it to eu-central-1, but when I execute "terragrunt run-all plan" I see my old environment in outputs.
I deleted tfstate and all other local files, that were created by terragrunt. I also deleted s3bucket and dynamodb table in AWS. Where does terragrunt store information about region? How can I use new region?
terraform {
source = "/home/bohdan/Dev_ops/terragrunt_vpc/modules//vpc"
extra_arguments "custom_vars" {
commands = get_terraform_commands_that_need_vars()
}
}
locals {
remote_state_bucket_prefix = "tfstate"
environment = "dev"
app_name = "demo3"
aws_account = "873432059572"
aws_region = "eu-central-1"
image_tag = "v1"
}
inputs = {
remote_state_bucket = format("%s-%s-%s-%s", local.remote_state_bucket_prefix, local.app_name, local.environment, local.aws_region)
environment = local.environment
app_name = local.app_name
aws_account = local.aws_account
aws_region = local.aws_region
image_tag = local.image_tag
}
remote_state {
backend = "s3"
config = {
encrypt = true
bucket = format("%s-%s-%s-%s", local.remote_state_bucket_prefix, local.app_name, local.environment, local.aws_region)
key = format("%s/terraform.tfstate", path_relative_to_include())
region = local.aws_region
# dynamodb_table = format("tflock-%s-%s-%s", local.environment, local.app_name, local.aws_region)
dynamodb_table = "my-lock-table"
}
}
Problem was in source = "/home/bohdan/Dev_ops/terragrunt_vpc/modules//vpc"
I referred to an incorrect module with hardcoded region

Is BigQuery Data Transfer Service for Cloud Storage Transfer possible to implement using Terraform?

This is more of a question about the possibility because there is no documentation on GCP regarding this.
I am using BigQuery DTS for moving my CSVs from a GCS bucket to BQ Table. I have tried it out manually and it works but I need some automation behind it and want to implement it using Terraform.
I have already checked this link but it doesnt help exactly:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_data_transfer_config
Any help would be appreciated. Thanks!
I think the issue is that the documentation does not list the params. Here is an example. Compare with the API.
https://cloud.google.com/bigquery-transfer/docs/cloud-storage-transfer#bq
resource "google_bigquery_data_transfer_config" "sample" {
display_name = "sample"
location = "asia-northeast1"
data_source_id = "google_cloud_storage"
schedule = "every day 10:00"
destination_dataset_id = "target_dataset"
params = {
data_path_template = "gs://target_bucket/*"
destination_table_name_template = "target_table"
file_format = "CSV"
write_disposition = "MIRROR"
max_bad_records = 0
ignore_unknown_values = "false"
field_delimiter = ","
skip_leading_rows = "1"
allow_quoted_newlines = "false"
allow_jagged_rows = "false"
delete_source_files = "false"
}
}

Creating a CloudWatch Metrics from the Athena Query results

My Requirement
I want to create a CloudWatch-Metric from Athena query results.
Example
I want to create a metric like user_count of each day.
In Athena, I will write an SQL query like this
select date,count(distinct user) as count from users_table group by 1
In the Athena editor I can see the result, but I want to see these results as a metric in Cloudwatch.
CloudWatch-Metric-Name ==> user_count
Dimensions ==> Date,count
If I have this cloudwatch metric and dimensions, I can easily create a Monitoring Dashboard and send send alerts
Can anyone suggest a way to do this?
You can use CloudWatch custom widgets, see "Run Amazon Athena queries" in Samples.
It's somewhat involved, but you can use a Lambda for this. In a nutshell:
Setup your query in Athena and make sure it works using the Athena console.
Create a Lambda that:
Runs your Athena query
Pulls the query results from S3
Parses the query results
Sends the query results to CloudWatch as a metric
Use EventBridge to run your Lambda on a recurring basis
Here's an example Lambda function in Python that does step #2. Note that the Lamda function will need IAM permissions to run queries in Athena, read the results from S3, and then put a metric into Cloudwatch.
import time
import boto3
query = 'select count(*) from mytable'
DATABASE = 'default'
bucket='BUCKET_NAME'
path='yourpath'
def lambda_handler(event, context):
#Run query in Athena
client = boto3.client('athena')
output = "s3://{}/{}".format(bucket,path)
# Execution
response = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': DATABASE
},
ResultConfiguration={
'OutputLocation': output,
}
)
#S3 file name uses the QueryExecutionId so
#grab it here so we can pull the S3 file.
qeid = response["QueryExecutionId"]
#occasionally the Athena hasn't written the file
#before the lambda tries to pull it out of S3, so pause a few seconds
#Note: You are charged for time the lambda is running.
#A more elegant but more complicated solution would try to get the
#file first then sleep.
time.sleep(3)
###### Get query result from S3.
s3 = boto3.client('s3');
objectkey = path + "/" + qeid + ".csv"
#load object as file
file_content = s3.get_object(
Bucket=bucket,
Key=objectkey)["Body"].read()
#split file on carriage returns
lines = file_content.decode().splitlines()
#get the second line in file
count = lines[1]
#remove double quotes
count = count.replace("\"", "")
#convert string to int since cloudwatch wants numeric for value
count = int(count)
#post query results as a CloudWatch metric
cloudwatch = boto3.client('cloudwatch')
response = cloudwatch.put_metric_data(
MetricData = [
{
'MetricName': 'MyMetric',
'Dimensions': [
{
'Name': 'DIM1',
'Value': 'dim1'
},
],
'Unit': 'None',
'Value': count
},
],
Namespace = 'MyMetricNS'
)
return response
return