Adding a GSI to a DynamoDB table with terraform - amazon-web-services

It seems that modifying a dynamo table is not allowed with terraform.
How would I go about adding a GSI to an existing table with terraform? I was previously doing it in python (with boto3 update_table), but now trying to do it in Terraform. I don't want to lose data nor to have to do it manually.
This is the error I keep getting:
Error: error creating DynamoDB Table: ResourceInUseException: Table already exists
And my code:
# Create a DynamoDB table with GSIs.
resource "aws_dynamodb_table" "table" {
name = var.function_name
billing_mode = "PAY_PER_REQUEST"
hash_key = "PK" # partition key
range_key = "SK" # sort key
# Partition Key.
attribute {
name = "PK"
type = "S"
}
# Sort Key (datetime in UTC).
attribute {
name = "SK"
type = "S"
}
# Date (no time).
attribute {
name = "date"
type = "S"
}
# Define a GSI.
global_secondary_index {
name = "date-index"
hash_key = "date" # partition key
range_key = "SK"
projection_type = "ALL"
}
}

You have to first import your table into terraform. Only then you will be able to edit it using TF.

Related

Issues when I try to configure an AWS Athena Iceberg table using Terraform

I have the following Terraform configuration to create an AWS Athena Iceberg table:
resource "aws_glue_catalog_table" "my_table" {
name = "my_table"
database_name = "my_db"
table_type = "TABLE"
parameters = {
"table_type" = "iceberg"
}
storage_descriptor {
location = "s3://my_data/my_table"
input_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"
ser_de_info {
name = "my-table-stream"
serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
parameters = {
"serialization.format" = 1
}
}
columns {
name = "some_date"
type = "date"
}
columns {
name = "some_name"
type = "string"
}
columns {
name = "some_count"
type = "bigint"
}
}
}
when that is run the table is created as expected. However when I run select * from my_table from the Athena Query Editor I get the following error:
GENERIC_USER_ERROR: Detected Iceberg type table without metadata
location. Please make sure an Iceberg-enabled compute engine such as
Athena or EMR Spark is used to create the table, or the table is
created by using the Iceberg open source library. Setting table_type
parameter in Glue metastore to create an Iceberg table is not
supported.
How to properly configure an Athena Iceberg table with Terraform?

Parse schema of a dynamic dataframe in AWS Glue

I have a dynamic dataframe in AWS glue which I created using the below piece of code.
val rawDynamicDataFrame = glueContext.getCatalogSource(
database = rawDBName,
tableName = rawTableName,
redshiftTmpDir = "",
transformationContext = "rawDynamicDataFrame"
).getDynamicFrame()
In order to get the schema of the above dynamic frame, I used the below piece of code:
val x = rawDynamicDataFrame.schema
Now x is of type com.amazonaws.services.glue.schema.Schema. How can I parse the schema object?
To check if a field exist in schema use containsField(fieldPath):
if (rawDynamicDataFrame.schema.containsField("app_name")) {
// do something
}
Maybe you can use field_names = [field.name for field in self. rawDynamicDataFrame.schema().fields] to get a list of field names.

Filtering by non-key field in DynamoDB (aws-cli)

​i'm trying to query a number field in DynamoDB through aws-cli.
It forces me to set the key (userId) to be something, although I want to retrieve all the users where the queriedField equals to 0. this is the syntax:
​
aws dynamodb query
--table-name TableName
--key-condition-expression "userId = :userid"
--filter-expression "mapAttr.queriedField = :num"
--expression-attribute-values '{ ":userid": { "S": "<AccountID>" }, ":num" : { "N": "0" }}'
​
In order to do this query, you will have to scan the whole table with your filter expression.
However, if this is for something that is still in development/design, consider making the 'number field' a top level attribute. That will allow you to create a GSI with a hash key of 'number field', and project the userId attribute to the GSI. Alternatively, you can use Global Secondary Index Write Sharding for Selective Table Queries

AWS dynamoDB query by date

I am trying query the values in dynamoDB and I have still the error.
Date is ISO-8601 (String) = createdAt and is it sort key.
My params:
{
TableName: 'Pool',
ExpressionAttributeValues: { ':oin': 'lol', ':from': '2017-12-16T20:26:02.594Z' },
KeyConditionExpression: 'oin = :oin',
ConditionExpression: 'createdAt >= :from',
ProjectionExpression: 'createdAt, h10m, h30m, h1h, h24h, accepted, stale, dupl, oth',
ScanIndexForward: false
}
I try GE with same result.
I generate the date with this code in Node.js:
var date = new Date();
date.setHours(date.getHours()-24);
var dateiso = date.toISOString();
I get following error:
ValidationException: Value provided in ExpressionAttributeValues unused in expressions: keys: {:from}
Any idea how to solve ConditionExpression? Thank you
The error message seems to indicate that you have an unused ExpressionAttributeValue of :from.
The ConditionExpression attribute that you specified can only be used for an Update, Delete and PutItem operation. If the createdAt attribute is a sort key, you want to specify that in the KeyConditionExpression along with oin.
For example:
KeyConditionExpression: 'oin = :oin AND createdAt >= :from'

DynamoDB Can I query just the values of a GSI Index when it is a GSI with sort Key

I have a DynamoDB Table "Music". On this it has a GSI with partition key "Category" and sort key "UserRating".
I can query easily as an example for songs that are in "Category" = "Rap" and "UserRating" = 1
What I would like to do is query and just get back all the "Categories". As this is a a GSI and the partition key I have heard you can do it but I am not sure how.
Is it possible or will I have to create a separate GSI on "Category" without the sort key.
Thanks for your help.
When you don't want to filter by key. You may need to scan the index. The below solution is scanning the index to get all the category (not all distinct category).
Please find below the Java code to get all the Category from GSI. Replace the secondary index name in the below code accordingly.
List<String> categoryList = new ArrayList<>();
DynamoDB dynamoDB = new DynamoDB(dynamoDBClient);
Table table = dynamoDB.getTable("Music");
Index index = table.getIndex("Secondary Index Name");
ItemCollection<ScanOutcome> items = null;
ScanSpec scanSpec = new ScanSpec().withSelect(Select.SPECIFIC_ATTRIBUTES).withAttributesToGet("Category");
items = index.scan(scanSpec);
Iterator<Item> pageIterator = items.iterator();
while (pageIterator.hasNext() ) {
categoryList.add(pageIterator.next().getString("Category"));
}