Querying DynamoDb with Global Secondary Index Error - amazon-web-services

I'm new to DynamoDb and the intricacies of querying it - I understand (hopefully correctly) that I need to either have a partition Key or Global Secondary Index (GSI) in order to query against that value in the table.
I know I can use Appsync to query on a GSI by setting up a resolver - and this works. However, I have a setup using the Java AWS CDK (I'm writing in Kotlin) where I'm using Appsync and routing my queries into lambda Resolvers (so that once this works, I can do more complicated things later).
The Crux of the issue is that when I setup a Lambda to resolve my query, I end up with this Error msg: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Query condition missed key schema element: testName returned from the Lambda.
I think these should be the key snippets..
My DynamoDbBean..
#DynamoDbBean
data class Test(
#get:DynamoDbPartitionKey var id: String = "",
#get:DynamoDbSecondaryPartitionKey(indexNames = ["testNameIndex"])
var testName: String = "",
)
Using the CDK I Created GSI on
testTable.addGlobalSecondaryIndex(
GlobalSecondaryIndexProps.builder()
.indexName("testNameIndex")
.partitionKey(
Attribute.builder()
.name("testName")
.type(AttributeType.STRING)
.build()
)
.projectionType(ProjectionType.ALL)
.build())
Then, within my Lambda I am trying to query my DynamoDb table, using a fixed value here testName = A.
My Sample data in the Test table would be like so..
{
"id" : "SomeUUID",
"testName" : "A"
}
private var client: AmazonDynamoDB = AmazonDynamoDBClientBuilder.standard().build()
private var dynamoDB: DynamoDB = DynamoDB(client)
Lambda Resolver Snippets...
val table: Table = dynamoDB.getTable(TABLE_NAME)
val index: Index = table.getIndex("testNameIndex")
...
QuerySpec().withKeyConditionExpression("testNameIndex = :testName")
.withValueMap(ValueMap().withString(":testName", "A"))
val iterator: Iterator<Item> = index.query(querySpec).iterator()
while (iterator.hasNext()) {
logger.info(iterator.next().toJSONPretty())
}
This is what results in this Error msg: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: Query condition missed key schema element: testName
Am I on the wrong lines here? I know there is some mixing of libs between the 'enhanced' Dynamo sdk and the dynamodbv2 sdk - so if there is a better way of doing this query, I'd love to know!
Thanks!

Your QuerySpec's withKeyConditionExpression is initialized wrong. You need to init it like.
not testNameIndex it should be testName
QuerySpec().withKeyConditionExpression("testName = :testName")
.withValueMap(ValueMap().withString(":testName", "A"))

Related

Dynamically Insert/Update Item in DynamoDB With Python Lambda using event['body']

I am working on a lambda function that gets called from API Gateway and updates information in dynamoDB. I have half of this working really dynamically, and im a little stuck on updating. Here is what im working with:
dynamoDB table with a partition key of guild_id
My dummy json code im using:
{
"guild_id": "126",
"guild_name": "Posted Guild",
"guild_premium": "true",
"guild_prefix": "z!"
}
Finally the lambda code:
import json
import boto3
def lambda_handler(event, context):
client = boto3.resource("dynamodb")
table = client.Table("guildtable")
itemData = json.loads(event['body'])
guild = table.get_item(Key={'guild_id':itemData['guild_id']})
#If Guild Exists, update
if 'Item' in guild:
table.update_item(Key=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Updated Guild!')
return responseObject
#New Guild, Insert Guild
table.put_item(Item=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Inserted Guild!')
return responseObject
The insert part is working wonderfully, How would I accomplish a similar approach with update item? Im wanting this to be as dynamic as possible so I can throw any json code (within reason) at it and it stores it in the database. I am wanting my update method to take into account adding fields down the road and handling those
I get the follow error:
Lambda execution failed with status 200 due to customer function error: An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema.
A "The provided key element does not match the schema" error means something is wrong with Key (= primary key). Your schema's primary key is guild_id: string. Non-key attributes belong in the AttributeUpdate parameter. See the docs.
Your itemdata appears to include non-key attributes. Also ensure guild_id is a string "123" and not a number type 123.
goodKey={"guild_id": "123"}
table.update_item(Key=goodKey, UpdateExpression="SET ...")
The docs have a full update_item example.

AWS AppSync RDS: $util.rds.toJSONObject() Nested Objects

I am using Amazon RDS with AppSync. I've created a resolver that join two tables to get One-to-One association between them and returns columns from both tables. What I would like to do is to be able to put nest some columns under a key in the resulting parsed JSON object evaluated using $util.rds.toJSONObject().
Here's the schema:
type Parent {
col1: String
col2: String
child: Child
}
type Child {
col3: String
col4: String
}
Here's the resolver:
{
"version": "2018-05-29",
"statements": [
"SELECT parent.*, child.col3 AS `child.col3`, child.col4 AS `child.col4` FROM parent LEFT JOIN child ON parent.col1 = child.col3"
]
}
I tried naming the resulting column with dot-syntax but, $util.rds.toJSONObject() doesn't put col3 and col4 under child key. The reason it should is because otherwise, Apollo won't be able to cache and parse the entity.
Note: Dot-syntax is not documented anywhere. Usually, some ORMs use dot-syntax technique to convert SQL rows to proper nested JSON objects.
The #Aaron_H comment and answer were useful for me, but the response mapping template provided in the answer didn't work for me.
I managed to get a working response mapping template for my case which is similar for the case in question. In images below you will find info for query -> message(id: ID) { ... } (one message and the associated user will be returned):
SQL request to user table;
SQL request to message table;
SQL JOIN tables request for message id=1;
GraphQL Schema;
Request and response templates;
AWS AppSync query.
https://github.com/xai1983kbu/apollo-server/blob/pulumi_appsync_2/bff_pulumi/graphql/resolvers/Query.message.js
Next example for query messages
https://github.com/xai1983kbu/apollo-server/blob/pulumi_appsync_2/bff_pulumi/graphql/resolvers/Query.messages.js
Assuming your resolver is expecting to return a list of Parent types, ie. [Parent!]!, you can write your response mapping template logic like this:
#if($ctx.error)
$util.error($ctx.error.message, $ctx.error.type)
#end
#set($output = $utils.rds.toJsonObject($ctx.result)[0])
## Make sure to handle instances where fields are null
## or don't exist according to your business logic
#foreach( $item in $output )
#set($item.child = {
"col3": $item.get("child.col3"),
"col4": $item.get("child.col4")
})
#end
$util.toJson($output)

Google Cloud Datastore query values in array

I have entities that look like that:
{
name: "Max",
nicknames: [
"bestuser"
]
}
how can I query for nicknames to get the name?
I have created the following index,
indexes:
- kind: users
properties:
- name: name
- name: nicknames
I use the node.js client library to query the nickname,
db.createQuery('default','users').filter('nicknames', '=', 'bestuser')
the response is only an empty array.
Is there a way to do that?
You need to actually fetch the query from datastore, not just create the query. I'm not familiar with the nodejs library, but this is the code given on the Google Cloud website:
datastore.runQuery(query).then(results => {
// Task entities found.
const tasks = results[0];
console.log('Tasks:');
tasks.forEach(task => console.log(task));
});
where query would be
const query = db.createQuery('default','users').filter('nicknames', '=', 'bestuser')
Check the documentation at https://cloud.google.com/datastore/docs/concepts/queries#datastore-datastore-run-query-nodejs
The first point to notice is that you don't need to create an index to this kind of search. No inequalities, no orders and no projections, so it is unnecessary.
As Reuben mentioned, you've created the query but you didn't run it.
ds.runQuery(query, (err, entities, info) => {
if (err) {
reject(err);
} else {
response.resultStatus = info.moreResults;
response.cursor = info.moreResults == TNoMoreResults? null: info.endCursor;
resolve(entities);
};
});
In my case, the response structure was made to collect information on the cursor (if there is more data than I've queried because I've limited the query size using limit) but you don't need to anything more than the resolve(entities)
If you are using the default namespace you need to remove it from your query. Your query needs to be like this:
const query = db.createQuery('users').filter('nicknames', '=', 'bestuser')
I read the entire plob as a string to get the bytes of a binary file here. I imagine you simply parse the Json per your requirement

DynamoDB: getting table description null

I need to have a query on DynamoDB.
Currently I made so far this code:
AWSCredentials creds = new DefaultAWSCredentialsProviderChain().getCredentials();
AmazonDynamoDBClient client = new AmazonDynamoDBClient(creds);
client.withRegion(Regions.US_WEST_2);
DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient(creds));
Table table = dynamoDB.getTable("dev");
QuerySpec spec = new QuerySpec().withKeyConditionExpression("tableKey = :none.json");
ItemCollection<QueryOutcome> items = table.query(spec);
System.out.println(table);
The returned value of table is: {dev: null}, which means the that teh description is null.
It's important to say that while i'm using AWS CLI with this command: aws dynamodb list-tables i'm getting a result of all the tables so if i'm also making the same operation over my code dynamoDB.listTables() is retrieving empty list.
Is there something that I'm doing wrong?
Do I need to define some more credentials before using DDB API ?
I was getting the same problem and landed here looking for a solution. As mentioned in javadoc of getDesciption
Returns the table description; or null if the table description has
not yet been described via {#link #describe()}. No network call.
Initially description is set to null. After the first call to describe(), which makes a network call, description gets set and getDescription can be used after that.

Copying one table to another in DynamoDB

What's the best way to identically copy one table over to a new one in DynamoDB?
(I'm not worried about atomicity).
Create a backup(backups option) and restore the table with a new table name. That would get all the data into the new table.
Note: Takes considerable amount of time depending on the table size
I just used the python script, dynamodb-copy-table, making sure my credentials were in some environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY), and it worked flawlessly. It even created the destination table for me.
python dynamodb-copy-table.py src_table dst_table
The default region is us-west-2, change it with the AWS_DEFAULT_REGION env variable.
AWS Pipeline provides a template which can be used for this purpose: "CrossRegion DynamoDB Copy"
See: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-crossregion-ddb-create.html
The result is a simple pipeline that looks like:
Although it's called CrossRegion you can easily use it for the same region as long the destination table name is different (Remember that table names are unique per account and region)
You can use Scan to read the data and save it to the new table.
On the AWS forums a guy from the AWS team posted another approach using EMR: How Do I Duplicate a Table?
Here's one solution to copy all items from one table to another, just using shell scripting, the AWS CLI and jq. Will work OK for smallish tables.
# exit on error
set -eo pipefail
# tables
TABLE_FROM=<table>
TABLE_TO=<table>
# read
aws dynamodb scan \
--table-name "$TABLE_FROM" \
--output json \
| jq "{ \"$TABLE_TO\": [ .Items[] | { PutRequest: { Item: . } } ] }" \
> "$TABLE_TO-payload.json"
# write
aws dynamodb batch-write-item --request-items file://"$TABLE_TO-payload.json"
# clean up
rm "$TABLE_TO-payload.json"
If you both tables to be identical, you'd want to delete all items in TABLE_TO first.
DynamoDB now supports importing from S3.
https://aws.amazon.com/blogs/database/amazon-dynamodb-can-now-import-amazon-s3-data-into-a-new-table/
So, probably in almost all use cases, the easiest and cheapest way to replicate a table is
Use "Export to S3" feature to dump entire table into S3. Since this uses backup to generate the dump, table's throughput is not affected, and is very fast as well. You need to have backups (PITR) enabled. See https://aws.amazon.com/blogs/aws/new-export-amazon-dynamodb-table-data-to-data-lake-amazon-s3/
Use "Import from S3" to import the dump created in step 1. This automatically requires you to create a new table.
Use this node js module : copy-dynamodb-table
This is a little script I made to copy the contents of one table to another.
It's based on the AWS-SDK v3. Not sure how well it would scale to big tables but as a quick and dirty solution it does the job.
It gets your AWS credentials from a profile in ~/.aws/credentials change default to the name of the profile you want to use.
Other than that it takes two args one for the source table and one for destination
const { fromIni } = require("#aws-sdk/credential-providers");
const { DynamoDBClient, ScanCommand, PutItemCommand } = require("#aws-sdk/client-dynamodb");
const ddbClient = new DynamoDBClient({
credentials: fromIni({profile: "default"}),
region: "eu-west-1",
});
const args = process.argv.slice(2);
console.log(args)
async function main() {
const { Items } = await ddbClient.send(
new ScanCommand({
TableName: args[0],
})
);
console.log("Successfully scanned table")
console.log("Copying", Items.length, "Items")
const putPromises = [];
Items.forEach((item) => {
putPromises.push(
ddbClient.send(
new PutItemCommand({
TableName: args[1],
Item: item,
})
)
);
});
await Promise.all(putPromises);
console.log("Successfully copied table")
}
main();
Usage
node copy-table.js <source_table_name> <destination_table_name>
Python + boto3 🚀
The script is idempotent as far as you maintain the same Keys.
import boto3
def migrate(source, target):
dynamo_client = boto3.client('dynamodb', region_name='us-east-1')
dynamo_target_client = boto3.client('dynamodb', region_name='us-west-2')
dynamo_paginator = dynamo_client.get_paginator('scan')
dynamo_response = dynamo_paginator.paginate(
TableName=source,
Select='ALL_ATTRIBUTES',
ReturnConsumedCapacity='NONE',
ConsistentRead=True
)
for page in dynamo_response:
for item in page['Items']:
dynamo_target_client.put_item(
TableName=target,
Item=item
)
if __name__ == '__main__':
migrate('awesome-v1', 'awesome-v2')
On November 29th, 2017 Global Tables was introduced. This may be useful depending on your use case, which may not be the same as the original question. Here are a few snippets from the blog post:
Global Tables – You can now create tables that are automatically replicated across two or more AWS Regions, with full support for multi-master writes, with a couple of clicks. This gives you the ability to build fast, massively scaled applications for a global user base without having to manage the replication process.
...
You do not need to make any changes to your existing code. You simply send write requests and eventually consistent read requests to a DynamoDB endpoint in any of the designated Regions (writes that are associated with strongly consistent reads should share a common endpoint). Behind the scenes, DynamoDB implements multi-master writes and ensures that the last write to a particular item prevails. When you use Global Tables, each item will include a timestamp attribute representing the time of the most recent write. Updates are propagated to other Regions asynchronously via DynamoDB Streams and are typically complete within one second (you can track this using the new ReplicationLatency and PendingReplicationCount metrics).
Another option is to download the table as a .csv file and upload it with the following snippet of code.
This also eliminates the need for providing your AWS credentials to a packages such as the one #ezzat suggests.
Create a new folder and add the following two files and your exported table
Edit uploadToDynamoDB.js and add the filename of the exported table and your table name
Run npm install in the folder
Run node uploadToDynamodb.js
File: Package.json
{
"name": "uploadtodynamodb",
"version": "1.0.0",
"description": "",
"main": "uploadToDynamoDB.js",
"author": "",
"license": "ISC",
"dependencies": {
"async": "^3.1.1",
"aws-sdk": "^2.624.0",
"csv-parse": "^4.8.5",
"fs": "0.0.1-security",
"lodash": "^4.17.15",
"uuid": "^3.4.0"
}
}
File: uploadToDynamoDB.js
var fs = require('fs');
var parse = require('csv-parse');
var async = require('async');
var _ = require('lodash')
var AWS = require('aws-sdk');
// If your table is in another region, make sure to update this
AWS.config.update({ region: "eu-central-1" });
var ddb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
var csv_filename = "./TABLE_CSV_EXPORT_FILENAME.csv";
var tableName = "TABLENAME"
function prepareData(data_chunk) {
const items = data_chunk.map(obj => {
const keys = Object.keys(obj)
let attr = Object.values(obj)
attr = attr.map(a => {
let newAttr;
// Can we make this an integer
if (isNaN(Number(a))) {
newAttr = { "S": a }
} else {
newAttr = { "N": a }
}
return newAttr
})
let item = _.zipObject(keys, attr)
return {
PutRequest: {
Item: item
}
}
})
var params = {
RequestItems: {
[tableName]: items
}
};
return params
}
rs = fs.createReadStream(csv_filename);
parser = parse({
columns : true,
delimiter : ','
}, function(err, data) {
var split_arrays = [], size = 25;
while (data.length > 0) {
split_arrays.push(data.splice(0, size));
}
data_imported = false;
chunk_no = 1;
async.each(split_arrays, function(item_data, callback) {
const params = prepareData(item_data)
ddb.batchWriteItem(
params,
function (err, data) {
if (err) {
console.log("Error", err);
} else {
console.log("Success", data);
}
});
}, function() {
// run after loops
console.log('all data imported....');
});
});
rs.pipe(parser);
It's been a very long time since the question was posted and AWS has been continuously improvising features. At the time of writing this answer, we have the option to export the Table to S3 bucket then use the import feature to import this data from S3 into a new table which automatically will re-create a new table with the data. Plese refer this blog for more idea on export & import
Best part is that you get to change the name, PK or SK.
Note: You have to enable PITR (might incur additional costs). Always best to refer documents.
Here is another simple python util script for this: ddb_table_copy.py. I use it often.
usage: ddb_table_copy.py [-h] [--dest-table DEST_TABLE] [--dest-file DEST_FILE] source_table
Copy all DynamoDB items from SOURCE_TABLE to either DEST_TABLE or DEST_FILE. Useful for migrating data during a stack teardown/re-creation.
positional arguments:
source_table Name of source table in DynamoDB.
optional arguments:
-h, --help show this help message and exit
--dest-table DEST_TABLE
Name of destination table in DynamoDB.
--dest-file DEST_FILE
2) a valid file path string to save the items to, e.g. 'items.json'.