I want to add an item only if it does not exist. I am not sure how to do it. Currently I am adding successfully without checking the condition (it adds regardless if the item exists). The code is:
const params = {
TableName: MY_TABLE,
Item: myItem
};
documentClient.put(params).promise()
.catch(err => { Logger.error(`DB Error: put in table failed: ${err}`) });
}
What do I need to add in order to make the code check if the item exists and if it does, just return?
Note: I do not want to use the database mapper. I want the code to the be written using the AWS.DynamoDB class
DynamoDB supports Conditional Writes, allowing you to define a check which needs to be successfull, for the item to be inserted/updated.
DynamoDB is not an SQL database (hopefully you know this...) and does not offer the full set of ACID guarantees. One of the things you can't do with it is an atomic "check and write" - all you can do is fetch the item, see if it exists and then put the item. However, if done other process wires the item to the table between your "get" and your "write", you won't know anything about it.
If you absolutely need this kind of behaviour, DynamoDB is not the right solution.
Related
I want to update the Sort Key of the following DynamoDB item from FILE#789 to FILE#790. From DynamoDB docs as well as some StackOverflow answers, the right way to do this is to delete the existing item (DeleteItem) and recreate it (PutItem) with the updated primary key, performed with TransactWriteItems.
Item: {
"PK": {
"S": "USER#123"
},
"SK": {
"S": "FILE#789"
},
"fileName": {
"S": "Construction"
}
}
DeleteItem is straightforward, since I know the composite Primary Key's values. But to re-create the item, I need the most current values of the item's attributes and then perform a PutItem. Reading the item separately, and then performing a DeleteItem and PutItem within a transaction does not guarantee that the most current values of the item's attributes are used to re-create the item. What is the recommended way to handle an update of an item's Primary Key in this scenario?
I'd use a variation of Optimistic Locking here. I've written a blog about the concept a while ago if you want to get into details, but here's the basic summary:
Each item gets a version counter that's incremented once you update it. When you want to update an item, you first read it and store the current version number. Then you perform your update and increment the version number. Now you need to write it back to the table and this is when you add a condition. The condition is that the current version of the item needs to be the same as it was when you read the item originally. If that's not the case the write should fail, because somebody else modified the item in the mean time.
Applied to your use case that means:
Read the item and keep track of the version
Update the item locally with the new key and version
Prepare your transaction:
The DeleteItem should be conditional with a check to make sure the version number is identical to 1)
The PutItem can be without conditions
You use TransactWrite to store the data
This way the transaction will fail if the original item has been updated while you were changing it. If it fails, you start at 1) again.
I want to put an order from my lex bot into dynamoDB however the PutItem operation overwrites each time(If the customer name is already in the table).
I know from the documentation that it will do this if the primary key is the same.
My goal is to have each order put into the database so they will be easily searchable in the future.
I have attached some screenshots below. Any help is appreciated
https://imgur.com/a/mLpEkOi
def putDynam(orderNum, table_custName, slotKey, slotVal):
client = boto3.resource('dynamodb')
table = client.Table('blah')
input = {'Customer': table_custName, 'OrderNumber':orderNum[0], 'Bun Type': slotVal[5], 'CheeseDecision': slotVal[1], 'Cheese Type': slotVal[0], 'Pickles': slotVal[4], 'SauceDecision': slotVal[3], 'Sauce Type': slotVal[2]}
action = table.put_item(Item=input)
The primary key is used for identifying each item in the table. There can only be 1 record with a specific primary key (primary keys are unique).
Customer name is not a good primary key, because it's not unique.
In this case you could have an order with some generated Id (orderNumber in your example?), that could be the primary key, and Customer (preferably CustomerId) as a property.
Or you could have a composite primary key made up of CustomerId and OrderId.
If you want to query orders by customer, you could use an index if it's not in the primary key.
I recommend you read up on how DynamoDB works first. You can start with this data modelling tutorial from AWS.
So, basically, the customer name has to be unique, since it's your Primary Key. You can't have two rows with the same primary key. A way could be to have an incremental value that serves as id, and each insert would simply have i+1 as its id.
You can see this stack overflow question for more information: https://stackoverflow.com/a/12460690/11593346
Per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.put_item, you can:
"perform a conditional put operation (add a new item if one with the specified primary key doesn't exist)"
Note
To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
Also see DynamoDB: updateItem only if it already exists
If you really need to know whether the item exists or not so you can trigger your exception logic, then run a query first to see if the item already exists and don't even call put_item. You can also explore whether using a combination of ConditionExpression and one of the ReturnValues options (for put_item or update_item) may return enough data for you to know if an item existed.
I want to check whether id(primary-key) of new item is uniq or not before adding into dynamoDB
what could be best option for both performance and cost wise.
Possible options to check uniqueness of primary-key can be...
1) Get (if empty array returns, it means there are no matching data. which also means it is uniq)
2) Scan (obvious, worst idea for both performance and cost)
3) Query
++ my another thought is, if there has any way to forcibly ignore incoming request in DynamoDB settings(discard incoming request or send error message), logic could be much simpler.
In normal RDB, if we try to add new item with existing primary key, Database will return error message without changing original data stored in database.
however, in DynamoDB, whether we Put item or Update item with existing primary key, it just silently changes original data stored in database.
have any idea?
As you mentioned, DynamoDB will update an item with the primary key you provide if it already exists. The article below shows you how you can make a conditional PUT request which will fail upon trying to insert an item that already exists (based on the primary key).
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_PutItem.html
To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
When using ConditionExpression in a DynamoDB request, and the condition is not met, the entire request will fail. I am using conditional updates, and the fact that ConditionCheckFailedException doesn't contain any information about which condition failed is giving me a hard time.
For example consider this scenario: There's an item in a table like this:
{
state: 'ONGOING'
foo: 'FOO',
bar: 'BAR'
}
I then want to update this item, changing both foo and state:
ExpressionAttributeValues: {
:STATE_FINISHED: 'FINISHED',
:FOO: 'NEW FOO'
},
UpdateExpression: 'SET state=:STATE_FINISHED, foo=:FOO',
However, my application has a logical transition order of states, and to prevent concurrency issues where two requests concurrently modify an item and causing an inconsistent state, I add a condition to make sure only valid transitions of state are accepted:
ExpressionAttributeValues: {
:STATE_ONGOING: 'ONGOING'
},
ConditionExpression: 'state = :STATE_ONGOING'
This e.g. prevents two concurrent requests from modifying state into FINISHED and CANCELLED at the same time.
This is all fine when there's only one condition; if the request fails I know it was because an invalid state transition and I can choose whether to just fail the request, or to make a new request that only modifies FOO, whatever makes sense in my application. But if I have multiple conditions in one request, it seems impossible to find out which particular condition failed, which means I need to fail the entire request or divide it into multiple separate requests, updating one conditional value at a time. This can however raise new concurrency issues.
Has anyone found a decent solution to a similar problem?
Ideally what I'd want is to be able to make a UpdateExpression that modifies a certain attribute conditionally, otherwise ignoring it, or by using a custom function that returns the new value based on the old value and the suggested updated value, similar to an SQL UPDATE with an embedded SELECT .. CASE .... Is anything like this possible?
Or, is it at least possible to get more information out of a ConditionalCheckFailedException (such as which particular condition failed)?
As you mentioned, the DynamoDB doesn't provide granular level error message if there are multiple fields available on ConditionalExpression. I am not addressing this part of the question in my answer.
I would like to address the second part i.e. returning the old/new value.
The ReturnValues parameter can be used to get the desired values based on your requirement. You can set one of these values to get the required values.
New value - should already be available
Old value - To get old value, you can either use UPDATED_OLD or ALL_OLD
ReturnValues: NONE | ALL_OLD | UPDATED_OLD | ALL_NEW | UPDATED_NEW,
ALL_OLD - Returns all of the attributes of the item, as they appeared
before the UpdateItem operation.
UPDATED_OLD - Returns only the
updated attributes, as they appeared before the UpdateItem operation.
ALL_NEW - Returns all of the attributes of the item, as they appear
after the UpdateItem operation.
UPDATED_NEW - Returns only the updated
attributes, as they appear after the UpdateItem operation.
I'm using DynamoDB and I need to update a specific attribute on multiple records. Writing my requirement in pseudo-language I would like to do an update that says "update table persons set relationshipStatus = 'married' where personKey IN (key1, key2, key3, ...)" (assuming that personKey is the KEY in my DynamoDB table).
In other words, I want to do an update with an IN-clause, or I suppose one could call it a batch update. I have found this link that asks explicitly if an operation like a batch update exists and the answer there is that it does not. It does not mention IN-clauses, however. The documentation shows that IN-clauses are supported in ConditionalExpressions (100 values can be supplied at a time). However, I am not sure if such an IN-clause is suitable for my situation because I still need to supply a mandatory KEY attribute (which expects a single value it seems - I might be wrong) and I am worried that it will do a full table scan for each update.
So my question is: how do I achieve an update on multiple DynamoDB records at the same time? At the moment it almost looks like I will have to call an update statement for each Key one-by-one and that just feels really wrong...
As you noted, DynamoDB does not support a batch update operation. You would need to query for, and obtain the keys for all the records you want to update. Then loop through that list, updating each item one at a time.
You can use TransactWriteItems action to update multiple records in DynamoDB table.
The official documentation available here, also you can see TransactWriteItems javascript/nodejs example here.
I don't know if it has changed since the answer was given but it's possible now
See the docs:
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html
I have used it like this in javascript (mapping the new blocks to an array of objects with the wanted structure:
let params = {}
let tableName = 'Blocks';
params.RequestItems[tableName] = _.map(newBlocks, block => {
return {
PutRequest: {
Item: {
'org_id': orgId,
'block_id': block.block_id,
'block_text': block.block_text
},
ConditionExpression: 'org_id <> :orgId AND block_id <> :block_id',
ExpressionAttributeValues: {
':orgId': orgId,
':block_id': block.block_id
}
}
}
})
docClient.batchWrite(params, function(err, data) {
.... and do stuff with the result
You can even mix puts and deletes
And if your using dynogels (you cant mix em due to dynogels support but what you can do is for updating (use create because behind the scenes it casts to the batchWrite function as put's)
var item1 = {email: 'foo1#example.com', name: 'Foo 1', age: 10};
var item2 = {email: 'foo2#example.com', name: 'Foo 2', age: 20};
var item3 = {email: 'foo3#example.com', name: 'Foo 3', age: 30};
Account.create([item1, item2, item3], function (err, acccounts) {
console.log('created 3 accounts in DynamoDB', accounts);
});
Note this from DynamoDB limitations (from the docs):
The BatchWriteItem operation puts or deletes multiple items in one or more tables. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Individual items to be written can be as large as 400 KB.
If i remember correctly i think dynogels is chunking the requests into chucks of 25 before sending them off and then collecting them in one promise and returns (though im not 100% certain on this) - otherwise a wrapper function would be pretty simple to assemble
DynamoDb is not designed as relational DB to support the native transaction. It is better to design the schema to avoid the situation of multiple updates at the first place. Or if it is not practical in your case, please keep in mind you may improve it when restructuring the design.
The only way to update multiple items at the same time is use TransactionWrite operation provided by DynamoDB. But it comes with a limitation (25 at most for example). So keep in mind with that, you probable should do some limitation in your application as well. In spite of being very costly (because of the implementation involving some consensus algorithm), it is still mush faster than a simple loop. And it gives you ACID property, which is probably we need most. Think of a situation using loop, if one of the updates fails, how do you deal with the failure? Is it possible to rollback all changes without causing some race condition? Are the updates idempotent? It really depends on the nature of your application of cause. Be careful.
Another option is to use the thread pool to do the network I/O job, which can definitely save a lot of time, but it also has the same failure-and-rollback issue to think about.