put-item: Creates a new item, or replaces an old item with a new item
update-item: Edits an existing item's attributes, or adds a new item to the table if it does not already exist.
When I used update-item with a new partition key which did not exist in the table, it created the item. The same thing happened with put-item.
So what is the different between put-item and update-item?
Thanks.
The difference is subtle and it has to do with the scenario when the item already exists in the table.
PutItem will always act as if the item did not exist in the table at all, recreating it entirely with the contents of the new item.
UpdateItem on the other hand, in the case when the item already exists, will not completely recreate/replace the item but instead it will update the attributes of the existing item based on the contents of the new item. The behavior can be configured to merge or remove attributes from the existing item.
I hope this makes sense but think of PutItem as “I don’t care what’s there, make it look like what I’m telling you” vs. UpdateItem which is more like “modify the item, if it exists, to add/remove attributes”
Related
As stated in this question, I've assumed that you can't have something like updated date as the sort key of a table, because if you update you will create a duplicate record.
Further, I've always assumed that the same thing applied to a GSI using updated date. But in my scenario I have the updated date as a sort key on a GSI, and no new records are created when I update the original item.
To recap, the attributes and key schema are:
Attributes:
Id
MySortKey
MyComputedField
UpdatedDate
Table:
PartitionKey: Id
SortKey: MySortKey
GSI:
PartitionKey: MyComputedField
SortKey: UpdatedDate
My question is, am I indirectly affecting the performance of the index by doing this? Or are there any other issues caused by this pattern that I'm not aware of?
Global Secondary Indexes are separate tables under the hood and changed items from the primary table are replicated to it.
As you observed correctly, you can use a changing attribute as the sort key in a GSI without that resulting in duplicates once you write to the base table.
Note, that there is no guarantee of uniqueness in the GSI, i.e. you can have more than one item with the same key attributes.
In addition to that you can only do eventually consistent reads from GSIs.
GSIs also have their own read and write capacity units that you need to provision and if you change items in the base table that need to be replicated, the operation will consume write capacity units on the GSI.
Reads are separate from that.
The RCUs on the GSI remain unaffected from the writes to the table.
But if you often change items, you may see some inconsistencies for a very brief period of time (that's why only eventually consistent reads are possible).
That means you can use the patterns if you can live with the side effects I mentioned.
I want to put an order from my lex bot into dynamoDB however the PutItem operation overwrites each time(If the customer name is already in the table).
I know from the documentation that it will do this if the primary key is the same.
My goal is to have each order put into the database so they will be easily searchable in the future.
I have attached some screenshots below. Any help is appreciated
https://imgur.com/a/mLpEkOi
def putDynam(orderNum, table_custName, slotKey, slotVal):
client = boto3.resource('dynamodb')
table = client.Table('blah')
input = {'Customer': table_custName, 'OrderNumber':orderNum[0], 'Bun Type': slotVal[5], 'CheeseDecision': slotVal[1], 'Cheese Type': slotVal[0], 'Pickles': slotVal[4], 'SauceDecision': slotVal[3], 'Sauce Type': slotVal[2]}
action = table.put_item(Item=input)
The primary key is used for identifying each item in the table. There can only be 1 record with a specific primary key (primary keys are unique).
Customer name is not a good primary key, because it's not unique.
In this case you could have an order with some generated Id (orderNumber in your example?), that could be the primary key, and Customer (preferably CustomerId) as a property.
Or you could have a composite primary key made up of CustomerId and OrderId.
If you want to query orders by customer, you could use an index if it's not in the primary key.
I recommend you read up on how DynamoDB works first. You can start with this data modelling tutorial from AWS.
So, basically, the customer name has to be unique, since it's your Primary Key. You can't have two rows with the same primary key. A way could be to have an incremental value that serves as id, and each insert would simply have i+1 as its id.
You can see this stack overflow question for more information: https://stackoverflow.com/a/12460690/11593346
Per https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Client.put_item, you can:
"perform a conditional put operation (add a new item if one with the specified primary key doesn't exist)"
Note
To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
Also see DynamoDB: updateItem only if it already exists
If you really need to know whether the item exists or not so you can trigger your exception logic, then run a query first to see if the item already exists and don't even call put_item. You can also explore whether using a combination of ConditionExpression and one of the ReturnValues options (for put_item or update_item) may return enough data for you to know if an item existed.
so I want to delete elements from my dyanmodb table, however, I want to delete them based on a filter expression. So if the "city" attribute is "London", the item should be deleted.
The solutions I found are all requiring exactly specifying the key, but in this case, the primary key is just a random number, so it would be hard to get them all.
Thank you for your help in advance.
The only way to delete items from a DynamoDB table is by specifying the key. If you want to delete based on an expression, you first have to scan or query the items which satisfy that expression and then delete all items returned from that scan or query by their key.
I want to check whether id(primary-key) of new item is uniq or not before adding into dynamoDB
what could be best option for both performance and cost wise.
Possible options to check uniqueness of primary-key can be...
1) Get (if empty array returns, it means there are no matching data. which also means it is uniq)
2) Scan (obvious, worst idea for both performance and cost)
3) Query
++ my another thought is, if there has any way to forcibly ignore incoming request in DynamoDB settings(discard incoming request or send error message), logic could be much simpler.
In normal RDB, if we try to add new item with existing primary key, Database will return error message without changing original data stored in database.
however, in DynamoDB, whether we Put item or Update item with existing primary key, it just silently changes original data stored in database.
have any idea?
As you mentioned, DynamoDB will update an item with the primary key you provide if it already exists. The article below shows you how you can make a conditional PUT request which will fail upon trying to insert an item that already exists (based on the primary key).
http://docs.amazonwebservices.com/amazondynamodb/latest/developerguide/API_PutItem.html
To prevent a new item from replacing an existing item, use a conditional expression that contains the attribute_not_exists function with the name of the attribute being used as the partition key for the table. Since every record must contain that attribute, the attribute_not_exists function will only succeed if no matching item exists.
I know that this can be done with a full table scan & inspecting all records for the presence of attributes. Is there a less painful way ?
No, there isn't. This is one of the trade-offs of DynamoDB.
If there was a way to do this, then storing a new item with a new attribute would have to update something else, somewhere else, that remembered all of the attributes that were present in the table.