Update DynamoDB item with optional operation - amazon-web-services

Is the following operation implementable in DynamoDB:
On providing the key,
SET status as "active".
If the attribute additional_info starts_with "xyz"
Then REMOVE additional_info
Else do nothing to additional_info
Basically, an update where we override value of status regardless of additional_info. And optionally unset additional_info as an attribute if a condition is satisfied?

It cannot be done in one call. It also can't be done in a single transaction wrapping the two calls into one, because a transaction can't repeatedly reference the same item.
If you can just make two calls, that's the way to go. If you need it to be somewhat transactional, you'll need to control things in your app. The details depend on your requirements.

Related

Condition check and Put on different tables in one DDB network call

Here are my tables:
Table1
Id (String, composite PK partition key)
IdTwo (String, composite PK sort key)
Table2
IdTwo (String, simple PK partition key)
Timestamp (Number)
I want to PutItem in Table1 only if IdTwo does not exist in Table2 or the item in Table2 with the same IdTwo has Timestamp less than the current time (can be given as outside input).
The simple approach I know would work is:
GetItem on Table2 with ConsistentRead=true. If item exists or its Timestamp < current time, exit early.
PutItem on Table1.
However, this is two network calls to DDB. I'd prefer optimizing it, like using TransactWriteItems which is one network call. Is it possible for my use case?
If you want to share code, I'd prefer Go, but any language is fine.
First off, the operation you're looking for is TransactWriteItems - https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TransactWriteItems.html
This is the API operation that lets you do atomic and transactional conditional writing operations. There's two parts to your question, not sure they can be done together—but then they might not need to be.
The first part, insert in table1 if condition is met in table2 is simple enough—you add the item you want in table1 in the Put section of the API call, and phrase the existence check for table2 in the ConditionCheck section.
You can't do multiple checks right now, so the check to see if the timestamp is lower than current time is another separate operation, also in the ConditionCheck. You can't combine them together or do just one because of your rules.
I'd suggest doing a bit of optimistic concurrency here. Try the TransactWriteItems with the second ConditionCheck, where the write will succeed only if the timestamp is less than current time. This is what should happen in most cases. If the transaction fails, now you need to check if it failed because the timestamp was lower or because the item doesn't yet exist.
If it doesn't yet exist, then do a TransactWiteItems where you populate the timestamp with a ConditionCheck to make sure it doesn't exist (another thread might have written it in the meantime) and then retry the first operation.
You basically want to keep retrying the first operation (write with condition check to make sure timestamp is lower) until it succeeds or fails for a good reason. If it fails because the data is uninitialized, initizalize it taking into account race conditions and then try again.

Put in table if item does not exist in DynamoDB

I want to add an item only if it does not exist. I am not sure how to do it. Currently I am adding successfully without checking the condition (it adds regardless if the item exists). The code is:
const params = {
TableName: MY_TABLE,
Item: myItem
};
documentClient.put(params).promise()
.catch(err => { Logger.error(`DB Error: put in table failed: ${err}`) });
}
What do I need to add in order to make the code check if the item exists and if it does, just return?
Note: I do not want to use the database mapper. I want the code to the be written using the AWS.DynamoDB class
DynamoDB supports Conditional Writes, allowing you to define a check which needs to be successfull, for the item to be inserted/updated.
DynamoDB is not an SQL database (hopefully you know this...) and does not offer the full set of ACID guarantees. One of the things you can't do with it is an atomic "check and write" - all you can do is fetch the item, see if it exists and then put the item. However, if done other process wires the item to the table between your "get" and your "write", you won't know anything about it.
If you absolutely need this kind of behaviour, DynamoDB is not the right solution.

Graceful handling of ConditionCheckFailedException with DynamoDB

When using ConditionExpression in a DynamoDB request, and the condition is not met, the entire request will fail. I am using conditional updates, and the fact that ConditionCheckFailedException doesn't contain any information about which condition failed is giving me a hard time.
For example consider this scenario: There's an item in a table like this:
{
state: 'ONGOING'
foo: 'FOO',
bar: 'BAR'
}
I then want to update this item, changing both foo and state:
ExpressionAttributeValues: {
:STATE_FINISHED: 'FINISHED',
:FOO: 'NEW FOO'
},
UpdateExpression: 'SET state=:STATE_FINISHED, foo=:FOO',
However, my application has a logical transition order of states, and to prevent concurrency issues where two requests concurrently modify an item and causing an inconsistent state, I add a condition to make sure only valid transitions of state are accepted:
ExpressionAttributeValues: {
:STATE_ONGOING: 'ONGOING'
},
ConditionExpression: 'state = :STATE_ONGOING'
This e.g. prevents two concurrent requests from modifying state into FINISHED and CANCELLED at the same time.
This is all fine when there's only one condition; if the request fails I know it was because an invalid state transition and I can choose whether to just fail the request, or to make a new request that only modifies FOO, whatever makes sense in my application. But if I have multiple conditions in one request, it seems impossible to find out which particular condition failed, which means I need to fail the entire request or divide it into multiple separate requests, updating one conditional value at a time. This can however raise new concurrency issues.
Has anyone found a decent solution to a similar problem?
Ideally what I'd want is to be able to make a UpdateExpression that modifies a certain attribute conditionally, otherwise ignoring it, or by using a custom function that returns the new value based on the old value and the suggested updated value, similar to an SQL UPDATE with an embedded SELECT .. CASE .... Is anything like this possible?
Or, is it at least possible to get more information out of a ConditionalCheckFailedException (such as which particular condition failed)?
As you mentioned, the DynamoDB doesn't provide granular level error message if there are multiple fields available on ConditionalExpression. I am not addressing this part of the question in my answer.
I would like to address the second part i.e. returning the old/new value.
The ReturnValues parameter can be used to get the desired values based on your requirement. You can set one of these values to get the required values.
New value - should already be available
Old value - To get old value, you can either use UPDATED_OLD or ALL_OLD
ReturnValues: NONE | ALL_OLD | UPDATED_OLD | ALL_NEW | UPDATED_NEW,
ALL_OLD - Returns all of the attributes of the item, as they appeared
before the UpdateItem operation.
UPDATED_OLD - Returns only the
updated attributes, as they appeared before the UpdateItem operation.
ALL_NEW - Returns all of the attributes of the item, as they appear
after the UpdateItem operation.
UPDATED_NEW - Returns only the updated
attributes, as they appear after the UpdateItem operation.

Can I check if value was changed in prePersist()?

I want to automatically set one of the entity fields if it was not manually set. Is there a way to check this? The fields has a default value, so I can not simply compare the value. I was wondering if doctrine maintains whether or not value is changed and if I can access that information.
Also, Is prePersist in Doctrine 2 equivalent of preInsert in Doctrine 1? How can I make sure I only run code on create statement?
Thanks
The prePersist event (docs about prePersist) is triggered when you call Doctrine\ORM\EntityManager#persist on an entity.
If you need to check for changes to an entity, I suggest you to check for the onFlush event (docs about onFlush). There you can obtain any changes you have applied to the entity using the Doctrine\ORM\UnitOfWork API. Changes tracking on an entity happens after calling Doctrine\ORM\EntityManager#persist

How to handle expired items?

My site allows users to post things on the site with an expiry date. Once the item has expired, it will no longer be displayed in the listings. Posts can also be closed, canceled, or completed. I think it would be be nicest just to be able to check for one attribute or status ("is active") rather than having to check for [is not expired, is not completed, is not closed, is not canceled]. Handling the rest of those is easy because I can just have one "status" field which is essentially an enum, but AFAIK, it's impossible to set the status to "expired" as soon as that time occurs. How do people typically handle this?
Edit: I'm not asking how to write a query to find expired items; I'm asking how I can find the "active" (unexpired items that meet a few other boolean conditions) without having to use a big nasty query every time I want to find them.
I think that can be managed with cronjob and django custom management command, is just an idea.
Make the item have birth and death (type:date) columns and a status column (completed, removed, to be expired...).
Update/fill the death column when you want to logically end the lifecycle of an item (for whatever reason: expiry, completed, ...). Update the status column accordingly.
Querying for active items (in pseudo-SQL):
select * from mytable where birth <= todays_date <= death or death is null
It sounds like the expiration date/time is the field that you need to actually store and make decisions based off of. IsActive sounds like something you would calculate on the fly based on the expiration date and possibly other fields (even though its a pain)
IsActive as a field would probably work better if it wasn't a product of some other information like the expiration date, but was valid on its own, such as if a user manually set the status to "active" or "not active".