Waiting for a table to be completely deleted - amazon-web-services

I have a table that has to be refreshed daily from an external source. All the recommendations I read say to delete the whole table and re-create it instead of deleting all the items.
I tried the suggested method, but the deleteTable function returns successful even though the table is still in a state of "Table is being deleted", as seen from the DynamoDB console. Sometimes this takes more than a minute.
What is the proper way of deleting and re-creating a table? Should I just keep trying createTable until the already exists error goes away?
I am using Node.js.
(The table is a list of some 5,000+ bus stops. The source doesn't specify how often the data changes nor give any indicator that there are changes. I found a small number of changes once every few weeks.)

If you are using boto3 (Python), there is a waiter called TableNotExists:
Polls DynamoDB.Client.describe_table() every 20 seconds until a successful state is reached. An error is returned after 25 failed checks.
Or, you could just do that polling yourself.

I would suggest changing the table name each day, using the current date as part of the table name. Then you can create the new table and start populating it without having to wait for the delete of the previous day's table to complete.

If the response from the createTable method is a Table already exists exception, the exception also contains a retryDelay property that is a number.
I can't find documentation on retryDelay but it seems to be a time duration in seconds.
I use the Table already exists exception to check that the table is not completely deleted, and, if not, back off for a period specified in the retryDelay property. After a few iterations, the table can be successfully created.
Sometimes the value in retryDelay can be more than 20.
This approach has worked without issues for me every time.

Related

Trying to find how a BigQuery table was deleted by searching the audit log

A big query table was accidentally deleted. Fortunately we sink all our BQ audit logs into a dataset.
But I'm seeing some unexpected results. I was not seeing any delete operations for the table, then I broadened the scope of the query and found I could not see any ops for the table in the last 90 days.
I want to confirm my query is doing what I think it does. If this returns nothing does it really mean this table has not been touched in the last 90 days?
WHERE DATE(timestamp) > timestamp_add(current_datetime, interval -90 day) AND
resource.labels.project_id = "myproject" AND
resource.type='bigquery_resource' AND
protopayload_auditlog.resourceName LIKE '%MyTable%'
LIMIT 10
I should add if I swap out MyTable with another table in the above query I can get results so I don't think it's a syntax issue.
Thinking about this more: could it be that the table was truncated in a way that was not considered an "admin" operation?
We sink the following logs into the dataset I'm searching:
cloudaudit_googleapis_com_activity
cloudaudit_googleapis_data_access
cloudaudit_googleapis_system_event
Syntax looks ok. I recommend you to try larger intervals to confirm the query behaves as expected. Assuming the table is not empty, by increasing the days you must see something eventually.
I was right, it was a different operation that removed the table. I found it in the system log. It was removed due to a InternalTableExpired event.
SELECT
resource.labels.project_id,
protopayload_auditlog.resourceName,
protopayload_auditlog.methodName,
protopayload_auditlog.authenticationInfo.principalEmail,
protopayload_auditlog.requestMetadata.callerIp
FROM
`bombora-bi-prod.BomboraAuditLogs.cloudaudit_googleapis_com_system_event`
WHERE
protopayload_auditlog.resourceName LIKE '%datasets/MyDataset/tables/MyTable%'
LIMIT 100

Work around DynamoDb Transaction Conflict Exception

My model represents users with unique names. In order to achieve that I store user and its name as 2 separate items using TransactWriteItems. The approximate structure looks like this:
PK | data
--------------------------------
userId#<userId> | {user data}
userName#<userName> | {userId: <userId>}
Data arrives to a lambda from a Kinesis stream. If one lambda invocation processes an "insert" event and another lambda request comes in about at the same time (the difference could be 5 milliseconds) the "update" event causes a TransactionConflictException: Transaction is ongoing for the item error.
Should I just re-try to run update again in a second or so? I couldn't really find a resolution strategy.
That implies you’re getting data about the same user in quick succession and both writes are hitting the same items. One succeeds while the other exceptions out.
Is it always duplicate data? If you’re sure it is, then you can ignore the second write. It would be a no-op.
Is it different data? Then you’ve got to decide how to handle that conflict. You’ll have one dataset in the database and a different dataset live in your code. That’s a business logic question not database question.

Bigtable's (Golang) admin client is erroring with "A DropRowRange operation is already ongoing"

When a user deletes a resource, I'd like to iterate through a few relevant rowRanges in Bigtable, and delete them. I have 3+ calls to admin.DropRowRange(ctx, table, rowKeyPrefix). Some of the calls are applied to the same table; each call is applied to a different rowRange.
The majority of the time this has worked. However, now I've received an error "a DropRowRange operation is already ongoing."
I haven't found this error documented anywhere, and I haven't found anyone else report it.
Are there constraints for how frequently I can call the function? Are the constraints general or for a given table or for a given rowRange? Are there any recommended workarounds?
It feels like the error could be due to the same operation getting retried twice, and possibly the DropRowRange operation isn't idempotent. Is this it?
DropRowRange operation is used to permanently drop/delete a row range from a specified table. At the moment it is only allowed to call it once at a time which is also the case you're hitting.
We are however working on allowing this method to be called in parallel, but there is no ETA for this yet. You can follow this public issue to get updates about the progress.

Dynamo DB Optimistic Locking Behavior during Save Action

Scenario: We have a Dynamo DB table supporting Optimistic Locking with Version Number. Two concurrent threads are trying to save two different entries with the same primary key value to that Table.
Question: Will ConditionalCheckFailedException be thrown for the latter save action?
Yes, the second thread which tries to insert the same data would throw ConditionalCheckFailedException.
com.amazonaws.services.dynamodbv2.model.ConditionalCheckFailedException
As soon as the item is saved in database, the subsequent updates should have the version matching with the value on DynamoDB table (i.e. server side value).
save — For a new item, the DynamoDBMapper assigns an initial version
number 1. If you retrieve an item, update one or more of its
properties and attempt to save the changes, the save operation
succeeds only if the version number on the client-side and the
server-side match. The DynamoDBMapper increments the version number
automatically.
We had a similar use case in past but in our case, multiple threads reading first from the dynamoDB and then trying to update the values.
So finally there will be change in version by the time they read and they try to update the document and if you don't read the latest value from the DynamoDB then intermediate update will be lost(which is known as update loss issue refer aws-docs for more info).
I am not sure, if you have this use-case or not but if you have simply 2 threads trying to update the value and then if one of them get different version while their request reached to DynamoDB then you will get ConditionalCheckFailedException exception.
More info about this error can be found here http://grepcode.com/file/repo1.maven.org/maven2/com.michelboudreau/alternator/0.10.0/com/amazonaws/services/dynamodb/model/ConditionalCheckFailedException.java

Database polling, prevent duplicate fetches

I have a system whereby a central MSSQL database keeps in a table a queue of jobs that need to be done.
For the reasons that processing requirements would not be that high, and that there would not be a particularly high frequency of requests (probably once every few seconds at most) we made the decision to have the applications that utilise the queue simply query the database whenever one is needed; there is no message queue service at this time.
A single fetch is performed by having the client application run a stored procedure, which performs the query(ies) involved and returns a job ID. The client application then fetches the job information by querying by ID and sets the job as handled.
Performance is fine; the only snag we have felt is that, because the client application has to query for the details and perform a check before the job is marked as handled, on very rare occasions (once every few thousand jobs), two clients pick up the same job.
As a way of solving this problem, I was suggesting having the initial stored procedure that runs "tag" the record it pulls with the time and date. The stored procedure, when querying for records, will only pull records where this "tag" is a certain amount of time, say 5 seconds, in the past. That way, if the stored procedure runs twice within 5 seconds, the second instance will not pick up the same job.
Can anyone foresee any problems with fixing the problem this way or offer an alternative solution?
Use a UNIQUEIDENTIFIER field as your marker. When the stored procedure runs, lock the row you're reading and update the field with a NEWID(). You can mark your polling statement using something like WITH(READPAST) if you're worried about deadlocking issues.
The reason to use a GUID here is to have a unique identifier that will serve to mark a batch. Your NEWID() call is guaranteed to give you a unique value, which will be used to prevent you from accidentally picking up the same data twice. GETDATE() wouldn't work here because you could end up having two calls that resolve to the same time; BIT wouldn't work because it wouldn't uniquely mark off batches for picking up or reporting.
For example,
declare #ReadID uniqueidentifier
declare #BatchSize int = 20; -- make a parameter to your procedure
set #ReadID = NEWID();
UPDATE tbl WITH (ROWLOCK)
SET HasBeenRead = #ReadID -- your UNIQUEIDENTIFIER field
FROM (
SELECT TOP (#BatchSize) Id
FROM tbl WITH(UPDLOCK ROWLOCK READPAST )
WHERE HasBeenRead IS null ORDER BY [Id])
AS t1
WHERE ( tbl.Id = t1.Id)
SELECT Id, OtherCol, OtherCol2
FROM tbl WITH(UPDLOCK ROWLOCK READPAST )
WHERE HasBeenRead = #ReadID
And then you can use a polling statement like
SELECT COUNT(*) FROM tbl WITH(READPAST) WHERE HasBeenRead IS NULL
Adapted from here: https://msdn.microsoft.com/en-us/library/cc507804%28v=bts.10%29.aspx