Use DynamoDBVersionAttribute when creating a new DynamoDB Table in Java - amazon-web-services

I'm trying to add a DynamoDBVersionAttribute to incorporate optimistic locking when accessing/updating items in a DynamoDB table. However, I'm unable to figure out how exactly to add the version attribute.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBMapper.OptimisticLocking.html seems to state that using it as an annotation in the class that creates the table is the way to go. However, our codebase is creating new tables in a format similar to this:
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);
List<AttributeDefinition> attributeDefinitions= new
ArrayList<AttributeDefinition>();
attributeDefinitions.add(new
AttributeDefinition().withAttributeName("Id").withAttributeType("N"));
List<KeySchemaElement> keySchema = new ArrayList<KeySchemaElement>();
keySchema.add(new
KeySchemaElement().withAttributeName("Id").withKeyType(KeyType.HASH));
CreateTableRequest request = new CreateTableRequest()
.withTableName(tableName)
.withKeySchema(keySchema)
.withAttributeDefinitions(attributeDefinitions)
.withProvisionedThroughput(new ProvisionedThroughput()
.withReadCapacityUnits(5L)
.withWriteCapacityUnits(6L));
Table table = dynamoDB.createTable(request);
I'm not able to find out how to add the VersionAttribute through the Java code as described above. It's not an attribute definitions so unsure where it goes. Any guidance as to where I can add this VersionAttribute in the CreateTable request?

As far as I'm aware, the #DynamoDBVersionAttribute annotation for optimistic locking is only available for tables modeled specifically for DynamoDBMapper queries. Using DynamoDBMapper is not a terrible approach, since it effectively creates an ORM for CRUD operations on DynamoDB items.
But if your existing codebase can't make use of it, your next best bet is probably to use conditional writes to increment a version number if it's equal to what you expect it to be (i.e. roll your own optimistic locking). Unfortunately, you would need to include the increment / condition to every write you want to be optimistically locked.

Your code just creates a table, but then in order to use DynamoDBMapper to access that table, you need to create a class that represents it. For example if your table is called Users, you should create a class called Users, and use annotations to link it to the table.
You can keep your table creation code, but you need to create the DynamoDBMapper class. You can then do all of your loading, saving and querying using the DynamoDBMapper class.
When you have created the class, just give it a field called version and put the annotation on it, DynamoDBMapper will take care of the rest.

Related

Can we alter AWS QLDB table?

Suppose I have created a table like this.
CREATE TABLE Vehicle
and insert some documents to this table.
INSERT INTO Vehicle
<< {
'VIN' : '1N4AL11D75C109151',
'Type' : 'Sedan',
} >>
So my requirement is to change the table name from Vehicle to VehicleCar and want to change the 'VIN' to 'VID'
How can I do that?
Thanks,
Dasun.
QLDB doesn't currently offer an ALTER TABLE capability. You'd have to DROP the table and re-create it. This counts against your table limits, so don't do it too often.
QLDB is schema-less, so you can change your field names and/or the structure of your documents anytime you want to, simply by writing new revisions to your documents in the new format. The journal will still contain the old revisions, however. If your application has any functionality that uses the history() function to access old revisions, then it needs to be able to gracefully handle variations in the document format.
It is important to note that QLDB is not optimized for scanning large volumes of data. It's optimized for targeted queries against an index using an equality operator. A query like "SELECT * FROM table" will scan the entire table. This is an anti-pattern for QLDB and will not perform well as your ledger grows. So if you change your document format, running a SELECT * and updating every document to the new format may be more work than you realize. First, that SELECT * scan query may time-out or it may be aborted with an Optimistic Concurrency Control exception because another process inserted a document in the table. Second, you'd have to do it in batches of 40 documents at a time because of the limit to the number of documents in a transaction.
All of this is to say that making your application resilient to schema changes is a good idea. :-)

DynamoDB 1 big table or multiple small tables?

I'm currently facing some questions regarding my database design. Currently i'm developing an api which lets users do the following:
Create an Account ( 1 User owns 1 Account)
Create a Profile ( 1 Account owns 1-n Profiles)
Let a profile upload 2 types of items ( 1 Profile owns 0-n Items ; the items differ in type and purpose)
Calling the API methods triggers AWS Lambda to perform the requested operations in the DynamoDB tables.
My current plan looks like this:
It should be possible to query items by specifying a time frame and the Profile ID. But i think my design completely defeats the purpose of DynamoDB. AWS documentation says that a well designed product only requires one table.
What would be a good way to realise this architecture in one table?
Are there any drawbacks on using the current design?
What would you specify as Primary/Partition/sort key/secondary indexes in both the current design and a one-table-approach?
I’m going to give this answer assuming that you need to be able to do the following queries.
Given an Account, find all profiles
Given a Profile, find all Items
Given a Profile and a specific ItemType, find all Items
Given an Item, find the owning Profile
Given a Profile, find the owning account
One of the beauties of DynamoDB (and also a bane, perhaps) is that it is mostly schema-less. You need to have the mandatory Primary Key attributes for every item in the table, but all of the other attributes can be anything you like. In order to have a DynamoDB design with only one table, you usually need to get used to the idea of having mixed types of objects in the same table.
That being said, here’s a possible schema for your use case. My suggestion assumes that you are using something like UUIDs for your identifiers.
The partition key is a field that is simply called pkey (or whatever you want). We’ll also call the sort key skey (but again, it doesn’t really matter). Now, for an Account, the value of pkey is Account-{{uuid}} and the value of skey would be the same. For a Profile, the pkey value is also Account-{{uuid}}, but the skey value is Profile-{{uuid}}. Finally, for an Item, the pkey is Profile-{{uuid}} and the skey is Item-{{type}}-{{uuid}}. For all of the attributes of an item, don’t worry about it, just use whatever attributes you want to use.
Since the “parent” object is always the partition key, you can get any of the “child” objects simply by querying for the ID of the of the parent. For example, your key condition expression to get all the ‘ItemType2’s for a Profile would be
pkey = “Profile-{{uuid}}” AND begins_with(skey, “Item-Type2”)
In this schema, your GSI has the same keys as the table, but reversed. You can query the GSI for ‘Item-{{type}}-{{uuid}}’ to get the owning Profile, and similarly with a Profile is to get the owning account.
What I have illustrated here is the adjacency list pattern. DynamoDB also has an article describing how to use composite sort keys for hierarchical data, which would also be suitable for your data, and depending on your expected queries, it might be more suitable than using the adjacency list.
You don’t have to put everything in a single table. Yes, DynamoDB recommends it, but it is far more important to make sure that your application is correct and maintainable. If having multiple tables means it’s easier to write a defect free application, then use multiple tables.

Using requests or straight DynamoDB object to retrieve items from table?

I'm currently working with the DynamoDB AWS API and just found two different ways of doing apparently the same thing, I was actually wondering if there is any performance or benefit in user one instead of another.
My current scenario is restricted to just a table, so I will be only manipulating that one without going any further.
Is there any benefit from using this (which I think that in my scenario is simpler that forming a request every single time I want to check key existence, and I can also have a fixed Table object to request any time I need to)...
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("my-table");
table.getItem("sample");
... with this one?
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
HashMap<String, AttributeValue> key = new HashMap<String, AttributeValue>();
key.put("Artist", new AttributeValue().withS("sample1"));
key.put("SongTitle", new AttributeValue().withS("sample2"));
GetItemRequest request = new GetItemRequest()
.withTableName("my-table")
.withKey(key);
I've taken the code from the actual examples in the AWS website.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.SDKs.Interfaces.LowLevel.html
https://docs.aws.amazon.com/en_en/amazondynamodb/latest/developerguide/JavaDocumentAPIWorkingWithTables.html#JavaDocumentAPIListTables
Table is more or less a light-weight wrapper around the DynamoDB client object. (You can see this by looking at its source code on Guthub.) You can use whichever one you think makes your code the most readable and maintainable.

DynamoDB Concurrency Issue

I'm building a system in which many DynamoDB (NoSQL) tables all contain data and data in one table accesses data in another table.
Multiple processes are accessing the same item in a table at the same time. I want to ensure that all of the processes have updated data and aren't trying to access that item at the exact same time because they are all updating the item with different data.
I would love some suggestions on this as I am stuck right now and don't know what to do. Thanks in advance!
Optimistic locking is a strategy to ensure that the client-side item that you are updating (or deleting) is the same as the item in Amazon DynamoDB. If you use this strategy, your database writes are protected from being overwritten by the writes of others, and vice versa.
With optimistic locking, each item has an attribute that acts as a version number. If you retrieve an item from a table, the application records the version number of that item. You can update the item, but only if the version number on the server side has not changed. If there is a version mismatch, it means that someone else has modified the item before you did. The update attempt fails, because you have a stale version of the item. If this happens, you simply try again by retrieving the item and then trying to update it. Optimistic locking prevents you from accidentally overwriting changes that were made by others. It also prevents others from accidentally overwriting your changes.
To support optimistic locking, the AWS SDK for Java provides the #DynamoDBVersionAttribute annotation. In the mapping class for your table, you designate one property to store the version number, and mark it using this annotation. When you save an object, the corresponding item in the DynamoDB table will have an attribute that stores the version number. The DynamoDBMapper assigns a version number when you first save the object, and it automatically increments the version number each time you update the item. Your update or delete requests succeed only if the client-side object version matches the corresponding version number of the item in the DynamoDB table.
ConditionalCheckFailedException is thrown if:
You use optimistic locking with #DynamoDBVersionAttribute and the version value on the server is different from the value on the client side.
You specify your own conditional constraints while saving data by using DynamoDBMapper with DynamoDBSaveExpression and these constraints failed.
Note
DynamoDB global tables use a “last writer wins” reconciliation between concurrent updates. If you use global tables, last writer policy wins. So in this case, the locking strategy does not work as expected.

Django and Oracle nested table support

Can Django support Oracle nested tables or varrays or collections in some manner? Asking just for completeness as our project is reworking the data model, attempting to move away from EAV organization, but I don't like creating a bucket load of dependent supporting tables for each main entity.
e.g.
(not the proper Oracle syntax, but gets the idea across)
Events
eventid
report_id
result_tuple (result_type_id, result_value)
anomaly_tuple(anomaly_type_id, anomaly_value)
contributing_factors_tuple(cf_type_id, cf_value)
etc,
where the can be multiple rows of the tuples for one eventid
each of these tuples can, of course exist as separate tables, but this seems to be more concise. If it 's something Django can't do, or I can't modify the model classes to do easily, then perhaps just having django create the extra tables is the way to go.
--edit--
I note that django-hstore is doing something very similar to what I want to do, but using postgresql's hstore capability. Maybe I can branch off of that for an Oracle nested table implementation. I dunno...I'm pretty new to python and django, so my reach may exceed my grasp in this case.
Querying a nested table gives you a cursor to traverse the tuples, one member of which is yet another cursor, so you can get the rows from the nested table.