Data Model in DynamoDB - amazon-web-services

When using Mobile Hub (AWS), building a DynamoDB table. There is at some point the option to download the Data Model for the table. But we do not see this option (AFAIK) if we do not use Mobile Hub. So the question is: Is there a way to get the Data Model for the table, when not using Mobile Hub?

Just to clarify, DynamoDB doesn't have a full data model like RDBMS. However, it does have the hash key, partition key (if defined) and all the index details.
You can get this information using Describe table API. The API will give the output in JSON format. Kindly look at the link for more information.
Please note that all the non-key attributes are not included in the data model. This is the basic concept in NoSQL database and this is the flexibility of NoSQL database when compared to RDBMS.
The item structure (non-key attributes) need not be defined while
creating the table. In fact, DynamoDB doesn't allow to define the
non-key attributes while creating the table
The non-key attributes in one item need not be same in the another
item

Related

Update only one column in GCP datastore table

I want to update only one column in GCP Datastore table.
For Example : Table has columns id, name, descriptions, price, data.
I receive data to update only descriptions. I want to update only descriptions column without reading other data.(want to avoid read before write)
It is possible to update only column of datastore without reading data from datastore.
If not what other database in GCp allow to do it?
Cloud Datastore is a document database which stores entities, and there are no fixed columns or schema. Instead, each entity can have a different set of properties, which are similar to columns in a traditional relational database.check this document for more information
You cannot update specific properties of an entity.As this documentation says you have to update the entire entity.To update a specific property of an entity, you would need to retrieve the entire entity, modify the desired property, and then write the entire entity back to the database.

Should Dynamodb apply single table design instead of multiple table design when the entities are not relational

Let’s assume there are mainly 3 tables for the current database.
Pkey = partition key
Admin
-id(Pkey), username, email, createdAt,UpdatedAt
Banner
-id(Pkey), isActive, createdAt, caption
News
-id(Pkey), createdAt, isActive, title, message
None of the above tables have relation with other tables, and more tables will be required in the future(I think most of it also don’t have the relation with other tables).
According to the aws document
You should maintain as few tables as possible in a DynamoDB application.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html
So I was considering the need to combine these 3 tables into a single table.
Should I start to use a single table from now on, or keep using multiple tables for the database?
If using a single table, how should I design the table schema?
DynamoDB is a NoSQL database, hence you design your schema specifically to make the most common and important queries as fast and as inexpensive as possible. Your data structures are tailored to the specific requirements of your business use cases.
When designing a data model for your DynamoDB Table, you should start from the access patterns of your data that would in turn inform the relation (or lack thereof) among them.
Two interesting resources that would help you get started are From SQL to NoSQL and NoSQL Design for DynamoDB, both part of the AWS Developer Documentation of DynamoDB.
In your specific example, based on the questions you're trying to answer (i.e. use case & access patterns), you could either work with only the Partition Key or more likely, benefit from the usage of composite Sort Keys / Sort Key overloading as described in Best Practices for Using Sort Keys to Organize Data.
Update, add example table design to get you started:

Automatically generate data documentation in the Redshift cluster

I am trying to automatically generate a data documentation in the Redshift cluster for all the maintained data products, but I am having trouble to do so.
Is there a way to fetch/store metadata about tables/columns in redshift directly?
Is there also some automatic way to determine what are the unique keys in a Redshift table?
For example an ideal solution would be to have:
Table location (cluster, schema, etc.)
Table description (what is the table for)
Each column's description (what is each column for, data type, is it a key column, if so what type, etc.)
Column's distribution (min, max, median, mode, etc.)
Columns which together form a unique entry in the table
I fully understand that getting the descriptions automatically is pretty much impossible, but I couldn't find a way to store the descriptions in redshift directly, instead I'd have to use 3rd party solutions or generally a documentation outside of the SQL scripts, which I'm not a big fan of, due to the way the data products are built right now. Thus having a way to store each table's/column's description in redshift would be greatly appreciated.
Amazon Redshift has the ability to store a COMMENT on:
TABLE
COLUMN
CONSTRAINT
DATABASE
VIEW
You can use these comments to store descriptions. It might need a bit of table joining to access.
See: COMMENT - Amazon Redshift

Relational DB to Single Dynamo DB Table

I've read many documentation from AWS and saw Re-Invent Videos. It talks about that any relational tables can be stored in single Dynamo DB (except few scenarios)
I have got below schema in Relational DB that I want to convert in single Dynamo DB but scratching my head, how it should look like.
My Use cases are:
Get all Attributes by Product / Item number
Get a specific attribute
for a Product / Item number
Get all Item / Product by an Attribute
name and attribute Value (For e.g. Get me all the Items where size is
45)
Get Attribute information by Attribute name (For e.g. Get me
details about the Color attribute)
Your use-cases are a better fit with a relational database rather than a NoSQL database.
A NoSQL database is excellent for storing and retrieving data based on a primary key. For example, "store record #12", or "retrieve record #12". The item that is stored is in JSON format and can contain a lot of information. DynamoDB can also provide predictable performance for such requests, making it ideal for speed-critical applications (eg retrieving user profiles in a popular web application).
However, NoSQL is not ideal if you wish to search for data such as "Get me all the Items where size is 45". You can achieve some of this by adding additional indexes, but it can become complex and is not as flexible as a relational database.
Yes, you can "store" relational tables in a NoSQL database, but you can't access them in the way you desire.
Your examples and your diagram would be better suited for a relational database. I would recommend Amazon RDS for MySQL or PostgreSQL.

Should I use a secondary index or separate ID lookup table in DynamoDB?

I'm migrating a database from mongodb to dynamodb and trying to understand best practices, especially with using secondary local indexes and sort keys.
My application pulls in html data from the web, and loads the data into several tables/collections. At the time of extraction it gives each item an extracted_id, unique to the website it's pulled from. Before loading the items, it gives each item a UUID as its primary/partition key.
Problem: In order to avoid assigning different uuids to the same extracted_id I query the db to check if the entity has a preexisting entity_uuid.
Current Solution: Currently in mongodb, I have two sets of tables/collections. One for storing all items, and one for storing an entity's extracted_id(as key) / entity_uuid (as value) lookup table.
Better Solution?: As I move to DynamoDB would it be better to only create one database with extracted_id as a local secondary index, as to not store duplicate data? I'm unsure as the docs say to use indexes sparingly. I don't use the extracted_id for anything other than providing items with their uuid for a given site.
Hopefully this makes sense, I'm new to AWS / DynamoDB and would appreciate any tips / better solutions to the ones mentioned.
Why not just make extracted_id the partition key of your new DynamoDB table and use a ConditionExpression attribute_not_exists(extracted_id) to prevent your application from writing duplicate entries?