AWS DynamoDB Multi-Tenant Table Schema - amazon-web-services

Thanks in advance!
I want to ceate a Saas solution using AWS for multiple tenants each having multiple users.
Each of the user (Example : Admin, Manager, Supervisor) have to upload their department users data (Eg: Name, SurName, Email, Phone etc. and these user attributes together are identified by a HashKey)
In short I have to store all users's information of multiple departments of multiple companies including each users HashKey
How can this be done using DynamoDB? Can someone help in creating a Table schema?
Query pattern mostly used is : A tenant will provide HashKey and would want to fetch all user information from it or part of it by providing HashKey and some fields.
Regards,

You can use only one table and "separate" the data using dynamodb index.
Create an Index that will be responsible for storing the tenant hash (id, whatever) and then use it to fetch the data when needed.

Related

How Do I Safely Partition DynamoDB Database to Protect User Data?

ex.
Let's say I'm trying to create an eCommerce platform with multiple sellers and I create a Table called Orders. The partition key will be storeID and the sort key will be orderNumber.
Stores can call API.get('Orders', {storeID}), which will return all the items with the partition key of storeID.
My application uses Amazon Cognito and each user is assigned a username which is a uuid. My question is can I use the uuid as the storeID in my DynamoDB table? The key assumption is that attackers won't be able to guess the uuid.
You should always validate access to resources server-side.
Not being able to guess a uuid isn't a safe assumption. Since you are using Amazon Cognito, there should be a way in your server code to get the logged-in user (the uuid). When you are making a query to DynamoDB, you shouldn't rely on a uuid passed by an HTTP query (client-side), but use instead the uuid value of the logged-in user.
The uuid could be used in a Global Secondary Index, so that you can quickly query the orders of a user.
you can use UUID to store the same users data in dynamo as well. But make sure while fetching the details in dynamo what detail an API is actually needed like whether a store API should request only orders and not users so to differentiate those stuffs just add an attribute called "entity_type" which will have
order
user
store
other resources
So while fetching add this entity_type as well one of the where condition to filter out rest of the unwanted stuffs.

Updating records in many to many relationship - single table pattern

on a single table pattern on dynamodb, when you have a many to many relationship, how would you make an update
lets say a user-account table, consisting of a userId and accountId as the partition key & sort key, and a gsi of the reverse, a user can have many accounts and an account can have many users, what happens when the information on an account updates
would you go through and loop through all accounts and update each individual record?
For example, if a user updates an account information, then all users linked to that account would need the updated information now. In SQL it would be easy as the accounts would be linked via a foreign key, but in nosql, each row contains all the information about the account, so what would be the best approach to maintain consistent data for the updated account accross all users?

How can I implement "hierarchical" permissions between DynamoDB objects in AWS AppSync with a GraphQL API?

I am building a project using AppSync and GraphQL to enable Restaurants to track orders. There are four DynamoDB tables (one for each of the following entities): Restaurants, Staff, Tables and Orders. Each Restaurant can have many members of Staff, who are each allocated to one or more Tables. Each Table can have many orders, but an order can only belong to one table (see the System Design diagram for a visualisation of these relationships).
Problem
My issue is that I need very fine-grained hierarchical access control, with 3 main concerns:
Staff belonging to one Restaurant must not be able to Create, Read, Update or Delete any entities belonging to other Restaurants.
All staff in a Restaurant can view all tables in the Restaurant. However, they can only view orders belonging to a table if they are allocated to that table (e.g. a StaffTableJoin object which connects that particular Staff member to that table exists.) OR they are a Restaurant admin (see part 3)
A member of Staff who is a Restaurant Admin can view all orders belonging to any table in the restaurant.
A cognito user is created for each member of staff, and their permissions should be assigned based on the relationships between entities in my DynamoDB table.
Solutions Considered
I have visited the Authorization and Authentication page in the AWS docs to explore options for restricting permissions. So far, I have considered using COGNITO_USER_POOLS and AWS_LAMDBA authorization.
For the approach using COGNITO_USER_POOLS, I would create a Cognito User Group for each Restaurant. When new members of staff register, they are assigned to their restaurant's user group. I would then add an groupsCanAccess field to each entity in each database. My resolvers would check that the requesting user belongs to a group which is allowed to access each resource. However this would only address concern 1, as all staff in a restaurant would then have the same permissions to access their restaurant's resources.
For the approach using AWS_LAMBDA, I am not too sure how this would work, but I considered creating an Authorization lambda which checks which restaurant the requesting user belongs to. For instance, if the User was requesting an Order, I would need to check which table the order belongs to, then check if a StaffUserJoin exists (connecting the requesting User to the table). This approach seems very difficult (maybe impossible).
Any advice that could be offered is much appreciated, as I have been struggling with this for a long time. It seems like a common use case, where permissions are needed based on an object hierachy. Thanks in advance :)

setting foreign key and primary key in gcp firestore

I am new to GCP and NOSQL.
is it possible to have primary and foreign key in the GCP fire-store
Example: I have two table STUDENT and DEPARTMENT
table looks like below
Department-table
dept-id(primary key)
deptname
Student-table
dept-id(foreign key)
student-id
student name
can anybody please help in design this in GCP Fire-store?
To a database, a key is the same as any UUID/randomID and can be shared and used between users, teams, admins, businesses, of all kinds. what matters is how that data is associated. Since Firestore is a noSQL database, there is no direct relational references, so one key cannot be equal to another without including secondary lookups.
In the same way you would define a user profile by an ID, you can create an empty document with a random ID to facilitate the ID of a team, or in this case the department. You can also utilize string combinations if you have a team and a sub-team, so long as at the point of the database request you have access to the team/department ID, you can use Regex to match a string comparison.
Example: request.resource.data.name.matches('/^' + departmentID)
To make a foreign key work with Security Rules or within the client, you must get the key that contains the data as the key should be the name of the document in question to streamline the request as you cannot perform queries or loop through data within Security Rules.
I great read on this subject, I highly suggest this article
https://medium.com/firebase-developers/a-list-of-firebase-firestore-security-rules-for-your-project-fe46cfaf8b2a
But my suggestion is to use a key that represents the department directly rather than using additional resource to have a foreign key and managing it.
Firestore won't support referential integrity.
It means that you can use any (subject to rules and conventions) names for fields, but the semantic and additional functionality is to be maintained by you, rather than by the system.

DynamoDB table/index schema design for querying multi-valued attributes

I'm building a DynamoDB app that will eventually serve a large number (millions) of users. Currently the app's item schema is simple:
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
email: "foo#foo.com",
... other attributes ...
}
When a new user signs up, or if a user wants to find another user by email address, we'll need to look up users by email instead of by userId. With the current schema that's easy: just use a global secondary index with email as the Partition Key.
But we want to enable multiple email addresses per user, and the DynamoDB Query operation doesn't support a List-typed KeyConditionExpression. So I'm weighing several options to avoid an expensive Scan operation every time a user signs up or wants to find another user by email address.
Below is what I'm planning to change to enable additional emails per user. Is this a good approach? Is there a better option?
Add a sort key column (e.g. itemTypeAndIndex) to allow multiple items per userId.
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
itemTypeAndIndex: "main", // sort key
email: "foo#foo.com",
... other attributes ...
}
If the user adds a second, third, etc. email, then add a new item for each email, like this:
{
userId: "08074c7e0c0a4453b3c723685021d0b6", // partition key
itemTypeAndIndex: "Email-2", // sort key
email: "bar#bar.com"
// no more attributes
}
The same global secondary index (with email as the Partition Key) can still be used to find both primary and non-primary email addresses.
If a user wants to change their primary email address, we'd swap the email values in the "primary" and "non-primary" items. (Now that DynamoDB supports transactions, doing this will be safer than before!)
If we need to delete a user, we'd have to delete all the items for that userId. If we need to merge two users then we'd have to merge all items for that userId.
The same approach (new items with same userId but different sort keys) could be used for other 1-user-has-many-values data that needs to be Query-able
Is this a good way to do it? Is there a better way?
Justin, for searching on attributes I would strongly advise not to use DynamoDB. I am not saying, you can't achieve this. However, I see a few problems that will eventually come in your path if you will go this root.
Using sort-key on email-id will result in creating duplicate records for the same user i.e. if a user has registered 5 email, that implies 5 records in your table with the same schema and attribute except email-id attribute.
What if a new use-case comes in the future, where now you also want to search for a user based on some other attribute(for example cell phone number, assuming a user may have more then one cell phone number)
DynamoDB has a hard limit of the number of secondary indexes you can create for a table i.e. 5.
Thus with increasing use-case on search criteria, this solution will easily become a bottle-neck for your system. As a result, your system may not scale well.
To best of my knowledge, I can suggest a few options that you may choose based on your requirement/budget to address this problem using a combination of databases.
Option 1. DynamoDB as a primary store and AWS Elasticsearch as secondary storage [Preferred]
Store the user records in DynamoDB table(let's call it UserTable)as and when a user registers.
Enable DynamoDB table streams on UserTable table.
Build an AWS Lambda function that reads from the table's stream and persists the records in AWS Elasticsearch.
Now in your application, use DynamoDB for fetching user records from id. For all other search criteria(like searching on emailId, phone number, zip code, location etc) fetch the records from AWS Elasticsearch. AWS Elasticsearch by default indexes all the attributes of your record, so you can search on any field within millisecond of latency.
Option 2. Use AWS Aurora [Less preferred solution]
If your application has a relational use-case where data are related, you may consider this option. Just to call out, Aurora is a SQL database.
Since this is a relational storage, you can opt for organizing the records in multiple tables and join them based on the primary key of those tables.
I will suggest for 1st option as:
DynamoDB will provide you durable, highly available, low latency primary storage for your application.
AWS Elasticsearch will act as secondary storage, which is also durable, scalable and low latency storage.
With AWS Elasticsearch, you can run any search query on your table. You can also do analytics on data. Kibana UI is provided out of the box, that you may use to plot the analytical data on a dashboard like (how user growth is trending, how many users belong to a specific location, user distribution based on city/state/country etc)
With DynamoDB streams and AWS Lambda, you will be syncing these two databases in near real-time [within few milliseconds]
Your application will be scalable and the search feature can further be enhanced to do filtering on multi-level attributes. [One such example: search all users who belong to a given city]
Having said that, now I will leave this up to you to decide. 😊