Dynamodb restrict access to user data only - amazon-web-services

how do I restrict dynamoddb user access to its own data owned by them.
I came access Using IAM Policy Conditions for Fine-Grained Access Control
"Condition": {
"ForAllValues:StringEquals": {
"dynamodb:LeadingKeys": [
"${www.amazon.com:user_id}"
],
"dynamodb:Attributes": [
"UserId",
"GameTitle",
"Wins",
"Losses",
"TopScore",
"TopScoreDateTime"
]
}
}
problem with above condition is that my partition key is not the userId.
its something else.
here is what my DB looks like like
Hash : "Sales" # its just plain text Sales
Range Date # its date
attributes : Sale : [ # array of maps
{
name : abc,
userId : idabc,
some-other : stuff
},
{
name : xyz,
userId : idxyz,
some-other : stuff
}
]
any idea how to restrict access based on sale[x].userId ?
or any better design how to handle this kind of design ?
I use Date range to query 90% of the data.
other option is to use different table for each and every logical table.
like sales,expense,payroll etc but I don't want to create different tables
and it defeats the purpose or NoSQL I guess.
FYI I am using javascript sdk to access dynamodb from browser.
app has 3 different user types
customer (access to its own data)
merchants (its own data and access to its customer data)
admin (access to all the data)
I think for this I have to create 3 different userPools, correct me if I am wrong.
but cant restrict access to own data, if I use partition key as userId
then querying for merchants becomes difficult.
any suggestion on how do I handle this db design?
thanx

Related

Creating different User Types in AWS Amplify

I am planning to use AWS Amplify as a backend for a mobile application. The App consists of two User Types (UserTypeA,UserTypeB). They have some common data points and some unique one's too.
UserTypeA(id, email, firstName, lastName, profilePicture, someUniquePropertyForUserTypeA)
UserTypeB(id, email, firstName, lastName, profilePicture, someUniquePropertyForUserTypeB)
What would be a scalable approach to achieve this? I am also using AWS Amplify authentication so I can save the common data as CustomAttributes offered by Cognito, but then how would I save the uniqueProperties for the two user types. Will this approach scale?
This is a social app and is heavily reliant on other users' profile data as well (which will be queried most of the time).
Check out the patterns that are recommended by AppSync (The graphQL service that is behind Amplify when adding graphQL API). It is described in detail here: https://docs.aws.amazon.com/appsync/latest/devguide/security-authorization-use-cases.html
The main idea is to have multiple user pools defined in Cognito and then you can use the different groups in the resolvers. For example:
// This checks if the user is part of the Admin group and makes the call
#foreach($group in $context.identity.claims.get("cognito:groups"))
#if($group == "Admin")
#set($inCognitoGroup = true)
#end
#end
#if($inCognitoGroup)
{
"version" : "2017-02-28",
"operation" : "UpdateItem",
"key" : {
"id" : $util.dynamodb.toDynamoDBJson($ctx.args.id)
},
"attributeValues" : {
"owner" : $util.dynamodb.toDynamoDBJson($context.identity.username)
#foreach( $entry in $context.arguments.entrySet() )
,"${entry.key}" : $util.dynamodb.toDynamoDBJson($entry.value)
#end
}
}
#else
$utils.unauthorized()
#end
or using the #directives on the graphQL schema, such as:
type Query {
posts:[Post!]!
#aws_auth(cognito_groups: ["Bloggers", "Readers"])
}
type Mutation {
addPost(id:ID!, title:String!):Post!
#aws_auth(cognito_groups: ["Bloggers"])
}
...
This is a Database design problem. To solve this, you can try creating a relation that has the common attributes in it, that is, User with attributes, (ID, email, firstName, lastName, profilePicture, someUniquePropertyForUserTypeA).
After that, create sub-classed based relations, that is UserTypeA, and UserTypeB.
These relations will have a unique ID, and have a foreign key relation with the parent (User). How? The first major relation would be 'User'. The 2 sub classed relations would be 'UserTypeA', and 'UserTypeB'.
The 'User' has an attribute 'ID'.
So the two sub classes have an attribute, 'User_ID', which is a foregin relation to 'User'.'ID'.
Now just autogen another ID column for UserTypeA and UserTypeB.
This way, you have a central table which has a unique ID for all users, and then you have a unique ID in each of the sub class relations, which together with User_ID forms a composite key.

Using custom attribute values to restrict access to dynamo DB rows

I need to restrict a users access to only rows from tables keyed by a unique ID which is used as the primary key in dynamo DB.
This unique ID is present in the JWT from the OIDC provider but the issue is that IAM doesn't allow custom attributes to be used to restrict access to Leading Keys.
Does anyone know if it is possible to do this via some sort of workaround like mapping the custom attribute to a value that can be used in an IAM policy?
I need to do something like the following:
"ForAllValues:StringEquals": {
"dynamodb:LeadingKeys": [
"${cognito-identity.amazonaws.com:my_unique_id}"
],
"dynamodb:Attributes": [
"UserId",
"GameTitle",
"Wins",
"Losses",
"TopScore",
"TopScoreDateTime"
]
},

AppSync GraphQL mutation server logic in resolvers

I am having issues finding good sources for / figuring out how to correctly add server-side validation to my AppSync GraphQL mutations.
In essence I used AWS dashboard to define my AppSync schema, hence had DynamoDB tables created for me, plus some basic resolvers set up for the data.
No I need to achieve following:
I have a player who has inventory and gold
Player calls purchaseItem mutation with item_id
Once this mutation is called I need to perform some checks in resolver i.e. check if item_id exists int 'Items' table of associated DynamoDB, check if player has enough gold, again in 'Players' table of associated DynamoDB, if so, write to Players DynamoDB table by adding item to their inventory and new subtracted gold amount.
I believe most efficient way to achieve this that will result in less cost and latency is to use "Apache Velocity" templating language for AppSync?
It would be great to see example of this showing how to Query / Write to DynamoDB, handle errors and resolve the mutation correctly.
For writing to DynamoDB with VTL use the following tutorial
you can start with the PutItem template. My request template looks like this:
{
"version" : "2017-02-28",
"operation" : "PutItem",
"key" : {
"noteId" : { "S" : "${context.arguments.noteId}" },
"userId" : { "S" : "${context.identity.sub}" }
},
"attributeValues" : {
"title" : { "S" : "${context.arguments.title}" },
"content": { "S" : "${context.arguments.content}" }
}
}
For query:
{ "version" : "2017-02-28",
"operation" : "Query",
"query" : {
## Provide a query expression. **
"expression": "userId = :userId",
"expressionValues" : {
":userId" : {
"S" : "${context.identity.sub}"
}
}
},
## Add 'limit' and 'nextToken' arguments to this field in your schema to implement pagination. **
"limit": #if(${context.arguments.limit}) ${context.arguments.limit} #else 20 #end,
"nextToken": #if(${context.arguments.nextToken}) "${context.arguments.nextToken}" #else null #end
}
This is based on the Paginated Query template.
What you want to look at is at Pipeline Resolvers:
https://docs.aws.amazon.com/appsync/latest/devguide/pipeline-resolvers.html
Yes, this requires the VTL (Velocity Template)
That allows you to perform read, writes, validation, and anything you'd like using VTL. What you basically do is chain the inputs and outputs into the next template and make the required process.
Here's a Medium post showing you how to do it:
https://medium.com/#dabit3/intro-to-aws-appsync-pipeline-functions-3df87ceddac1
In other words, what you can do is:
Have one template that queries the database, pipeline the result to another template that validates the result and inserts it if succeeds or fails it.

AWS: Transforming data from DynamoDB before it's sent to Cloudsearch

I'm trying to set up AWS' Cloudsearch with a DynamoDB table. My data structure is something like this:
{
"name": "John Smith",
"phone": "0123 456 789"
"business": {
"name": "Johnny's Cool Co",
"id": "12345",
"type": "contractor",
"suburb": "Sydney"
},
"profession": {
"name": "Plumber",
"id": "20"
},
"email": "johnsmith#gmail.com",
"id": "354684354-4b32-53e3-8949846-211384",
}
Importing this data from DynamoDB -> Cloudsearch is a breeze, however I want to be able to index on some of these nested object parameters (like business.name, profession.name etc).
Cloudsearch is pulling in some of the nested objects like suburb, but it seems like it's impossible for it to differentiate between the name in the root of the object and the name within the business and profession objects.
Questions:
How do I make these nested parameters searchable? Can I index on business.name or something?
If #1 is not possible, can I somehow send my data through a transforming function before it gets to Cloudsearch? This way I could flatten all of my objects and give the fields unique names like businessName and professionName
EDIT:
My solution at the moment is to have a separate DynamoDB table which replicates our users table, but stores it in a CloudSearch-friendly format. However, I don't like this solution at all so any other ideas are totally welcome!
You can use dynamodb streams and write a function that runs in lambda to capture changes and add documents to cloudsearch, flatenning them at that point, instead of keeping an additional dynamodb table.
For example, within my lambda function I have logic that keeps the list of nested fields (within a "body" parent in this case) and I create a just flatten them with their field name, in the case of duplicate sub-field names you can append the parent name to create a new field such as "body-name" as the key.
... misc. setup ...
headers = { "Content-Type": "application/json" }
indexed_fields = ['app', 'name', 'activity'] #fields to flatten
def handler(event, context): #lambda handler called at each update
document = {} #document to be uploaded to cloudsearch
document['id'] = ... #your uid, from the dynamo update record likely
document['type'] = 'add'
all_fields = {}
#flatten/pull out info you want indexed
for record in event['Records']:
body = record['dynamodb']['NewImage']['body']['M']
for key in indexed_fields:
all_fields[key] = body[key]['S']
document['fields'] = all_fields
#post update to cloudsearch endpoint
r = requests.post(url, auth=awsauth, json=document, headers=headers)

Map different Sort Key responses to Appsync Schema values

So here is my schema:
type Model {
PartitionKey: ID!
Name: String
Version: Int
FBX: String
# ms since epoch
CreatedAt: AWSTimestamp
Description: String
Tags: [String]
}
type Query {
getAllModels(count: Int, nextToken: String): PaginatedModels!
}
type PaginatedModels {
models: [Model!]!
nextToken: String
}
I would like to call 'getAllModels' and have all of it's data, and all of it's tags be filled in.
But here is the thing. Tags are stored via sort keys. Like so
PartionKey | SortKey
Model-0 | Model-0
Model-0 | Tag-Tree
Model-0 | Tag-Building
Is it possible to transform the 'Tag' sort keys into the Tags: [String] array in the schema via a DynamoDB resolver? Or must I do something extra fancy through a lambda? Or is there a smarter way to do this?
To clarify, are you storing objects like this in DynamoDB:
{ PartitionKey (HASH), Tag (SortKey), Name, Version, FBX, CreatedAt, Description }
and using a DynamoDB Query operation to fetch all rows for a given HashKey.
Query #PartitionKey = :PartitionKey
and getting back a list of objects some of which have a different "Tag" value and one of which is "Model-0" (aka the same value as the partition key) and I assume that record contains all other values for the record. E.G.
[
{ PartitionKey, Tag: 'ValueOfPartitionKey', Name, Version, FBX, CreatedAt, ... },
{ PartitionKey, Tag: 'Tag-Tree' },
{ PartitionKey: Tag: 'Tag-Building' }
]
You can definitely write resolver logic without too much hassle that reduces the list of model objects into a single object with a list of "Tags". Let's start with a single item and see how to implement a getModel(id: ID!): Model query:
First define the response mapping template that will get all rows for a partition key:
{
"version" : "2017-02-28",
"operation" : "Query",
"query" : {
"expression": "#PartitionKey = :id",
"expressionValues" : {
":id" : {
"S" : "${ctx.args.id}"
}
},
"expressionNames": {
"#PartitionKey": "PartitionKey" # whatever the table hash key is
}
},
# The limit will have to be sufficiently large to get all rows for a key
"limit": $util.defaultIfNull(${ctx.args.limit}, 100)
}
Then to return a single model object that reduces "Tag" to "Tags" you can use this response mapping template:
#set($tags = [])
#set($result = {})
#foreach( $item in $ctx.result.items )
#if($item.PartitionKey == $item.Tag)
#set($result = $item)
#else
$util.qr($tags.add($item.Tag))
#end
#end
$util.qr($result.put("Tags", $tags))
$util.toJson($result)
This will return a response like this:
{
"PartitionKey": "...",
"Name": "...",
"Tags": ["Tag-Tree", "Tag-Building"],
}
Fundamentally I see no problem with this but its effectiveness depends upon your query patterns. Extending this to the getAll use is doable but will require a few changes and most likely a really inefficient Scan operation due to the fact that the table will be sparse of actual information since many records are effectively just tags. You can alleviate this with GSIs pretty easily but more GSIs means more $.
As an alternative approach, you can store your Tags in a different "Tags" table. This way you only store model information in the Model table and tag information in the Tag table and leverage GraphQL to perform the join for you. In this approach have Query.getAllModels perform a "Scan" (or Query) on the Model table and then have a Model.Tags resolver that performs a Query against the Tag table (HK: ModelPartitionKey, SK: Tag). You could then get all tags for a model and later create a GSI to get all models for a tag. You do need to consider that now the nested Model.Tag query will get called once per model but Query operations are fast and I've seen this work well in practice.
Hope this helps :)