AWS PII data encryption for users - amazon-web-services

I have sensitive data attached to users that they need to access to (on AWS dynamoDB). However, to comply with requirements, I or other devs should not be able to decrypt it.
Is there a recommended way of handling this problem (for example using user password as a decrypting key or something) ?
I am not looking for data anonymisation or masking as users still need to access the data.
My infrastructure is on AWS using Cognito as authentication method, lambda functions, API gateway and DynamoDB.

Related

Suggestion: Integrating Amazon Cognito with AWS DynamoDB

I've built an application which is connected with Amazon Cognito to take the sign in and sign-ups of users. Currently, application support three different subscriptions (Free, Basic, Premium). If the user signs in for basic Subscriptions, I want to give them least access to DynamoDB for download the parts of applications which is required to run the application service.
How to connect DynamoDB with Cognito directly
I am not sure, what's the best approach to follow this scenario?
(Please note- this is not a mobile-based application, so do not give suggestion to use AWS Amplify or relatable services)
When I was first learning about Cognito, I had made the same set of assumptions you are currently making. I knew that User Pools could act as my application's user directory, and Identity Pools would magically unlock all my authorization needs. I was mistaken :)
At the risk of oversimplifying, AWS Cognito exists to answer two questions:
Who are you? (authentication)
What can you do? (authorization)
Cognito addresses these concerns with two distinct offerings: User Pools (authentication) and Identity Pools (authorization).
At a high level, User Pools let you handle user registration, authentication, account recovery, and supports authentication with third-party identity providers like Facebook, Google, etc. Sounds like you might have this part figured out.
Cognito Identity Pools, on the other hand, provides a way to authorize users to use various AWS services. You can think of it as a vending machine for handing out AWS credentials. For example, if you needed to give your users access to upload a file to an S3 bucket or to invoke an endpoint in API Gateway, you could do so with an Identity Pool. You can even allow item-level access to DynamoDB based on an Amazon Cognito ID. However, this might not work the way you expect since your application users are probably not directly connecting to DynamoDB.
In most web/mobile applications, users are not connecting directly to DynamoDB. Instead, they are interacting with a web/mobile app that communicates to the back-end of your application via an API. That API would then communicate with DynamoDB. If your stack is in AWS, the path may look something like this:
Client (web/mobile app) <-> API Gateway <-> Lambda <-> DynamoDB
In this architecture, your users would authenticate via Cognito. Cognito would then authorize the user to make calls to API Gateway. API Gateway would execute your lambda, which would then interact with DynamoDB. The "user" of DynamoDB in this example is your Lambda, not the user of your application.
That last bit is important, so I'll repeat it: Unless your users are directly connecting to DynamoDB (not recommended), they are not the "user" operating on DynamoDb. Therefore, restricting DynamoDB access based on a user's Cognito ID is not going to be an option for you.
So, what can you do? Your application needs to provide the business logic around what effect your users can have on DynamoDB. Perhaps free users have read-only access to a specific partition, while premium users can modify the same partition. That logic has to be handled directly by you.
I know you said you weren't looking for Amplify suggestions since your application is not mobile-based. However, Amplify offers SDKs that aren't specific to mobile development. The folks at Serverless have made a fantastic tutorial on building a full-stack serverless web app, which includes a very readable chapter on serverless auth with Cognito. They use Amplify in a web app to integrate with Cognito, S3, and API Gateway. If that's something you are trying to do, I'd recommend checking it out.

Can I create a Cognito-compatible JWT that doesn't reference a specific user?

We have a set of APIs in AWS secured by Cognito JWTs. Authentication of users based on their Cognito IdTokens and authorisation based on Cognito custom properties works great.
Some of our API endpoints are also called by 'service' processes: scheduled processes or maintenance tasks which are run from Lambda functions and are not associated with any 'real' user. We need these processes to be able to generate some sort of authentication/authorisation token which can be validated by the API.
I've considered :
Creating a real 'admin#example.com' Cognito user, storing the password in Secrets Manager, and authenticating as that user in the service functions. Not ideal IMO because it requires managing the existence of a data object as part of the infrastructure, and might also be an availability pinch point.
Generating a different secret token for the service functions, and special-case its validation in the APIs. Not ideal because now we have to manually maintain compatibility to the Cognito JWT signature, plus it's reinventing a security wheel. Also, we either have to share a secret between the service functions and the APIs, or manage an RSA keypair.
What I'd really like to do is have some way for service functions to sign a JWT based on their own IAM credentials, which can be validated in the API by a call to STS. But I don't want to pass a 'real' set of STS credentials from the service function (which has permissions to do other things beyond just invoke the API).
In short: how can I best 'mix' validating Cognito users and Lambda functions in the same API?

Kinesis Data Firehose set with a web page

Well, I have a web page (PHP) that is running on-premise and it's accessed from different countries. I would like to catch some data and store it somewhere. I can handle internally with the team the data and the format of the file to catch the info. But we would like to get leverage of AWS to store it in S3. So we notice that we need an intermedium layer to avoid use AWS credentials required for S3.
as this page is on the internet and it's consumed by a user thru web for sure we don't want to include anything for credentials embedded in the site. So likely Kinesis data firehose as consumer role could just catch the data send by our page and then internally store it in S3.
Question
I see that exist an SDK for Kinesis but it requires AWS credentials. We really need a kind of link where we need the data produced and AWS handles the rest. But I don't know why I require to set up AWS credentials using the SDK. Does it mean then that our website will load and live with our credentials? I don't feel this approach secure. I appreciate the comments.
You can use API Gateway Kinesis Proxy to avoid using credentials or even aws-sdk in your webpages.
https://docs.aws.amazon.com/apigateway/latest/developerguide/integrating-api-with-aws-services-kinesis.html
This way you don't need to expose any credentials and control permissions with a role.
If you are worried about having a security issue and if the users are authenticated, you can use custom authorizers to authorize the url.
https://docs.aws.amazon.com/apigateway/latest/developerguide/use-custom-authorizer.html
If it is public facing, then just the above integration should work.
Hope it helps.

Using S3 for saving images from mobile application

I am creating a backend service which will be getting requests from an Android application regarding creating of some service requests. These service requests will contain details about the the service items and also some images related to the request. We want to use S3 for storing the images directly from the android application and getting the key of the image saved through an API call on the backend service.
The problem with this approach is the authorization of the mobile application to access the shared bucket.
If we save the access key of the shared bucket in the application, this code can be decompiled and the secret will be compromised.
Another option is to create an API on the backend service which will give back the authorization key to the mobile application before it needs to put the image to S3. In this way we can also rotate the secrets periodically.
Which of these approach is better in terms of security? Is there any other approach which I am missing? It sounds like a standard access practice of using S3 for saving files, so there must be something for this particular scenario.
You don't need to invent an API to do this - AWS provides its STS service for just this use case.
http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html
To request temporary security credentials, you can use the AWS STS API actions.
To call the APIs, you can use one of the AWS SDKs, which are available
for a variety of programming languages and environments, including
Java, .NET, Python, Ruby, Android, and iOS. The SDKs take care of
tasks such as cryptographically signing your requests, retrying
requests if necessary, and handling error responses. You can also use
the AWS STS Query API, which is described in the AWS Security Token
Service API Reference. Finally, two command line tools support the AWS
STS commands: the AWS Command Line Interface, and the AWS Tools for
Windows PowerShell.
The AWS STS API actions return temporary security credentials that
consist of an access key and a session token. The access key consists
of an access key ID and a secret key. Users (or an application that
the user runs) can use these credentials to access your resources.
When the credentials are created, they are associated with an IAM
access control policy that limits what the user can do when using the
credentials. For more information, see Using Temporary Security
Credentials to Request Access to AWS Resources.

How to handle users data in an aws-based serverless stack

I'm a first-timer with AWS and I'm a bit lost.
I would like to have a serverless stack using Cognito to handle authentication, DynamoDB, Lambda and CloudFront for exposing REST services.
I don't know exactly how to handle users data. For example, I would like to store the user email and physical address. I've seen you can have that directly in Cognito, however, I would like to perform custom validation when these attributes are set/updated.
Can I do that easily with a trigger, letting the user have a Write access on its data?
Or should I restrain Write access to these attributes and expose a REST service to update them manually in a lambda?
I've also seen someone using a users table in DynamoDB to store some data, what are the advantage compared to using directly the identity pool?
Thanks,
You can easily store this kind of data(email, address) in Cognito user pools and validate the data using PreSignUp Lambda trigger, more details.
The advantage of using DynamoDB to store user data is that you will almost certainly hit a RequestLimitExceeded exception using Cognito as a primary data store. If you contact AWS support and explain what you are doing, they will up the Cognito API limit on your account - but that only temporarily solves the problem. Since Amazon doesn't publish what will trigger a RequestLimitExceeded error, you will eventually hit it again if your traffic increases.
Every time I have tried to use Cognito as the only source of user data I have run into this problem. So I end up storing user data in Dynamo or RDS.
If you don't have a lot of traffic or if you aren't going to be querying the Cognito API often, then it might work for you