I need to Encrypt the Sensitive fields in the Bq Table but my Loading Is Done through the Dataflow. I thought of 3 Different way to Use it.
Encrypt the whole Table using Customer Managed Key and Make 3 Views on Different Classifications and provide Service account to Users to access the View and Provide that Service account role as Decrypter in KMS and Dataflow Service Account as Encrypter Load the Table. (Problem We do not have View Level Access so that views Required to Maintain in Different Datasets which makes our job more Difficult)
Encrypt the Fields Using The API call in Dataflow While Loading and Make a UDF function to Decrypt that Colum Data at Runtime in Bq Using Service Account.
Example Id Fields are Encrypted Using API call in Dataflow And we defined a UDF function in Bq to Decrypt it but only those can decrypt that Data who have access in KMS else it will throw an Exception
In this way we keep a Single Table Open to All Users but Only Authenticate Use can only See the that.
Problem: (Continuous Call of API at Runtime which makes our quota Exhausted and Cost is Another Matter)
Maintaining Different tables in different datasets which a. Encrypted Tables with Sensitive Field b. Non-Encrypted Table with Non-Sensitive Fields.
Problem: (Maintenance and Making Data in Sink and Join at Run Time in BQ)
The Above are My Approach and Use case Is Anyone able to help me to see what to Use and Why its better than others.
Related
Does Google Cloud SQL support column-level encryption?
I know that it is possible for BigQuery tables but not sure about Cloud SQL!
link
It's not a out of the bow feature on Cloud SQL, you need to do it manually when you read and write the data. You can use Cloud KMS for hat.
With BigQuery, keep in mind that you need to keep the key in BigQuery also and only the IAM permission allow to access or not to the keyset.
Eventually, all the data are encrypted at rest, but I'm sure that your use case is for a specific column, not for the whole database.
My needs - Read/Write Encrypted Data
I have a software application that can build webpages and emails and I need to encrypt content stored in the database (PII plus user generated content by financial/healthcare institutions).
I would like to use AWS KMS/AWS Secrets Manager to manage the hash so the hash is secured and keys that gate the hash are automatically rotated and managed by AWS. I'll use this hash to encrypt/decrypt data. Two way encryption is required.
My Question
It seems there are two options, and I'm not sure which is preferred and which is the proper way to use AWS for this:
Option 1 - Encrypt DB Access (Not preferred)
I could store all PII and encrypted data in a separate RDS DB, and simply gate API access with encryption provided by AWS Secrets Manger and KMS. This kinda stinks because the encrypted data relates to tables in the main DB. So hosting this data elsewhere is cumbersome to maintain.
Option 2 - Encrypt the data on a field level (!Preferred!)
I would prefer to store encrypted data in the DB directly. For instance a table may have 7 columns unencrypted, but the content column contains encrypted data. I then need to figure out a way to encrypt/decrypt this securely. I would need some sort hash that encrypts/decrypts the data. Storing in directly in PHP seems like a bad idea so could I use AWS KMS / Secrets Manager to do this?
Plan A -
I store a hash in Secrets Manager (Encrypted with KMS), so when the application wants to encrypt/decrypt content, it uses the required IAM user to get the hash from AWS Secrets Manager, encrypts/decrypts the content and then removes the hash from memory.
Plan B* -
I use KMS directly (No secrets manager) and pass encrypted/decrypted content to KMS directly and it encrypts/decrypts on the fly, never needing to expose or send the key it's using to perform these actions.
Thanks again for taking the time
How do I use athena workgroups to restrict access of a user to a particular database?
For e.g. I have a user "readonly" who should not be able to run select query on default database. Is this possible?
The way to restrict users from querying tables is to use IAM permissions. The permissions model in Athena is unfortunately more complicated than in an isolated data warehouse or RDBMS, since Athena is a part of a larger ecosystem that also includes S3 and Glue.
There is no specific permission for running SELECT. You can restrict users to run queries by controlling whether or not they are allowed to perform the athena:StartQueryExecution action, but you can't control what kind of queries they run.
Instead you need to think in terms of access to data, and access to the catalog.
To restrict reading you restrict the user's access to the data on S3. Even if a user is allowed to run a SELECT query they will get an error if they don't have permission to run s3:ListObject and s3:GetObject on the objects in the table's prefix.
You can also restrict a user's access to the catalog objects, i.e. the databases and tables – but that does not restrict their access to the data itself, think of it more as a restriction on creating, updating, and dropping databases and tables. Even if there is a way to restrict which databases and tables a user can see in the catalog, if they have permission to read the data they can read the data directly from S3 and skipping Athena.
You can find the documentation on how to control access to catalog objects here: https://docs.aws.amazon.com/athena/latest/ug/fine-grained-access-to-glue-resources.html
Workgroups in Athena can't be used to control access to data, nor to the catalog.
Is there any way of controlling access to DynamoDB data based on data in other tables? By way of comparison, Firebase Realtime Database rules have access to a snapshot of the entire database when being evaluated, so rules like this are possible:
".write": "root.child('allow_writes').val() === true"
But all my reading of the AWS permissions structure hasn't given me any clue how to achieve the same thing. There are variables that can be tested based on the current authenticated user, and some variables based on the current request, but no way I can see of referencing other data within the database.
AWS don't support this case, you're only option would be to put the access control in your application.
You can control table, item or attribute level data access in DynamoDB using a IAM policy variables. Frustratingly AWS don't even seem to publish a list of available policy variables. Typically it boils down to using Cognito sub or AWS userid, which the majority of people don't want to use as a partition keys in their tables.
I am new to AWS platform. I am trying to build a backend for a mobile app using AWS lambda, API gateway and DynamoDB using Facebook Authentication of AWS Cognito for my app.
A user are able to logged in to app and data should saved in a table with UserID (which I get from Cognito), data1, data2, data3. This only belongs to this. Let's say those are user's activities.
Again when he login to app next time, he should be able to see all his entered data.
I was looking for the example of it, I found this link which is about fine grained access control where the table is Partitioned with a particular user and permission.
https://aws.amazon.com/blogs/mobile/dynamodb-on-mobile-part-5-fine-grained-access-control/
That doesn't sound right. In a regular RDBMS centered app, the application connects to the database using a specific user in a connection string. User specific data is returned to the user using a query that is constructed on the fly with "username = user_id".
Is this above link talking about something different?
I am confused.
Thanks for your time!!
I believe the article you linked is discussing allowing an app to access DynamoDB directly, by calling the AWS API directly instead of going through a backend application layer. It is using variables in the IAM policy to only allow a user to execute queries against the table that contain their ID as the primary key.
In your case the AWS Lambda function is your backend application layer. You could simply assign an IAM role to the Lambda function that allows it to query all records in the DynamoDB table, and build queries in the Lambda function using the UserID as the query key.