How to control the access of AWS secret manager - amazon-web-services

Suppose I am an AWS Superuser who has all AWS permissions.
I have configured AWS glue including the connection to a database using username and password.
I have stored the username and password in AWS secret manager, the glue’s ETL job script will connect database using this information and then run the ETL job.
ETL Data engineers do not have super user permission. But they know how to write the details of the ETL job script. And the script needs to retrieve the secret info first, which means engineers can write code to print out the password… and we have a lot of Data engineers…
My question is: what is the right strategy to control the password access of the secret manager?
1) Shall we allow ETL data engineers to update script to glue and run it? then they can see the password, or
2) Shall we only allow them to write ETL script, but let superuser to update the script to glue after reviewing the code? or
3) Do we have a way to separate the ETL job script code and get_password code?
Note, I know How to use IAM, tags to control secret manager. But my question is different.

Related

Superset with Athena: set workgroup based on user role

I'm working at a company that used to use AWS Athena and Quicksight to run sql queries and create dashboards, but now we have to use Apache Superset to do this.
While all users was using aws console, I could get Cloud Trail logs to send to the managers some reports of data consumption based on Athena workgroups or even idetify users that ran heavy queries, but since we migrate to Superset I lost the trace of user and workgroup because all queries are run using the same Athena connection...
Is this possible to pass the Athena workgroup based on user role (and maybe the username too) throught the conector?
I tried to find something on superset docs, but didn't find anything :(

AWS programatic credential use in automation scripts

Right now we create scripts that run through CLI to automate or fetch things from AWS.
But we used AWS access key/ secret access/session token for the same.
these keys and tokens are valid for 1 hour. Hence next hour if we do use them, the script will fail.
But it is also not possible to fetch the temp credentials, update the script, and run those.
So what is the best possible solution in this condition? What should I do that I can get the updated credentials and I can run the script by using those updated credentials (automatically)? Or any other alternatives so that we can still run scripts from our local machines using Boto with AWS credentials?
any help is appreciated.
Bhavesh
I'm assuming that your script runs outside of AWS, otherwise you would simply configure your compute (EC2, Lambda, etc.) to automatically assume an IAM role.
If you have persistent IAM User credentials that allow you to assume the relevant role then use those.
If you don't then take a look at the new IAM Roles Anywhere feature.

AWS Educate Starter Account obtain credentials in Python with boto3

I have a AWS Educate Starter Account, and I want to be able to generate automatically my credentials (aws_access_key_id, aws_secret_access_key, aws_session_token) from my code.
Currently, the way I do it is:
1) Login with my university email and password in labs.vocareum.com
2) Click on Account Details and copy and paste the credentials into ~/.aws/credentials for AWS CLI
3) In my Python code I use boto3 to interact with s3
But I would like to do everything in my Python script, without logging in every time and copy the credentials, since they are temporary credentials (they expire every 1 hour).
The type of account doesn't allow me to create and IAM User either.
This is a similar question, but is 2 years old and doesn't have an answer on how to do it without logging in.
Is there any way to do it?
labs.vocareum manage keys for your AWS Educate. It is not possible to do this since you will be locked out before you can reschedule your session. Unfortunately, there is currently no other way but the current method.

I want to regularly push files to AWS S3 from an on-prem server

I generate some files(size of the order KBs) periodically on an on-prem server. I want to push them to s3 as soon as they are generated. How do I go about managing the accesses?
I'm using the boto3 package of python to do so. How do I manage access, as in do I create a new IAM role? If so, how to specify the permissions?
There's a few requirements here:
You need something to 'trigger' the upload. This could be a cron job / scheduled task, or it could be specifically triggered by whatever process is generating the files
You could then use either the AWS Command-Line Interface (CLI) command to upload the files (either aws s3 cp or aws s3 sync), or you could write your own program as you suggested.
The AWS CLI or your own program will require AWS credentials. The recommended method would be:
Create a User in IAM and make note of the credentials (Access Key, Secret Key)
Assign minimal permissions to the User so that it has enough permission to perform the function (eg s3:PutObject for the given bucket). Do not assign s3:* on every bucket! (Very bad for security.)
On the on-premises computer, run aws configure (part of the AWS CLI) and enter the credentials that were provided when the IAM User was created
This will place the credentials in a .aws/credentials file on the computer. Never put credentials in your programming code! (Again, bad for security.)
If you are asking how to create an IAM User, please see: Creating an IAM User in Your AWS Account - AWS Identity and Access Management

Sending credentials to Google Dataflow jobs

What is the right way to pass credentials to Dataflow jobs?
Some of my Dataflow jobs need credentials to make REST calls and fetch/post processed data.
I am currently using environment variables to pass the credentials to the JVM, read them into a Serializable object and pass them on to the DoFn implementation's constructor. I am not sure this is the right approach as any class which is Serializable should not contain sensitive information.
Another way I thought of is to store the credential in GCS and retrieve them using service account key file, but was wondering why should my job execute this task of reading credentials from GCS.
Google Cloud Dataflow does not have native support for passing or storing secured secrets. However you can use Cloud KMS and/or GCS as you propose to read a secret at runtime using your Dataflow service account credentials.
If you read the credential at runtime from a DoFn, you can use the DoFn.Setup lifecycle API to read the value once and cache it for the lifetime of the DoFn.
You can learn about various options for secret management in Google Cloud here: Secret management with Cloud KMS.