I am deploying a Python Flask application with Elastic Beanstalk. I have a config file /.ebextensions/01.config where among other things I set some environment variables - some of which should be secret.
The file looks something like this:
packages:
yum:
gcc: []
git: []
postgresql93-devel: []
option_settings:
"aws:elasticbeanstalk:application:environment":
SECRET_KEY: "sensitive"
MAIL_USERNAME: "sensitive"
MAIL_PASSWORD: "sensitive"
SQLALCHEMY_DATABASE_URI: "sensitive"
"aws:elasticbeanstalk:container:python:staticfiles":
"/static/": "app/static/"
What are the best practices for keeping certain values secret? Currently the .ebextensions folder is under source control and I like this because it is shared with everyone, but at the same time I do not want to keep sensitive values under source control.
Is there a way to specify some environment variables through the EB CLI tool when deploying (e.g. eb deploy -config ...)? Or how is this use case covered by the AWS deployment tools?
The AWS documentation recommends storing sensitive information in S3 because environment variables may be exposed in various ways:
Providing connection information to your application with environment
properties is a good way to keep passwords out of your code, but it's
not a perfect solution. Environment properties are discoverable in the
Environment Management Console, and can be viewed by any user that has
permission to describe configuration settings on your environment.
Depending on the platform, environment properties may also appear in
instance logs.
The example below is from the documentation, to which you should refer for full details. In short, you need to:
Upload the file to S3 with minimal permissions, possibly encrypted.
Grant read access to the role of the instance profile for your Elastic Beanstalk autoscaling group. The policy would be like:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "database",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::my-secret-bucket-123456789012/beanstalk-database.json"
]
}
]
}
Add a file with a name like s3-connection-info-file.config to /.ebextensions in your application bundle root with these contents:
Resources:
AWSEBAutoScalingGroup:
Metadata:
AWS::CloudFormation::Authentication:
S3Auth:
type: "s3"
buckets: ["my-secret-bucket-123456789012"]
roleName: "aws-elasticbeanstalk-ec2-role"
files:
"/tmp/beanstalk-database.json" :
mode: "000644"
owner: root
group: root
authentication: "S3Auth"
source: https://s3-us-west-2.amazonaws.com/my-secret-bucket-123456789012/beanstalk-database.json
Then update your application code to extract the values from the file /tmp/beanstalk-database.json (or wherever you decide to put it in your actual config.)
This question already has an answer, but I want to contribute an alternative solution to this problem. Instead of having to keep secrets in environment variables (which then have to be managed and stored somewhere out of version control, plus you need to remember to set them at deployment), I put all my secrets in an encrypted S3 bucket only accessible from the role the EB is running as. I then fetch the secrets at startup. This has the benefit of completely decoupling deployment from configuration, and you never ever have to fiddle with secrets in the command line again.
If needed (for example if secrets are needed during app setup, such as keys to repositories where code is fetched) you can also use an .ebextensions config file with an S3Auth directive to easily copy the contents of said S3 bucket to your local instance; otherwise just use the AWS SDK to fetch all secrets from the app at startup.
EDIT: As of April 2018 AWS offers a dedicated managed service for secrets management; the AWS Secrets Manager. It offers convenient secure storage of secrets in string or json format, versioning, stages, rotation and more. It also eliminates some of the configuration when it comes to KMS, IAM etc for a quicker setup. I see no real reason using any other AWS service for storing static sensitive data such as private keys, passwords etc.
You should be able to specify sensitive values as environment variables from eb web console: Your EB app -> Your EB environment -> Configuration -> Software Configuration -> Environment Properties
Alternatively, you can make use of this: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/eb3-setenv.html
EDIT: While this was accepted answer in 2015, this should not be how you handle it. Now you can use AWS Secrets Manager for this purpose.
I have been using another shell script, something ./deploy_production.sh to set environment specific variables. In the shell script, you can use "eb setenv NAME1=VAR1 NAME2=VAR2 ..." to set env var.
And this file doesn't need to go into git repo.
Some of the other answers are mentioning that there might be a better way with Parameter Store / Secrets Manager.
I described how I did this with AWS Systems Manager Parameter Store (which also gives you an interface to Secrets Manager) in this answer: https://stackoverflow.com/a/59910941/159178. Basically, you give your Beanstalk ECS IAM role access to the relevant parameter and then load it from your application code at startup.
Related
I'd like to be able to use GitHub Actions to be able to deploy resources with AWS, but without using a hard-coded user.
I know that it's possible to create an IAM user with a fixed credential, and that can be exported to GitHub Secrets, but this means if the key ever leaks I have a large problem on my hands, and rotating such keys are challenging if forgotten.
Is there any way that I can enable a password-less authentication flow for deploying code to AWS?
Yes, it is possible now that GitHub have released their Open ID Connector for use with GitHub Actions. You can configure the Open ID Connector as an Identity Provider in AWS, and then use that for an access point to any role(s) that you wish to enable. You can then configure the action to use the credentials acquired for the duration of the job, and when the job is complete, the credentials are automatically revoked.
To set this up in AWS, you need to create an Open Identity Connect Provider using the instructions at AWS or using a Terraform file similar to the following:
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
client_id_list = [
// original value "sigstore",
"sts.amazonaws.com", // Used by aws-actions/configure-aws-credentials
]
thumbprint_list = [
// original value "a031c46782e6e6c662c2c87c76da9aa62ccabd8e",
"6938fd4d98bab03faadb97b34396831e3780aea1",
]
}
The client id list is the 'audience' which is used to access this content -- you can vary that, provided that you vary it in all the right places. The thumbprint is a hash/certificate of the Open ID Connector, and 6938...aea1 is the current one used by GitHub Actions -- you can calculate/verify the value by following AWS' instructions. The thumbprint_list can hold up to 5 values, so newer versions can be appended when they are being made available ahead of time while continuing to use older ones.
If you're interested in where this magic value came from, you can find out at How can I calculate the thumbprint of an OpenID Connect server?
Once you have enabled the identity provider, you can use it to create one or more custom roles (replacing with :
data "aws_caller_identity" "current" {}
resource "aws_iam_role" "github_alblue" {
name = "GitHubAlBlue"
assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Action = "sts:AssumeRoleWithWebIdentity"
Effect = "Allow"
Principal = {
Federated = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/token.actions.githubusercontent.com"
}
Condition = {
StringLike = {
"token.actions.githubusercontent.com:aud" : ["sts.amazonaws.com" ],
"token.actions.githubusercontent.com:sub" : "repo:alblue/*"
}
}
}]
})
}
You can create as many different roles as you need, and even split them up by audience (e.g. 'production', 'dev'). Provided that the OpenID connector's audience is trusted by the account then you're good to go. (You can use this to ensure that the OpenID Connector in a Dev account doesn't trust the roles in a production account and vice-versa.) You can have, for example, a read-only role for performing terraform validate and then another role for terraform apply.
The subject is passed from GitHub, but looks like:
repo:<organization>/<repository>:ref:refs/heads/<branch>
There may be different formats that come out later. You could have an action/role specifically for PRs if you use :ref:refs/pulls/* for example, and have another role for :ref:refs/heads/production/*.
The final step is getting your GitHub Actions configured to use the token that comes back from the AWS/OpenID Connect:
Standard Way
jobs:
terraform-validate:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- name: Checkout
uses: actions/checkout#v2
- name: Configure AWS credentials from Test account
uses: aws-actions/configure-aws-credentials#master
with:
role-to-assume: arn:aws:iam::<accountid>:role/GitHubAlBlue
aws-region: us-east-1
- name: Display Identity
run: aws sts get-caller-identity
Manual Way
What's actually happening under the covers is something like this:
jobs:
terraform-validate:
runs-on: ubuntu-latest
env:
AWS_WEB_IDENTITY_TOKEN_FILE: .git/aws.web.identity.token.file
AWS_DEFAULT_REGION: eu-west-2
AWS_ROLE_ARN: arn:aws:iam::<accountid>:role/GitHubAlBlue
permissions:
id-token: write
contents: read
steps:
- name: Checkout
uses: actions/checkout#v2
- name: Configure AWS
run: |
sleep 3 # Need to have a delay to acquire this
curl -H "Authorization: bearer $ACTIONS_ID_TOKEN_REQUEST_TOKEN" \
"$ACTIONS_ID_TOKEN_REQUEST_URL&audience=sts.amazonaws.com" \
| jq -r '.value' > $AWS_WEB_IDENTITY_TOKEN_FILE
aws sts get-caller-identity
You need to ensure that your AWS_ROLE_ARN is the same as defined in your AWS account, and that the audience is the same as accepted by the OpenID Connect and the Role name as well.
Essentially, there's a race condition between the job starting, and the token's validity which doesn't come in until after GitHub has confirmed the job has started; if the size of the AWS_WEB_IDENITY_TOKEN_FILE is less than 10 chars, it's probably an error and sleep/spinning will get you the value afterwards.
The name of the AWS_WEB_IDENTITY_TOKEN_FILE doesn't really matter, so long as it's consistent. If you're using docker containers, then storing it in e.g. /tmp will mean that it's not available in any running containers. If you put it under .git in the workspace, then not only will git ignore it (if you're doing any hash calculations) but it will also be present in any other docker run actions that you do later on.
You might want to configure your role so that the validity of the period used is limited; once you have the web token it's valid until the end of the job, but the token requested has a lifetime of 15 minutes, so it's possible for a longer-running job to expose that.
It's likely that GitHub will have a blog post on how to configure/use this in the near future. The above information was inspired by https://awsteele.com/blog/2021/09/15/aws-federation-comes-to-github-actions.html, who has some examples in CloudFormation templates if that's your preferred thing.
Update GitHub (accidentally) changed their thumbprint and the example has been updated. See for more information. The new thumbprint is 6938fd4d98bab03faadb97b34396831e3780aea1 but it's possible to have multiple thumbprints in the IAM OpenID connector.
Github recently updated their certifcate chain and the thumprint has changed from the one mentioned above a031c46782e6e6c662c2c87c76da9aa62ccabd8e to 6938fd4d98bab03faadb97b34396831e3780aea1
Github Issue: https://github.com/aws-actions/configure-aws-credentials/issues/357
For some reason I keep getting this with the provided answer of AlBlue:
"message":"Can't issue ID_TOKEN for audience 'githubactions'."
Deploying the stack provided in this post, does work however.
I am having issues deploying my docker images to aws ecr as part of a terraform deployment and I am trying to think through the best long term strategy.
At the moment I have a terraform remote backend in s3 and dynamodb on let's call it my master account. I then have dev/test etc environments in separate accounts. The terraform deployment is currently run off my local machine (mac) and uses the aws 'master' account and its credentials which in turn assumes a role in the target deployment account to create the resources as per:
provider "aws" { // tell terraform which SDK it needs to load
alias = "target"
region = var.region
assume_role {
role_arn = "arn:aws:iam::${var.deployment_account}:role/${var.provider_env_deployment_role_name}"
}
}
I am creating a number of ecs services with Fargate deployments. The container images are built in separate repos by GitHub Actions and saved as GitHub packages. These package names and versions are being deployed after the creation of the ecr and service (maybe that's not ideal thinking about it) and this is where the problems arise.
The process is to pull the image from GitHub Packages, retag it and upload to the ecr using multiple executions of a null_resource local-exec. Works fine stand alone but has problems as part of the terraform process. I think the reason is that the other resources use the above provider to get permissions but as null_resource does not accept a provider it cannot get the permissions this way. So I have been passing the aws creds values into the shell. Not convinced this is really secure but that's currently moot as it ain't working either. I get this error:
Error saving credentials: error storing credentials - err: exit status 1, out: `error storing credentials - err: exit status 1, out: `The specified item already exists in the keychain.``
Part of me thinks this is the wrong approach and that as I migrate to deploying via a Github action I can separate the infrastructure deployment via terraform from what is really the application deployment and just use GitHub secrets to reset the credentials values then run the script.
Alternatively, maybe the keychain thing just goes away and my process will work fine? Secure ??
That's fine for this scenario but it isn't really a generic approach for all my use cases.
I am shortly going to start deploying multiple aws lambda functions with docker containers. Haven't done it before but it looks like the process is going to be: create ecr, deploy container, deploy lambda function. This really implies that the container deployment should integral to the terraform deployment which loops back to my issue with the local-exec??
I found Actions to deploy to ECR which would imply splitting the deployments into multiple files but that seems inelegant and potentially brittle.
Maybe there is a simple solution, but given where I am trying to go with this, what is my best approach?
I know this isn't a complete answer, but you should be pulling your AWS creds from environment variables. I don't really understand if you need credentials for different accounts, but if you do then swap them during the progress of your action. See https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html . Terraform should pick these up and automatically use them for AWS access.
Instead of those hard coded access key/secret access keys I'd suggest making use of Github/AWS's ability to assume role through temporary credentials with OIDC https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
You'd likely only define one initial role that you'd authenticate into and from there assume into the other accounts you're deploying into.
These the assume role credentials are only good for an hour and do not have the operation overhead of having to rotate them.
As suggested by Kevin Buchs answer...
My primary issue was related to deploying from a mac and the use of the keychain. As this was not on the critical path I went round it and set up a GitHub Action.
The Action loaded environmental variables from GitHub secrets for my 'master' aws account credentials.
AWS_ACCESS_KEY_ID: ${{ secrets.NK_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.NK_AWS_SECRET_ACCESS_KEY }}
I also loaded the target accounts credentials into environmental variables in the same way BUT with the prefix TF_VAR.
TF_VAR_DEVELOP_AWS_ACCESS_KEY_ID: ${{ secrets.DEVELOP_AWS_ACCESS_KEY_ID }}
TF_VAR_DEVELOP_AWS_SECRET_ACCESS_KEY: ${{ secrets.DEVELOP_AWS_SECRET_ACCESS_KEY }}
I then declare terraform variables which will be automatically populated from the environment variables.
variable "DEVELOP_AWS_ACCESS_KEY_ID" {
description = "access key for the dev account"
type = string
}
variable "DEVELOP_AWS_SECRET_ACCESS_KEY" {
description = "secret access key for the dev account"
type = string
}
And when I run a shell script with a local exec:
resource "null_resource" "image-upload-to-importcsv-ecr" {
provisioner "local-exec" {
command = "./ecr-push.sh ${var.DEVELOP_AWS_ACCESS_KEY_ID} ${var.DEVELOP_AWS_SECRET_ACCESS_KEY} "
}
}
Within the script I can then use these arguments to set the credentials eg
AWS_ACCESS=$1
AWS_SECRET=$1
.....
export AWS_ACCESS_KEY_ID=${AWS_ACCESS}
export AWS_SECRET_ACCESS_KEY=${AWS_SECRET}
and the script now has credentials to do whatever.
I have an ECS service, which requires AWS credentials. I use ECR to store docker images and jenkins visible only for VPN connections to build images.
I see 2 possibilities to provide AWS credentials to the service
Store them as Jenkins secret and insert into the docker image during build
Make them a part of the environment when creating ECS Task definition
What is more secure? Are there other possibilities?
First thing, You should not use AWS credentials while working inside AWS, you should assign the role to Task definition or services instead of passing the credentials to docker build or task definition.
With IAM roles for Amazon ECS tasks, you can specify an IAM role that
can be used by the containers in a task. Applications must sign their
AWS API requests with AWS credentials, and this feature provides a
strategy for managing credentials for your applications to use,
similar to the way that Amazon EC2 instance profiles provide
credentials to EC2 instances
So sometimes the underlying application is not designed in a way that can use role so in this I will recommend storing ENV in the task definition but again from where to get the value of ENV?
Task definition support two methods to deal with ENV,
Plain text as direct value
Use ‘valueFrom’ attribute for ECS task definition
The following is a snippet of a task definition showing the format when referencing an Systems Manager Parameter Store parameter.
{
"containerDefinitions": [{
"secrets": [{
"name": "environment_variable_name",
"valueFrom": "arn:aws:ssm:region:aws_account_id:parameter/parameter_name"
}]
}]
}
This is the most secure and recommended method by AWS documentation so this is the better way as compared to ENV in plain text inside Task definition or ENV in Dockerfile.
You can read more here and systems-manager-parameter-store.
But to use these you will must provide permission to task definition to access systems-manager-parameter-store.
I recently moved my app to elasticbeanstalk, and i am running Symfony3, there is a mandatory parameters.yml file that has to be populated with Environmental variables.
Id like to wget the parameters.yml from a private S3 bucket, limiting access to instances only.
I know i can set the environmental variables directly on the environment, but i have some very very sensitive stuff there, and environmental variables get leaked into my logging system, which is very bad.
I also have multiple environments such as workers using the same environmental variables, and copy pasting them is quite annoying.
So i am wondering if its possible to have the app wget it on deploy, i know how to do that, but i cant seem to configure the S3 bucket to only allow access from instances.
Yep, that definitely can be done, there are different ways of doing this depending what approach you want to take. I would suggest to use .ebextentions to create IAM Role -> grant access for that role to your bucket -> after package is unzip on the instance -> copy object from s3 using instance role
Create custom IAM role using AWS console or using .ebextentions custom resources and grant access for that role to the objects in your bucket.
Related read
Using above mentioned .ebextentions set aws:autoscaling:launchconfiguration in options_setting to specify instance profile you created before.
Again, using .ebxtentions use container_commands option to run aws s3 cp command
I'm writing an application which I want to run as an AWS Lambda function but also adhere to the Twelve-Factor app guidelines. In particular Part III. Config which requires the use of environmental variables for configuration.
However, I cannot find a way to set environmental variables for AWS Lambda instances. Can anyone point me in the right direction?
If it isn't possible to use environmental variables can you please recommend a way to use environmental variables for local development and have them transformed to a valid configuration system that can be accessed using the application code in AWS.
Thanks.
As of November 18, 2016, AWS Lambda supports environment variables.
Environment variables can be specified both using AWS console and AWS CLI. This is how you would create a Lambda with an LD_LIBRARY_PATH environment variable using AWS CLI:
aws lambda create-function \
--region us-east-1
--function-name myTestFunction
--zip-file fileb://path/package.zip
--role role-arn
--environment Variables={LD_LIBRARY_PATH=/usr/bin/test/lib64}
--handler index.handler
--runtime nodejs4.3
--profile default
Perhaps the 'custom environment variables' feature of node-lambda would address your concerns:
https://www.npmjs.com/package/node-lambda
https://github.com/motdotla/node-lambda
"AWS Lambda doesn't let you set environment variables for your function, but in many cases you will need to configure your function with secure values that you don't want to check into version control, for example a DB connection string or encryption key. Use the sample deploy.env file in combination with the --configFile flag to set values which will be prepended to your compiled Lambda function as process.env environment variables before it gets uploaded to S3."
There is no way to configure env variables for lambda execution since each invocation is disjoint and no state information is stored. However there are ways to achieve what you want.
AWS credentials - you can avoid storing that in env variables. Instead grant the privileges to your LambdaExec role. In fact, AWS recommends using roles instead of AWS credentials.
Database details: One suggestion is to store it in a well known file in a private bucket. Lambda can download that file when it is invoked, read the contents which can contain database details and other information. Since the bucket is private, others cannot access the file. The LambdaExec role needs IAM privileges to access the private bucket.
AWS just added support for configuration of Lambda functions via environment parameters.
Take a look here
We also had this requirement for our Lambda function and we "solved" this by generating a env file on our CI platform (in our case this is CircleCI). This file gets included in the archive that gets deployed to Lambda.
Now in your code you can include this file and use the variables.
The script that I use to generate a JSON file from CircleCI environment variables is:
cat >dist/env.json <<EOL
{
"CLIENT_ID": "$CLIENT_ID",
"CLIENT_SECRET": "$CLIENT_SECRET",
"SLACK_VERIFICATION_TOKEN": "$SLACK_VERIFICATION_TOKEN",
"BRANCH": "$CIRCLE_BRANCH"
}
EOL
I like this approach because this way you don't have to include environment specific variables in your repository.
I know it has been a while, but I didn't see a solution that works from the AWS Lambda console.
STEPS:
In your AWS Lambda Function Code, look for "Environment variables", and click on "Edit";
For the "Key", type "LD_LIBRARY_PATH";
For the "Value", type "/opt/python/lib".
Look at this screenshot for the details.
The #3 assumes that you are using Python as your runtime environment, and also that your uploaded Layer has its "lib" folder in the following structure:
python/lib
This solution works for the error:
/lib/x86_64-linux-gnu/libz.so.1: version 'ZLIB_1.2.9' not found
assuming the correct libray file is put in the "lib" folder and that the environment variable is set like above.
PS: If you are unsure about the #3 path, just look for the error in your console, and you will be able to see where your "lib" folder for your layer is at runtime.