Airflow BashOperator - Use different role then its pod role - amazon-web-services

I've tried to run the following commands as part of a bash script runs in BashOperator:
aws cli ls s3://bucket
aws cli cp ... ...
The script runs successfully, however the aws cli commands return error, showing that aws cli doesn't run with the needed permissions (as was defined in airflow-worker-node role)
Investigating the error:
I've upgraded awscli in the docker running the pod - to version 2.4.9 (I've understood that old version of awscli doesn't support access to s3 based on permission grant by aws role
I've Investigated the pod running my bash_script using the BashOperator:
Using k9s, and D (describe) command:
I saw that ARN_ROLE is defined correctly
Using k9s, and s (shell) command:
I saw that pod environment variables are correct.
aws cli worked with the needed permissions and can access s3 as needed.
aws sts get-caller-identity - reported the right role (airflow-worker-node)
Running the above commands as part of the bash-script which was executed in the BashOperator gave me different results:
Running env showed limited amount of env variables
aws cli returned permission related error.
aws sts get-caller-identity - reported the eks role (eks-worker-node)
How can I grant aws cli in my BashOperator bash-script the needed permissions?

Reviewing the BashOperator source code, I've noticed the following code:
https://github.com/apache/airflow/blob/main/airflow/operators/bash.py
def get_env(self, context):
"""Builds the set of environment variables to be exposed for the bash command"""
system_env = os.environ.copy()
env = self.env
if env is None:
env = system_env
else:
if self.append_env:
system_env.update(env)
env = system_env
And the following documentation:
:param env: If env is not None, it must be a dict that defines the
environment variables for the new process; these are used instead
of inheriting the current process environment, which is the default
behavior. (templated)
:type env: dict
:param append_env: If False(default) uses the environment variables passed in env params
and does not inherit the current process environment. If True, inherits the environment variables
from current passes and then environment variable passed by the user will either update the existing
inherited environment variables or the new variables gets appended to it
:type append_env: bool
If bash operator input env variables is None, it copies the env variables of the father process.
In my case, I provided some env variables therefore it didn’t copy the env variables of the father process into the chid process - which caused the child process (the BashOperator process) to use the default arn_role of eks-worker-node.
The simple solution is to set the following flag in BashOperator(): append_env=True which will append all existing env variables to the env variables I added manually.
I've figured out that in the version I'm running (2.0.1) it isn't supported (it is supported in later versions).
As a temp solution I've add **os.environ - to the BashOperator env parameter:
return BashOperator(
task_id="copy_data_from_mcd_s3",
env={
"dag_input": "{{ dag_run.conf }}",
......
**os.environ,
},
# append_env=True,- should be supported in 2.2.0
bash_command="utils/my_script.sh",
dag=dag,
retries=1,
)
Which solve the problem.

Related

Cloudwatch agent not using environment variable credentials on Windows

I'm trying to configure an AMI using a script that installs the unified Cloudwatch agent on both AWS and on premise Windows machines by using static IAM credentials for both of them. As part of the script, I set the credentials statically (as a test) using
$Env:AWS_ACCESS_KEY_ID="myaccesskey"
$Env:AWS_SECRET_ACCESS_KEY="mysecretkey"
$Env:AWS_DEFAULT_REGION="us-east-1"
Once I have the AMI, I create a machine and connect to it, and then verify the credentials are there by running aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************C6IF env
secret_key ****************SCnC env
region us-east-1 env ['AWS_REGION', 'AWS_DEFAULT_REGION']
But when I start the agent, I get the following error in the logs.
2022-12-26T17:51:49Z I! First time setting retention for log group test-cloudwatch-agent, update map to avoid setting twice
2022-12-26T17:51:49Z E! Failed to get credential from session: NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
EC2RoleRequestError: no EC2 instance role found
caused by: EC2MetadataError: failed to make EC2Metadata request
I'm using the Administrator user for both the installation of the agent and then when RDPing into the machine. Is there anything I'm missing?
I've already tried adding the credentials to the .aws/credentials file and modifying the common-config.toml file to use a profile. That way it works but in my case I just want to use the environment variables.
EDIT: I tested adding the credentials in the userdata script and modify a bit how they are created and now it seems to work.
$env:aws_access_key_id = "myaccesskeyid"
$env:aws_secret_access_key = "mysecretaccesskey"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
Now the problem is that I'm trying to start the agent at the end of the userdata script with the command from the documentation but it does nothing (I see in the agent logs the command but there is no error). If I RDP into the machine and launch the same command in Powershell it works fine. The command is:
& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m onPrem -s -c file:"C:\ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json"
I finally was able to make it work but I'm not sure of why it didn't before. I was using
$env:aws_access_key_id = "accesskeyid"
$env:aws_secret_access_key = "secretkeyid"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
to set the variables but then the agent was failing to initialize. I had to add
$env:aws_default_region = "us-east-1"
so it was able to run. I couldn't find the issue before because on Windows server 2022 I don't get the logs from the execution. I had to try using Windows Server 2019 to actually see the error when launching the agent.
I still don't know why the environment variables I set in the machine scope worked once logged into the machine but not when using them as part of the userdata script.

"Unable to determine aws-region" when running on-premises Cloudwatch agent

I'm trying to configure the AWS Cloudwatch agent to run on vanilla Ubuntu 18.04, outside of AWS. Every time I run it, I get this error:
# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m onPremise -c "file:/path/to/cloudwatch/cloudwatch.json" -s
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:/path/to/cloudwatch/cloudwatch.json --mode onPrem --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
Got Home directory: /root
I! Set home dir Linux: /root
Unable to determine aws-region.
Please make sure the credentials and region set correctly on your hosts.
Refer to http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
Fail to fetch the config!
Running the program under strace -f reveals that it is trying to read /root/.aws/credentials and then exiting. Per the guide, here are the contents of /root/.aws/credentials:
[AmazonCloudWatchAgent]
aws_access_key_id = key
aws_secret_access_key = secret
region = us-west-2
If I run aws configure get region, it is able to retrieve the region correctly. However, the Cloudwatch Agent is unable to read it. Here's the contents of common-config.toml (which also gets read, per strace).
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
[credentials]
shared_credential_profile = "AmazonCloudWatchAgent"
shared_credential_file = "/root/.aws/credentials"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
Here are other things I have tried:
enclosing region (and all values) in the configuration in double quotes, per https://forums.aws.amazon.com/thread.jspa?threadID=291589. This did not make a difference.
adding /home/myuser/.aws/config, /home/myuser/.aws/credentials, and /root/.aws/config and populating them with the appropriate values. Per strace these files are not being read.
searching for the source code for the CloudWatch Agent (it is not open source)
setting AWS_REGION=us-west-2 explicitly in the program environment (same error)
changing [AmazonCloudWatchAgent] to [profile AmazonCloudWatchAgent] everywhere and all permutations of the above (no difference)
adding a [default] section in all config files (makes no difference)
invoking the config-downloader program directly, setting AWS_REGION etc. (same error)
becoming a non-root user and then invoking the program using sudo instead of invoking the program as the root user without sudo.
I get the same error no matter what I try. I installed the CloudWatch agent by downloading the "latest" deb on March 23, 2020, per these instructions. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/download-cloudwatch-agent-commandline.html
The aws config defaults to C:\Users\Administrator instead of the user you installed the CloudWatch Agent as. So you may need to move the /.aws/ folder to the CLoudWatch user. Or...more straightforward:
aws configure --profile AmazonCloudWatchAgent
as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#install-CloudWatch-Agent-iam_user-first
You can also specify the region using common-config.toml as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#CloudWatch-Agent-profile-instance-first
On a server running Windows Server, this file is in the C:\ProgramData\Amazon\AmazonCloudWatchAgent directory. The default common-config.toml is as follows:
# This common-config is used to configure items used for both ssm and cloudwatch access
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
# [credentials]
# shared_credential_profile = "{profile_name}"
# shared_credential_file= "{file_name}"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
You can also update the common-config.toml with a new location if needed.
I was using an incorrect "secret" with an invalid character that caused the INI file parser to break. The CloudWatch agent incorrectly reported this as a "missing region," when a parse error or "invalid secret" error would have been more accurate.
you should create a new file in the same folder as credentials with the name config
And add there the region
[default]
region = your-region
see more here
You have to uncomment the # [credentials] in the /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml config file as well
Set the AWS_REGION environment variable.
On Linux, macOS, or Unix, use :
export AWS_REGION=your_aws_region

Inject GitLab CI Variables into Terraform Variables

I'm having a set of Terraform files and in particular one variables.tf file which sort of holds my variables like aws access key, aws access token etc. I want to now automate the resource creation on AWS using GitLab CI / CD.
My plan is the following:
Write a .gitlab-ci-yml file
Have the terraform calls in the .gitlab-ci.yml file
I know that I can have secret environment variables in GitLab, but I'm not sure how I can push those variables into my Terraform variables.tf file which looks like this now!
# AWS Config
variable "aws_access_key" {
default = "YOUR_ADMIN_ACCESS_KEY"
}
variable "aws_secret_key" {
default = "YOUR_ADMIN_SECRET_KEY"
}
variable "aws_region" {
default = "us-west-2"
}
In my .gitlab-ci.yml, I have access to the secrets like this:
- 'AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}'
- 'AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}'
- 'AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}'
How can I pipe it to my Terraform scripts? Any ideas? I would need to read the secrets from GitLab's environment and pass it on to the Terraform scripts!
Which executor are you using for your GitLab runners?
You don't necessarily need to use the Docker executor but can use a runner installed on a bare-metal machine or in a VM.
If you install the gettext package on the respective machine/VM as well you can use the same method as I described in Referencing gitlab secrets in Terraform for the Docker executor.
Another possibility could be that you set
job:
stage: ...
variables:
TF_VAR_SECRET1: ${GITLAB_SECRET}
or
job:
stage: ...
script:
- export TF_VAR_SECRET1=${GITLAB_SECRET}
in your CI job configuration and interpolate these. Please see Getting an Environment Variable in Terraform configuration? as well
Bear in mind that terraform requires a TF_VAR_ prefix to environment variables. So actually you need something like this in .gitlab-ci.yml
- 'TF_VAR_AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}'
- 'TF_VAR_AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}'
- 'TF_VAR_AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}'
Which also means you could just set the variable in the pipeline with that prefix as well and not need this extra mapping step.
I see you actually did discover this per your comment---I'm still posting this answer since I missed your comment the first time and it would have saved me an hour of work.

Environment variables with AWS SSM Run Command

I am using AWS SSM Run Command with the AWS-RunShellScript document to run a script on an AWS Linux 1 instance. Part of the script includes using an environment variable. When I run the script myself, everything is fine. But when I run the script with SSM, it can't see the environment variable.
This variable needs to be passed to a Python script. I had originally been trying os.environ['VARIABLE'] to no effect.
I know that AWS SSM uses root privileges and so I have put a line exporting the variable in the root ~/.bashrc file, yet it still can not see the variable. The root user can see it when I run it myself.
Is it not possible for AWS SSM to use environment variables, or am I not exporting it correctly? If it is not possible, I'll try using AWS KMS instead to store my variable.
~/.bashrc
export VARIABLE="VALUE"
script.sh
"$VARIABLE"
Security is important, hence why I don't want to just store the variable in the script.
SSM does not open an actual SSH session so passing environment variables won't work. It's essential a daemon running on the box that's taking your requests and processing them. It's a very basic product: it doesn't support any of the standard features that come with SSH such as SCP, port forwarding, tunneling, passing of env variables etc. An alternative way of passing a value you need to a script would be to store it in AWS Systems Manager Parameter Store, and have your script pull the variable from the store.
You'll need to update your instance role permissions to have access to ssm:GetParameters for the script you run to access the value stored.
My solution to this problem:
set -o allexport; source /etc/environment; set +o allexport
-o allexport enables all variables in /etc/environment to be exported. +o allexport disables this feature.
For more information see the Set builtin documentation
I have tested this solution by using the AWS CLI command aws ssm send-command:
"commands": [
"set -o allexport; source /etc/environment; set +o allexport",
"echo $TEST_VAR > /home/ec2-user/app.log"
]
I am running bash script in my SSM command document, so I just source the profile/script to have env variables ready to be used by the subsequent commands. For example,
"runCommand": [
"#!/bin/bash",
". /tmp/setEnv.sh",
"echo \"myVar: $myVar, myVar2: $myVar2\""
]
You can refer to Can a shell script set environment variables of the calling shell? for sourcing your env variables. For python, you will have to parse your source profile/script, see Emulating Bash 'source' in Python

AWS credentials in Dockerfile

I require files to be downloaded from AWS S3 during container build, however I've been unsuccessful to provide the AWS credentials to the build process without actually hardcoding them in the Dockerfile. I get the error:
docker fatal error: Unable to locate credentials
despite previously having executed:
aws configure
Moreover, I was not able to use --build-arg for this purpose.
My question: is it possible to have these credentials in build time without hardcoding them in the Dockerfile and if so how?
Thank you for your attention.
Using the --build-arg flag is the correct way to do it, if you don't mind that the values can be seen by everyone using docker history, however you must use the ARG directive, not the ENV directive to specify them in your Dockerfile.
Here is an example Dockerfile that I have used with AWS credentials. It takes in the aws credentials as build arguments, including a default argument for the AWS_REGION build argument. It then performs a basic aws action, in this case logging into ecr.
FROM <base-image>:latest # an image I have that has `aws` installed
ARG AWS_ACCESS_KEY_ID
ARG AWS_SECRET_ACCESS_KEY
ARG AWS_REGION=us-west-2
RUN aws ecr get-login --no-include-email | bash
CMD ["npm", "start"]
You then build the image with the following command:
docker build -t testing --build-arg AWS_ACCESS_KEY_ID=<Your ID Here> \
--build-arg AWS_SECRET_ACCESS_KEY=<Your Key Here> .
Please be aware that the values of the --build-arg arguments can be seen by anyone with access to the image later on using docker history.
It is possible to hide the values from docker history. In order to achieve this you must use multistage-build. This will make your history only visible from the second FROM on.
Based on Jack's snippet example:
FROM <base-image>:latest AS first
ARG AWS_ACCESS_KEY_ID
ARG AWS_SECRET_ACCESS_KEY
ARG AWS_REGION=us-west-2
[do something]
FROM <base-image>:latest
COPY --from=first /dir/file_from_first /dir/file
This is a way to hide all the layers created during the first FROM.