"Unable to determine aws-region" when running on-premises Cloudwatch agent - amazon-web-services

I'm trying to configure the AWS Cloudwatch agent to run on vanilla Ubuntu 18.04, outside of AWS. Every time I run it, I get this error:
# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m onPremise -c "file:/path/to/cloudwatch/cloudwatch.json" -s
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:/path/to/cloudwatch/cloudwatch.json --mode onPrem --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
Got Home directory: /root
I! Set home dir Linux: /root
Unable to determine aws-region.
Please make sure the credentials and region set correctly on your hosts.
Refer to http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
Fail to fetch the config!
Running the program under strace -f reveals that it is trying to read /root/.aws/credentials and then exiting. Per the guide, here are the contents of /root/.aws/credentials:
[AmazonCloudWatchAgent]
aws_access_key_id = key
aws_secret_access_key = secret
region = us-west-2
If I run aws configure get region, it is able to retrieve the region correctly. However, the Cloudwatch Agent is unable to read it. Here's the contents of common-config.toml (which also gets read, per strace).
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
[credentials]
shared_credential_profile = "AmazonCloudWatchAgent"
shared_credential_file = "/root/.aws/credentials"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
Here are other things I have tried:
enclosing region (and all values) in the configuration in double quotes, per https://forums.aws.amazon.com/thread.jspa?threadID=291589. This did not make a difference.
adding /home/myuser/.aws/config, /home/myuser/.aws/credentials, and /root/.aws/config and populating them with the appropriate values. Per strace these files are not being read.
searching for the source code for the CloudWatch Agent (it is not open source)
setting AWS_REGION=us-west-2 explicitly in the program environment (same error)
changing [AmazonCloudWatchAgent] to [profile AmazonCloudWatchAgent] everywhere and all permutations of the above (no difference)
adding a [default] section in all config files (makes no difference)
invoking the config-downloader program directly, setting AWS_REGION etc. (same error)
becoming a non-root user and then invoking the program using sudo instead of invoking the program as the root user without sudo.
I get the same error no matter what I try. I installed the CloudWatch agent by downloading the "latest" deb on March 23, 2020, per these instructions. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/download-cloudwatch-agent-commandline.html

The aws config defaults to C:\Users\Administrator instead of the user you installed the CloudWatch Agent as. So you may need to move the /.aws/ folder to the CLoudWatch user. Or...more straightforward:
aws configure --profile AmazonCloudWatchAgent
as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#install-CloudWatch-Agent-iam_user-first
You can also specify the region using common-config.toml as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#CloudWatch-Agent-profile-instance-first
On a server running Windows Server, this file is in the C:\ProgramData\Amazon\AmazonCloudWatchAgent directory. The default common-config.toml is as follows:
# This common-config is used to configure items used for both ssm and cloudwatch access
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
# [credentials]
# shared_credential_profile = "{profile_name}"
# shared_credential_file= "{file_name}"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
You can also update the common-config.toml with a new location if needed.

I was using an incorrect "secret" with an invalid character that caused the INI file parser to break. The CloudWatch agent incorrectly reported this as a "missing region," when a parse error or "invalid secret" error would have been more accurate.

you should create a new file in the same folder as credentials with the name config
And add there the region
[default]
region = your-region
see more here

You have to uncomment the # [credentials] in the /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml config file as well

Set the AWS_REGION environment variable.
On Linux, macOS, or Unix, use :
export AWS_REGION=your_aws_region

Related

Cloudwatch agent not using environment variable credentials on Windows

I'm trying to configure an AMI using a script that installs the unified Cloudwatch agent on both AWS and on premise Windows machines by using static IAM credentials for both of them. As part of the script, I set the credentials statically (as a test) using
$Env:AWS_ACCESS_KEY_ID="myaccesskey"
$Env:AWS_SECRET_ACCESS_KEY="mysecretkey"
$Env:AWS_DEFAULT_REGION="us-east-1"
Once I have the AMI, I create a machine and connect to it, and then verify the credentials are there by running aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************C6IF env
secret_key ****************SCnC env
region us-east-1 env ['AWS_REGION', 'AWS_DEFAULT_REGION']
But when I start the agent, I get the following error in the logs.
2022-12-26T17:51:49Z I! First time setting retention for log group test-cloudwatch-agent, update map to avoid setting twice
2022-12-26T17:51:49Z E! Failed to get credential from session: NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
EC2RoleRequestError: no EC2 instance role found
caused by: EC2MetadataError: failed to make EC2Metadata request
I'm using the Administrator user for both the installation of the agent and then when RDPing into the machine. Is there anything I'm missing?
I've already tried adding the credentials to the .aws/credentials file and modifying the common-config.toml file to use a profile. That way it works but in my case I just want to use the environment variables.
EDIT: I tested adding the credentials in the userdata script and modify a bit how they are created and now it seems to work.
$env:aws_access_key_id = "myaccesskeyid"
$env:aws_secret_access_key = "mysecretaccesskey"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
Now the problem is that I'm trying to start the agent at the end of the userdata script with the command from the documentation but it does nothing (I see in the agent logs the command but there is no error). If I RDP into the machine and launch the same command in Powershell it works fine. The command is:
& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m onPrem -s -c file:"C:\ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json"
I finally was able to make it work but I'm not sure of why it didn't before. I was using
$env:aws_access_key_id = "accesskeyid"
$env:aws_secret_access_key = "secretkeyid"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
to set the variables but then the agent was failing to initialize. I had to add
$env:aws_default_region = "us-east-1"
so it was able to run. I couldn't find the issue before because on Windows server 2022 I don't get the logs from the execution. I had to try using Windows Server 2019 to actually see the error when launching the agent.
I still don't know why the environment variables I set in the machine scope worked once logged into the machine but not when using them as part of the userdata script.

VM Manager - OS Policy Assignment for a Windows VM in GCP

I am trying to create a couple of os policy assignments to configure - run some scripts with PowerShell - and install some security agents on a Windows VM (Windows Server 2022), by using the VM Manager. I am following the official Google documentation to setup the os policies. The VM Manager is already enabled, nevertheless I have difficulties creating the appropriate .yaml file which is required for the policy assignment since I haven't found any detailed examples.
Related topics I have found:
Google documentation offers a very simple example of installing an .msi file - Example OS policies.
An example of a fixed policy assignment in Terraform registry - google_os_config_os_policy_assignment, from where I managed to better comprehend the required structure for the .yaml file even though it is in a .json format.
Few examples provided at GCP GitHub repository (OSPolicyAssignments).
OS Policy resources in JSON representation - REST Resource, from where you can navigate to sample cases based on the selected resource.
But, it is still not very clear how to create the desired .yaml file. (ie. Copy some files, run a PowerShell script to perform an installation or an authentication). According to the Google documentation pkg, repository, exec, and file are the supported resource types.
Are there any more detailed examples I could use to understand what is needed? Have you already tried something similar?
Update: Adding an additional source.
You need to follow these steps:
Ensure that the OS Config agent is installed in your VM by running the below command in PowerShell:
PowerShell Get-Service google_osconfig_agent
you should see an output like this:
Status Name DisplayName
------ ---- -----------
Running google_osconfig... Google OSConfig Agent
if the agent is not installed, refer to this tutorial.
Set the metadata values to enable OSConfig agent with Cloud Shell command:
gcloud compute instances add-metadata $YOUR_VM_NAME \
--metadata=enable-osconfig=TRUE
Generate an OS policy and OS policy assignment yaml file. As an example, I am generating an OS policy that installs a msi file retrieved from a GCS bucket, and an OS policy assignment to run it in all Windows VMs:
# An OS policy assignment to install a Windows MSI downloaded from a Google Cloud Storage bucket
# on all VMs running Windows Server OS.
osPolicies:
- id: install-msi-policy
mode: ENFORCEMENT
resourceGroups:
- resources:
- id: install-msi
pkg:
desiredState: INSTALLED
msi:
source:
gcs:
bucket: <your_bucket_name>
object: chrome.msi
generation: 1656698823636455
instanceFilter:
inventories:
- osShortName: windows
rollout:
disruptionBudget:
fixed: 10
minWaitDuration: 300s
Note: Every file has its own generation number, you can get it with the command gsutil stat gs://<your_bucket_name>/<your_file_name>.
Apply the policies created in the previous step using Cloud Shell command:
gcloud compute os-config os-policy-assignments create $POLICY_NAME --location=$YOUR_ZONE --file=/<your-file-path>/<your_file_name.yaml> --async
Refer to the Examples of OS policy assignments for more scenarios, and check out this example of a PowerShell script.
Down below you can find the the .yaml file that worked, in my case. It copies a file, and executes a PowerShell command, so as to configure and deploy a sample agent (TrendMicro) - again this is specifically for a Windows VM.
.yaml file:
id: trendmicro-windows-policy
mode: ENFORCEMENT
resourceGroups:
- resources:
- id: copy-exe-file
file:
path: C:/Program Files/TrendMicro_Windows.ps1
state: CONTENTS_MATCH
permissions: '755'
file:
gcs:
bucket: [your_bucket_name]
generation: [your_generation_number]
object: Windows/TrendMicro/TrendMicro_Windows.ps1
- id: validate-running
exec:
validate:
interpreter: POWERSHELL
script: |
$service = Get-Service -Name 'ds_agent'
if ($service.Status -eq 'Running') {exit 100} else {exit 101}
enforce:
interpreter: POWERSHELL
script: |
Start-Process PowerShell -ArgumentList '-ExecutionPolicy Unrestricted','-File "C:\Program Files\TrendMicro_Windows.ps1"' -Verb RunAs
To elaborate a bit more, this .yaml file:
copy-exe-file: It copies the necessary installation script from GCS to a specified location on the VM. Generation number can be easily found on "VERSION HISTORY" when you select the object on GCS.
validate-running: This stage contains two different steps. On the validate it checks if the specific agent is up and running on the VM. If not, then it proceeds with the enforce step, where it executes the "TrendMicro_Windows.ps1" file with PowerShell. This .ps1 file downloads, configures and installs the agent. Note 1: This command is executed as Administrator and the full path of the file is specified. Note 2: Instead of Start-Process PowerShell a Start-Process pwsh can also be utilized. It was vital for one of my cases.
Essentially, a PowerShell command can be directly run at the enforce
step, nonetheless, I found it much easier to pass it first to a .ps1
file, and then just run this file. There are some restriction with the
.yaml file anywise.
PS: Passing osconfig-log-level - debug as a key-value pair as Metadata - directly to a VM or applied to all of them (Compute Engine > Setting - Metadata > EDIT > ADD ITEM) - provide some additional information and may help you on dealing with errors.

Airflow BashOperator - Use different role then its pod role

I've tried to run the following commands as part of a bash script runs in BashOperator:
aws cli ls s3://bucket
aws cli cp ... ...
The script runs successfully, however the aws cli commands return error, showing that aws cli doesn't run with the needed permissions (as was defined in airflow-worker-node role)
Investigating the error:
I've upgraded awscli in the docker running the pod - to version 2.4.9 (I've understood that old version of awscli doesn't support access to s3 based on permission grant by aws role
I've Investigated the pod running my bash_script using the BashOperator:
Using k9s, and D (describe) command:
I saw that ARN_ROLE is defined correctly
Using k9s, and s (shell) command:
I saw that pod environment variables are correct.
aws cli worked with the needed permissions and can access s3 as needed.
aws sts get-caller-identity - reported the right role (airflow-worker-node)
Running the above commands as part of the bash-script which was executed in the BashOperator gave me different results:
Running env showed limited amount of env variables
aws cli returned permission related error.
aws sts get-caller-identity - reported the eks role (eks-worker-node)
How can I grant aws cli in my BashOperator bash-script the needed permissions?
Reviewing the BashOperator source code, I've noticed the following code:
https://github.com/apache/airflow/blob/main/airflow/operators/bash.py
def get_env(self, context):
"""Builds the set of environment variables to be exposed for the bash command"""
system_env = os.environ.copy()
env = self.env
if env is None:
env = system_env
else:
if self.append_env:
system_env.update(env)
env = system_env
And the following documentation:
:param env: If env is not None, it must be a dict that defines the
environment variables for the new process; these are used instead
of inheriting the current process environment, which is the default
behavior. (templated)
:type env: dict
:param append_env: If False(default) uses the environment variables passed in env params
and does not inherit the current process environment. If True, inherits the environment variables
from current passes and then environment variable passed by the user will either update the existing
inherited environment variables or the new variables gets appended to it
:type append_env: bool
If bash operator input env variables is None, it copies the env variables of the father process.
In my case, I provided some env variables therefore it didn’t copy the env variables of the father process into the chid process - which caused the child process (the BashOperator process) to use the default arn_role of eks-worker-node.
The simple solution is to set the following flag in BashOperator(): append_env=True which will append all existing env variables to the env variables I added manually.
I've figured out that in the version I'm running (2.0.1) it isn't supported (it is supported in later versions).
As a temp solution I've add **os.environ - to the BashOperator env parameter:
return BashOperator(
task_id="copy_data_from_mcd_s3",
env={
"dag_input": "{{ dag_run.conf }}",
......
**os.environ,
},
# append_env=True,- should be supported in 2.2.0
bash_command="utils/my_script.sh",
dag=dag,
retries=1,
)
Which solve the problem.

Setting environment variable for a Compute Engine VM

I need to set an environment variable within my virtual machine on Google Compute Engine. The variable I need to set is called "GOOGLE_APPLICATION_CREDENTIALS"and according to Google documentation I need to set its value to the path of a json file. I have two questions:
1: Can I set this variable within the Google Compute Engine interface on GCP?
2: Can I use System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", Resources.googlecredentials.credentials);? Whenever I try and set this variable on my local machine I use this technique, but I set the value to the path of the file (local directory). However, because I am now using a virtual machine, I was wondering, can I set the environment variable to the actual contents of a resource file? Advantageously, this allows me to embed the credentials into the actual app itself.
Cheers
Store your credentials in a file temporarily
$HOME/example/g-credentials.json
{
"foo": "bar"
}
Then upload it to your GCE projects metadata as a string
gcloud compute project-info add-metadata \
--metadata-from-file g-credentials=$HOME/example/g-credentials.json
You can view your GCE projects metadata on the cloud console by searching for metadata or you can view it by using gcloud
gcloud compute project-info describe
Then set the env var/load the config in your VMs startup script
$HOME/example/startup.txt
#! /bin/bash
# gce project metadata key where the config json is stored as a string
meta_key=g-credentials
env_key=GOOGLE_APPLICATION_CREDENTIALS
config_file=/opt/g-credentials.json
env_file=/etc/profile
# command to set env variable
temp_cmd="export $env_key=$config_file"
# command to write $temp_cmd to file if $temp_cmd doesnt exist w/in it
perm_cmd="grep -q -F '$temp_cmd' $env_file || echo '$temp_cmd' >> $env_file"
# set the env var for only for the duration of this script.
# can delete this if you don't start processes at the end of
# this script that utilize the env var.
eval $temp_cmd
# set the env var permanently for any SUBSEQUENT shell logins
eval $perm_cmd
# load the config from the projects metadata
config=`curl -f http://metadata.google.internal/computeMetadata/v1/project/attributes/$meta_key -H "Metadata-Flavor: Google" 2>/dev/null`
# write it to file
echo $config > $config_file
# start other processes below ...
example instance
gcloud compute instances create vm-1 \
--metadata-from-file startup-script=$HOME/example/startup.txt \
--zone=us-west1-a
you could also edit the user's profile:
nano ~/.bashrc
or even system-wide with /etc/profile, /etc/bash.bashrc, or /etc/environment
and then add:
export GOOGLE_APPLICATION_CREDENTIALS=...
Custom Metadata can also be used, which is rather GCE specific.
Yes, you could set it within your RDP/SSH session.
No, you should set the path in the variable according to the documentation, alternatively, there are code examples that gather the service account path in a variable to use the credentials within your applications.

A sane way to set up CloudWatch logs (awslogs-agent)

tl;dr The configuration of cloudwatch agent is #$%^. Any straightforward way?
I wanted one place to store the logs, so I used Amazon CloudWatch Logs Agent. At first it seemed like I'd just add a Resource saying something like "create a log group, then a log stream and send this file, thank you" - all declarative and neat, but...
According to this doc I had to setup JSON configuration that created a BASH script that downloaded a Python script that set up the service that used a generated config in yet-another-language somewhere else.
I'd think logging is something frequently used, so there must be a declarative configuration way, not this 4-language crazy combo. Am I missing something, or is ops world so painful?
Thanks for ideas!
"Agent" is just an aws-cli plugin and a bunch of scripts. You can install the plugin with pip install awscli-cwlogs on most systems (assuming you already installed awscli itself). NOTE: I think Amazon Linux is not "most systems" and might require a different approach.
Then you'll need two configs: awscli config with the following content (also add credentials if needed and replace us-east-1 with your region):
[plugins]
cwlogs = cwlogs
[default]
region = us-east-1
and logging config with something like this (adjust to your needs according to the docs):
[general]
state_file = push-state
[logstream-cfn-init.log]
datetime_format = %Y-%m-%d %H:%M:%S,%f
file = /var/log/cfn-init.log
file_fingerprint_lines = 1-3
multi_line_start_pattern = {datetime_format}
log_group_name = ec2-logs
log_stream_name = {hostname}-{instance_id}/cfn-init.log
initial_position = start_of_file
encoding = utf_8
buffer_duration = 5000
after that, to start the daemon automatically you can create a systemd unit like this (change config paths to where you actually put them):
[Unit]
Description=CloudWatch logging daemon
[Service]
ExecStart=/usr/local/bin/aws logs push --config-file /etc/aws/cwlogs
Environment=AWS_CONFIG_FILE=/etc/aws/config
Restart=always
Type=simple
[Install]
WantedBy=multi-user.target
after that you can systemctl enable and systemctl start as usual. That's assuming your instance running a distribution that uses systemd (which is most of them nowadays but if not you should consult documentation to your distribution to learn how to run daemons).
Official setup script also adds a config for logrotate, I skipped that part because it wasn't required in my case but if your logs are rotated you might want to do something with it. Consult the setup script and logrotate documentation for details (essentially you just need to restart the daemon whenever files are rotated).
You've linked doco particular to CloudFormation so a bunch of the complexity is probably associated with that context.
Here's the stand-alone documentation for the Cloudwatch Logs Agent:
Quick Start
Agent Reference
If you're on Amazon Linux, you can install the 'awslogs' system package via yum. Once that's done, you can enable the logs plugin for the AWS CLI by making sure you have the following section in the CLI's config file:
[plugins]
cwlogs = cwlogs
E.g., the system package should create a file under /etc/awslogs/awscli.conf . You can use that file by setting the...
AWS_CONFIG_FILE=/etc/awslogs/awscli.conf
...environment variable.
Once that's all done, you can:
$ aws logs push help
and
$ cat /path/to/some/file | aws logs push [options]
The agent also comes with helpers to keep various log files in sync.