Cloudwatch agent not using environment variable credentials on Windows

Cloudwatch agent not using environment variable credentials on Windows - amazon-web-services

I'm trying to configure an AMI using a script that installs the unified Cloudwatch agent on both AWS and on premise Windows machines by using static IAM credentials for both of them. As part of the script, I set the credentials statically (as a test) using
$Env:AWS_ACCESS_KEY_ID="myaccesskey"
$Env:AWS_SECRET_ACCESS_KEY="mysecretkey"
$Env:AWS_DEFAULT_REGION="us-east-1"
Once I have the AMI, I create a machine and connect to it, and then verify the credentials are there by running aws configure list
Name Value Type Location
---- ----- ---- --------
profile <not set> None None
access_key ****************C6IF env
secret_key ****************SCnC env
region us-east-1 env ['AWS_REGION', 'AWS_DEFAULT_REGION']
But when I start the agent, I get the following error in the logs.
2022-12-26T17:51:49Z I! First time setting retention for log group test-cloudwatch-agent, update map to avoid setting twice
2022-12-26T17:51:49Z E! Failed to get credential from session: NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
EC2RoleRequestError: no EC2 instance role found
caused by: EC2MetadataError: failed to make EC2Metadata request
I'm using the Administrator user for both the installation of the agent and then when RDPing into the machine. Is there anything I'm missing?
I've already tried adding the credentials to the .aws/credentials file and modifying the common-config.toml file to use a profile. That way it works but in my case I just want to use the environment variables.
EDIT: I tested adding the credentials in the userdata script and modify a bit how they are created and now it seems to work.
$env:aws_access_key_id = "myaccesskeyid"
$env:aws_secret_access_key = "mysecretaccesskey"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
Now the problem is that I'm trying to start the agent at the end of the userdata script with the command from the documentation but it does nothing (I see in the agent logs the command but there is no error). If I RDP into the machine and launch the same command in Powershell it works fine. The command is:
& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m onPrem -s -c file:"C:\ProgramData\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent.json"

I finally was able to make it work but I'm not sure of why it didn't before. I was using
$env:aws_access_key_id = "accesskeyid"
$env:aws_secret_access_key = "secretkeyid"
[System.Environment]::SetEnvironmentVariable('AWS_ACCESS_KEY_ID',$env:aws_access_key_id,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_SECRET_ACCESS_KEY',$env:aws_secret_access_key,[System.EnvironmentVariableTarget]::Machine)
[System.Environment]::SetEnvironmentVariable('AWS_DEFAULT_REGION','us-east-1',[System.EnvironmentVariableTarget]::Machine)
to set the variables but then the agent was failing to initialize. I had to add
$env:aws_default_region = "us-east-1"
so it was able to run. I couldn't find the issue before because on Windows server 2022 I don't get the logs from the execution. I had to try using Windows Server 2019 to actually see the error when launching the agent.
I still don't know why the environment variables I set in the machine scope worked once logged into the machine but not when using them as part of the userdata script.

Related

How to make aws cli quiet

When I am using aws cli commands it adds debug data to its output.
Is there a way to make it quiet?
Here is my use case:
# get deployed version
COMMAND="git describe --tags"
aws ecs execute-command --cluster="${CLUSTER}" --task="${TASK}" --container="${SERVICE}" --command="${COMMAND}" --interactive > VERSION
The issue is that instead of expected contents of VERSION file (just the version number):
0.0.67
I have something like that:
The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.
Starting session with SessionId: ecs-execute-command-123456789abcdefgh
0.0.67
Exiting session with sessionId: ecs-execute-command-123456789abcdefgh.
How can I get rid of the debug data?
I already tried adding --quiet parameter (parameter does not exist)
and redirecting error output, none helped.

AWS EB docker-compose deployment from private registry access forbidden

I'm trying to get docker-compose deployment to AWS Elastic Beanstalk working, in which the docker images are pulled from a private registry hosted by GitLab.
The strange thing is that initial deployment works perfectly; It pulls the image from the private registry and starts the containers using docker-compose, and the webpage (served by Django) is accessible through the host.
Deploying a new version using the same docker-compose and the same docker image will result in an error while pulling the docker image:
2021/03/16 09:28:34.957094 [ERROR] An error occurred during execution of command [app-deploy] - [Run Docker Container]. Stop running the command. Error: failed to run docker containers: Command /bin/sh -c docker-compose up -d failed with error exit status 1. Stderr:Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating network "current_default" with the default driver
Pulling redis (redis:alpine)...
Pulling mysql (mysql:5.7)...
Pulling project.dockertest(registry.gitlab.com/company/spikes/dockertest:latest)...
Get https://registry.gitlab.com/v2/company/spikes/dockertest/manifests/latest: denied: access forbidden
2021/03/16 09:28:34.957104 [INFO] Executing cleanup logic
Setup
AWS Elastic Beanstalk 64bit Amazon Linux 2/3.2
Gitlab registry credentials are stored within a S3 bucket, with the filename .dockercfg and has the following content:
{
"auths": {
"registry.gitlab.com": {
"auth": "base64 encoded username:personal_access_token"
}
},
"HttpHeaders": {
"User-Agent": "Docker-Client/18.03.1-ce (linux)"
}
}
The repository contains a v3 Dockerrun.aws.json file to refer to the credential file in S3:
{
"AWSEBDockerrunVersion": "3",
"Authentication": {
"bucket": "gitlab-dockercfg",
"key": ".dockercfg"
}
}
Reproduce
Setup docker-compose.yml that uses a service with a private docker image (and can be pulled with the credentials setup in the dockercfg within S3)
Create a new applicatoin that uses the docker-platform.
eb init testapplication --platform=docker --region=eu-west-1
Note: region must be the same as the S3 bucket containing the dockercfg.
Initial deployment (this will succeed)
eb create testapplication-test --branch_default --cname testapplication-test --elb-type=application --instance-types=t2.micro --min-instance=1 --max-instances=4
The initial deployment shows that the image is available and can be started:
2021/03/16 08:58:07.533988 [INFO] save docker tag command: docker tag 5812dfe24a4f redis:alpine
2021/03/16 08:58:07.533993 [INFO] save docker tag command: docker tag f8fcde8b9ae2 mysql:5.7
2021/03/16 08:58:07.533998 [INFO] save docker tag command: docker tag 1dd9b65d6a9f registry.gitlab.com/company/spikes/dockertest:latest
2021/03/16 08:58:07.534010 [INFO] Running command /bin/sh -c docker rm `docker ps -aq`
Without changing anything to the local repository and the remote docker image on the private registry, lets do a redeployment which will trigger the error:
eb deploy testapplication-test
This will fail with the following output:
...
2021-03-16 10:02:28 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2021-03-16 10:02:29 ERROR Unsuccessful command execution on instance id(s) 'i-0dc445d118ac14b80'. Aborting the operation.
2021-03-16 10:02:29 ERROR Failed to deploy application.
ERROR: ServiceError - Failed to deploy application.
And logs of the instance show (/var/log/eb-engine.log):
Pulling redis (redis:alpine)...
Pulling mysql (mysql:5.7)...
Pulling project.dockertest (registry.gitlab.com/company/spikes/dockertest:latest)...
Get https://registry.gitlab.com/v2/company/spikes/dockertest/manifests/latest: denied: access forbidden
2021/03/16 10:02:25.902479 [INFO] Executing cleanup logic
Steps I've tried to debug or solve the issue
Rename dockercfg to .dockercfg on S3 (somewhere mentioned on the internet as possible solution)
Use the 'old' docker config format instead of the one generated by docker 1.7+. But later on I figured out that Amazon Linux 2-instances are compatible with the new format together with Dockerrun v3
Having an incorrectly formatted dockercfg on S3 will cause an error deployment regarding the misformatted file (so it actually does something with the dockercfg from S3)
Documentation
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/single-container-docker-configuration.html
I'm out of debug options, and I've no idea where to look any further to debug this problem. Perhaps someone can see what is going wrong here?

First of all, the issue describe above is a bug confirmed by Amazon. To get the deployment working on our side, we've contacted Amazon support.
They've a fix in place which should be released this month, so keep an eye on the changelog of the Elastic beanstalk platform: https://docs.aws.amazon.com/elasticbeanstalk/latest/relnotes/relnotes.html
Although the upcoming release should have the fix, there is a workaround available to get the docker-compose deployment working.
Elastic Beanstalk allows hook to be executed within the deployment, which can be used to fetch the .docker.cfg from a S3 bucket to authenticate with against the private registry.
To do so, create the following file and directories from the root of the project:
File location: .platform/hooks/predeploy/docker_login
#!/bin/bash
aws s3 cp s3://{{bucket_name_to_use}}/.dockercfg ~/.docker/config.json
Important: Add execution rights to this file (for example: chmod +x .platform/hooks/predeploy/docker_login)
To support instance configuration changes, please symlink the hooks directory to confighooks:
ln -s .platform/hooks/ .platform/confighooks/
Updating configuration requires the .dockercfg credentials to be fetched too.
This should enable continuous deployments to the same EB-instance without the authentication errors, because the hook will be execute before the docker image pulling.
Some background:
The docker daemon reads credentials from ~/.docker/config by default on traditional linux systems. On the initial deploy this file will exist on the Elastic Beanstalk instance. On the next deployment this file is removed. Unfortunately, on the next deployment the .dockercfg is not refetched, therefor the docker daemon does not have the correct credentials to authenticate with.

I was dealing the same errors while trying to pull images from a privately hosted GitLab instance. I was able to resolve them by including the email address that was associated with the generated token found in the auth field of the .dockercfg file.
The following file format worked for me:
"registry.gitlab.com" {
"auth": "base64 encoded username:personal_access_token",
"email": "email for personal access token"
}
In my case I used a Project Access Token, which has an e-mail address associated with it once it is created.
The file format in the Elastic Beanstalk documentation for the authentication file here, indicates that this is the required file format, though the versions that it says this format is required for are almost certainly outdated, since we are running Docker ^19.

Errors when applying AWS eb commands

https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_nodejs_express.html
I'm trying to follow these steps to deploy an example of an Express application for the first time. After installing the Elastic Beanstalk Command Line Interface (EB CLI), I can apply eb commands in the Command Prompt (using Windows 10). After initializing a Git repository, I should use commands to configure an EB CLI repository.
These command are being applied in the directory of an an ExpressJS project:
First I enter the command: eb init – platform Node.js – region us-east-2 which results in the message in a separate window Application AWS2 has been created.
Next I enter command: eb create – sample node-express-env which results in the error message ERROR: InvalidParameterValueError - Environment node-express-env already exists.
Then when I enter the command: eb open the message says ERROR: This branch does not have a default environment. You must either specify an environment by typing "eb open my-env-name" or set a default environment by typing "eb use my-env-name".
Then when I enter: eb open node-express-env there's another message ERROR: NotFoundError - Environment "node-express-env" not Found. which contradicts the message from 2.

Make sure that, you configured the CLI to use the same region in which your environment is created.

"Unable to determine aws-region" when running on-premises Cloudwatch agent

I'm trying to configure the AWS Cloudwatch agent to run on vanilla Ubuntu 18.04, outside of AWS. Every time I run it, I get this error:
# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m onPremise -c "file:/path/to/cloudwatch/cloudwatch.json" -s
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:/path/to/cloudwatch/cloudwatch.json --mode onPrem --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
Got Home directory: /root
I! Set home dir Linux: /root
Unable to determine aws-region.
Please make sure the credentials and region set correctly on your hosts.
Refer to http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
Fail to fetch the config!
Running the program under strace -f reveals that it is trying to read /root/.aws/credentials and then exiting. Per the guide, here are the contents of /root/.aws/credentials:
[AmazonCloudWatchAgent]
aws_access_key_id = key
aws_secret_access_key = secret
region = us-west-2
If I run aws configure get region, it is able to retrieve the region correctly. However, the Cloudwatch Agent is unable to read it. Here's the contents of common-config.toml (which also gets read, per strace).
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
[credentials]
shared_credential_profile = "AmazonCloudWatchAgent"
shared_credential_file = "/root/.aws/credentials"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
Here are other things I have tried:
enclosing region (and all values) in the configuration in double quotes, per https://forums.aws.amazon.com/thread.jspa?threadID=291589. This did not make a difference.
adding /home/myuser/.aws/config, /home/myuser/.aws/credentials, and /root/.aws/config and populating them with the appropriate values. Per strace these files are not being read.
searching for the source code for the CloudWatch Agent (it is not open source)
setting AWS_REGION=us-west-2 explicitly in the program environment (same error)
changing [AmazonCloudWatchAgent] to [profile AmazonCloudWatchAgent] everywhere and all permutations of the above (no difference)
adding a [default] section in all config files (makes no difference)
invoking the config-downloader program directly, setting AWS_REGION etc. (same error)
becoming a non-root user and then invoking the program using sudo instead of invoking the program as the root user without sudo.
I get the same error no matter what I try. I installed the CloudWatch agent by downloading the "latest" deb on March 23, 2020, per these instructions. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/download-cloudwatch-agent-commandline.html

The aws config defaults to C:\Users\Administrator instead of the user you installed the CloudWatch Agent as. So you may need to move the /.aws/ folder to the CLoudWatch user. Or...more straightforward:
aws configure --profile AmazonCloudWatchAgent
as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#install-CloudWatch-Agent-iam_user-first
You can also specify the region using common-config.toml as described here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#CloudWatch-Agent-profile-instance-first
On a server running Windows Server, this file is in the C:\ProgramData\Amazon\AmazonCloudWatchAgent directory. The default common-config.toml is as follows:
# This common-config is used to configure items used for both ssm and cloudwatch access
## Configuration for shared credential.
## Default credential strategy will be used if it is absent here:
## Instance role is used for EC2 case by default.
## AmazonCloudWatchAgent profile is used for onPremise case by default.
# [credentials]
# shared_credential_profile = "{profile_name}"
# shared_credential_file= "{file_name}"
## Configuration for proxy.
## System-wide environment-variable will be read if it is absent here.
## i.e. HTTP_PROXY/http_proxy; HTTPS_PROXY/https_proxy; NO_PROXY/no_proxy
## Note: system-wide environment-variable is not accessible when using ssm run-command.
## Absent in both here and environment-variable means no proxy will be used.
# [proxy]
# http_proxy = "{http_url}"
# https_proxy = "{https_url}"
# no_proxy = "{domain}"
You can also update the common-config.toml with a new location if needed.

I was using an incorrect "secret" with an invalid character that caused the INI file parser to break. The CloudWatch agent incorrectly reported this as a "missing region," when a parse error or "invalid secret" error would have been more accurate.

you should create a new file in the same folder as credentials with the name config
And add there the region
[default]
region = your-region
see more here

You have to uncomment the # [credentials] in the /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml config file as well

Set the AWS_REGION environment variable.
On Linux, macOS, or Unix, use :
export AWS_REGION=your_aws_region

AWS cloudwatch terminal output logs

I'm currently doing my internship, and we were tasked to set up a hawkbit service on EWS ECR.
Hawkbit is used for software update roll-outs. We hace hit 2 bumps that we're currently stuck on.
first if we run the docker image on our local server the hawkbit service starts automatically by using a sh-file and running the following command in our dockerfile : CMD ["/hawkbit.sh"]
if we run the image in a cluster on ECR the service doesn't start automatically.
secondly, when hawkbit is running it outputs on the terminal, I can out this output into a log file, however, I'm not able to check the log on cloudwatch.
I used the following to create the file and put the input into the file:
2>&1 > /var/log/hawkbit/hawkbit
and I've edited the awslog.conf file as following:
[/var/log/hawkbit/hawkbit]
file = /var/log/hawkbit/hawkbit.*
log_group_name = /var/log/hawkbit/hawkbit
log_stream_name = {cluster}/{container_instance_id}
datetime_format = %Y-%m-%dT%H:%M:%SZ
any idea's would be very appreciated

Things to check regarding awslogs agent:
ensure that the service is running
check /var/log/awslogs.log file for errors
make sure instance has role attached with permissions sufficient for agent to work, read about required permissions here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js