Creating a SSM Composite Document that pulls Parameters from Parameter Store - AWS - amazon-web-services

The task is simple: whenever an EC2 instance is launched with tag key:value I want it to install a specific software. Whenever an EC2 instance is launched with a different tag key:value I want it to install a different software.
I understand that I can create 2 different associations in State Manager that uses runCommand RuneRemoteScript to install software based on the tags, but the goal is to have 1 composite document that can do this.
Any help / guidance would be appreciated!

You can achieve that using SSM Automation documents - https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-branchdocs.html
However, probably you will need to do something like this:
In the State Manager use AWS-RunDocument,
This document should execute SSM Automation document (your Composite document)
Your Composite document should look like this:
I didn't validate this template, and I assume It shouldn't work without a few days of debugging
schemaVersion: '0.3'
parameters:
InstanceId:
type: String
mainSteps:
- name: DescribeEc2
action: 'aws:executeScript'
inputs:
Runtime: python3.7
Handler: script_handler
Script: |
import json
import boto3
def script_handler(events):
ec2_instance = boto3.client('ec2').describe_instances(
InstanceIds=events["instance_id"],
)["Reservations"][0]["Instances"][0]
# thread it like an example,
# Here you should parse your tags and decide what software you
# want to install on the provided instance
return json.dumps(
{
"to_be_installed": "result"
},
sort_keys=True,
default=str
)
InputPayload:
instance_id: '{{ InstanceId }}'
Outputs:
- Name: result
Selector: "$.to_be_installed"
- name: WhatToInstall
action: aws:branch
inputs:
Choices:
- NextStep: InstallSoft1
Variable: "{{DescribeEc2.result}}"
StringEquals: soft_1
- NextStep: InstallSoft1
Variable: "{{DescribeEc2.result}}"
StringEquals: soft_2
- name: InstallSoft1
action: aws:runCommand
inputs:
DocumentName: AWS-RunShellScript
InstanceIds:
- '{{ InstanceId }}'
Parameters:
commands:
...
- name: InstallSoft2
action: aws:runCommand
inputs:
DocumentName: AWS-RunShellScript
InstanceIds:
- '{{ InstanceId }}'
Parameters:
commands:
...
Tbh, you will find a lot of troubles with such solution (IAM and SSM specific issues), so I will recommend using Event Bridge -> Lambda Function(that decides which Document/Automation should be run) -> SSM-RunDocument (executed directly in the Lambda Function).

Related

Want to send cloud-formation output through an email

I am writing a cloud-formation template where I am running two powershell scripts. Now I want to fetch the output of both the scripts and want to send that to an email which is already mentioned in cloud-formation parameter.
here is the code:-
AWSTemplateFormatVersion: '2010-09-09'
Description: Test Document
Resources:
Type: AWS::SSM::Document
Properties:
DocumentType: Command
Name: "Test Upgrade"
Content:
schemaVersion: '2.2'
description: "Test Upgrade"
parameters:
Emails:
type: String
description: |-
enter the email address to send the overall output
mainSteps:
- action: "aws:runPowerShellScript"
name: "DriverUpgrade"
precondition:
StringEquals: ["platformType", "Windows"]
inputs:
runCommand:
[]
timeoutSeconds: 3600
onFailure: Continue
maxAttempts: 1
isCritical: False
nextStep: Second Driver
- action: "aws:runPowerShellScript"
name: "SecondDriverUpgrade"
precondition:
StringEquals: ["platformType", "Windows"]
inputs:
runCommand:
[]
timeoutSeconds: 3600
onFailure: Continue
isCritical: False
Could you use AWS CLI command 'ses send-email' inside the Powershell script or as a part of Run Command parameters?
Certainly you have to configure the SES service beforehand in cfn template or console.
You could retrieve the email address from parameter store by using AWS CLI command ssm get-parameters.
To store the output of the scripts you could use a local file, parameter store, or, for example, dynamodb table etc.
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ses/send-email.html
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ssm/get-parameters.html

Catch event when an SSM-agent becomes active

I want to trigger a lambda whenever a new EC2 instance is registred in SSM's Fleet Manager (meaning the instance can be connected to using SSM), however I can't find what pattern to use in EventBridge.
Within EventBridge, I tried using the following pattern I found in the docs (so far its looks like the closest thing to my goal):
{
"source": ["aws.ssm"],
"detail-type": ["Inventory Resource State Change"]
}
However when I create a new EC2 and wait for its SSM agent to become active, it still doesn't trigger the above pattern.
Any idea how to catch this kind of event?
I think you have to go through CloudTrail API call.
Please find below a CloudFormation template I used in the past that was working. Please note that it just provides the SSM resources. You need to add your own SQS queue as well (see SQS.ARN) and I've used the association with the tag registration set to enabled. So that if you have a lambda function connected, you can set it to false so if the instance connect again, it won't go to the same process again.
AWSTemplateFormatVersion: "2010-09-09"
Description: >
SSM Registration event
# Description of the resources to be created.
Resources:
RegistrationDocument:
Type: AWS::SSM::Document
Properties:
DocumentType: Command
Content:
schemaVersion: "2.2"
description: >
An Automation Document ran by registered instances that gathers their software inventory
and automatically updates their AWS SSM Agent to the latest version.
mainSteps:
- name: GatherSoftware
action: aws:softwareInventory
- name: Sleep
action: aws:runShellScript
inputs:
runCommand:
- sleep 20 || true
- name: UpdateAgent
action: aws:updateSsmAgent
inputs:
agentName: amazon-ssm-agent
source: https://s3.{Region}.amazonaws.com/amazon-ssm-{Region}/ssm-agent-manifest.json
allowDowngrade: "false"
RegistrationDocumentAssociation:
Type: AWS::SSM::Association
Properties:
AssociationName: !Sub registration-association-${AWS::StackName}
Name: !Ref RegistrationDocument
Targets:
- Key: tag:registration
Values:
- enabled
RegistrationEventRule:
Type: AWS::Events::Rule
Properties:
Description: >
Events Rule that monitors registration of AWS SSM instances
and logs them to an SQS queue.
EventPattern:
source:
- aws.ssm
detail-type:
- AWS API Call via CloudTrail
detail:
eventName:
- UpdateInstanceAssociationStatus
requestParameters:
associationId:
- !Ref RegistrationDocumentAssociation
executionResult:
status:
- Success
State: ENABLED
Targets:
- Arn: SQS.ARN
Id: SqsRegistrationSubscription
SqsParameters:
MessageGroupId: registration.events

Cloudcustodian - filter by tag name for on/off hours

I have the following policy:
policies:
- name: stop-after-hours
resource: ec2
filters:
- tag:Schedule: "OfficeHours"
actions:
- stop
mode:
type: periodic
schedule: "rate(10 minutes)"
role: arn:aws:iam::XXXXXX:role/LambdaRoleCloudCustodian
Which correctly identified my EC2 tagged with "Schedule: OfficeHours":
$> custodian run --dry-run -s out shutdown-out-of-office.yml
custodian.policy:INFO policy:stop-after-hours-cologne resource:ec2 region:eu-central-1 count:1 time:0.00
However, when I want to set the offhour:
policies:
- name: stop-after-hours
resource: ec2
filters:
- tag:Schedule: "OfficeHours"
- type: offhour
offhour: 11
actions:
- stop
mode:
type: periodic
schedule: "rate(10 minutes)"
role: arn:aws:iam::XXXXXX:role/LambdaRoleCloudCustodian
The instance is not identified anymore.
2022-07-05 12:01:04,541: custodian.policy:INFO policy:stop-after-hours-cologne resource:ec2 region:eu-central-1 count:0 time:0.78
I also tried
- type: value
key: tag:Schedule
value: OfficeHours
which doesn't work.
Any idea on how I can filter on tag name AND value here?
So, after fiddling around quite some time, I finally found the solution.
Here's the complete policy
# Stop instances tagged with "Schedule: OfficeHour" at offhour
- name: stop-after-hours
resource: ec2
filters:
- tag:Schedule: OfficeHours
- State.Name: running
- type: offhour
tag: Schedule
weekends: true
default_tz: cet
offhour: 10
actions:
- stop
mode:
type: periodic
schedule: "rate(10 minutes)"
role: arn:aws:iam::XXXXXXXXX:role/LambdaRoleCloudCustodian
Some things to keep in mind
Here, under filters/type, I have a tag attribute for which the value is Schedule. This will tell Cloudcustodian to look for any instance which has the tag Schedule, whatever its value. If you do not specify this, you need to tag your instance with the default offhour tag which is maid_offhours
I also have tag:Schedule: OfficeHours which will filter out instances based on the tag Schedule's value.
If you want to test your policy with a dry-run, you must test in the current hour. So, if my offhour is set to 10, then the dry-run will only be able to fetch the resource if it is run between 10:00am and 10:59am.
I hope it helps some people, I find the Cloudcustodian documentation quite difficult to understand.

GCP Deployment Manager "ResourceErrorCode":"400" while database user creation

I am experimenting with deployment manager and each time I try to deploy an SQL instance with a DB on it and 2 users; some of the tasks are failing. Most of the time they are the users:
conf.yaml:
resources:
- name: mycloudsql
type: gcp-types/sqladmin-v1beta4:instances
properties:
name: mycloudsql-01
backendType: SECOND_GEN
instanceType: CLOUD_SQL_INSTANCE
databaseVersion: MYSQL_5_7
region: europe-west6
settings:
tier: db-f1-micro
locationPreference:
zone: europe-west6-a
activationPolicy: ALWAYS
dataDiskSizeGb: 10
- name: mydjangodb
type: gcp-types/sqladmin-v1beta4:databases
properties:
name: django-db-01
instance: $(ref.mycloudsql.name)
charset: utf8
- name: sqlroot
type: gcp-types/sqladmin-v1beta4:users
properties:
name: root
host: "%"
instance: $(ref.mycloudsql.name)
password: root
- name: sqluser
type: gcp-types/sqladmin-v1beta4:users
properties:
name: user
instance: $(ref.mycloudsql.name)
password: user
Error:
PS C:\Users\user\Desktop\Python\GCP> gcloud --project=sound-catalyst-263911 deployment-manager deployments create dm-sql-test-11 --config conf.yaml
The fingerprint of the deployment is TZ_wYom9Q64Hno6X0bpv9g==
Waiting for create [operation-1589869946223-5a5fa71623bc9-1912fcb9-bc59aafc]...failed.
ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1589869946223-5a5fa71623bc9-1912fcb9-bc59aafc]: errors:
- code: RESOURCE_ERROR
location: /deployments/dm-sql-test-11/resources/sqluser
message: '{"ResourceType":"gcp-types/sqladmin-v1beta4:users","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Precondition
check failed.","status":"FAILED_PRECONDITION","statusMessage":"Bad Request","requestPath":"https://www.googleapis.com/sql/v1beta4/projects/sound-catalyst-263911/instances/mycloudsql-01/users","httpMethod":"POST"}}'
- code: RESOURCE_ERROR
location: /deployments/dm-sql-test-11/resources/sqlroot
message: '{"ResourceType":"gcp-types/sqladmin-v1beta4:users","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Precondition
check failed.","status":"FAILED_PRECONDITION","statusMessage":"Bad Request","requestPath":"https://www.googleapis.com/sql/v1beta4/projects/sound-catalyst-263911/instances/mycloudsql-01/users","httpMethod":"POST"}}'
Console View:
It doesn`t say what that precondition failing is or am I missing something?
It seems the installation of database is not completed by the time the Deployment Manager starts to create users despite the reference notation is used in the YAML code to take care of dependencies. That is why you receive the "FAILED_PRECONDITION" error.
As a workaround you can split the deployment into two parts:
Create a CloudSQL instance and a database;
Create users.
This does not look elegant, but it works.
Alternatively, you can consider using Terraform. Fortunately, Cloud Shell instance is provided with Terraform pre-installed. There are sample Terraform code for Cloud SQL out there, for example this one:
CloudSQL deployment with Terraform

Use a Stackdriver resource group's ID in a GCP Deployment Manager configuration

I'm trying to create a Stackdriver alert policy with a Deployment Manager configuration. The same configuration first creates a resource group and a notification channel and then a policy based on those:
resources:
- name: test-group
type: gcp-types/monitoring-v3:projects.groups
properties:
displayName: A test group
filter: >-
resource.metadata.cloud_account="aproject-id" AND
resource.type="gce_instance" AND
resource.metadata.tag."managed"="yes"
- name: test-email-notification
type: gcp-types/monitoring-v3:projects.notificationChannels
properties:
displayName: A test email channel
type: email
labels:
email_address: incidents#example.com
- name: test-alert-policy
type: gcp-types/monitoring-v3:projects.alertPolicies
properties:
enabled: true
displayName: A test alert policy
documentation:
mimeType: text/markdown
content: "Test incident"
notificationChannels:
- $(ref.test-email-notification.name)
combiner: OR
conditions:
- conditionAbsent:
aggregations:
- alignmentPeriod: 60s
perSeriesAligner: ALIGN_RATE
duration: 300s
filter: metric.type="compute.googleapis.com/instance/uptime" group.id="$(ref.test-group.id)"
trigger:
count: 1
displayName: The instance is down
The policy's only condition has a filter based on the resource group, i.e. only the members of the group could trigger this alert.
I'm trying to use a reference to the group's ID, but it doesn't work - "The reference 'id' is invalid, reason: The field 'id' does not exists on the reference schema.
Also when I try to use $(ref.test-group.selfLink) I get The reference 'selfLink' is invalid, reason: The field 'selfLink' does not exists on the reference schema.
I could get the group's name (e.g. "projects/aproject-id/groups/3691870619975147604") but the filters only accept group IDs (e.g. only the "3691870619975147604" part):
'{"ResourceType":"gcp-types/monitoring-v3:projects.alertPolicies","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Field
alert_policy.conditions[0].condition_absent.filter had an invalid value of \"metric.type=\"compute.googleapis.com/instance/uptime\"
group.id=\"projects/aproject-id/groups/3691870619975147604\"\":
must specify a restriction on \"resource.type\" in the filter; see \"https://cloud.google.com/monitoring/api/resources\"
for a list of available resource types.","status":"INVALID_ARGUMENT","statusMessage":"Bad
Request","requestPath":"https://monitoring.googleapis.com/v3/projects/aproject-id/alertPolicies","httpMethod":"POST"}}'
Try replacing your alert policy with the following:
- name: test-alert-policy
type: gcp-types/monitoring-v3:projects.alertPolicies
properties:
enabled: true
displayName: A test alert policy
documentation:
mimeType: text/markdown
content: "Test incident"
notificationChannels:
- $(ref.test-email-notification.name)
combiner: OR
conditions:
- conditionAbsent:
aggregations:
- alignmentPeriod: 60s
perSeriesAligner: ALIGN_RATE
duration: 300s
filter: metric.type="compute.googleapis.com/instance/uptime" $(ref.test-group.filter)
trigger:
count: 1
displayName: The instance is down
metadata:
dependsOn:
- test-group
This adds 1) an explicit dependency to test-group using a dependsOn clause and 2) $(ref.test-group.filter) to the metric filter so that it, while not strictly linked to test-group, ends up containing all the same resources as test-group.
As Deployment Manager resources are ran in parallel its necessary to use dependsOn to ensure test-group is instantiated before attempting to create test-alert-policy; apparently Deployment Manager isn't quite smart enough to reason this just by the references.