What is the best way to monitor stacks on AWS?

What is the best way to monitor stacks on AWS? - amazon-web-services

I am currently working on projects with Laravel/PHP that I manage with AWS.
I deploy several instances thanks to the CloudFormation service. So i have a lot of LogsGroups.
If I have an error in these LogsGroups, I have to manually search for the error logs.
So, my need is :
When I have an error API or PH message in one of my LogsGroups, like this:
[2021-11-24T13:03:48.075879+00:00] technical.ERROR: TYPE: Trzproject\Trzutils\Exceptions\InvalidJWTException
MESSAGE: Provided JWT has since expired, as defined by the "exp" claim
FILE: /var/task/vendor/trzproject/trzcore/src/Trzutils/JWT/JWTService.php LINE: 64 TRACE: stack trace disabled ____________________________________________________________________
I want to be alerted with a message (slack, mail, whatever) that tells me in which LogGroup is the error.
I can’t create a LogInsight query because I have a lot of clients and, LogInsight does not allow to make query on a very large amount of LogsGroups.
Thank you in advance for your advice.
Edit : something like this ; https://theithollow.com/2017/12/11/use-amazon-cloudwatch-logs-metric-filters-send-alerts/
but without create x alarms for x logGroups
(sorry for my english)

I finally did this and it's a solution to my needs
I created a Lambda that is triggered by CloudWatch Logs. I’ve combined the log groups I want to monitor with CloudWatch Logs as well as a pattern to retrieve only certain messages.
serverless.yml
functions:
persistLogMessage:
handler: lambda/PersistLogMessage.php
timeout: 899 # 14min 59s
events:
- cloudwatchLog:
logGroup: 'MyLogsGroup01'
filter: '?ERROR ?WARN ?5xx'
- cloudwatchLog:
logGroup: 'MyLogsGroup02'
filter: '?ERROR ?WARN ?5xx'
...
layers:
- arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:layer:php-73:1
role: PersistLogMessageRole
...
resources:
Conditions:
Resources:
#
PersistLogMessageRole:
Type: AWS::IAM::Role
Properties:
RoleName: ${opt:stage}-${opt:client}-PersistLogMessageRole
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: PersistLogMessagePolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: "Allow"
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
- logs:PutRetentionPolicy
- logs:DescribeLogStreams
- logs:DescribeLogGroups
Resource: "*"
- Effect: Allow
Action:
- sns:Publish
Resource:
- "arn:aws:sns:eu-west-3:<account_id>:SnsTopic"
My Lambda:
public function __invoke(array $events): void
{
$data = $events['awslogs']['data'];
$this->logger->info('events', $events);
$dataDecoded = base64_decode($data);
$logMessage = zlib_decode($dataDecoded);
/** #var stdClass $stdClassLogMessage */
$stdClassLogMessage = json_decode($logMessage);
dump($stdClassLogMessage);
$params = [
'Message' => $stdClassLogMessage->logEvents[0]->message,
'TopicArn' => 'arn:aws:sns:eu-west-3:<account_id>:SnsTopic'
];
$this->snsClient->publish($params);
}

Related

Connect to Amazon MSK cluster

I’m trying to setup an Amazon MSK cluster and connect to it from a lambda function. The lambda function will be a producer of messages, not a consumer.
I am using the serverless framework to provision everything and in my serverless.yml I have added the following and that seems to be working fine.
MSK:
Type: AWS::MSK::Cluster
Properties:
ClusterName: kafkaOne
KafkaVersion: 2.2.1
NumberOfBrokerNodes: 3
BrokerNodeGroupInfo:
InstanceType: kafka.t3.small
ClientSubnets:
- Ref: PrivateSubnet1
- Ref: PrivateSubnet2
- Ref: PrivateSubnet3
But when trying to connect to this cluster to actually send messages I am unsure how to get the connection string here? I presume it should be the ZookeeperConnectString?
I’m new to kafka/msk so maybe I am not seeing something obvious.
Any advice much appreciated. Cheers.

I don't know what kind of code base u are using, so I will add my code which I wrote in GO.
In essence you should connect to MSK cluster the same way as you would connect to some stand alone Kafka instance. We are using brokers for "connecting" or better said writing to MSK cluster.
I'm using segmentio/kafka-go library. My function for sending event to MSK cluster looks like this
// Add event
func addEvent(ctx context.Context, requestBody RequestBodyType) (bool, error) {
// Prepare dialer
dialer := &kafka.Dialer{
Timeout: 2 * time.Second,
DualStack: true,
}
brokers := []string{os.Getenv("KAFKA_BROKER_1"), os.Getenv("KAFKA_BROKER_2"), os.Getenv("KAFKA_BROKER_3"), os.Getenv("KAFKA_BROKER_4")}
// Prepare writer config
kafkaConfig := kafka.WriterConfig{
Brokers: brokers,
Topic: os.Getenv("KAFKA_TOPIC"),
Balancer: &kafka.Hash{},
Dialer: dialer,
}
// Prepare writer
w := kafka.NewWriter(kafkaConfig)
// Convert struct to json string
event, err := json.Marshal(requestBody)
if err != nil {
fmt.Println("Convert struct to json for writing to KAFKA failed")
panic(err)
}
// Write message
writeError := w.WriteMessages(ctx,
kafka.Message{
Key: []byte(requestBody.Event),
Value: []byte(event),
},
)
if writeError != nil {
fmt.Println("ERROR WRITING EVENT TO KAFKA")
panic("could not write message " + err.Error())
}
return true, nil
}
My serverless.yml
Upper code (addEvent) belongs to functions -> postEvent in serverless.yml... If you are consuming from kafka, then you should check functions -> processEvent. Consuming event is fairly simple, but setting everything up for producing to Kafka it crazy. We are probably working on this for month and a half and still figuring out how everything should be set up. Sadly serverless does not do everything for you, so you will have to "click trough" manually in AWS, but we compared to other frameworks and serverless is still the best right now
provider:
name: aws
runtime: go1.x
stage: dev
profile: ${env:AWS_PROFILE}
region: ${env:REGION}
apiName: my-app-${sls:stage}
lambdaHashingVersion: 20201221
environment:
ENV: ${env:ENV}
KAFKA_TOPIC: ${env:KAFKA_TOPIC}
KAFKA_BROKER_1: ${env:KAFKA_BROKER_1}
KAFKA_BROKER_2: ${env:KAFKA_BROKER_2}
KAFKA_BROKER_3: ${env:KAFKA_BROKER_3}
KAFKA_BROKER_4: ${env:KAFKA_BROKER_4}
KAFKA_ARN: ${env:KAFKA_ARN}
ACCESS_CONTROL_ORIGINS: ${env:ACCESS_CONTROL_ORIGINS}
ACCESS_CONTROL_HEADERS: ${env:ACCESS_CONTROL_HEADERS}
ACCESS_CONTROL_METHODS: ${env:ACCESS_CONTROL_METHODS}
BATCH_SIZE: ${env:BATCH_SIZE}
SLACK_API_TOKEN: ${env:SLACK_API_TOKEN}
SLACK_CHANNEL_ID: ${env:SLACK_CHANNEL_ID}
httpApi:
cors: true
apiGateway:
resourcePolicy:
- Effect: Allow
Action: '*'
Resource: '*'
Principal: '*'
vpc:
securityGroupIds:
- sg-*********
subnetIds:
- subnet-******
- subnet-*******
functions:
postEvent:
handler: bin/postEvent
package:
patterns:
- bin/postEvent
events:
- http:
path: event
method: post
cors:
origin: ${env:ACCESS_CONTROL_ORIGINS}
headers:
- Content-Type
- Content-Length
- Accept-Encoding
- Origin
- Referer
- Authorization
- X-CSRF-Token
- X-Amz-Date
- X-Api-Key
- X-Amz-Security-Token
- X-Amz-User-Agent
allowCredentials: false
methods:
- OPTIONS
- POST
processEvent:
handler: bin/processEvent
package:
patterns:
- bin/processEvent
events:
- msk:
arn: ${env:KAFKA_ARN}
topic: ${env:KAFKA_TOPIC}
batchSize: ${env:BATCH_SIZE}
startingPosition: LATEST
resources:
Resources:
GatewayResponseDefault4XX:
Type: 'AWS::ApiGateway::GatewayResponse'
Properties:
ResponseParameters:
gatewayresponse.header.Access-Control-Allow-Origin: "'*'"
gatewayresponse.header.Access-Control-Allow-Headers: "'*'"
ResponseType: DEFAULT_4XX
RestApiId:
Ref: 'ApiGatewayRestApi'
myDefaultRole:
Type: AWS::IAM::Role
Properties:
Path: /
RoleName: my-app-dev-eu-serverless-lambdaRole-${sls:stage} # required if you want to use 'serverless deploy --function' later on
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action: sts:AssumeRole
# note that these rights are needed if you want your function to be able to communicate with resources within your vpc
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole
- arn:aws:iam::aws:policy/service-role/AWSLambdaMSKExecutionRole
I must warn you that we spend a lot of time figuring out how to properly setup VPC and other networking / permission stuff. My collage will write blog post once he arrivers from vacation. :) I hope this helps you some how. Best of luck ;)
UPDATE
If you are using javascript, then you would connect to Kafka similar to this
const { Kafka } = require('kafkajs')
const kafka = new Kafka({
clientId: 'order-app',
brokers: [
'broker1:port',
'broker2:port',
],
ssl: true, // false
})

The connection string which is called broker bootstrap string can be found my making an API call like aws kafka get-bootstrap-brokers --cluster-arn ClusterArn
See example here: https://docs.aws.amazon.com/msk/latest/developerguide/msk-get-bootstrap-brokers.html
Also here is a step by step walk through on how produce/consume data: https://docs.aws.amazon.com/msk/latest/developerguide/produce-consume.html

Cognito "PreSignUp invocation failed due to configuration" despite having invoke permissions well configured

I currently have a Cognito user pool configured to trigger a pre sign up lambda. Right now I am setting up the staging environment, and I have the exact same setup on dev (which works). I know it is the same because I am creating both envs out of the same terraform files.
I have already associated the invoke permissions with the lambda function, which is very often the cause for this error message. Everything looks the same in both environments, except that I get "PreSignUp invocation failed due to configuration" when I try to sign up a new user from my new staging environment.
I have tried to remove and re-associate the trigger manually, from the console, still, it doesn't work
I have compared every possible setting I can think of, including "App client" configs. They are really the same
I tried editing the lambda code in order to "force" it to update
Can it be AWS taking too long to invalidate the permissions cache? So far I can only believe this is a bug from AWS...
Any ideas!?

There appears to be a race condition with permissions not being attached on the first deployment.
I was able to reproduce this with cloudformation.
Deploying a stack with the same config twice appears to "fix" the permissions issue.
I actually added a 10-second delay on the permissions attachment and it solved my first deployment issue...
I hope this helps others who run into this issue. 😃
# Hack to fix Cloudformation bug
# AWS::Lambda::Permission will not attach correctly on first deployment unless "delay" is used
# DependsOn & every other thing did not work... ¯\_(ツ)_/¯
CustomResourceDelay:
Type: Custom::Delay
DependsOn:
- PostConfirmationLambdaFunction
- CustomMessageLambdaFunction
- CognitoUserPool
Properties:
ServiceToken: !GetAtt CustomResourceDelayFunction.Arn
SecondsToWait: 10
CustomResourceDelayFunctionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement: [{ "Effect":"Allow","Principal":{"Service":["lambda.amazonaws.com"]},"Action":["sts:AssumeRole"] }]
Policies:
- PolicyName: !Sub "${AWS::StackName}-delay-lambda-logs"
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action: [ logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents ]
Resource: !Sub arn:${AWS::Partition}:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${AWS::StackName}*:*
CustomResourceDelayFunction:
Type: AWS::Lambda::Function
Properties:
Handler: index.handler
Description: Wait for N seconds custom resource for stack debounce
Timeout: 120
Role: !GetAtt CustomResourceDelayFunctionRole.Arn
Runtime: nodejs12.x
Code:
ZipFile: |
const { send, SUCCESS } = require('cfn-response')
exports.handler = (event, context, callback) => {
if (event.RequestType !== 'Create') {
return send(event, context, SUCCESS)
}
const timeout = (event.ResourceProperties.SecondsToWait || 10) * 1000
setTimeout(() => send(event, context, SUCCESS), timeout)
}
# ------------------------- Roles & Permissions for cognito resources ---------------------------
CognitoTriggerPostConfirmationInvokePermission:
Type: AWS::Lambda::Permission
## CustomResourceDelay needed to property attach permission
DependsOn: [ CustomResourceDelay ]
Properties:
Action: lambda:InvokeFunction
FunctionName: !GetAtt PostConfirmationLambdaFunction.Arn
Principal: cognito-idp.amazonaws.com
SourceArn: !GetAtt CognitoUserPool.Arn

In my situation the problem was caused by the execution permissions of the Lambda function: While there was a role configured, that role was empty due to some unrelated changes.
Making sure the role actually had permissions to do the logging and all the other things that the function was trying to do made things work again for me.

Passing a list of KeyValuePairs to AWS Cloudformation template [duplicate]

When creating ECS infrastructure we describe our Task Definitions with CloudFormation. We want to be able to dynamically pass environment variables as a parameter to the template. According to the docs, Environment has a KeyValuePair type, but CloudFormation parameters do not have this type.
We can not hardcode Environment variables to the template, because this template is used as a nested stack so environment variables will be dynamically passed inside it.
The only possible way I see so far is to pass all arguments as a CommaDelimitedList, and then somehow parse and map it using CloudFormation functions. I can Fn::Split every entity in key and value, but how to dynamically build an array of KeyValuePair in CloudFormation?
Or maybe there is an easier way, and I'm missing something? Thanks in advance for any ideas.

I know it's late and you have already found a workaround. However, the following is the closest I came to solve this. Still not completely dynamic as expected parameters have to be defined as placeholders. Therefore the maximum number of environment variables expected should be known.
The answer is based on this blog. All credits to the author.
Parameters:
EnvVar1:
Type: String
Description: A possible environment variable to be passed on to the container definition.
Should be a key-value pair combined with a ':'. E.g. 'envkey:envval'
Default: ''
EnvVar2:
Type: String
Description: A possible environment variable to be passed on to the container definition.
Should be a key-value pair combined with a ':'. E.g. 'envkey:envval'
Default: ''
EnvVar3:
Type: String
Description: A possible environment variable to be passed on to the container definition.
Should be a key-value pair combined with a ':'. E.g. 'envkey:envval'
Default: ''
Conditions:
Env1Exist: !Not [ !Equals [!Ref EnvVar1, '']]
Env2Exist: !Not [ !Equals [!Ref EnvVar2, '']]
Env3Exist: !Not [ !Equals [!Ref EnvVar3, '']]
Resources:
TaskDefinition:
ContainerDefinitions:
-
Environment:
- !If
- Env1Exist
-
Name: !Select [0, !Split [":", !Ref EnvVar1]]
Value: !Select [1, !Split [":", !Ref EnvVar1]]
- !Ref "AWS::NoValue"
- !If
- Env2Exist
-
Name: !Select [0, !Split [":", !Ref EnvVar2]]
Value: !Select [1, !Split [":", !Ref EnvVar2]]
- !Ref "AWS::NoValue"
- !If
- Env3Exist
-
Name: !Select [0, !Split [":", !Ref EnvVar3]]
Value: !Select [1, !Split [":", !Ref EnvVar3]]
- !Ref "AWS::NoValue"

You may want to consider using the EC2 Parameter Store to create secured key/value pairs, which is supported in CloudFormation, and can be integrated with ECS environments.
AWS Systems Manager Parameter Store
AWS Systems Manager Parameter Store provides secure, hierarchical
storage for configuration data management and secrets management. You
can store data such as passwords, database strings, and license codes
as parameter values. You can store values as plain text or encrypted
data. You can then reference values by using the unique name that you
specified when you created the parameter. Highly scalable, available,
and durable, Parameter Store is backed by the AWS Cloud. Parameter
Store is offered at no additional charge.
While Parameter Store has great security features for storing application secrets, it can also be used to store nonsensitive application strings such as public keys, environment settings, license codes, etc.
And it is supported directly by CloudFormation, allowing you to easily capture, store and manage application configuration strings which can be accessed by ECS.
This template allows you provide the Parameter store key values at stack creation time via the console or CLI:
Description: Simple SSM parameter example
Parameters:
pSMTPServer:
Description: SMTP Server URL eg [email-smtp.us-east-1.amazonaws.com]:587
Type: String
NoEcho: false
SMTPServer:
Type: AWS::SSM::Parameter
Properties:
Name: my-smtp-server
Type: String
Value: !Ref pSMTPServer
Any AWS runtime environment (EC2, ECS, Lambda) can easily securely retrieve the values. From the console side, there is great Parameter manager interface that maintains parameter version history. Its intergated with IAM, so permissions are controlled with standard IAM policy syntax:
{
"Action": [
"ssm:GetParameterHistory",
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath"
],
"Resource": [
"arn:aws:ssm:us-west-2:555513456471:parameter/smtp-server"
],
"Effect": "Allow"
},
{
"Action": [
"kms:Decrypt"
],
"Resource": [
"arn:aws:kms:us-west-2:555513456471:key/36235f94-19b5-4649-84e0-978f52242aa0a"
],
"Effect": "Allow"
}
Finally, this blog article shows a technique to read the permissions into a Dockerfile at runtime. They suggest a secure way to handle environment variables in Docker with AWS Parameter Store. For reference, I am including their Dockerfile here:
FROM grafana/grafana:master
RUN curl -L -o /bin/aws-env https://github.com/Droplr/aws-env/raw/master/bin/aws-env-linux-amd64 && \
chmod +x /bin/aws-env
ENTRYPOINT ["/bin/bash", "-c", "eval $(/bin/aws-env) && /run.sh"]
With that invocation, each of the parameters are available as an environment variable in the container. You app may or may not need a wrapper to read the parameters from the environment variables.

I was facing the same problem ,I needed to create a lambda resource with environment variables.
We decided to fix initial set of environment variable and keys name are also decided in advance.
So I added four parameters , and used Ref for values while keeping fixed keys name.

There is another way too - which may sound overkill but it allows to put whatever env. to the function you wish, no need to "predefine" how many env. variables, only restriction in sample below - can not use :::: or |||| inside value of the key. Key can't have such symbols by AWS docs already.
Gameplan:
Make an inline CF Lambda Function with code which accepts all env in any format you wish as a string and uses any code you want to use inside that function (i use JS with NodeJS env) and while it's your code, parse how you wish that string passing in and use aws-sdk to update the function. Call function once inside the CF template.
In this sample you pass in env as such string:
key1::::value1||||key2::::value2 If you need to use :::: or |||| in your value, of course update to some other divider.
Not big fan of running lambda for such task, yet i want to have option of passing in virtually any env. to the CF template and this works.
LambdaToSetEnvRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- 'sts:AssumeRole'
ManagedPolicyArns:
- arn:aws:iam::aws:policy/CloudWatchLambdaInsightsExecutionRolePolicy
Policies:
- PolicyName: cloudwatch-logs
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- 'logs:CreateLogGroup'
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource:
- !Sub "arn:aws:logs:*:${AWS::AccountId}:log-group:*:*"
- !Sub "arn:aws:logs:*:${AWS::AccountId}:log-group:/aws/lambda-insights:*"
- PolicyName: trigger-lambda-by-cloud-events
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- 'lambda:UpdateFunctionConfiguration'
Resource:
- !GetAtt OriginalLambda.Arn
Tags:
- { Key: managed-by, Value: !Ref AWS::StackId }
LambdaToSetEnv:
Type: AWS::Lambda::Function
DeletionPolicy: Delete
Properties:
Code:
ZipFile: |
const response = require('cfn-response');
const aws = require('aws-sdk');
exports.handler = (event, context) => {
console.log(JSON.stringify({event, context}));
try {
if (event.RequestType === "Delete") {
response.send(event, context, response.SUCCESS, {RequestType: event.RequestType});
} else {
const client = new aws.Lambda({apiVersion: '2015-03-31'});
const Variables = {
"All": process.env.FunctionEnvVariables,
};
console.log('process.env.FunctionEnvVariables: ', process.env.FunctionEnvVariables);
if(process.env.FunctionEnvVariables){
process.env.FunctionEnvVariables.split('||||').forEach((pair) => {
if(pair && pair.trim() !== ''){
Variables[pair.split('::::')[0]] = pair.split('::::')[1];
}
})
}
const result = client.updateFunctionConfiguration({ FunctionName: process.env.LambdaToUpdateArn, Environment: { Variables } }, function (error, data){
console.log('data: ', data);
console.log('error: ', error);
if(error){
console.error(error);
response.send(event, context, response.ERROR, {});
} else {
response.send(event, context, response.SUCCESS, {});
}
});
}
} catch (e) {
response.send(event, context, response.ERROR, e.stack);
}
}
Role: !GetAtt LambdaToSetEnvRole.Arn
Handler: index.handler
Runtime: nodejs14.x
Timeout: '300'
Environment:
Variables:
LambdaToUpdateArn: !GetAtt OriginalLambda.Arn
FunctionEnvVariables: !Ref FunctionEnvVariables
LambdaCall:
DependsOn:
- OriginalLambda
- LambdaToSetEnv
Type: Custom::LambdaCallout
Properties:
ServiceToken: !GetAtt LambdaToSetEnv.Arn

Granting access to AWS Lambda resources that have a given tag

I want to grant access to a group of users to perform certain operations on certain Lambda functions. My Lambdas are already tagged properly to allow this, for instance: "department:hr". Can I tie this together with IAM?
I have seen documentation on conditionals that allow comparison of ResourceTag\* to a value, but these do not seem to be available in the visual editor (which unfortunately I depend on) for Lambda functions.
I want something like this:
"Effect": "Allow",
"Action": [
"lambda:ListFunctions",
"lambda:ListVersionsByFunction",
"lambda:GetLayerVersion",
"lambda:GetEventSourceMapping",
"lambda:GetFunction",
"lambda:ListAliases",
"lambda:GetAccountSettings",
"lambda:GetFunctionConfiguration",
"lambda:GetLayerVersionPolicy",
"lambda:ListTags",
"lambda:ListEventSourceMappings",
"lambda:ListLayerVersions",
"lambda:ListLayers",
"lambda:GetAlias",
"lambda:GetPolicy"
],
"Resource": "*"
"Condition": {
"StringEquals": {
"lambda:ResourceTag/department": "hr"
}
I can't build this in the visual editor and I get syntax errors when I attempt it myself.

I don't believe that lambda:ResourceTag/${TagKey} is a context condition available for any lambda actions (REF: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_awslambda.html).
With that said, incorrect use of context keys typically fails silently. Could you include the full statement? For example, in the above snippet, there is a missing } for the condition.

If your IAM users are tagged with department:hr and if they assume the below IAM role via console, they should be able to access the lambda functions that have been tagged with department:hr.
HRDepartmentLambdaFunctionsAccessRole:
Type: AWS::IAM::Role
Properties:
RoleName: "HRDepartmentLambdaFunctionsAccessRole"
AssumeRolePolicyDocument:
# Allow users in account X to perform operations on lambda functions
Statement:
- Effect: Allow
Principal:
AWS:
- "AWS_ACCOUNT_NUMBER"
Action:
- sts:AssumeRole
Condition:
StringEquals:
aws:PrincipalTag/department:
- hr
Path: /
Policies:
- PolicyName: AllowAccessToLambdaFunctionsInHRDepartment
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- lambda:ListFunctions
- lambda:ListVersionsByFunction
- lambda:GetLayerVersion
- lambda:GetEventSourceMapping
- lambda:GetFunction
- lambda:ListAliases
- lambda:GetAccountSettings
- lambda:GetFunctionConfiguration
- lambda:GetLayerVersionPolicy
- lambda:ListTags
- lambda:ListEventSourceMappings
- lambda:ListLayerVersions
- lambda:ListLayers
- lambda:GetAlias
- lambda:GetPolicy
Resource: '*'
Condition:
StringEquals:
lambda:ResourceTag/department: 'hr'
Ref: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html

Lambda doesn't have permission to add tags to managed instances in systems manager

I have a multi account structure in AWS, where I have a master and child accounts. I am following this guide in order to propagate tags from the child instances to the master account, once they have been activated and I can manage the instances in the master account (systems manager).
So far it all works to the point where the lambda in the master account has all of the tags it needs. However, it is unable to add the tags to the managed instances in systems manager. Not sure why the role still can't access the tags, given the permissions...
This is the error I get:
[ERROR] 2019-03-29T09:14:02.419Z a00a68ba-9904-4199-bcae-cad75f6f5232 An error occurred (ValidationException) when calling the AddTagsToResource operation: Caller is an end user and not allowed to mutate system tags instanceId: mi-0d3bfce27d073c0f2
This is the lambda function with the attached role:
AWSTemplateFormatVersion: '2010-09-09'
Description: Management function that copies tags
Resources:
rSSMTagManagerRole:
Type: "AWS::IAM::Role"
Properties:
RoleName: Automation-SSMTagManagerRole
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Principal:
Service:
- "lambda.amazonaws.com"
Action:
- "sts:AssumeRole"
Path: "/aws/"
Policies:
- PolicyName: "CopyInstanceTagsToSSMPolicy"
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: "Allow"
Action:
- ssm:AddTagsToResource
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
- tag:*
Resource: "*"
fnSSMTagManager:
Type: AWS::Lambda::Function
Properties:
FunctionName: Automation-SSM-Tag-Manager
Handler: index.lambda_handler
Role: !GetAtt [rSSMTagManagerRole, Arn]
Description: >
Copies tags from the list of instances in the event
context to the specified managed instances.
Code:
ZipFile: |+
import boto3
import json
import logging
#setup simple logging for INFO
logger = logging.getLogger()
logger.setLevel( logging.WARN )
client = boto3.client( 'ssm' )
def lambda_handler( event, context ):
"""Copies tags from the list of instances in the event
context to the specified managed instances.
"""
for instance in event[ "instances" ]:
addTags( instance[ "instanceId" ], instance[ "tags" ] )
def addTags( resourceid, tags ):
logger.info( "Configuring " + resourceid + " with " + str(tags) )
try:
response = client.add_tags_to_resource(
ResourceType='ManagedInstance',
ResourceId=resourceid,
Tags=tags
)
logger.info( response )
return response
except Exception as e:
errorMessage = str(e) + "instanceId: " + resourceid
logger.error( errorMessage )
return errorMessage
Runtime: python3.6
Timeout: '90'

Using the same guide. Faced the exact same error. It turned out that the instances in the agency account were having too many(10 plus) tags which caused the Tag Manager to give this error. Modified the Tag collector lambda function to propagate only specific tags instead of all tags. That cleared the error.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js