PutLifecycleHook operation: Unable to publish test message to notification target (FIFO) - amazon-web-services

There are many documents that explain how to resolve this error. Checked many of them and tried . However following them is not resolving this issue for me.
Error I get is
An error occurred (ValidationError) when calling the PutLifecycleHook operation: Unable to publish test message to notification target arn:aws:sqs:xxxxx:XXXXX:kubeeventsqueue.fifo using IAM role arn:aws:iam::XXXXXXXXX:role/kubeautoscaling. Please check your target and role configuration and try to put lifecycle hook again.
The command I am using is:
aws autoscaling put-lifecycle-hook --lifecycle-hook-name terminate --auto-scaling-group-name mygroupname --lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING --role-arn arn:aws:iam::XXXXXX:role/kubeautoscaling --notification-target-arn arn:aws:sqs:xxxxx:XXXXXXX:kubeeventsqueue.fifo
Note that I have replaced XXXXX for the actual ids above.
The role concerned (arn:aws:iam::XXXXXX:role/kubeautoscaling) is having trust relationship with autoscaling.amazonaws.com. It is also having "AutoScalingNotificationAccessRole" policy attached to it.
While testing, I have also tried adding a permission of "Allow everybody" for All SQS Actions (SQS:*). (Removed it after testing though).
I have also tried to first create SQS queue and then configure --notification-target-arn, without any success.
Any help on this would be very helpful.

It appears that you are using an Amazon SQS FIFO (first-in-first-out) queue.
From Configuring Notifications for Amazon EC2 Auto Scaling Lifecycle Hooks - Receive Notification Using Amazon SQS:
FIFO queues are not compatible with lifecycle hooks.
I don't know whether this is the cause of your current error, but it would prohibit your desired configuration from working.

Yes, FIFO queues are definitely not supported by LifeCycleHooks. I Wasted a lot of time dorking with permissions and queue configuration only to finally find that FIFO is not supported. It would be nice if this were much more prominent in the documentation because 1) it's way not obvious or intuitive and 2) the error message received suggests it's permissions or something. How about explicitly stating "FIFO Queues are not supported" instead of "Failed to send test message..." RIDICULOUS!

Related

AWS Eventbridge Notifications Does Not Work Using SNS topic

I want to receive notifications from AWS Eventbridge when there's a scheduled event for my Amazon Elastic Compute Cloud (Amazon EC2) instance.
I created an Eventbridge rule and set the target to an already working SNS topic. The SNS topic is subscribed to a working Lambda function that is used for other "Cloudwatch to slack" alarms already. The eventbridge setting is as follows:
{
"source": ["aws.health"],
"detail-type": ["AWS Health Event"],
"detail": {
"service": ["EC2"],
"eventTypeCategory": ["scheduledChange"]
}
}
I already got an EC2 scheduled maintenance(reboot) notification as e-mail from AWS, but this eventbridge I created did not trigger for that and did not send any notification to the slack channel.
I am unsure now if I am missing something in the setting. I am setting it for the first time and no way to simply test it with fake input. It is supposed to work even if there is a single schedule event that appears in the top bell icon(as shown in the screenshot above), correct?
In order to find out the root cause of this issue, I suggest to take a look a the CloudWatch usage metrics for SNS. SNS reports the following metrics which might be useful for you: NumberOfMessagesPublished, NumberOfNotificationsDelivered NumberOfNotificationsFailed. If you find these metrics reported an they have a value different than 0, this means that SNS receives events from Event Bridge and the problem is somewhere else.
If you are using a Lambda to send messages to Slack, you should take a look at the logs in CloudWatch to see if the Lambda did execute successfully. You might want to check out the setup for Lambda recommended by AWS: (link)
For further debugging you may want to check out test-event-pattern CLI command.
It is supposed to work even if there is a single schedule event that appears in the top bell icon(as shown in the screenshot above), correct?
Yeah, it supposed to work even if there already is an event.
I'm having a similar issue with eventbridge rule being built with cloudformation. I had to manually go into the eventbridge rule via the AWS console and go to the trigger and select the SNS topic again. It now works. It took me a while to figure out. Can you confirm that the fix did that for you as I'm not sure how to fix this...

What is the Terraform resource for this AWS console item?

I am looking to add notifications to a build pipeline I am deploying in AWS via Terraform. I cannot seem to locate the resource which creates the status notifications in CodeBuild. Can someone let me know which resource this is?
You’ve not mentioned what sort of notification you are looking to create, so I won’t be able to provide some sample code, however, as per the AWS docs here, you can detect state changes jn CodePipeline using Cloudwatch events.
You can find the Terraform reference for CloudWatch Event Rules here, and you can follow the docs to create a resource that monitors CodePipeline for state changes using CloudWatch Events Rules.

An error occurred (InvalidParameterException) when calling the PutSubscriptionFilter operation

Trying to put cloud watch logs into kineses firehose.
Followed below:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html#FirehoseExample
Got this error
An error occurred (InvalidParameterException) when calling the PutSubscriptionFilter operation: Could not deliver test message to specified Firehose stream. Check if t
e given Firehose stream is in ACTIVE state.
aws logs put-subscription-filter --log-group-name "xxxx" --filter-name "xxx" --filter-pattern "{$.httpMethod = GET}" --destination-arn "arn:aws:firehose:us-east-1:12345567:deliverystream/xxxxx" --role-arn "arn:aws:iam::12344566:role/xxxxx"
You need to update the trust policy of your IAM role so that it gives permissions to the logs.amazonaws.com service principal to assume it, otherwise CloudWatch Logs won't be able to assume your role to publish events to your Kinesis stream. (Obviously you also need to double-check the permissions on your role to make sure it has permissions to read from your Log Group and write to your Kinesis Stream.)
It would be nice if they added this to the error message to help point people in the right direction...
The most likely problem that causes this error is a permissions issue. i.e. something wrong in the definition of the IAM role you passed to --role-arn. You may want to double check that the role and its permissions were set up properly as described in the doc.
I was getting a similar error when subscribing to a cloudwatch loggroup and publishing to a Kinesis stream. Cdk was not defining a dependency needed for the SubscriptionFilter to be created after the Policy that would allow the filtered events to be published in Kinesis. This is reported in this github cdk issue:
https://github.com/aws/aws-cdk/issues/21827
I ended up using the workaround implemented by github user AlexStasko: https://github.com/AlexStasko/aws-subscription-filter-issue/blob/main/lib/app-stack.ts
If your Firehose is active status and you can send log stream then the remaining issue is only policy.
I got the similar issue when follow the tutorial. The one confused here is Kinesis part and Firehose part, we may mixed up together. You need to recheck your: ~/PermissionsForCWL.json, with details part of:
....
"Action":["firehose:*"], *// You could confused with kinesis:* like me*
"Resource":["arn:aws:firehose:region:123456789012:*"]
....
When I did the tutorial you mentioned, it was defaulting to a different region so I had to pass --region with my region. It wasn't until I did the entire steps with the correct region that it worked.
For me I think this issue was occurring due to the time it takes for the IAM data plane to settle after new roles are created via regional IAM endpoints for regions that are geographically far away from us-east-1.
I have a custom Lambda CF resource that auto-subscribes all existing and future log groups to a Firehose via a subscription filter. The IAM role gets deployed for CW Logs then very quickly the Lambda function tries to subscribe the log groups. And on occasion this error would happen.
I added a time.sleep(30) to my code (this code only runs once a stack creation so it's not going to hurt anything to wait 30 seconds).

send notification alert when AWS Lambda function has an error

I have a AWS Lambda function running some process in my infrastructure. The Lambda is triggered every 8 hours using a CloudWatch rule. I am trying to raise a notification if any error happens into the Lambda process. I tried to use SES but that service is not available in that Region.
I will like to know any suggestions for this problem:
How to setup notifications when an error occurs in my Lambda functions ?
I am looking for suggestions. This questions never asked for doing my task. I will appreciate any official documentation but either way, any help is welcome.
Some suggestions:
Dead Letter Queues:
If your error causes failed invocations, you can use a Lambda Dead Letter Queue to send the event to an SNS topic or an SQS queue. If you send it to an SNS topic, you can directly subscribe to the topic via SNS or Email to get notified any time a message is published to that topic.
Multi-region SES:
If you're really set on using SES directly, SES clients can be instantiated with an explicit region provided -- as long as your lambda's execution role has the appropriate permissions, you can send email to SES from a different region. Here's documentation for instantiating the JS SES Client.
CloudWatch Logs:
If your error does not cause the invocation to fail, another option is using a CloudWatch Logs metric filter to aggregate failures and potentially alarm on them. If you're using NodeJS, you can simply log out via console.log(), console.error(), etc. and it will be written out to CWLogs. More details here.
You can subscribe an SNS topic to CloudWatch Alarms, and notify yourself in the same way as the DLQ.
As you gain experience with the error and learn how to process common errors, you could also subscribe another lambda to the SNS topic from the DLQ/CWLogs example to process it as it happens.

How to find out which machine keeps reading and deleting messages from an SQS queue

I have an SQS queue and several machines that read from them. Although I have shut down all of them, somebody keeps reading and removing messages from the queue.
Is there any way to find the ip number of the machines that read messages from an SQS queue.
Thanks
No, you cannot get the IP or EC2 instance that is performing these actions, but there are some steps you can take to attempt to narrow down what is consuming messages.
CloudTrail will only log the following actions to SQS:
AddPermission
CreateQueue
DeleteQueue
PurgeQueue
RemovePermission
SetQueueAttributes
This means that a consumer of the messages is not logged meaning CloudTrail cannot answer this question for you.
What you can do is use the IAM console to try and isolate which user or role is accessing the SQS service. This will not narrow it down to which individual queue, but it is better than nothing. You can look under the Access Advisor tab to check each user/role if it is using SQS.
If that is not enough to narrow down then you will probably have to resort to adding a policy to the SQS queue to start blocking users/roles from getting messages from that specific queue. This will be a game of guess and check. Alternatively, you could lock the queue down so only a specific user or role can read from the queue.
If you are using cross-account-access for this queue the above steps will not be as useful as you will not have the same level of visibility. Also if you have the same role or user that is used by lots of different servers or applications this approach will also not work. If that is the case this would be a good time to start applying least privilege as it can help with these types of problems.
You could write a program in your message consumption application to run some of the queries here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html and then log for example the instance ID to somewhere you know you'll be able to pick it up.
This is redundant however if you don't know where to deploy this code (because you don't know the machine/group that is consuming the messages).
I could suggest manually creating a message in the SQS management console ( Queue Actions > Send a message), and create the message such that you know an error will be thrown by your consumer application (i.e. badly formatted or whatever). Then search the relevant location for the error- this will hopefully allow you to detect which machine threw it.