High publish latency with AWS SNS

High publish latency with AWS SNS - amazon-web-services

I'm seeing some occasional latency spikes in a production env (up to 7 seconds) when publishing messages to SNS from EC2. According to datadog APM, the bottleneck is the call to SocketInputStream.read(byte[], int, int, int) which happens somewhere deep in the publish call. When looking at the sdk metrics in cloudwatch, the APICallDuration and ServiceCallDuration correlate with the numbers in datadog (> 6 seconds).
I'm using a r6a.2xlarge EC2 instance which runs in the same region as SNS (us-east-1)
I'm also using the SNS Java sync client and I'm creating it as follows
private val metricsPublisher = CloudWatchMetricPublisher
.builder()
.cloudWatchClient(
CloudWatchAsyncClient
.builder()
.httpClientBuilder(
NettyNioAsyncHttpClient
.builder()
)
.build()
)
.build()
protected val client: SnsClient = SnsClient.builder
.endpointOverride(URI.create(endpoint))
.region(region)
.overrideConfiguration(
ClientOverrideConfiguration
.builder()
.addMetricPublisher(metricsPublisher)
.build()
)
.build()
Any suggestions on possible fixes/mitigations?

Related

AWS Backup - How to get the cluster id value

Here's the scenario, after AWS Backup completes backup cluster aurora RDS, it will send 1 event message to SNS topic. Finally, I'll use lambda to get cluster information.
AWS Backup job completed -> Event message -> SNS topic -> trigger lambda
The thing is that I cannot find any information related to cluster ID ( tried with EventBridge ). I've tried to use Event Subscription ( RDS), but, it won't send any information while I perform AWS Backup.
Please let me know if you have any other ideas, thanks in advance

Glue Crawler: The number of unique events received is 0 for the target

I've created a crawler that pulls messages from SQS when new objects are added on S3 but when it runs the message "The number of unique events received is 0 for the target" is printed and the expected table isn't created. When I remove S3 events from crawler settings the tables are created successfully.
Execution logs:
BENCHMARK : Running Start Crawl for Crawler [crawler_name]
INFO : The crawl is running by consuming Amazon S3 events.
INFO : The number of messages in the SQS queue arn:aws:sqs:[myqueue] is 17
INFO : The number of messages in the SQS queue arn:aws:sqs:[myqueue-dlq] is 0
INFO : The number of unique events received is 0 for the target s3://[mybucket]/[myfolder]
BENCHMARK : Crawler has finished running and is in state READY

Are you using Amazon S3 event notification, or Amazon S3 bucket notification to sending out notification to Amazon SQS ?.
I faced the same issue, for me the issue was caused as I used S3 Event notification ( via AWS Event bridge service ) instead of plain old Amazon S3 bucket notification for forwarding S3 notification messages to Amazon SQS. After switching to Amazon S3 bucket notification this issue was resolved. Message format for Amazon S3 bucket notification and Amazon S3 event notification is different and it seems AWS Glue crawler does not process/recognize message received via Amazon S3 event notification. Hope this helps.

Recieve alert on any specific windows service entered into stopped state

I want email notification if any specific EC2 windows service entered into the stopped state.
I configured CloudWatch, able to receive logs of all windows services.
Created a lambda function to get notify when any service entered into the stopped state, but the problem is I am receiving alert only when I click on the test function.
I am receiving CloudWatch logs like this:
03:43:02 [System] [INFORMATION] [7036] [Service Control Manager] [mydomain.com] [The Background Intelligent Transfer Service service entered the running state.]
03:43:02 [System] [INFORMATION] [7040] [Service Control Manager] [mydomain.com] [The start type of the Background Intelligent Transfer Service service was changed from demand start to auto start.]
03:43:02 [System] [INFORMATION] [7036] [Service Control Manager] [mydomain.com] [The WinHTTP Web Proxy Auto-Discovery Service service entered the running state.]
03:45:02 [System] [INFORMATION] [7040] [Service Control Manager] [mydomain.com] [The start type of the Background Intelligent Transfer Service service was changed from auto start to demand start.]
This is my lambda function:
import boto3
import time
client = boto3.client('logs')
sns = boto3.client('sns')
instance_name = "Development"
a1 = int(round(time.time() * 1000))
def lambda_handler(event, context):
response = client.get_log_events(
logGroupName = 'Eadev',
logStreamName = 'i-01fe1z56y790cq',
startTime = a1,
startFromHead = False
)
event01 = '[System] [INFORMATION] [7036] [Service Control Manager] [mydomain.com] [The DebtManager-Host service entered the stopped state.]'
event02 = '[System] [INFORMATION] [7036] [Service Control Manager] [mydomain.com] [The DebtManager-Controller service entered the stopped state.]'
for i in response['events']:
if event01 == i['message']:
print(event01)
sns.publish( TargetArn = "arn:aws:sns:us-east-1:3913948:testsns",Message = instance_name +" "+ event01)
if event02 == i['message']:
print(event02)
sns.publish( TargetArn = "arn:aws:sns:us-east-1:3913948:testsns",Message = instance_name +" "+ event02)
I expected email notification from any service stopped, but I am receiving alert only when I clicked on test in Lambda function.

It appears that your desired situation is:
The Amazon CloudWatch agent on the Windows instance sends log data to Amazon CloudWatch Logs
Send a notification when a particular entry is detected in the log file
Rather than triggering a Lambda function for every log message, you can use CloudWatch Logs Filter Metrics to trigger a CloudWatch Alarm:
Collecting Metrics and Logs from Amazon EC2 Instances and On-Premises Servers with the CloudWatch Agent
Searching and Filtering Log Data to detect the desired messages by Creating Metric Filters
This pushes metrics into Amazon CloudWatch Metrics
You can then create a traditional Amazon CloudWatch Alarm on the metric and have it trigger when a certain number of such messages are received
A CloudWatch Alarm can send a notification an Amazon SNS topic
For an end-to-end example, see: Use Amazon CloudWatch Logs Metric Filters to Send Alerts - The IT Hollow
Alternatively, you can use an AWS Lambda function:
Collect Metrics and Logs from Amazon EC2 Instances and On-Premises Servers with the CloudWatch Agent
Use Real-time Processing of Log Data with Subscriptions
It can accept a subscription filter to identify the records of interest
It can then trigger an AWS Lambda function, which you can program to do whatever you wish (eg send a message to an Amazon SNS topic)

AWS running Lambda read from IOT topic

I have a AWS Lambda which publishes data to a AWS IOT topic A and waits for the result, which will be published to a different topic B.
I was wondering how to get this data from topic B when the thing publishes it to the already running lambda.
I was not able to find any equivalent to get_thing_shadow for a particular topic https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iot-data.html#id4
Eg:
Lambda1 -> IOT Topic A -> Thing
Lambda1 waiting
Thing -> IOT Topic B
Lambda1 reads from Topic B and updates say DB and dies.
I was wondering how this can be done.
For some reasons we are unable to use IOT Shadow anymore.
Current architecture:
Lambda1 -> IOT Shadow Desired -> Thing
Lambda1 -> waits for 5 sec
Lambda1 -> reads IOT Shadow Reported -> success or failure
If failure Lambda1 -> resets IOT Desired to old state -> exists

It is not possible to configure IoT to send the new message to the "already running" Lambda. It will always trigger a new invocation of the Lambda function. Isn't the previous state already in the IoT Shadow Update Failed message? Can't you just use that data in the new invocation to do whatever DB updates or whatever else you need?

AWS sdk for lambda(e.g. boto3 for python) does not support subscribing topic.
It only support publishing topic.
If you want to subscribe topic, You must use device sdk
(ref. https://docs.aws.amazon.com/iot/latest/developerguide/iot-sdks.html )
And then, You can publish and subscribe by device sdk in lambda.
If you don't want to use device sdk, you have to use redis or dynamoDB like below.
device publish response message -> AWS IoT Rule trigger some action(e.g. write to DB ) -> lambda polling DB.

Amazon SQS Tagging

We are trying to setup Amazon SQS between two AWS applications. Management wants to track cost associated with all Amazon resources. Is it possible to tag Amazon Simple Queue Service resources?

This feature is now supported on SQS: https://aws.amazon.com/blogs/aws/introducing-cost-allocation-tags-for-amazon-sqs/

Tagging for SQS is not yet supported. Perhaps you may manually calculate using the standard formula with few assumption of number of requests made like number of SQS Request etc.
In my opinion you can enable cost tagging for the AWS Resources which are support and for the remaining you can try having the accountability like misc. charges which can certainly include the SQS.
First 1 million Amazon SQS Requests per month are free
$0.50 per 1 million Amazon SQS Requests per month thereafter ($0.00000050 per SQS Request)
A single request can have from 1 to 10 messages, up to a maximum total payload of 256KB.
Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests.
Reference : http://aws.amazon.com/sqs/pricing/

Please use following AWS cli command to tag you SQS:
aws sqs tag-queue --queue-url --tags="Key-name=Value","Key-name=Value"

I have created a boto3 script to tag the SQS. Please pass your queue name in tag_sqs_queue()
import boto3
client = boto3.client('sqs')
def get_queue_url(queuename):
get_url = client.get_queue_url(
QueueName=queuename
)
def tag_sqs_queue(queuename):
response = client.tag_queue(
QueueUrl=get_queue_url(queuename),
Tags={
'Environment': 'Production',
'Owner': 'Joe Biden'
}
)
tag_sqs_queue('<queuname>')
script first gets the queue URL and then applies tags.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

High publish latency with AWS SNS - amazon-web-services

Related

AWS Backup - How to get the cluster id value

Glue Crawler: The number of unique events received is 0 for the target

Recieve alert on any specific windows service entered into stopped state

AWS running Lambda read from IOT topic

Amazon SQS Tagging

Categories

Resources