I have been trying to implement the section 'Detailed table of client versions connections to WorkSpaces' (Step 6) as mentioned in the guide.
However on running the mentioned query, i get an empty table rather than a populated one
Apart from the query mentioned in the doc, i also tried to change it a bit as suggested by cloudwatch -
stats count() |
fields#timestamp, #message | filter source = "aws.workspaces" |
fields account | fields region, detail.clientPlatform,
detail.clientVersion, detail.workspaceId |display region, detail.clientPlatform,
detail.clientVersion, detail.workspaceId
However, it still gives no result. I am seeing below attached log table as result -
Related
I am trying to create some alerting policies in GCP for my application hosted in Kubernetes cluster.
We have a Cloud load balancer serving the traffic and I can see the HTTP status codes like 2XX, 5XX etc.
I need to create some alerting policies based on the error percentage rather than the absolute value like ((NumberOfFailures/Total) * 100) so that if my error percentage goes above say 50% then trigger an alert.
I couldn't find anything on the google documentation. It just tells you to use counter which is like using an absolute value. I am looking for something like if the failure rate goes beyond 50% in a rolling window of 15 minutes then trigger the alert.
Is that even possible to do that natively in GCP?
Yes, I think this is possible with MQL. I have recently created something similar to your use case.
fetch api
| metric 'serviceruntime.googleapis.com/api/request_count'
| filter
(resource.service == 'my-service.com')
| group_by 10m, [value_request_count_aggregate: aggregate(value.request_count)]
| every 10m
| { group_by [metric.response_code_class],
[response_code_count_aggregate: aggregate(value_request_count_aggregate)]
| filter (metric.response_code_class = '5xx')
; group_by [],
[value_request_count_aggregate_aggregate:
aggregate(value_request_count_aggregate)] }
| join
| value [response_code_ratio: val(0) / val(1)]
| condition gt(val(), 0.1)
In this example, I am using the request count for a service my-service.com. I am aggregating the request count over the last 10 minutes and responses with response code 5xx. Additionally, I am aggregating the request count over the same time period, but all response codes. Then in the last two lines, I am computing the ratio of the number of 5xx status codes with the number of all response codes. Finally, I create a boolean value that is true when the ratio is above 0.1 and that I can use to trigger an alert.
I hope this gives you a rough idea of how you can create your own alerting policy based on percentages.
Lambda obviously tracks executions, since you can see data points in the Lambda Monitoring tab.
Lambda also saves the logs in log groups, however I get the impression that Lambda launches are reused if happening in a shorter interval (say 5 minutes between launches), so the output from multiple executions gets written to the same log stream.
This makes logs a lot harder to follow, especially due to other limitations (the CloudWatch web console is super slow and cumbersome to navigate, aws log get-log-events has a 1MB/10k message limitation which makes it cumbersome to use).
Is there some way to only get Lambda log entries for a specific Lambda execution?
You can filter by the RequestId. Most loggers will include this in the log, and it is automatically included in the START, END, and REPORT entries.
My current approach is to use CloudWatch Logs Insights to query for the specific logs that I'm looking for. Here is the sample query:
fields #timestamp, #message
| filter #requestId = '5a89df1a-bd71-43dd-b8dd-a2989ab615b1'
| sort #timestamp
| limit 10000
I need to query and count the failure logs of my app from different locations to test the performance of the APIs.
fields #message | filter (#message like /Failure/ and #message like /AWE/)
| stats count()as failure by bin(1d) as AWE
Right now I filtered the logs with the expressions of 'Failure' and 'AWE'(one of the location) and got the below results.
# AWE failure
1 2020-12-30T08:00:00.000+08:00 6
I want to continue to filter failure logs with other locations on the same widget but I can't.
So I want to know if I am missing something or it is just not possible to do the thing I want to do.
I have two log groups generated by two different lambda. When I subscribe one log group to my elasticsearch service, it is working. However, when I add the other log group I have the following error in the log generated by cloudwatch :
"responseBody": "{\"took\":5,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"cwl-2018.03.01\",\"_type\":\"/aws/lambda/lambda-1\",\"_id\":\"33894733850010958003644005072668130559385092091818016768\",\"status\":400,\"error\":
{\"type\":\"illegal_argument_exception\",\"reason\":\"Rejecting mapping update to [cwl-2018.03.01] as the final mapping would have more than 1 type: [/aws/lambda/lambda-1, /aws/lambda/lambda-2]\"}}}]}"
How can I resolve this, and still have both log group in my Elasticsearch service, and visualize all the logs ?
Thank you.
The problem is that ElasticSearch 6.0.0 made a change that allows indices to only contain a single mapping type. (https://www.elastic.co/guide/en/elasticsearch/reference/6.0/removal-of-types.html) I assume you are running an ElasticSearch service instance that is using version 6.0.
The default Lambda JS file if created through the AWS console sets the index type to the log group name. An example of the JS file is on this gist (https://gist.github.com/iMilnb/27726a5004c0d4dc3dba3de01c65c575)
Line 86: action.index._type = payload.logGroup;
I personally have a modified version of that script in use and changed that line to be:
action.index._type = 'cwl';
I have logs from various different log groups streaming through to the same ElasticSearch instance. It makes sense to have them all be the same type since they are all CloudWatch logs versus having the type be the log group name. The name is also set in the #log_group field so queries can use that for filtering.
In my case, I did the following:
Deploy modified Lambda
Reindex today's index (cwl-2018.03.07 for example) to change the type
for old documents from <log group name> to cwl
Entries from different log groups will now coexist.
You can also modify the generated Lambda code like below to make it work with multiple CW log groups. If the Lambda function can create different ES index for the different log streams coming under the same log groups, then we can avoid this problem. So, you need to find the Lambda function LogsToElasticsearch_<AWS-ES-DOMAIN-NAME>, then the function function transform(payload), and finally change the index name formation part like below.
// index name format: cwl-YYYY.MM.DD
//var indexName = [
//'cwl-' + timestamp.getUTCFullYear(), // year
//('0' + (timestamp.getUTCMonth() + 1)).slice(-2), // month
//('0' + timestamp.getUTCDate()).slice(-2) // day
//].join('.');
var indexName = [
'cwl-' + payload.logGroup.toLowerCase().split('/').join('-') + '-' + timestamp.getUTCFullYear(), // log group + year
('0' + (timestamp.getUTCMonth() + 1)).slice(-2), // month
('0' + timestamp.getUTCDate()).slice(-2) // day
].join('.');
Is it possible to forward all the cloudwatch log groups to a single index in ES? Like having one index "rds-logs-* "to stream logs from all my available RDS instances.
example: error logs, slow-query logs, general logs, etc., of all RDS instances, would be required to be pushed under the same index(rds-logs-*)?
I tried the above-mentioned code change, but it pushes only the last log group that I had configured.
From AWS: by default, only 1 log group can stream log data into ElasticSearch service. Attempting to stream two log groups at the same time will result in log data of one log group override the log data of the other log group.
Wanted to check if we have a work-around for the same.
Problem
Dependency on AWS Services status
If you depend on Amazon AWS service to operate, you need to keep a close eye on the status of their services. Amazon uses the website http://status.aws.amazon.com/, which provides links to RSS feeds to specific services in specific regions.
Potential Errors
Our service uses S3, CloudFront, and other services to operate. We'd like to be informed on any service that might go down during hours of operations, and automate what we should do in case something goes wrong.
Splunk Logging
We use Splunk for Logging all of our services.
Requirement
For instance, if errors occurs in the application while writing to S3, we'd like to know if that was caused by a potential outage in AWS.
How to monitor the Status RSS feed in Splunk?
Is there an HTTP client for that? A background service?
Solution
You can use the Syndication Input app to collect the RSS feed data from the AWS Status
Create a query that fetches the RSS Items that have errors and stores in Splunk indexes under the syndication sourcetype.
Create an alert based on the query, a since field so that we can adjust the alerts over time.
How
Ask your Splunk team to install the app "Syndication Input" on the environments you need.
After that, just collect each of the RSS feeds needed and add them to the Settings -> Data Input -> Syndication Feed. Take all the URLs from the Amazon Status RSS feeds and use them as Splunk Data Input, filling out the form with certain interval:
http://status.aws.amazon.com/rss/cloudfront.rss
http://status.aws.amazon.com/rss/s3-us-standard.rss
http://status.aws.amazon.com/rss/s3-us-west-1.rss
http://status.aws.amazon.com/rss/s3-us-west-2.rss
When you are finished, the Syndication App has the following:
Use the search for the errors when the occur, adjusting the “since” date so that you can create an alert for the results. I added a day in the past just for display purpose.
since should be some start day you will start monitoring AWS. This helps the query to result in any new event when Amazon publishes new errors captured from the text Informational message:.
The query should not return anything new because the since will not return any date.
Since the token RESOLVED is appended to a new RSS feed item, we exclude them from the alerts.
.
sourcetype=syndication "Informational message:" NOT "RESOLVED"
| eval since=strptime("2010-08-01", "%Y-%m-%d")
| eval date=strptime(published_parsed, "%Y-%m-%dT%H:%M:%SZ")
| rex field=summary_detail_base "rss\/(?<aws_object>.*).rss$"
| where date > since
| table aws_object, published_parsed, id, title, summary
| sort -published_parsed
Create an Alert with the Query. For instance, to send an email: