data platform monitoring tools options - amazon-web-services

We have aws based data pipelines having different components like kinesis lambda firehose s3 dynamodb and emr for spark jobs.
We need to implement monitoring system across all these components mostly monitoring the processing time taken at each point and if any bottlenecks.
Can anybody please guide if they have implemented such monitoring system. I am more interested in building the prometheus and grafana based system

You can use a Cloudwatch exporter to take the metrics to a Prometheus server.
There are 2 main Cloudwatch exporters:
Cloudwatch exporter: The 'official' one, written in Java. The main handicap is that it does not implement GetMetricsData API call, so it can provoke API throttling if there are many resources in the namespace.
Yet another Cloudwatch Exporter: This one is written in go and implements GetMetricsData, which allows to make up to 500 metrics requests in a single API call.
In both the configuration is similar and in the repositories there are configuration files available as examples.
AWS has a doc about the convenience of using GetMetricsData instead of GetMetricStatistics.

For your AWS services you can use CloudWatch and CloudTrail for monitoring. For Apache Spark you can also use the Spark Web UI.

Related

Is there an AWS service for visualizing triggers between other AWS services

For example say I build a workflow that uses 10 lambda functions that trigger each other and are triggered by a dynamodb table and an S3 bucket.
Is there any AWS tool that tracks how these triggers are tying together so I can easily visualize the whole workflow I’ve created?
Bang on, few months ago, I too was in a similar situation for my distributed architecture running on AWS.
So far, I have found the following options as possibilities. I'm still figuring out which is more suitable. But, hope this information helps you.
1. AWS-Native option :: Engineer your Lambda code to trigger Cloudwatch custom-metrics for any important events from within the code. Later, you may use Cloudwatch dashboard to visualize them.
2. Non-AWS options :: There are several of them, but all of them require you to engineer your code with their respective libraries / packages to transmit the needed information. Some of them support ASYNC invocations, so it shouldn't keep your master lambdas in the waiting state for log tracing.
IOPipe
Epsagon
3. Mix of AWS & Non-AWS :: This is a more traditional approach to our problem. You log events to Cloudwatch Logs (like how Lambda does it out of the box), "ingest" these logs into popular log management and analysis SaaS tooling to make sense between these logs via "pattern-matching" and other proprietary techniques.
Splunk Cloud
Datadog
All the best! Keep me posted how it is going.
cheers,
ram
If you use CloudFormation you can visualize the resource relations with CloudFormation Designer. However, if you don't have the resources in a CloudFormation stack, you can create one from all the existing resources.

AWS: Is it possible to monitor an external service?

With CloudWatch you can monitor applications running on AWS. Is it also possible to monitor an external service?
For example, I have a REST API and I want to get notified once that API is not accessible anymore. Does AWS offer you a monitoring tool for that purpose?
Not Cloudwatch just by itself, but you can use a combination of Cloudwatch and Lambdas to do what you're asking. You can use cloudwatch events to run lambdas on a schedule, something like once every 5 mins.
CloudwatchEvents -> HealthCheck Lambda -> Cloudwatch Custom Metrics
Your lambda can then ping the API you're monitoring the health of, and either send its status to cloudwatch as a custom metric; or potentially if your lambda throws an error when the API fails, the lambda error metric which is already in cloudwatch becomes your API failure metric
Once the metric exists in cloudwatch, either as a custom metric or the lambda metric by proxy, you're able to do usual cloudwatch things like alarms and notifications.
Now there is a simple way to monitor external resources - CloudWatch Synthetics. Just create a canary to regularly monitor a website, API or even validate a multi-step UI flow.
Read more in the docs: CloudWatch > Using Synthetic Monitoring
Amazon CloudWatch supports custom metrics generated by your applications and services that you do not run on AWS. In this way, CloudWatch can be an integrated storage and aggregation point, allowing you to monitor all of the metrics that you collect, and track on a single platform.
There might be more than one way to reach your goal by using the AWS CLI, an API/SDK, or the CloudWatch collectd plugin etc. I'd recommend you take a look at these links for more details: link-1, link-2, link-3, link-4

Designing AWS logging solution

I am looking for a forensic logging solution for 120 applications on EC2. The solution must perform real-time, support replay it messages, must persist the logs.
Which services I should be using for this purpose as the services look very similar to me. Athena, Kinesis, SQS, Elastic Search, EMR?
There are lots of products which can help you with this, such as datadog, stackdriver, etc. If you want to stay within the AWS ecosystem, look at Cloudwatch Logs.

Cloudwatch data logs to create a custom dashboard

So I am working on creating my own dashboard for AWS instances, I am trying to determine is there any way to get AWS cloud watch metrics log so that I can get the data and plot it in a graph.
I have been working with AWS CLI but wasn't able to get a perfect way to resolve my query.
I just need the metrics like
CPU utilization vs time
Disk Utilization vs time
Network Out/In vs time
etc
AWS CloudWatch API can do this for you, the two actions you may need:
GetMetricStatistics: get time-series data for one or more statistics of a given MetricName.
CLI reference: http://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/cli-mon-get-stats.html
API docs: http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricStatistics.html
ListMetrics: lists the names, namespaces, and dimensions of the metrics associated with your AWS account. You can filter metrics by using any combination of metric name, namespace, or dimensions.
CLI reference:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/cli-mon-list-metrics.html
API docs: http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html
Apart from the CLI there are also a bunch of SDKs for different languages (Java, .NET, Ruby, JavaScript etc.) that you can use to call AWS APIs. You can find these in the official AWS github repo: https://github.com/aws

AWS Cloudwatch monitoring for S3

Amazon Cloudwatch provides some very useful metrics for monitoring my EC2s, load balancers, elasticache and RDS databases, etc and allows me to set alarms for a whole range of criteria; but is there any way to configure it to monitor my S3s as well? Or are there any other monitoring tools (besides simply enabling logging) that will help me monitor the numbers of POST/GET requests and data volumes for my S3 resources? And to provide alarms for thresholds of activity or increased datastorage?
AWS S3 is a managed storage service. The only metrics available in AWS CloudWatch for S3 are NumberOfObjects and BucketSizeBytes. In order to understand your S3 usage better you need to do some extra work.
I have recently written an AWS Lambda function to do exactly what you ask for and it's available here:
https://github.com/maginetv/s3logs-cloudwatch
It works by parsing S3 Server side log files and aggregates/exports metrics to AWS Cloudwatch (CloudWatch allows you to publish custom metrics).
Example graphs that you will get in AWS CloudWatch after deploying this function on your AWS account are:
RestGetObject_RequestCount
RestPutObject_RequestCount
RestHeadObject_RequestCount
BatchDeleteObject_RequestCount
RestPostMultiObjectDelete_RequestCount
RestGetObject_HTTP_2XX_RequestCount
RestGetObject_HTTP_4XX_RequestCount
RestGetObject_HTTP_5XX_RequestCount
+ many others
Since metrics are exported to CloudWatch, you can easily set up alarms for them as well.
CloudFormation template is included in GitHub repo and you can deploy this function very quickly to gain visibility into your S3 bucket usage.
EDIT 2016-12-10:
In November 2016 AWS has added extra S3 request metrics in CloudWatch that can be enabled when needed. This includes metrics like AllRequests, GetRequests, PutRequests, DeleteRequests, HeadRequests etc. See Monitoring Metrics with Amazon CloudWatch documentation for more details about this feature.
I was also unable to find any way to do this with CloudWatch. This question from April 2012 was answered by Derek#AWS as not having S3 support in CloudWatch. https://forums.aws.amazon.com/message.jspa?messageID=338089
The only thing I could think of would be to import the S3 access logs to a log service (like Splunk). Then create a custom cloud watch metric where you post the data that you parse from the logs. But then you have to filter out the polling of the access logs and…
And while you were at it, you could just create the alarms in Splunk instead of in S3.
If your use case is to simply alert when you are using it too much, you could set up an account billing alert for your S3 usage.
I think this might depend on where you are looking to track the access from. I.e. if you are trying to measure/watch usage of S3 objects from outside http/https requests then Anthony's suggestion if enabling S3 logging and then importing into splunk (or redshift) for analysis might work. You can also watch billing status on requests every day.
If trying to guage usage from within your own applications, there are some AWS SDK cloudwatch metrics:
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/metrics/package-summary.html
and
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/metrics/S3ServiceMetric.html
S3 is a managed service, meaning that you don't need to take action based on system events in order to keep it up and running (as long as you can afford to pay for the service's usage). The spirit of CloudWatch is to help with monitoring services that require you to take action in order to keep them running.
For example, EC2 instances (which you manage yourself) typically need monitoring to alert when they're overloaded or when they're underused or else when they crash; at some point action needs to be taken in order to spin up new instances to scale out, spin down unused instances to scale back in, or reboot instances that have crashed. CloudWatch is meant to help you do the job of managing these resources more effectively.
To enable Request and Data transfer metrics in your bucket you can run the below command. Be aware that these are paid metrics.
aws s3api put-bucket-metrics-configuration \
--bucket YOUR-BUCKET-NAME \
--metrics-configuration Id=EntireBucket
--id EntireBucket
This tutorial describes how to do it in AWS Console with point and click interface.