I need to create an AWS managed Grafana dashboard to display active New Relic Alerts. I created a datasource for NewRelic. And then created a new panel and there chose NewRelic DataSource. I tried to write NRQL query, like "SELECT * FROM Alerts", "SELECT alertName, conditionName, startTime
FROM MetricAlertActive". But it's not working.
What is the correct NRQL query for displaying active New Relic Alerts
New Relic incidents (which are triggered from alert conditions), are under a native eventType called "NrAiIncident" - To only fetch active-only alerts (those alerts that do not have a close event), this NRQL should work:
SELECT * FROM (SELECT uniqueCount(event) as 'total', latest(event) as 'state', latest(priority) as 'priority', latest(timestamp) as 'openTime' FROM NrAiIncident where event in ('open','close') facet title, policyName, conditionName limit max) where total=1 and state='open' limit max
Related
I'm trying to export every item in a DynamoDB table to S3. I found this tutorial https://aws.amazon.com/blogs/big-data/how-to-export-an-amazon-dynamodb-table-to-amazon-s3-using-aws-step-functions-and-aws-glue/ and followed the example. Basically,
table = glueContext.create_dynamic_frame.from_options(
"dynamodb",
connection_options={
"dynamodb.input.tableName": table_name,
"dynamodb.throughput.read.percent": read_percentage,
"dynamodb.splits": splits
}
)
glueContext.write_dynamic_frame.from_options(
frame=table,
connection_type="s3",
connection_options={
"path": output_path
},
format=output_format,
transformation_ctx="datasink"
)
I tested it in a tiny table in nonprod environment and it works fine. But my Dynamo table in production is over 400GB, 200 mil items. I suppose it'll take a while, but I have no idea how long to expect. Hours, or even days? Are there any way to show progress? For example, showing a count of how many items have been processed. I don't want to blindly start this job and wait.
One way would be to enable continuous logging for your AWS Glue Job to monitor its progress.
Another way would be to trigger a Lambda function whenever a file has been stored in S3, using Amazon S3 event notifications.
Did you try the custom waiter class within was docs?
For instance custom waiter for a Glue Job should look something like this:
class JobCompleteWaiter(CustomWaiter):
def __init__(self, client):
super().__init__(
"JobComplete",
"get_job_run",
"JobRun.JobRunState",
{"SUCCEEDED": WaitState.SUCCEEDED, "FAILED": WaitState.FAILED},
client,
max_tries=100,
)
def wait(self, JobName, RunId):
self._wait(JobName=JobName, RunId=RunId)
According to boto3 docs, you should expect a set of 6 different possible states from a JOB: STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'
So I chost checkein whether was SUCCEEDED or FAILED.
I'm referring to the Google Cloud ML Engine API, in particular this method:
https://developers.google.com/apis-explorer/#search/ml/m/ml/v1/ml.projects.jobs.list
What is the filter syntax one should us for the 'Job ID Contains' filter as per the UI?
I'm able to use the State filter as:
state:"RUNNING" OR state:"FAILED"
but none of the following work:
job id contains:"mytext"
'job id contains':"mytext"
'JobIDContains':"mytext"
turns out it is super simple:
jobId:mytext
Here are some details on filters:
https://cloud.google.com/sdk/gcloud/reference/topic/filters where the above is an example of
key : simple-pattern
My organization is planning for a maintenance window for the next 5 hours. During that time, I do not want Cloud Watch to trigger alarms and send notifications.
Earlier, when I had to disable 4 alarms, I have written the following code in AWS Lambda. This worked fine.
import boto3
import collections
client = boto3.client('cloudwatch')
def lambda_handler(event, context):
response = client.disable_alarm_actions(
AlarmNames=[
'CRITICAL - StatusCheckFailed for Instance 456',
'CRITICAL - StatusCheckFailed for Instance 345',
'CRITICAL - StatusCheckFailed for Instance 234',
'CRITICAL - StatusCheckFailed for Instance 123'
]
)
But now, I was asked to disable all the alarms which are 361 in number. So, including all those names would take a lot of time.
Please let me know what I should do now?
Use describe_alarms() to obtain a list of them, then iterate through and disable them:
import boto3
client = boto3.client('cloudwatch')
response = client.describe_alarms()
names = [[alarm['AlarmName'] for alarm in response['MetricAlarms']]]
disable_response = client.disable_alarm_actions(AlarmNames=names)
You might want some logic around the Alarm Name to only disable particular alarms.
If you do not have the specific alarm arns, then you can use the logic in the previous answer. If you have a specific list of arns that you want to disable, you can fetch names using this:
def get_alarm_names(alarm_arns):
names = []
response = client.describe_alarms()
for i in response['MetricAlarms']:
if i['AlarmArn'] in alarm_arns:
names.append(i['AlarmName'])
return names
Here's a full tutorial: https://medium.com/geekculture/terraform-structure-for-enabling-disabling-alarms-in-batches-5c4f165a8db7
I've looked over the documentation for Google's PubSub, and also tried looking in Google Cloud Monitoring, but couldn't find any means of figuring out what's the queue size in my topics.
Since I plan on using PubSub for analytics, it's important for me to monitor the queue count, so I could scale up/down the subscriber count.
What am I missing?
The metric you want to look at is "undelivered messages." You should be able to set up alerts or charts that monitor this metric in Google Cloud Monitoring under the "Pub/Sub Subscription" resource type. The number of messages that have not yet been acknowledged by subscribers, i.e., queue size, is a per-subscription metric as opposed to a per-topic metric. For info on the metric, see pubsub.googleapis.com/subscription/num_undelivered_messages in the GCP Metrics List (and others for all of the Pub/Sub metrics available).
This might help if you're looking into a programmatic way to achieve this:
from google.cloud import monitoring_v3
from google.cloud.monitoring_v3 import query
project = "my-project"
client = monitoring_v3.MetricServiceClient()
result = query.Query(
client,
project,
'pubsub.googleapis.com/subscription/num_undelivered_messages',
minutes=60).as_dataframe()
print(result['pubsub_subscription'][project]['subscription_name'][0])
The answer to your question is "no", there is no feature for PubSub that shows these counts. The way you have to do it is via log event monitoring using Stackdriver (it took me some time to find that out too).
The colloquial answer to this is do the following, step-by-step:
Navigate from GCloud Admin Console to: Monitoring
This opens a new window with separate Stackdriver console
Navigate in Stackdriver: Dashboards > Create Dashboard
Click the Add Chart button top-right of dashboard screen
In the input box, type num_undelivered_messages and then SAVE
Updated version based on #steeve's answer. (without pandas dependency)
Please note that you have to specify end_time instead of using default utcnow().
import datetime
from google.cloud import monitoring_v3
from google.cloud.monitoring_v3 import query
project = 'my-project'
sub_name = 'my-sub'
client = monitoring_v3.MetricServiceClient()
result = query.Query(
client,
project,
'pubsub.googleapis.com/subscription/num_undelivered_messages',
end_time=datetime.datetime.now(),
minutes=1,
).select_resources(subscription_id=sub_name)
for content in result:
print(content.points[0].value.int64_value)
Here is a java version
package com.example.monitoring;
import static com.google.cloud.monitoring.v3.MetricServiceClient.create;
import static com.google.monitoring.v3.ListTimeSeriesRequest.newBuilder;
import static com.google.monitoring.v3.ProjectName.of;
import static com.google.protobuf.util.Timestamps.fromMillis;
import static java.lang.System.currentTimeMillis;
import com.google.monitoring.v3.ListTimeSeriesRequest;
import com.google.monitoring.v3.TimeInterval;
public class ReadMessagesFromGcp {
public static void main(String... args) throws Exception {
String projectId = "put here";
var interval = TimeInterval.newBuilder()
.setStartTime(fromMillis(currentTimeMillis() - (120 * 1000)))
.setEndTime(fromMillis(currentTimeMillis()))
.build();
var request = newBuilder().setName(of(projectId).toString())
.setFilter("metric.type=\"pubsub.googleapis.com/subscription/num_undelivered_messages\"")
.setInterval(interval)
.setView(ListTimeSeriesRequest.TimeSeriesView.FULL)
.build();
var response = create().listTimeSeries(request);
for (var subscriptionData : response.iterateAll()) {
var subscription = subscriptionData.getResource().getLabelsMap().get("subscription_id");
var numberOrMessages = subscriptionData.getPointsList().get(0).getValue().getInt64Value();
if(numberOrMessages > 0) {
System.out.println(subscription + " has " + numberOrMessages + " messages ");
}
}
}
}
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-monitoring</artifactId>
<version>3.3.2</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java-util</artifactId>
<version>4.0.0-rc-2</version>
</dependency>
output
queue-1 has 36 messages
queue-2 has 4 messages
queue-3 has 3 messages
There is a way to count all messages published to a topic using custom metrics.
In my case I am publishing messages to a Pub/Sub topic via a Cloud Composer (Airflow) Dag that runs a python script.
The python script returns logging information about the ran Dag.
logging.info(
f"Total events in file {counter-1}, total successfully published {counter - error_counter -1}, total errors publishing {error_counter}. Events sent to topic: {TOPIC_PATH} from filename: {source_blob_name}.",
{
"metric": "<some_name>",
"type": "completed_file",
"topic": EVENT_TOPIC,
"filename": source_blob_name,
"total_events_in_file": counter - 1,
"failed_published_messages": error_counter,
"successful_published_messages": counter - error_counter - 1,
}
I then have a Distribution custom metric which filters on resource_type, resource_lable, jsonPayload.metric and jsonPayload.type. The metric also has the Field Name set to jsonPayload.successful_published_messages
Custom metric filter:
resource.type=cloud_composer_environment AND resource.labels.environment_name={env_name} AND jsonPayload.metric=<some_name> AND jsonPayload.type=completed_file
That custom metric is then used in a Dashboard with the MQL setting of
fetch cloud_composer_environment
| metric
'logging.googleapis.com/user/my_custom_metric'
| group_by 1d, [value_pubsub_aggregate: aggregate(value.pubsub)]
| every 1d
| group_by [],
[value_pubsub_aggregate_sum: sum(value_pubsub_aggregate)]
Which to get to I first setup an Icon chart with resource type: cloud composer environment, Metric: my_custom metric, Processing step: to no preprocessing step, Alignment function: SUM, period 1, unit day, How do you want it grouped group by function: mean.
Ideally you would just select sum for the Group by function but it errors and that is why you then need to sqitch to MQL and manually enter sum instead of mean.
This will now count your published messages for up to 24 months which is the retention period set by Google for the custom metrics.
I am using Django app djStripe to work integrate Stripe into my Django app to allow users to subscribe to plans and pay using Stripe.
I want to have a zero dollar plan but create a Stripe customer account so in future using can just change subscription from zero to a paid plan, and then they will be asked for their credit card info.
This is acceptable in Stripe and according to Stripe a zero dollar subscription does not ask for credit card though it does create customer. However, djStripe does ask for credit card with a zero dollar plan.
djStripe readthedocs does hint at custom plans being the solution but I am need help to determine if
a) that is indeed the way and
b) if yes to a), then how to implement.
I've setup the plan in my app's Settings.py as follows:
DJSTRIPE_PLANS = {
"starter": {
"stripe_plan_id": "starter",
"name": "Starter",
"description": "Starter subscription.",
"statement_descriptor": "Starter co",
"price": 0, # $0
"currency": "usd",
"interval": "year",
"trial_period_days": 0,
"team_size": 2,
"image_count": 1000
}
}
I haven't customized any of the standard djStripe subscription process.
First of all add trail period in the plan, because without a trial period stripe tries to charge the customer for that it requires a credit card info. Subscribe to customer.subscription.trial_will_end stripe will send this webhook three days before the trial expires and on that event update trail for customer.