How to create a slack notification channel in Google Cloud Platform with terraform - google-cloud-platform

I'm trying to create a slack notification channel in GCP with terraform. I am able to create a channel with the code below, but it's missing "Team" and "Owner" attributes.
resource "google_monitoring_notification_channel" "default" {
display_name = "Test Slack Channel"
type = "slack"
enabled = "true"
labels = {
"channel_name" = "#testing"
"auth_token" = "<my_slack_app_token>"
}
}
The first channel in the screenshot below was created via GUI and works fine. The second channel was created via terraform and is unable to send notificaitons:
Terraform registry does not mention these attributes, I have tried defining them in labels right after channel_name:
labels = {
"channel_name" = "#testing"
"team" = "<my_team>"
"owner" = "google_cloud_monitoring"
"auth_token" = "<my_slack_app_token>"
}
I got the following error:
Error creating NotificationChannel: googleapi: Error 400: Field "notification_channel.labels['owner']" is not allowed; labels must conform to the channel type's descriptor; permissible label keys for "slack" are: {"auth_token", "channel_name"}
Apparently, only channel_name and auth_token are valid labels.
What am I missing?

Slack needs the sensitive_lables option for tokens. There is an example in the docs
resource "google_monitoring_notification_channel" "default" {
display_name = "Test Slack Channel"
type = "slack"
labels = {
"channel_name" = "#foobar"
}
sensitive_labels {
auth_token = "...."
}
}

Related

Add environment based Multiple Notification Channel to GCP Alert Policy with Terraform Lookup

I'm trying to add multiple notification channels to a GCP Alert policy with terraform.
My issue is that I need to add different notification channels based on the production environment where they are deployed.
As long as I keep the notification channel unique, I can easily deploy in the following way.
Here is my variables.tf file:
locals {
notification_channel = {
DEV = "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]"
PRD = "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]"
}
}
Here is my main.tf file:
resource "google_monitoring_alert_policy" "alert_policy" {
display_name = "My Alert Policy"
combiner = "OR"
conditions {
display_name = "test condition"
condition_threshold {
filter = "metric.type=\"compute.googleapis.com/instance/disk/write_bytes_count\" AND resource.type=\"gce_instance\""
duration = "60s"
comparison = "COMPARISON_GT"
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_RATE"
}
}
}
user_labels = {
foo = "bar"
}
notification_channels = [lookup(local.notification_channel, terraform.workspace)]
}
My issue here happens when I try to map multiple notification channels instead of one per environment.
Something like:
locals {
notification_channel = {
DEV = ["projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]", "projects/[PROJECT_ID]/notificationChannels/[CHANNEL_ID]" ]...
}
}
However, if I try this way, system tells me that Inappropriate value for attribute "notification_channels": element 0: string.
Here's documentation of:
Terraform Lookup function Terraform
GCP Alert Policy
Could you help?
If I understood your question, you actually need only to remove the square brackets:
notification_channels = lookup(local.notification_channel, terraform.workspace)
Since the local variable notification_channel is already a list, you only need to use lookup to fetch the value based on the workspace you are currently in.

GCP Alerting Policy to Alert on KMS Key Deletion Using Terraform

I am trying to alert on KMS Key deletions using terraform.
I have a log based metric, a policy and a notification channel to PagerDuty.
This all works, however, following the alert triggering it soon clears and there seems to be nothing I can do to stop this.
Here is my code:
resource "google_logging_metric" "logging_metric" {
name = "kms-key-pending-deletion"
description = "Logging metric used to alert on scheduled deletions of KMS keys"
filter = "resource.type=cloudkms_cryptokeyversion AND protoPayload.methodName=DestroyCryptoKeyVersion"
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
unit = "1"
display_name = "kms-key-pending-deletion-metric-descriptor"
}
}
resource "google_monitoring_notification_channel" "pagerduty_alerts" {
display_name = "pagerduty-notification-channel"
type = "pagerduty"
sensitive_labels {
service_key = var.token
}
}
resource "google_monitoring_alert_policy" "kms_key_deletion_alert_policy" {
display_name = "kms-key-deletion-alert-policy"
combiner = "OR"
notification_channels = [google_monitoring_notification_channel.pagerduty_alerts.name]
conditions {
display_name = "kms-key-deletion-alert-policy-conditions"
condition_threshold {
comparison = "COMPARISON_GT"
duration = "300s"
filter = "metric.type=\"logging.googleapis.com/user/kms-key-pending-deletion\" AND resource.type=\"global\""
threshold_value = "0"
}
}
documentation {
content = "Runbook: https://blah"
}
}
In the GCP GUI I can disable the option "Notify on incident closure" in the policy and it stops the alert from clearing.
However I cannot set this via terraform.
I have tried setting alert_strategy.auto_close to null and 0s but this did not work:
alert_strategy {
auto_close = "0s"
# auto_close = null
}
How do I keep the alert active and stop it from clearing when building the policy in terraform?
Am I using the correct resource type? - Should I be using cloudkms.cryptoKey.state that are in "DESTROY_SCHEDULED" state somehow?
For others wanting to find the answer to this:
The need to keep an alert open and not allow it to automatically close is missing in the API.
The issue is tracked here: https://issuetracker.google.com/issues/151052441?pli=1

Cloud Asset Organization feed for deleted/created resource

I am creating a asset feed for the deleted/created resource. The code below and the link is showing the expression only for when the resources are getting created, but I want another feed when resources are getting deleted only. Reference link - https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloud_asset_organization_feed
I just want to receive the notification ONLY for create and delete no UPDATE.
resource "google_cloud_asset_organization_feed" "organization_feed" {
billing_project = "my-project-name"
org_id = "123456789"
feed_id = "network-updates"
content_type = "RESOURCE"
asset_types = [
"compute.googleapis.com/Subnetwork",
"compute.googleapis.com/Network",
]
feed_output_config {
pubsub_destination {
topic = google_pubsub_topic.feed_output.id
}
}
condition {
expression = <<-EOT
!temporal_asset.deleted &&
temporal_asset.prior_asset_state == google.cloud.asset.v1.TemporalAsset.PriorAssetState.DOES_NOT_EXIST
EOT
title = "created"
description = "Send notifications on creation events"
}
}
To create a deleted asset feed change the condition to:
condition {
expression = temporal_asset.deleted
title = "deleted"
description = "Send notifications on deletion events"
}
Monitoring asset changes with conditions
TemporalAsset

How to create an alert policy for unknown custom metric in GCP

Given the following alert policy in GCP (created with terraform)
resource "google_monitoring_alert_policy" "latency_alert_policy" {
display_name = "Latency of 95th percentile more than 1 second"
combiner = "OR"
conditions {
display_name = "Latency of 95th percentile more than 1 second"
condition_threshold {
filter = "metric.type=\"custom.googleapis.com/http/server/requests/p95\" resource.type=\"k8s_pod\""
threshold_value = 1000
duration = "60s"
comparison = "COMPARISON_GT"
aggregations {
alignment_period = "60s"
per_series_aligner= "ALIGN_NEXT_OLDER"
cross_series_reducer= "REDUCE_MAX"
group_by_fields = [
"metric.label.\"uri\"",
"metric.label.\"method\"",
"metric.label.\"status\"",
"metadata.user_labels.\"app.kubernetes.io/name\"",
"metadata.user_labels.\"app.kubernetes.io/component\""
]
}
trigger {
count = 1
percent = 0
}
}
}
}
I get the following this error (which is part of a terraform project also creating the cluster):
Error creating AlertPolicy: googleapi: Error 404: The metric referenced by the provided filter is unknown. Check the metric name and labels.
Now, this is a custom metric (by a Spring Boot app with Micrometer), therefore this metric does not exist when creating infrastructure. Does GCP have to know a metric before creating an alert for it? This would mean that a Spring boot app has to be deployed on a cluster and sending metrics before this policy can be created?
Am I missing something... (like this should not be done in terraform, infrastructure)?
interesting question, the reason for the 404 error is because the resource was not found, there seems to be a preexisting pre-requisite for the descriptor. I would create the metric descriptor first, you can use this as reference, then going forward on creating the alerting policy.
This is an ingenious way you may avoid it. Please comment if it makes sense and if you make it work like this, share it.
For reference (this can be referenced from the alert policy according to terraform doc):
resource "google_monitoring_metric_descriptor" "p95_latency" {
description = ""
display_name = ""
type = "custom.googleapis.com/http/server/requests/p95"
metric_kind = "GAUGE"
value_type = "DOUBLE"
labels {
key = "status"
}
labels {
key = "uri"
}
labels {
key = "exception"
}
labels {
key = "method"
}
labels {
key = "outcome"
}
}

Having trouble with Terraform and AWS Storage Gateway disks

I am using Terraform with AWS and have been able to create an AWS Storage Gateway file gateway using the aws_storagegateway_gateway resource.
The gateway will create and the status will be 'online' however there is not a cache disk added yet in the console which is normal as it has to be done after the gateway is created. The VM does have a disk and it is available to add in the console and doing so in the console works perfectly.
However, I am trying to add the disk with Terraform once the gateway is created and cannot seem to get the code to work, or quite possibly don't understand how to get it to work.
Trying to use the aws_storagegateway_cache resource but I get an error on the disk_id and do not know how to get it to return from the code of the gateway creation.
Might someone have a working example of how to get the cache disk to add with Terraform once the gateway is created or know how to get the disk_id so I can add it?
Adding code
provider "aws" {
access_key = "${var.access-key}"
secret_key = "${var.secret-key}"
token = "${var.token}"
region = "${var.region}"
}
resource "aws_storagegateway_gateway" "hmsgw" {
gateway_ip_address = "${var.gateway-ip-address}"
gateway_name = "${var.gateway-name}"
gateway_timezone = "${var.gateway-timezone}"
gateway_type = "${var.gateway-type}"
smb_active_directory_settings {
domain_name = "${var.domain-name}"
username = "${var.username}"
password = "${var.password}"
}
}
resource "aws_storagegateway_cache" "sgwdisk" {
disk_id = "SCSI"
gateway_arn = "${aws_storagegateway_gateway.hmsgw.arn}"
}
output "gatewayid" {
value = "${aws_storagegateway_gateway.hmsgw.arn}"
}
The error I get is:
aws_storagegateway_cache.sgwdisk: error adding Storage Gateway cache: InvalidGatewayRequestException: The specified disk does not exist.
status code: 400, request id: fda602fd-a47e-11e8-a1f4-b383e2e2e2f6
I have attempted to hard code the disk_id like above or use a variable. On the variable I don't know if it is returned or exists so that could be the issue, new to this.
Before creating the resource "aws_storagegateway_cache", use data to get the disk id. I am using the below scripts and it works fine.
variable "upload_disk_path" {
default = "/dev/sdb"
}
data "aws_storagegateway_local_disk" "upload_disk" {
disk_path = "${var.upload_disk_path}"
gateway_arn = "${aws_storagegateway_gateway.this.arn}"
}
resource "aws_storagegateway_upload_buffer" "stg_upload_buffer" {
disk_id = "${data.aws_storagegateway_local_disk.upload_disk.disk_id}"
gateway_arn = "${aws_storagegateway_gateway.this.arn}"
}
In case your are using two disk's (one for upload and one cahce), use the same code but set the default value of cache_disk_path = "/dev/sdc"
if you use the AWS cli to run: aws storagegateway list-local-disks --gateway-arn [your gateway's arn] --region [gateway's region], you'll get data returned that includes the disk ID.
Then, in your example code, you replace SCSI with "${gateway_arn}:[diskID from command above]" and your cache volume will be created.
One thing I've noticed though - when I've done this and then tried to apply the same Terraform code again, and in some cases even with a targeted deploy of a specific resource within my Terraform, it wants to re-deploy the cache volume, because the Terraform is detecting that the disk ID is changing to a value of "1". Passing in "1" as the value in the Terraform, however, does not seem to work.
This would work also:
variable "disk_path" {
default = "/dev/sdb"
}
provider "aws" {
alias = "primary"
access_key = var.access-key
secret_key = var.secret-key
token = var.token
region = var.region
}
resource "aws_storagegateway_gateway" "hmsgw" {
gateway_ip_address = var.gateway-ip-address
gateway_name = var.gateway-name
gateway_timezone = var.gateway-timezone
gateway_type = var.gateway-type
smb_active_directory_settings {
domain_name = var.domain-name
username = var.username
password = var.password
}
}
data "aws_storagegateway_local_disk" "sgw_disk" {
disk_path = var.disk_path
gateway_arn = aws_storagegateway_gateway.hmsgw.arn
provider = aws.primary
}
resource "aws_storagegateway_cache" "sgw_cache" {
disk_id = data.aws_storagegateway_local_disk.sgw_disk.id
gateway_arn = aws_storagegateway_gateway.hmsgw.arn
provider = aws.primary
}