Prometheus Alert manager not printing {{ $labels.instance }} vaule - amazon-web-services

We have multiple AWS accounts and network access is configured between two AWS accounts and service discovery is working with node-exporter. I have Prometheus configuration with some of the rules configured for the docker containers and now I have added one of the rules similar to the existing one to check if by mistakenly same container is launched in another AWS account and below is the rule. for exiting rules, {{ $labels.instance }} is printing in Alerts email, but not for the new rule which I have written newly
Scrape config for labels:
- job_name: 'aws-conatiners'
scheme: http
ec2_sd_configs:
- region: {{region}}
port: 8181
relabel_configs:
- source_labels: [__meta_ec2_tag_Name]
target_label: instance
The new rule which I have created to check if more than one container is running:
# Alert to check if more than one instance is running for backendapi service
- alert: multiple_instances_are_running
expr: sum(container_last_seen{name=~"backendapi"}) > 1
for: 5m
labels:
severity: critical
annotations:
summary: "More than one Instance (instance {{ $labels.instance }}) is running"
description: "More than one Instance (instance {{ $labels.instance }}) is running for 5 minutes."
Can someone please check and help me to get the instance name printed in alert emails

Related

Cloud SQL creation with Deployment Manager - "Precondition check failed." error

I'm using the gcp-types/sqladmin-v1beta4:instances Resource Type to create a Cloud SQL instance using the Deployment Manager and I'm getting the error below:
{
"ResourceType":"gcp-types/sqladmin-v1beta4:instances",
"ResourceErrorCode":"400",
"ResourceErrorMessage":{
"code":400,
"message":"Precondition check failed.",
"status":"FAILED_PRECONDITION",
"statusMessage":"Bad Request",
"requestPath":"https://www.googleapis.com/sql/v1beta4/projects/[PROJECT_NAME]/instances",
"httpMethod":"POST"
}
}
Here's the configuration inside the JINJA file:
{% set deployment_name = env['deployment'] %}
{% set INSTANCE_NAME = deployment_name + '-instance' %}
resources:
- name: {{ INSTANCE_NAME }}
type: gcp-types/sqladmin-v1beta4:instances
properties:
region: us-central1
rootPassword: root
settings:
tier: db-n1-standard-1
backupConfiguration:
binaryLogEnabled: true
enabled: true
- name: demand_ml_db
type: gcp-types/sqladmin-v1beta4:databases
properties:
name: demand_ml_db
instance: $(ref.{{ INSTANCE_NAME }}.name)
charset: utf8
The FAILED_PRECONDITION error - while not very descriptive, tends to be thrown when you're attempting to deploy over a previous Cloud SQL instance that was recently deleted; as a matter of fact, the instance you selected for deletion is not cleaned up instantly. There's an Issue Tracker thread regarding this here.
I was able to verify this on my end as well. The deployment using the JINJA file you've specified worked fine at first, but when I deleted it, and re-deployed - I received the same error.
The most simple approach is to try using a different deployment (or instance) name.

NFS connection timing out on EKS

I have an NFS helm chart. It is one of the charts for an application that has 5 more sub-charts. 2 of the charts have a shared storage which I am using NFS. In GCP when I provide NFS service name in the PV it works.
apiVersion: v1
kind: PersistentVolume
metadata:
name: {{ include "nfs.name" . }}
spec:
capacity:
storage: {{ .Values.persistence.nfsVolumes.size }}
accessModes:
- {{ .Values.persistence.nfsVolumes.accessModes }}
mountOptions:
- nfsvers=4.1
nfs:
server: nfs.default.svc.cluster.local # nfs is from svc {{ include "nfs.name" .}}
path: "/opt/shared-shibboleth-idp"
But the same doesn't work on AWS EKS. The error there - on AWS EKS - is connection timeout so it can't mount the volume.
When I change the server to
server: a4eab2d4aef2311e9a2880227e884517-1524131093.us-west-2.elb.amazonaws.com .
I get connection timed out.
All the mounts are okay since it works well with GCP.
What am I doing wrong?

How to add load balancer to aws ecs service with ansible

I want to add a load balancer to a ecs service module with ansible. Therefore, I am using the following code:
- name: create ECS service on VPC network
ecs_service:
state: present
name: console-test-service
cluster: new_cluster
desired_count: 0
network_configuration:
subnets:
- subnet-abcd1234
security_groups:
- sg-aaaa1111
- my_security_group
Now, I want to add a load balancer with the load_balancers parameter. However, It is required a list of load balancers. How can I add a list of names of the load balancer that I want to define?
For example:
load_balancers:
- name_of_my_load_balancer
returns the following error:
raise
ParamValidationError(report=report.generate_report())\nbotocore.exceptions.ParamValidationError:
Parameter validation failed:\nInvalid type for parameter
loadBalancers[0], value: name_of_my_load_balancer, type: , valid
types: \n"
It needs a dictionary which includes the target group ARN, container name and the container port.
- name: create ECS service on VPC network
ecs_service:
state: present
name: console-test-service
cluster: new_cluster
desired_count: 0
load_balancers:
- targetGroupArn: arn:aws:elasticloadbalancing:eu-west-1:453157221:targetgroup/tg/16331647320e8a42
containerName: laravel
containerPort: 80
network_configuration:
subnets:
- subnet-abcd1234
security_groups:
- sg-aaaa1111
- my_security_group

Ansible : Add running EC2 instances to Auto-scaling group

I am working on an Ansible project in which I would like to add to my Auto-scaling group an existing EC2 instance found by tag-Name. I was able to find it with an AMI or terminating the old instances. But I am simply looking for a way to add them to auto-scaling group like in web management console. Where I just right click on instance, select settings, attach it to auto-scaling group. Below code is all in 1 file.
Find EC2 instances:
- hosts: localhost
connection: local
gather_facts: no
tasks:
- ec2_remote_facts:
region: eu-central-1
filters:
"tag:Name": Ubuntu_From_AMI
register: ec2found
- name: Add found instances to group
add_host: hostname="{{ item.public_ip_address }}" groups=ec2instances
with_items: "{{ ec2found.instances }}"
Here is how I am adding the auto-scaling group :
- hosts: localhost
connection: local
gather_facts: no
tasks:
- name: Add auto-scaling groups.
ec2_asg:
name: magento_scaling_group
load_balancers: 'LB_NAME'
availability_zones: [ 'eu-central-1a', 'eu-central-1b', 'eu-central-1c' ]
launch_config_name: "{{ lc.name }}"
min_size: 0
max_size: 5
desired_capacity: 0
vpc_zone_identifier: [ 'subnet-e712ad8c', 'subnet-e12e8dac', 'subnet-28e91a55' ]
tags:
- environment: production
propagate_at_launch: no
Is it possible? Thank you.
Based on the current list of modules, it appears there is no such functionality. You'll need to create a new module or just cheat and use the aws cli in a normal command: invocation. If you go the route of creating a new module, please do consider submitting it as a PR to the Ansible project so others will benefit from your work.

How to spin up multiple aws instances and assign given range of IP addresses using ansible?

The objective is to spin up multiple instances which can be achieved using count but I have been give specific range of private IP addresses, and want to assign them to the instances.
Below is my present playbook,
---
- name: Provision an EC2 Instance
hosts: local
connection: local
gather_facts: False
tags: provisioning
# Necessary Variables for creating/provisioning the EC2 Instance
vars:
instance_type: t2.micro
security_group: default # Change the security group name here
image: ami-a9d276c9 # Change the AMI, from which you want to launch the server
region: us-west-2 # Change the Region
keypair: ansible # Change the keypair name
ip_addresses:
- 172.31.1.117/32
- 172.31.1.118/32
count: 2
tasks:
- name: Launch the new EC2 Instance
local_action: ec2
group={{ security_group }}
instance_type={{ instance_type}}
image={{ image }}
wait=true
region={{ region }}
keypair={{ keypair }}
count={{count}}
vpc_subnet_id=subnet-xxxxxxx
# private_ip={{private_ip}}
with_items: ip_addresses
register: ec2
- name: Wait for SSH to come up
local_action: wait_for
host={{ item.public_ip }}
port=22
state=started
with_items: ec2.instances
- name: Add tag to Instance(s)
local_action: ec2_tag resource={{ item.id }} region={{ region }} state=present
with_items: ec2.instances
args:
tags:
Name: ansible
- name: Update system
apt: update_cache=yes
- name: Install Git
apt:
name: git
state: present
- name: Install Python2.7
apt:
name: python=2.7
state: present
- name: Install Java
apt:
name: openjdk-8-jdk
state: present
Which is although bringing up the instances but not assigning the IP addresses intended to be assigned. and I'm getting following warning
PLAY [Provision an EC2 Instance] ***********************************************
TASK [Launch the new EC2 Instance] *********************************************
changed: [localhost -> localhost] => (item=172.31.1.117/32)
changed: [localhost -> localhost] => (item=172.31.1.118/32)
[DEPRECATION WARNING]: Skipping task due to undefined attribute, in the future this will be a fatal error.. This feature will be removed in a future release. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.
Please suggest me the best possible way to achieve this.
You are giving count=2, so 2 instances will be launched
Your IP addresses are wrong, you are giving a CIDR instead of IP
You are not using the IP address anywhere in your code when launching the instances
How to fix?
ip_addresses:
- 172.31.1.117
- 172.31.1.118
Don't specify count in ec2 module
Loop through the list of ipaddresses (there are 2 of them)
Make sure you use the IP by referencing {item}
Like this:
private_ip={{item}}