How to get the list of instances for AWS EMR?

Why is the list for EC2 different from the EMR list?
Why are not all the types of instances from the EC2 available for EMR? How to get this special list?

As a programming solution, you are looking something like this: (using python boto3)
import boto3
client = boto3.client('emr')
for instance in client.list_instances():
print("Instance[%s] %s"%(,

This is what I use, although I'm not 100% sure it's accurate (because I couldn't find documentation to support some of my choices (-BoxUsage, etc.)).
It's worth looking through the responses from AWS in order to figure out what the different values are for different fields in the pricing client responses.
Use the following to get the list of responses:
default_profile = boto3.session.Session(profile_name='default')
# Only us-east-1 has the pricing API
# -
pricing_client = default_profile.client('pricing', region_name='us-east-1')
service_name = 'ElasticMapReduce'
product_filters = [
{'Type': 'TERM_MATCH', 'Field': 'location', 'Value': aws_region_name}
response = pricing_client.get_products(
num_prices = 100
while 'NextToken' in response:
# re-query to get next page
Once you've gotten the list of responses, you can then filter out the actual instance info:
emr_prices = {}
for response in response_list:
for price_info_str in response['PriceList']:
price_obj = json.loads(price_info_str)
attributes = price_obj['product']['attributes']
# Skip pricing info that doesn't specify a (EC2) instance type
if 'instanceType' not in attributes:
inst_type = attributes['instanceType']
# AFAIK, Only usagetype attributes that contain the string '-BoxUsage' are the ones that contain the prices that we would use (empirical research)
# Other examples of values are <REGION-CODE>-M3BoxUsage, <REGION-CODE>-M5BoxUsage, <REGION-CODE>-M7BoxUsage (no clue what that means.. )
if '-BoxUsage' not in attributes['usagetype']:
if 'OnDemand' not in price_obj['terms']:
on_demand_info = price_obj['terms']['OnDemand']
price_dim = list(list(on_demand_info.values())[0]['priceDimensions'].values())[0]
emr_price = Decimal(price_dim['pricePerUnit']['USD'])
emr_prices[inst_type] = emr_price
Realistically, it's straightforward enough to figure this out from the boto3 docs. In particular, the get_products documentation.


How use Filters with boto3 vpc endpoint services?

I need to get vpc endpoint service ID from python script, but I don't understand how use boto3, filters from vpc-id or a subnet
How do I use Filters?
This part of boto3
> (dict) --
A filter name and value pair that is used to return a more specific list of results from a describe operation. Filters can be used to match a set of resources by specific criteria, such as tags, attributes, or IDs. The filters supported by a describe operation are documented with the describe operation. For example:
Name (string) --
The name of the filter. Filter names are case-sensitive.
Values (list) --
The filter values. Filter values are case-sensitive.
(string) --
The easiest method would be to call it with no filters, and observe what comes back:
import boto3
ec2_client = boto3.client('ec2', region_name='ap-southeast-2')
response = ec2_client.describe_vpc_endpoint_services()
for service in response['ServiceDetails']:
You can then either filter the results within your Python code, or use the Filters capability of the Describe command.
Feel free to print(response) to see the data that comes back.
It depends on what you want to filter the results with. In my case, I use below to filter it for a specific vpc-endpoint-id.
import boto3
vpc_client = boto3.client('ec2')
vpcEndpointId = "vpce-###"
vpcEndpointDetails = vpc_client.describe_vpc_endpoints(
'Name': 'vpc-endpoint-id',
'Values': [vpcEndpointId]

Trying to disable all the Cloud Watch alarms in one shot

My organization is planning for a maintenance window for the next 5 hours. During that time, I do not want Cloud Watch to trigger alarms and send notifications.
Earlier, when I had to disable 4 alarms, I have written the following code in AWS Lambda. This worked fine.
import boto3
import collections
client = boto3.client('cloudwatch')
def lambda_handler(event, context):
response = client.disable_alarm_actions(
'CRITICAL - StatusCheckFailed for Instance 456',
'CRITICAL - StatusCheckFailed for Instance 345',
'CRITICAL - StatusCheckFailed for Instance 234',
'CRITICAL - StatusCheckFailed for Instance 123'
But now, I was asked to disable all the alarms which are 361 in number. So, including all those names would take a lot of time.
Please let me know what I should do now?
Use describe_alarms() to obtain a list of them, then iterate through and disable them:
import boto3
client = boto3.client('cloudwatch')
response = client.describe_alarms()
names = [[alarm['AlarmName'] for alarm in response['MetricAlarms']]]
disable_response = client.disable_alarm_actions(AlarmNames=names)
You might want some logic around the Alarm Name to only disable particular alarms.
If you do not have the specific alarm arns, then you can use the logic in the previous answer. If you have a specific list of arns that you want to disable, you can fetch names using this:
def get_alarm_names(alarm_arns):
names = []
response = client.describe_alarms()
for i in response['MetricAlarms']:
if i['AlarmArn'] in alarm_arns:
return names
Here's a full tutorial:

Setting .authorize_egress() with protocol set to all

I am trying to execute the following code
def createSecurityGroup(self, securitygroupname):
conn = boto3.resource('ec2')
response = conn.create_security_group(GroupName=securitygroupname, Description = 'test')
VPC_NAT_SecurityObject = createSecurityGroup("mysecurity_group")
response_egress_all = VPC_NAT_SecurityObject.authorize_egress(
IpPermissions=[{'IpProtocol': '-1'}])
and getting the below exception
An error occurred (InvalidParameterValue) when calling the AuthorizeSecurityGroupEgress operation: Only Amazon VPC security
groups may be used with this operation.
I tried several different combinations but not able to set the protocol to all . I used '-1' as explained in the boto3 documentation. Can somebody pls suggest how to get this done.
1.boto3.resource("ec2") class actually a high level class wrap around the client class. You must create an extract class instantiation using boto3.resource("ec2").Vpc in order to attach to specific VPC ID e.g.
import boto3
ec2_resource = boto3.resource("ec2")
myvpc = ec2_resource.Vpc("vpc-xxxxxxxx")
response = myvpc.create_security_group(
GroupName = securitygroupname,
Description = 'test')
2.Sometime it is straightforward to use boto3.client("ec2") If you check boto3 EC2 client create_security_group, you will see this:
response = client.create_security_group(
If you use automation script/template to rebuild the VPC, e.g. salt-cloud, you need give the VPC a tag name in order to acquire it automatically from boto3 script. This will save all the hassle when AWS migrate all the AWS resources ID from 8 alphanumeric to 12 or 15 character.
Another option is using cloudformation that let you put everything and specify variable in a template to recreate the VPC stack.

AWS boto v2.32.0 - List tags for an ASG

I am trying to use boto v2.32.0 to list the tags on a particular ASG
something simple like this is obviously not working (especially with the lack of a filter system):
import boto.ec2.autoscale
asg = boto.ec2.autoscale.connect_to_region('ap-southeast-2')
tags = asg.get_all_tags('asgname')
print tags
asg = boto.ec2.autoscale.connect_to_region('ap-southeast-2')
group = asg.get_all_groups(names='asgname')
tags = asg.get_all_tags(group)
print tags
asg = boto.ec2.autoscale.connect_to_region('ap-southeast-2')
group = asg.get_all_groups(names='asgname')
tags = group.get_all_tags()
print tags
Without specifying an 'asgname', it's not returning every ASG. Despite what the documentation says about returning a token to see the next page, it doesn't seem to be implemented correctly - especially when you have a large number of ASG's and tags per ASG.
Trying something like this has basically shown me that the token system appears to be broken. it is not "looping" through all ASG's and tags before it returns "None":
asg = boto.ec2.autoscale.connect_to_region('ap-southeast-2')
nt = None
while ( True ):
tags = asg.get_all_tags(next_token=nt)
for t in tags:
if ( t.key == "MyTag" ):
print t.resource_id
print t.value
if ( tags.next_token == None ):
nt = str(tags.next_token)
Has anyone managed to achieve this?
This functionality is available in AWS using the AutoScaling DescribeTags API call, but unfortunately boto does not completely implement this call.
You should be able to pass a Filter with that API call to only get the tags for a specific ASG, but if you have a look at the boto source code for get_all_tags() (v2.32.1), the filter is not implemented:
:type filters: dict
:param filters: The value of the filter type used
to identify the tags to be returned. NOT IMPLEMENTED YET.
(quote from the source code mentioned above).
I eventually answered my own question by creating a work around using the amazon cli. Since there has been no activity on this question since the day I asked it I am posting this workaround as a solution.
import os
import json
## bash command
awscli = "/usr/local/bin/aws autoscaling describe-tags --filters Name=auto-scaling-group,Values=" + str(asgname)
output = str()
# run it
cmd = os.popen(awscli,"r")
while 1:
# get tag lines
lines = cmd.readline()
if not lines: break
output += lines
# json.load to manipulate
tags = json.loads(output.replace('\n',''))

Configuring munin server for use with AWS autoscaling?

I am planning to use AWS autoscaling groups for my webservers. As a monitoring solution I am using munin at the moment. In the configuration file on the munin master server, you have to give IP addresses or host names for every host you want to monitor.
Now with autoscaling the number of instances will change frequently, and writing static information in the munin config does not seem to fit well in this environment. I could probably query all server addresses I want to monitor and write the munin master configuration file then, but this seems not like a good approach to me.
What is the preferred way of using munin in such an environment? Does someone use munin with autoscaling?
In general I would like to keep using munin and not switch to another monitoring solution because I wrote quite a lot of specific plugins that I rely on. However if you have another monitoring solution that will probably let me keep my plugins I am also open for that.
One year ago we used munin as alternative monitoring system and I will tell you one: I don't like it at all.
We had some automation for auto scaling system in nagios too, but this is also ugly way to monitor large amount of AWS instances because nagios starts to lag/crash after some amount of monitoring instances.
If you have more that 150-200 instances to monitor I suggest you to use some commercial services like StackDriver or other alternatives.
I stumbled across this old topic because I was looking for a solution to the same problem. Finally I found a way that works for me which I would like to share with you. The tl;dr summary
use AWS Python API to get all instances in the same VPC the munin master is in
test if munin port 4949 is open on the instances found to detect munin nodes
create munin.conf from a munin.base.conf (without nodes) and append entries for all the nodes found
run the script on the munin master all 5 minutes via cron
Finally, here is my Python script which does all the magic:
#! /usr/bin/python
import boto3
import requests
import argparse
import shutil
import socket
socketTimeout = 2
ec2 = boto3.client('ec2')
def getVpcId():
response = requests.get('')
instance_id = response.text
response = ec2.describe_instances(
'Name' : 'instance-id',
'Values' : [ instance_id ]
return response['Reservations'][0]['Instances'][0]['VpcId']
def findNodes(tag):
result = []
vpcId = getVpcId()
response = ec2.describe_instances(
'Name' : 'tag-key',
'Values' : [ tag ]
'Name' : 'vpc-id',
'Values' : [ vpcId ]
for reservation in response['Reservations']:
for instance in reservation['Instances']:
return result
def getInstanceTag(instance, tagName):
for tag in instance['Tags']:
if tag['Key'] == tagName:
return tag['Value']
return None
def isMuninNode(host):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, 4949))
return True
except Exception as e:
return False
def appendNodesToConfig(nodes, target, tag):
with open(target, "a") as file:
for node in nodes:
hostname = getInstanceTag(node, tag)
if hostname.endswith('.'):
hostname = hostname[:-1]
if hostname <> None and isMuninNode(hostname):
file.write('[' + hostname + ']\n')
file.write('\taddress ' + hostname + '\n')
file.write('\tuse_node_name yes\n\n')
parser = argparse.ArgumentParser("")
parser.add_argument("baseconfig", help="base munin config to append nodes to")
parser.add_argument("target", help="target munin config")
args = parser.parse_args()
base = args.baseconfig
target =
shutil.copyfile(base, target)
nodes = findNodes('CNAME')
appendNodesToConfig(nodes, target, 'CNAME')
For the API calls to work you have to setup AWS API credentials or assign an IAM role with the required permissions (ec2:DescribeInstances as a bare minimum) to your munin master instance (which is my prefered method).
Some final implementation notes:
I have a tag named CNAME assigned to all my AWS instances which holds the internal DNS host name. Therefore I filter for this tag and use the value as the node name and address for the munin configuration. You probably have to change this for your setup.
Another option would be to assign a specific tag to all the instances you want to monitor with munin. You could then filter for this tag and probably also skip the check for the open munin port.
Hope this is of some help.