I would like to get the usage cost report of each instance in my aws account form a period of time.
I'm able to get linked_account_id and service in the output but I need instance_id as well. Please help
import argparse
import boto3
import datetime
cd = boto3.client('ce', 'ap-south-1')
results = []
token = None
while True:
if token:
kwargs = {'NextPageToken': token}
else:
kwargs = {}
data = cd.get_cost_and_usage(
TimePeriod={'Start': '2019-01-01', 'End': '2019-06-30'},
Granularity='MONTHLY',
Metrics=['BlendedCost','UnblendedCost'],
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'LINKED_ACCOUNT'},
{'Type': 'DIMENSION', 'Key': 'SERVICE'}
], **kwargs)
results += data['ResultsByTime']
token = data.get('NextPageToken')
if not token:
break
print('\t'.join(['Start_date', 'End_date', 'LinkedAccount', 'Service', 'blended_cost','unblended_cost', 'Unit', 'Estimated']))
for result_by_time in results:
for group in result_by_time['Groups']:
blended_cost = group['Metrics']['BlendedCost']['Amount']
unblended_cost = group['Metrics']['UnblendedCost']['Amount']
unit = group['Metrics']['UnblendedCost']['Unit']
print(result_by_time['TimePeriod']['Start'], '\t',
result_by_time['TimePeriod']['End'],'\t',
'\t'.join(group['Keys']), '\t',
blended_cost,'\t',
unblended_cost, '\t',
unit, '\t',
result_by_time['Estimated'])
As far as I know, Cost Explorer can't treat the usage per instance. There is a function Cost and Usage Reports which gives a detailed billing report by dump files. In this file, you can see the instance id.
It can also be connected to the AWS Athena. Once you did this, then directly query to the file on Athena.
Here is my presto example.
select
lineitem_resourceid,
sum(lineitem_unblendedcost) as unblended_cost,
sum(lineitem_blendedcost) as blended_cost
from
<table>
where
lineitem_productcode = 'AmazonEC2' and
product_operation like 'RunInstances%'
group by
lineitem_resourceid
The result is
lineitem_resourceid unblended_cost blended_cost
i-***************** 279.424 279.424
i-***************** 139.948 139.948
i-******** 68.198 68.198
i-***************** 3.848 3.848
i-***************** 0.013 0.013
where the resourceid containes the instance id. The amount of cost is summed for all usage in this month. For other type of product_operation, it will contains different resource ids.
You can add an individual tag to all instances (e.g. Id) and then group by that tag:
GroupBy=[
{
'Type': 'TAG',
'Key': 'Id'
},
],
Related
I have a python script to get the cost of each service for the last month, but I want the cost for each resource using the tag mentioned for that resource. For example, I have got the cost for RDS service, am using two database instances, So I want to get the separate cost for two database instances . I have different tags for two database instances
TAGS:
First database instance --> Key: Name Value:rds1
Second database instance --> Key: Name Value:rds2
My output should be like , the tag of the resource and its cost
Example --> rds1 - 15$
rds2 - 10$
Can anyone help me to achieve this ?
I have attached the ouput I got for cost based on the service
Output for cost based on service
Similar work you can find here Boto3 get_cost_and_usage filter Getting zero cost based on tag
Make sure that the tags that you are listing are correct.
On top of this,
you could see all of the tags that have been created in your account by the user or it is AWS managed. here https://us-east-1.console.aws.amazon.com/billing/home#/tags.
with boto3 cost-explorer client, you could use this function list_cost_allocation_tags to get a list of cost allocation tags.
import boto3
start_date = '2022-07-01'
end_date = '2022-08-30'
client = boto3.client('ce')
tags_response = None
try:
tags_response = client.list_cost_allocation_tags(
Status='Inactive', # 'Active'|'Inactive',
# TagKeys=[
# 'Key',
# ],
Type='UserDefined', # 'AWSGenerated'|'UserDefined',
# NextToken='string',
# MaxResults=100,
)
except Exception as e:
print(e)
cost_allocation_tags = tags_response['CostAllocationTags']
print(cost_allocation_tags)
print("-"*5+' Input TagValues with comma separation '+"-"*5)
for cost_allocation_tag in cost_allocation_tags:
tag_key = cost_allocation_tag['TagKey']
tag_type = cost_allocation_tag['Type']
tag_status = cost_allocation_tag['Status']
tag_values = str(input(
f'TagKey: {tag_key}, Type: {tag_type}, Status: {tag_status} -> '))
if tag_values == "":
continue
tag_values_parsed = tag_values.strip().split(',')
if tag_values_parsed == []:
continue
cost_usage_response = None
try:
cost_usage_response = client.get_cost_and_usage(
TimePeriod={
'Start': start_date,
'End': end_date
},
Metrics=['AmortizedCost'],
Granularity='MONTHLY', # 'DAILY'|'MONTHLY'|'HOURLY',
Filter={
'Tags': {
'Key': tag_key,
'Values': tag_values_parsed,
'MatchOptions': [
'EQUALS' # 'EQUALS'|'ABSENT'|'STARTS_WITH'|'ENDS_WITH'|'CONTAINS'|'CASE_SENSITIVE'|'CASE_INSENSITIVE',
]
},
},
# GroupBy=[
# {
# 'Type': 'SERVICE', # 'DIMENSION'|'TAG'|'COST_CATEGORY', # AZ, INSTANCE_TYPE, LEGAL_ENTITY_NAME, INVOICING_ENTITY, LINKED_ACCOUNT, OPERATION, PLATFORM, PURCHASE_TYPE, SERVICE, TENANCY, RECORD_TYPE , and USAGE_TYPE
# 'Key': 'string',
# },
# ],
)
print(cost_usage_response)
except Exception as e:
print(e)
The below-mentioned code is created for exporting all the findings from the security hub to an S3 bucket using lambda functions. The filters are set for exporting only CIS-AWS foundations benchmarks. There are more than 20 accounts added as the members in security hub. The issue that I'm facing here is even though I'm using the NextToken configuration. The output doesn't have information about all the accounts. Instead, it just displays any one of the account's data randomly.
Can somebody look into the code and let me know what could be the issue, please?
import boto3
import json
from botocore.exceptions import ClientError
import time
import glob
client = boto3.client('securityhub')
s3 = boto3.resource('s3')
storedata = {}
_filter = Filters={
'GeneratorId': [
{
'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
'Comparison': 'PREFIX'
}
],
}
def lambda_handler(event, context):
response = client.get_findings(
Filters={
'GeneratorId': [
{
'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
'Comparison': 'PREFIX'
},
],
},
)
results = response["Findings"]
while "NextToken" in response:
response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
results.extend(response["Findings"])
storedata = json.dumps(response)
print(storedata)
save_file = open("/tmp/SecurityHub-Findings.json", "w")
save_file.write(storedata)
save_file.close()
for name in glob.glob("/tmp/*"):
s3.meta.client.upload_file(name, "xxxxx-security-hubfindings", name)
TooManyRequestsException error is also getting now.
The problem is in this code that paginates the security findings results:
while "NextToken" in response:
response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
results.extend(response["Findings"])
storedata = json.dumps(response)
print(storedata)
The value of storedata after the while loop has completed is the last page of security findings, rather than the aggregate of the security findings.
However, you're already aggregating the security findings in results, so you can use that:
save_file = open("/tmp/SecurityHub-Findings.json", "w")
save_file.write(json.dumps(results))
save_file.close()
My Requirement
I want to create a CloudWatch-Metric from Athena query results.
Example
I want to create a metric like user_count of each day.
In Athena, I will write an SQL query like this
select date,count(distinct user) as count from users_table group by 1
In the Athena editor I can see the result, but I want to see these results as a metric in Cloudwatch.
CloudWatch-Metric-Name ==> user_count
Dimensions ==> Date,count
If I have this cloudwatch metric and dimensions, I can easily create a Monitoring Dashboard and send send alerts
Can anyone suggest a way to do this?
You can use CloudWatch custom widgets, see "Run Amazon Athena queries" in Samples.
It's somewhat involved, but you can use a Lambda for this. In a nutshell:
Setup your query in Athena and make sure it works using the Athena console.
Create a Lambda that:
Runs your Athena query
Pulls the query results from S3
Parses the query results
Sends the query results to CloudWatch as a metric
Use EventBridge to run your Lambda on a recurring basis
Here's an example Lambda function in Python that does step #2. Note that the Lamda function will need IAM permissions to run queries in Athena, read the results from S3, and then put a metric into Cloudwatch.
import time
import boto3
query = 'select count(*) from mytable'
DATABASE = 'default'
bucket='BUCKET_NAME'
path='yourpath'
def lambda_handler(event, context):
#Run query in Athena
client = boto3.client('athena')
output = "s3://{}/{}".format(bucket,path)
# Execution
response = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': DATABASE
},
ResultConfiguration={
'OutputLocation': output,
}
)
#S3 file name uses the QueryExecutionId so
#grab it here so we can pull the S3 file.
qeid = response["QueryExecutionId"]
#occasionally the Athena hasn't written the file
#before the lambda tries to pull it out of S3, so pause a few seconds
#Note: You are charged for time the lambda is running.
#A more elegant but more complicated solution would try to get the
#file first then sleep.
time.sleep(3)
###### Get query result from S3.
s3 = boto3.client('s3');
objectkey = path + "/" + qeid + ".csv"
#load object as file
file_content = s3.get_object(
Bucket=bucket,
Key=objectkey)["Body"].read()
#split file on carriage returns
lines = file_content.decode().splitlines()
#get the second line in file
count = lines[1]
#remove double quotes
count = count.replace("\"", "")
#convert string to int since cloudwatch wants numeric for value
count = int(count)
#post query results as a CloudWatch metric
cloudwatch = boto3.client('cloudwatch')
response = cloudwatch.put_metric_data(
MetricData = [
{
'MetricName': 'MyMetric',
'Dimensions': [
{
'Name': 'DIM1',
'Value': 'dim1'
},
],
'Unit': 'None',
'Value': count
},
],
Namespace = 'MyMetricNS'
)
return response
return
data = cd.get_cost_and_usage(TimePeriod={'Start':input("Enter Start Date in format yyyy-mm-dd:\n"),'End': input("Enter End Date in format yyyy-mm-dd:\n")}, Granularity='MONTHLY',
Metrics=[input('Choose any of the following metrics: AMORTIZED_COST, UNBLENDED_COST, BLENDED_COST, USAGE_QUANTITY, NET_AMORTIZED_COST\n')],
GroupBy=[{'Type': 'DIMENSION', 'Key': 'LINKED_ACCOUNT'}, {'Type': 'DIMENSION', 'Key':input("Enter any of the following Dimensions - SERVICE, INSTANCE_TYPE, USAGE_TYPE, RECORD_TYPE:\n")}], **kwargs)
for info in data['ResultsByTime']:
for group in info['Groups']:
data=(group['Keys'][0], info['TimePeriod']['Start'], group['Metrics'][]
['Amount'], group['Keys'][1])
print(*data, sep=", ", file=f, flush=True)
in the below part of the script, the [] needs to have the same input as Metrics, how do I make it so the User doesn't have to input the same thing twice? whatever they put as Metrics comes up in this as well?
data=(group['Keys'][0], info['TimePeriod']['Start'], group['Metrics'][]
['Amount'], group['Keys'][1])
I think the most logical way would be to extract the metric input along with the other inputs, and then re-use it.
For example:
metric_chosen = input('Choose any of the following metrics: AMORTIZED_COST, UNBLENDED_COST, BLENDED_COST, USAGE_QUANTITY, NET_AMORTIZED_COST\n')
date_start = input("Enter Start Date in format yyyy-mm-dd:\n")
date_end = input("Enter End Date in format yyyy-mm-dd:\n")
dimension_chosen = input("Enter any of the following Dimensions - SERVICE, INSTANCE_TYPE, USAGE_TYPE, RECORD_TYPE:\n")
data = cd.get_cost_and_usage(
TimePeriod={
'Start': date_start,
'End': date_end
},
Granularity='MONTHLY',
Metrics=[metric_chosen],
GroupBy=[
{'Type': 'DIMENSION',
'Key': 'LINKED_ACCOUNT'
},
{'Type': 'DIMENSION',
'Key': dimension_chosen
}], **kwargs)
for info in data['ResultsByTime']:
for group in info['Groups']:
data=(group['Keys'][0],
info['TimePeriod']['Start'],
group['Metrics'][metric_chosen]['Amount'],
group['Keys'][1])
print(*data, sep=", ", file=f, flush=True)
Greeting,
My code work as long as its the first spot request of the day. If I terminate the instance and make another spot request It just gives me back my old request.
Is there something with my code or with AWS ??? Is there work-around ??
I have tried to clone my AMI and then use the clone AMI change the price or change the number of instance in the spec
but it is still not working ???
!/home/makayo/.virtualenvs/boto3/bin/python
"""
http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Client.describe_spot_instance_requests
"""
import boto3
import time
myid=
s = boto3.Session()
ec2 = s.resource('ec2')
client = boto3.client('ec2')
images = list(ec2.images.filter(Owners=[myid]))
def getdate(datestr):
ix=datestr.replace('T',' ')
ix=ix[0:len(ix)-5]
idx=time.strptime(ix,'%Y-%m-%d %H:%M:%S')
return(idx)
zz=sorted(images, key=lambda images: getdate(images.creation_date))
#last_ami
myAmi=zz[len(zz)-1]
#earliest
#myAmi=latestAmi=zz[0]
"""
[{u'DeviceName': '/dev/sda1', u'Ebs': {u'DeleteOnTermination': True, u'Encrypted': False, u'SnapshotId': 'snap-d8de3adb', u'VolumeSize': 50, u'VolumeType': 'gp2'}}]
"""
#myimageId='ami-42870a55'
myimageId=myAmi.id
print myimageId
mysubnetId= myinstanceType='c4.4xlarge'
mykeyName='spot-coursera'
#make sure ajust this but dont do multiple in a loop as it can fail!!!
mycount=2
#make sure ajust this but dont do multiple in a loop as it can fail!!!
myprice='5.0'
mytype='one-time'
myipAddr=
myallocId=''
mysecurityGroups=['']
#mydisksize=70
mygroupId=
#mygroupId=
myzone='us-east-1a'
myvpcId='vpc-503dba37'
#latestAmi.block_device_mappings[0]['Ebs']['VolumeSize']=mydisksize
#diskSpec=latestAmi.block_device_mappings[0]['Ebs']['VolumeSize']
response2 = client.request_spot_instances(
DryRun=False,
SpotPrice=myprice,
ClientToken='string',
InstanceCount=1,
Type='one-time',
LaunchSpecification={
'ImageId': myimageId,
'KeyName': mykeyName,
'SubnetId':mysubnetId,
#'SecurityGroups': mysecurityGroups,
'InstanceType': myinstanceType,
'Placement': {
'AvailabilityZone': myzone,
}
}
)
#print(response2)
myrequestId=response2['SpotInstanceRequests'][0]['SpotInstanceRequestId']
import time
XX=True
while XX:
response3 = client.describe_spot_instance_requests(
#DryRun=True,
SpotInstanceRequestIds=[
myrequestId,
]
#Filters=[
# {
# 'Name': 'string',
# 'Values': [
# 'string',
# ]
#},
#]
)
#print(response3)
request_status=response3['SpotInstanceRequests'][0]['Status']['Code']
if(request_status=='fullfilled'):
print myrequestId,request_status
XX=False;
elif ('pending' in request_status):
print myrequestId,request_status
time.sleep(5)
else:
XX=False
print myrequestId,request_status
"""
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
while( len(list(instances))==0):
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
for instance in instances:
print(instance.id, instance.instance_type);
response = instance.modify_attribute(Groups=[mygroupId]);
print(response);
This is wrong:
ClientToken='string',
Or, at least, it's wrong most of the time, as you should now realize.
The purpose of the token is to ensure that EC2 does not process the same request twice, due to retries, bugs, or a multitude of other reasons.
It doesn't matter (within reason -- 64 characters, max, ASCII, case-sensitive) what you send here, but you need to send something different with each unique request.
A client token is a unique, case-sensitive string of up to 64 ASCII characters. It is included in the response when you describe the instance. A client token is valid for at least 24 hours after the termination of the instance. You should not reuse a client token in another call later on.
http://docs.aws.amazon.com/AWSEC2/latest/APIReference/Run_Instance_Idempotency.html