I would like to use Boto3 to generate a list of EC2s along with state changes (pending, running, shutting-down, terminated etc.) between a set of two date times. My understanding is that Config Service maintains histories of EC2s even if the EC2 no longer exists. I have taken a look at this document, however I am having difficulty understanding which functions to use in order to accomplish the task at hand.
Thank you
Under the assumption that you have already configured AWS Config rules to track ec2-instance state, this approach will suit your need.
1) Get the list of ec2-instances using the list_discovered_resources API.Ensure includeDeletedResources is set to True if you want to include deleted resources in the response.
response = client.list_discovered_resources(
resourceType='AWS::EC2::Instance',
limit=100,
includeDeletedResources=True,
nextToken='string'
)
Parse the response and store the resource-id.
2) Pass each resource_id to the get_resource_config_history API.
response = client.get_resource_config_history(
resourceType='AWS::EC2::Instance',
resourceId='i-0123af12345be162h5', // Enter your EC2 instance id here
laterTime=datetime(2018, 1, 7), // Enter end date. default is current date.
earlierTime=datetime(2018, 1, 1), // Enter start date
chronologicalOrder='Reverse'|'Forward',
limit=100,
nextToken='string'
)
You can parse the response and get the state changes, which ec2 instance went through for that corresponding time period.
Related
If I call the boto3 EC2 client's describe_instances function with no MaxResults parameter, will it return all instances in the initial call? There is a parameter that allows one to specify MaxResults, but it is not required. If I don't specify this MaxResults parameter, will the response contain all instances or will it still chunk them into groups using the NextToken of the response?
The documentation says
"Describes the specified instances or all of AWS account's
instances...If you do not specify instance IDs, Amazon EC2 returns
information for all relevant instances."
But it is not clear whether I still need to expect that things could be returned in chunks if my account has a lot of instances. The MaxResults parameter can be set to "between 5 and 1000," which implies 1000 may be the default MaxResults.
If you do not specify MaxResults, then the server-side API will limit the response to either a maximum number of results/items (for example 1000) or a maximum size of response payload (e.g. 256 MB). Which it does is not typically documented, and potentially varies from API call to API call and from service to service.
If NextToken is present in the response and is not NULL, then you should re-issue the API call, with the NextToken, to get the next 'page' of results. Rinse and repeat until you have all results.
If you know you have only a handful of EC2 instances (say < 100), most programmers don't typically check the response's NextToken. They probably should, but they don't.
Note that the above relates to the boto3 Client interface. You can also use the describe-instances paginator.
If you are purely interested in EC2 instances within a given VPC, then you can use the VPC's instances collection. This is part of boto3's Resource-level interface. The instances are lazily-loaded and you don't need to paginate or mess with next tokens. See differences between Client and Resource.
Modified
Let us assume that we call the describe_instances() and didn't set the value of MaxResults.
Then, the response will contain the list of instances. There can be NextToken or not. If NextToken exists, the response is showing only some part of all instances. If NextToken is not present, then the response shows all instances.
Not setting the MaxResults does not mean that the response will show all instances.
Original
Once you receive the response as a result of describe_instances() without NextToken, the result shows all instances even you didn't set the MaxResults. You only need to care about the response for describe_instances().
Or use the pagenator to get all result without NextToken. Here is my sample code for snapshot.
import boto3
boto3 = boto3.session.Session(region_name='ap-northeast-2')
ec2 = boto3.client('ec2')
page_iterator = ec2.get_paginator('describe_snapshots').paginate()
for page in page_iterator:
for snapshot in page['Snapshots']:
print(snapshot['SnapshotId'], snapshot['StartTime'])
This will print all snapshot id and starttime.
Check the below 2 options to call describe instances:
simple direct API call like "describe_instances" which has NextToken argument which means you can use this token as a starting point when you query next time. If you have less number of instances maybe in a single call it would return all instances and in that case you wont see NextToken value.
Using paginator Command Reference Here once you have a paginator.paginate() object, you can use for loop and it will return all instances. In this way we don't have to worry about MaxItems or NextToken.
Simple example that illustrates how to use a paginator
I would recommend using paginators whenever possible.
I have two log groups generated by two different lambda. When I subscribe one log group to my elasticsearch service, it is working. However, when I add the other log group I have the following error in the log generated by cloudwatch :
"responseBody": "{\"took\":5,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"cwl-2018.03.01\",\"_type\":\"/aws/lambda/lambda-1\",\"_id\":\"33894733850010958003644005072668130559385092091818016768\",\"status\":400,\"error\":
{\"type\":\"illegal_argument_exception\",\"reason\":\"Rejecting mapping update to [cwl-2018.03.01] as the final mapping would have more than 1 type: [/aws/lambda/lambda-1, /aws/lambda/lambda-2]\"}}}]}"
How can I resolve this, and still have both log group in my Elasticsearch service, and visualize all the logs ?
Thank you.
The problem is that ElasticSearch 6.0.0 made a change that allows indices to only contain a single mapping type. (https://www.elastic.co/guide/en/elasticsearch/reference/6.0/removal-of-types.html) I assume you are running an ElasticSearch service instance that is using version 6.0.
The default Lambda JS file if created through the AWS console sets the index type to the log group name. An example of the JS file is on this gist (https://gist.github.com/iMilnb/27726a5004c0d4dc3dba3de01c65c575)
Line 86: action.index._type = payload.logGroup;
I personally have a modified version of that script in use and changed that line to be:
action.index._type = 'cwl';
I have logs from various different log groups streaming through to the same ElasticSearch instance. It makes sense to have them all be the same type since they are all CloudWatch logs versus having the type be the log group name. The name is also set in the #log_group field so queries can use that for filtering.
In my case, I did the following:
Deploy modified Lambda
Reindex today's index (cwl-2018.03.07 for example) to change the type
for old documents from <log group name> to cwl
Entries from different log groups will now coexist.
You can also modify the generated Lambda code like below to make it work with multiple CW log groups. If the Lambda function can create different ES index for the different log streams coming under the same log groups, then we can avoid this problem. So, you need to find the Lambda function LogsToElasticsearch_<AWS-ES-DOMAIN-NAME>, then the function function transform(payload), and finally change the index name formation part like below.
// index name format: cwl-YYYY.MM.DD
//var indexName = [
//'cwl-' + timestamp.getUTCFullYear(), // year
//('0' + (timestamp.getUTCMonth() + 1)).slice(-2), // month
//('0' + timestamp.getUTCDate()).slice(-2) // day
//].join('.');
var indexName = [
'cwl-' + payload.logGroup.toLowerCase().split('/').join('-') + '-' + timestamp.getUTCFullYear(), // log group + year
('0' + (timestamp.getUTCMonth() + 1)).slice(-2), // month
('0' + timestamp.getUTCDate()).slice(-2) // day
].join('.');
Is it possible to forward all the cloudwatch log groups to a single index in ES? Like having one index "rds-logs-* "to stream logs from all my available RDS instances.
example: error logs, slow-query logs, general logs, etc., of all RDS instances, would be required to be pushed under the same index(rds-logs-*)?
I tried the above-mentioned code change, but it pushes only the last log group that I had configured.
From AWS: by default, only 1 log group can stream log data into ElasticSearch service. Attempting to stream two log groups at the same time will result in log data of one log group override the log data of the other log group.
Wanted to check if we have a work-around for the same.
I have shared a bunch of AMIs from an AWS account to another.
I used this EC2conn1.modify_image_attribute(AMI_id, operation='add', attribute='launchPermission', user_ids=[second_aws_account_id]) to do it.
But, by only adding launch permission for the 2nd account, I can launch an instance but I cannot copy the shared AMI to another region [in the 2nd account].
When I tick the checkbox to "create volume" from the UI of the 1st account, I can copy the shared AMI from the 2nd:
I can modify the launch permissions using the modify_image_attribute function from boto.
In the documentation says, attribute (string) – The attribute you wish to change but I understand that it can only change the launch permissions and add an account.
Yet, the get_image_attribute has 3 options Valid choices are: * launchPermission * productCodes * blockDeviceMapping.
So, is there a way to programmatically change it from the API along with the launch permissions or, it has not been implemented yet??
The console uses the API so there's almost nothing you can do in the console that you can't to using the API.
Remember that an AMI is just a configuration entity -- basic launch configuration, linked to (not containing) one or more backing snapshots, which are technically separate entities.
The console is almost certainly making an additional API request the ModifySnapshotAttribute API when it offers to optionally "add Create Volume permissions to the following associated snapshot."
See also http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-modifying-snapshot-permissions.html
Presumably, copying a snapshot to another region relies on the same "Create Volume" permission (indeed, you'll see that a copied snapshot has a fake source volume ID, presumably an artifact of the copying process).
Based on the accepted answer, this is the code I wrote for anyone interested.
# Add copy permission to the image's snapshot
# Find the snapshot of the specific AMI
image_object = EC2conn.get_image(AMI_id)
# Grab the block device mapping dynamically
ami_devices = []
for key in image_object.block_device_mapping.iterkeys():
# print key #debug
ami_devices.append(key)
# print ami_devices #debug
for ami_device in ami_devices:
snap_id = image_object.block_device_mapping[ami_device].snapshot_id
# Add permission
EC2conn.modify_snapshot_attribute(snap_id, attribute='createVolumePermission', operation='add', user_ids=second_aws_account_id)
print "{0} [{1}] Permission added to snapshot".format(AMI_name,snap_id)
I run a service where the users can publicly upload and download files to our site, using Amazon S3. Last month we had a problem where a user uploaded a file that was downloaded like crazy, resulting in 170 TB of bandwidth and a huge bill.
Talking to Amazon and searching on StackOverflow the way to ensure this doesn't happen again is to download the S3 logs parse them, and take actions from there.
We could build such script, but I guess there must be some open source or third party service providing a script or service for this?
What about:
Create a CloudFront Distribution for downloads
Setup a CloudWatch alarm that is triggered when the distribution's BytesDownloaded metric exceeds your chosen monthly limit
Add a notification (sent to an SNS topic you create) that is triggered when the alarm is fired
Add a Lambda function that is triggered by SNS when a notification is sent to that topic (the SNS topic should also have your email subscribed of course so you receive an email with the alarm)
In the Lambda function write code that uses the AWS SDK to update the cloudfront distribution and sets the enabled value to false
(You could also create a notification that is fired when the state of the alarm changes back to OK and trigger a lambda function that re-enables the distribution)
My solution to this, and problem like this, is to have billing alerts on my account. I know roughly how much I should spend each month, and setup alerts accordingly - roughly I have divided that amount by 4 (weeks), and set a series of billing alerts at 1/4, 1/2, 3/4 and 1X my estimated spend.
This is not a technical solution to stop the downloads, but at least someone will get notified and they can take action before it gets out of control.
Your best approach is distribute your S3 content using AWS Cloudfront and implement AWS Web Application Firewall (WAF) and implement IP blocking.
So if a IP hits your Cloud Front Distribution more than for say 5 times the AWS WAF will block that IP.
Here is the detailed guide.
https://blogs.aws.amazon.com/security/post/Tx1ZTM4DT0HRH0K/How-to-Configure-Rate-Based-Blacklisting-with-AWS-WAF-and-AWS-Lambda
We had similar kind of requirement long ago.
We used CloudTrail logs to figure out all the activities being performed on our AWS Account.
hope the script for downloading and filter Cloudtrail logs helps you out. ( The following script is only for figuring out launched instance-ids, owner, eventname. please modify according to your need)
import boto3
import gzip
import os
import json
client = boto3.client('s3')
bucketname = "mybucketname"
list_bucket_objects = client.list_objects(Bucket=bucketname )
download_path = '/home/ec2-user/cloudtrail/'
# DOWNLOADING: Downloading Log files from S3
for object in list_bucket_objects['Contents']:
print object['Key']
object_name = object['Key'].split('/')
if len(object_name)==8:
print "Downloading --->%s" % object_name[7]
client.download_file(bucketname, object['Key'], download_path+object_name[7])
# UNZIPPING: Unzipping the files in one folder
file_path = '/home/ec2-user/cloudtrail/'
new_file_path = '/home/ec2-user/cloudtrail/logs/'
#Create Log Directory
if not os.path.exists(new_file_path):
os.mkdir(new_file_path)
files = os.listdir(file_path)
for file in files:
boolean = os.path.isfile(file_path+file)
if boolean == True:
f = gzip.GzipFile(file_path+file, 'rb')
s = f.read()
f.close()
split_file = file.split('.')
log_path = new_file_path+split_file[0]
print log_path
out = open(log_path, 'wb')
out.write(s)
out.close()
# PARSING AND FILTERING: parsing output into json format, filtering output and writing it in result.txt file
fin = open(log_path).read()
content = json.loads(fin)
for i in range(0, len(content['Records'])):
event = content['Records'][i]['eventName']
if 'userName' in content['Records'][i]['userIdentity']:
user = content['Records'][i]['userIdentity']['userName']
if 'responseElements' in content['Records'][i]:
res_ele = content['Records'][i]['responseElements']
if res_ele:
if 'instancesSet' in content['Records'][i]['responseElements']:
if 'items' in content['Records'][i]['responseElements']['instancesSet']:
instance_id = content['Records'][i]['responseElements']['instancesSet']['items'][0]['instanceId']
if (event == "RunInstances" and instance_id != ""):
open('result.txt', 'ab').write(event+": :"+user+": :"+instance_id+"\n")
#result.txt is stored in your current working directory.
I'm accessing EC2 with the aws-sdk for Ruby. I have an array of instances from describe_instances().
This provides me with the state of the instances and even a state transition reason. But how can I get a time for the state transition?
Edit
So I have:
client=Aws::EC2::Client()
resp =client.describe_instances({ filters })
and I would need
resp.reservations[0].instances[0].state_transition_time #=> Time
similar to
resp.reservations[0].instances[0].state_transition_reason #=> String
This information is not available via the Amazon EC2 API at this time. The aws-sdk gem returns all of the information available from the DescribeInstances operation as documented here: http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html
The State Transition Reason is not always populated with a date and time and may not even be populated at all per the documentation. I have not found any hints in the documentation that specify the conditions in which you DO get a date/time, but in my experience, the date/time are present in the State Transition Reason for between 30 and 90 days. After that, the reason seems to persist, but the date is dropped from the string.
All of the documentation that I can find is listed here:
Attribute Definition
EC2 API - Ruby