How to change storage class of existing key via boto3 - amazon-web-services

When using AWS S3 service, I need to change storage class of existing key from STANDARD to STANDARD_IA.
change_storage_class from boto doesn't exist in boto3.
What is the equivalent in Boto3?

from amazon doc
You can also change the storage class of an object that is already stored in Amazon S3 by copying it to the same key name in the same bucket. To do that, you use the following request headers in a PUT Object copy request:
x-amz-metadata-directive set to COPY
x-amz-storage-class set to STANDARD, STANDARD_IA, or REDUCED_REDUNDANCY
in term of boto3 copy code, this will look like
import boto3
s3 = boto3.client('s3')
copy_source = {
'Bucket': 'mybucket',
'Key': 'mykey'
}
s3.copy(
copy_source, 'mybucket', 'mykey',
ExtraArgs = {
'StorageClass': 'STANDARD_IA',
'MetadataDirective': 'COPY'
}
)

Related

Boto3: copy objects within bucket

Is it possible to copy/duplicate objects within one prefix to another prefix in the same s3 bucket?
You can use copy_object() to copy an object in Amazon S3 to another prefix, another bucket and even another Region. The copying takes place entirely within S3, without needing to download/upload the object.
For example, to copy an object in mybucket from folder1/foo.txt to folder2/foo.txt, you could use:
import boto3
s3_client = boto3.client('s3')
response = s3_client.copy_object(
CopySource='/mybucket/folder1/foo.txt', # /Bucket-name/path/filename
Bucket='mybucket', # Destination bucket
Key='folder2/foo.txt' # Destination path/filename
)
An alternative using boto3 resource instead of client:
bucket = boto3.resource("s3").Bucket(my_bucket_name)
copy_source = {"Bucket": my_bucket_name, "Key": my_old_key}
bucket.copy(copy_source, my_new_key)
Where my_bucket_name, my_old_key and my_new_key are user defined variables.
Depending on the setup, additional arguments might be needed to instantiate a boto3 resource. A more complete instantiation call would be:
boto3.resource(
"s3",
endpoint_url=my_endpoint_url,
aws_access_key_id=my_aws_access_key_id, # Do not expose me in source code!
aws_secret_access_key=my_aws_secret_access_key, # Do not expose me in source code!
)

AWS lambda with dynamic trigger for S3 buckets

I have a S3 bucket by name "archive_A". I have created a lambda function to retrieve meta data info for any object "creation" or "permanently delete" from S3 bucket as triggers to my lambda function (python) and insert the meta data collected into DynamoDB.
For S3 bucket archive_A, I have manually added the triggers, one for "creation" and another one for "permanently delete" in my lambda function via GUI.
import boto3
from uuid import uuid4
def lambda_handler(event, context):
s3 = boto3.client("s3")
dynamodb = boto3.resource('dynamodb')
for record in event['Records']:
bucket_name = record['s3']['bucket']['name']
object_key = record['s3']['object']['key']
size = record['s3']['object'].get('size', -1)
event_name = record ['eventName']
event_time = record['eventTime']
dynamoTable = dynamodb.Table('S3metadata')
dynamoTable.put_item(
Item={'Resource_id': str(uuid4()), 'Bucket': bucket_name, 'Object': object_key,'Size': size, 'Event': event_name, 'EventTime': event_time})
In the future there could be more S3 buckets like archive_B, archive_C etc. In that case I have to keep adding triggers manually for each S3 bucket which is bit cumbersome.
Is there any dynamic way or adding triggers to lambda for S3 buckets with name "archive_*" and hence any future S3 bucket with name like "archive_G" will have a dynamically added triggers to lambda.
Please suggest. I am quite new to AWS too. Any example would be easier to follow.
There is no in-built way to automatically add triggers for new buckets.
You could probably create an Amazon EventBridge rule that triggers on CreateBucket and calls an AWS Lambda function with details of the new bucket.
That Lambda function could then programmatically add a trigger on your existing Lambda function.

How to replicate objects arriving in Amazon S3 via an AWS Lambda function

I want to replicate few prefixes into another bucket.
I do not want to use the sync and replication service. I want to do it with lambda only and I have ongoing migration running so data is coming in every 20 minutes, no transformation is required.
Can someone help me how I can use AWS Lambda for sync and copy?
You can configure a Trigger for an AWS Lambda function so that it is invoked when an object is added to the Amazon S3 bucket. The function will receive information about the object that triggered the function, including its Bucket and Key.
You should code the Lambda function to call CopyObject() to copy that object to the desired destination bucket. Here's an example in Python:
import boto3
import urllib
def lambda_handler(event, context):
TARGET_BUCKET = 'my-target-bucket'
# Get the bucket and object key from the Event
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
# Copy object
s3_client = boto3.client('s3')
s3_client.copy_object(
Bucket = TARGET_BUCKET,
Key = key,
CopySource= {'Bucket': bucket, 'Key': key}
)

Generate presigned s3 URL of latest object in the bucket using boto3

I have a s3 bucket with multiple folders. How can I generate s3 presigned URL for a latest object using python boto3 in aws for each folder asked by a user?
You can do something like
import boto3
from botocore.client import Config
import requests
bucket = 'bucket-name'
folder = '/' #you can add folder path here don't forget '/' at last
s3 = boto3.client('s3',config=Config(signature_version='s3v4'))
objs = s3.list_objects(Bucket=bucket, Prefix=folder)['Contents']
latest = max(objs, key=lambda x: x['LastModified'])
print(latest)
print (" Generating pre-signed url...")
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': bucket,
'Key': latest['Key']
}
)
print(url)
response = requests.get(url)
print(response.url)
here it will give the latest last modified file from the whole bucket however you can update login and update prefix value as per need.
if you are using Kubernetes POD, VM, or anything you can pass environment variables or use the python dict to store the latest key if required.
If it's a small bucket then recursively list the bucket, with prefix as needed. Sort the results by timestamp, and create the pre-signed URL for the latest.
If it's a very large bucket, this will be very inefficient and you should consider other ways to store the key of the latest file. For example: trigger a Lambda function whenever an object is uploaded and write that object's key into a LATEST item in DynamoDB (or other persistent store).

Boto3 get only S3 buckets of specific region

The following code sadly lists all buckets of all regions and not only from "eu-west-1" as specified. How can I change that?
import boto3
s3 = boto3.client("s3", region_name="eu-west-1")
for bucket in s3.list_buckets()["Buckets"]:
bucket_name = bucket["Name"]
print(bucket["Name"])
s3 = boto3.client("s3", region_name="eu-west-1")
connects to S3 API endpoint in eu-west-1. It doesn't limit the listing to eu-west-1 buckets. One solution is to query the bucket location and filter.
s3 = boto3.client("s3")
for bucket in s3.list_buckets()["Buckets"]:
if s3.get_bucket_location(Bucket=bucket['Name'])['LocationConstraint'] == 'eu-west-1':
print(bucket["Name"])
If you need a one liner using Python's list comprehension:
region_buckets = [bucket["Name"] for bucket in s3.list_buckets()["Buckets"] if s3.get_bucket_location(Bucket=bucket['Name'])['LocationConstraint'] == 'eu-west-1']
print(region_buckets)
The solution above does not always work for buckets in some US regions because the 'LocationConstraint' can be null. Here is another solution:
s3 = boto3.client("s3")
for bucket in s3.list_buckets()["Buckets"]:
if s3.head_bucket(Bucket=bucket['Name'])['ResponseMetadata']['HTTPHeaders']['x-amz-bucket-region'] == 'us-east-1':
print(bucket["Name"])
The SDK method:
s3.head_bucket(Bucket=[INSERT_BUCKET_NAME_HERE])['ResponseMetadata']['HTTPHeaders']['x-amz-bucket-region']
... should always give you the bucket region. Thanks to sd65 for the tip: https://github.com/boto/boto3/issues/292