How to programmatically add files to AWS S3 Bucket? - amazon-web-services

I am trying to write a lambda function that capture an image of my PCs webcam feed every time a trigger occurs. I want to programmatically add them to an S3 bucket without overriding them with the same key (like "image.jpg"). What's the best way to do something where the filename is incremented every time the function is called (ex: image1.jpg, image2.jpg, etc)? Note: I am using Boto3 to upload to S3 buckets.

You can store a current counter in DynamoDB or in parameter store.
Or just use a timestamp with enough resolution and be done with it.

an easy way to do this is to add date-time to the name of the image while storing/uploading to s3 , with this you always will have a new key name in your bucket. Code given below will do your work.
import boto3
import datetime
i = datetime.datetime.now()
ptr=str(i)
smg='group1'+ptr+'.jpeg'
s3 = boto3.resource('s3')
s3.meta.client.upload_file('local/file/group1.jpeg', 'bucket_name', smg)

Related

When uploading a file into aws s3 with boto3 is it possible to get the s3 object url as a return value?

I am uploading an image-file into AWS S3 using boto3 library. I noticed that the S3 object url ending does not match with the given Key. Is it possible to get the S3 object url as a return value from boto3 upload_file function?
example:
import boto3
s3 = boto3.client('s3')
file_location = ...
bucket = ...
folder = ...
filename = ...
url = s3.upload_file(
Filename=file_location,
Bucket=bucket,
Key=f'{folder}/{filename}',
)
I read from docs that it might be possible with a callback function, but I could not get it working with boto3.
If not what is the simplest way to get the uploaded object url?
Using the AWS SDK, you can get a URL for an object in an Amazon S3 bucket. I am not sure there is a Python example for this use case however, you can get an idea how to perform this task by looking at the Java example.
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/javav2/example_code/s3/src/main/java/com/example/s3/GetObjectUrl.java
Okey, my problem was that the the filenames I was adding to create the file Key contained a hashtag symbol which has a specific meaning in url. AWS was automatically changing the hashtags into %23, which created a mismatch between Key and URL. Now I changed the naming convention of the file into containing no hashtag, so no more problem occurs.

AWS Lambda create folder in S3 bucket

I have a Lambda that runs when files are uploaded to S3-A bucket and moves those files to another bucket S3-B. The challenge is that I need create a folder inside S3-B bucket with a corresponding date of uploaded files and move the files to the folder. Any help or ideas are greatly apprecited. It might sound confusing so feel free to ask questions.Thank you!
Here's a Lambda function that can be triggered by an Amazon S3 Event and move the object to another bucket:
import json
import urllib
from datetime import date
import boto3
DEST_BUCKET = 'bucket-b'
def lambda_handler(event, context):
s3_client = boto3.client('s3')
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
dest_key = str(date.today()) + '/' + key
s3_client.copy_object(
Bucket=DEST_BUCKET,
Key=dest_key,
CopySource=f'{bucket}/{key}'
)
The only thing to consider is timezones. The Lambda function runs in UTC and you might be expecting a slightly different date in your timezone, so you might need to adjust the time accordingly.
Just to clear up some confusion, in S3 there is no such thing as a folder. What you see in the interface is actually running the ListObjects using a prefix. The prefix is what you are seeing as the folder hierarchy.
To help illustrate this an object might have a key (which is a piece of metadata that defines its name) of folder/subfolder/file.txt, in the console you're actually using a prefix of folder/subfolder/*. This makes sense if you think of S3 more like a key value store, where the value is the object itself.
For this reason you can make a key on a prefix that has not existed before without creating any other hierarchical features.
In your Lambda function, you will need to download the files locally and then upload them to their new object key (remembering to delete the old object). Some SDKS will have an automated function that will perform all of these steps for you (such as Boto3 with the copy function).

Dynamically resizing images using AWS S3/Lambda

So I have a web app that stores images in a single bucket, following this principle (folder with the name of the user id, pictures with the name of user id + some random characters in the respected user id folder).
Now I already implemented a python script that takes uploaded image from a single bucket (root folder, or any folder I specify) and outputs it to another bucket/folder I specify. I'm just wondering if it's possible to do this in real time with my situation (I don't even need to export the resized pics to another bucket, they can stay in the same folder the original was uploaded to). This is part of the script I'm using right now. Any help appreciated.
s3_client = boto3.client('s3')
def resize_image(image_path, resized_path):
with Image.open(image_path) as image:
image.thumbnail((128, 128))
image.save(resized_path)
def handler(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
download_path = '/{}{}'.format(uuid.uuid4(), key)
upload_path = '/resized-{}'.format(key)
s3_client.download_file(bucket, key, download_path)
resize_image(download_path, upload_path)
s3_client.upload_file(upload_path, '{}-resized'.format(bucket), key)
Ah! It looks like you grabbed the sample code from the Lambda documentation: Tutorial: Using AWS Lambda with Amazon S3 - AWS Lambda
You can configure an Amazon S3 Event to trigger the AWS Lambda function whenever a new object is added to the S3 bucket. In fact, that is how the tutorial operates. This is effectively "real-time" because it triggers as soon as an object is uploaded. (Just configure the prefixes so it doesn't trigger an infinite loop.)
An alternative to resizing the images yourself is to use a service that can resize on-the-fly, such as:
Cloudinary
Imgix

AWS S3: Notification for files in particular folder

in S3 buckets we have a folder where incoming files are being placed. And then some of our system picks it up and processes it.
I want to know how many files in this folder is older than some period and then send a notification to corresponding team.
I.e. In S3 bucket, if some file arrived today and it's still there even after 3 hours, I want to get notified.
I am thinking to use boto python library to iterate through all the objects inside S3 bucket at schduled interval to check files are folder. And then send notification. However, this pulling solution doesn't seem good.
I am thinking to have some event based solution. I know, S3 has events which I can subscribe using either queue or lambda. However, I don't want to do any action as soon as I have file available, I just want to to check which files are older than some time and send email notification.
can we achieve this using event based solution?
Per hour we are expecting around 1000 files. Once file is processed they are moved to different folder. However if something goes wrong it will be there. So in day, I am not expecting more than 10,000 files in one bucket. Consider I have multiple buckets.
Itarate through S3 files to do that kind of filter is not a good idea. It can get very slow when you have more than a thousad of files in there. I would suggest you to use a database to store that records.
You can have a dynamodb with 2 columns: file name and upload date. Or, if budget is a problem, you can even have a sqlite3 file on the bucket, and fetch it whenever you need to query or add data to it. I did this using lambda, and it works just fine. Just don't forget to upload the file again when new records are inserted.
You could create an Amazon CloudWatch Event rule that triggers an AWS Lambda function at a desired time interval (eg every 5 minutes or once an hour).
The AWS Lambda function could list the desired folder looking for files older than a desired time period. It would be something like this:
import boto3
from datetime import datetime, timedelta, timezone
s3_client = boto3.client('s3')
paginator = s3_client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(
Bucket = 'my-bucket',
Prefix = 'to-be-processed/'
)
for page in page_iterator:
for object in page['Contents']:
if object['LastModified'] < datetime.now(tz=timezone.utc) - timedelta(hours=3):
// Print name of object older than given age
print(object['Key'])
You could then have it notify somebody. The easiest way would be to send a message to an Amazon SNS topic, and then people can subscribe to that topic via SMS or email to receive a notification.
The above code is quite simple in that it will find the same file every time, not just the new files that have been added to the notification period.

S3 bucket script to add timestamp in filename on upload

I'm looking for a way to add a timestamp in every file that is uploaded to an S3 bucket, Amazon-side. There is, of course, an option to do this client-side before the upload, but I don't think this is as nice and clean as it would be to have some script to run in the bucket itself everytime a new file is uploaded. I didn't find anything in the docs, though.
There is no capability within Amazon S3 to change the Key (filename) of a file based upon upload time.
Given that your desire is to avoid name conflicts, some choices are:
Use a unique GUID or a timestamp to name the file when uploading. This will avoid naming conflicts.
Upload the file to Bucket A, then use a Lambda function triggered on ObjectCreation to copy the object to Bucket B with a unique name based on timestamp
You can try with a lambda function handling the ObjectCreated event. See this tutorial.
Not sure that works though.