Share JPEG file stored on S3 via URL instead of downloading - amazon-web-services

I have recently completed this tutorial from AWS on how to create a thumbnail generator using lambda and S3: https://docs.aws.amazon.com/lambda/latest/dg/with-s3-tutorial.html . Basically, I'm uploading an image file to my '-source' bucket and then lambda generates a thumbnail and uploads it to my '-thumbnail' bucket.
Everything works as expected. However, I wanted to use s3 object URL in the '-thumbnail' bucket so that I can load the image from there for a small app I'm building. The issue I'm having is that the URL doesn't display the image in the browser but instead downloads the file. This causes my app to error out.
I did some research and learned that I had to change the content-type to image/jpeg and then also made the object public using ACL. This works for all of the other buckets I have except the one that has the thumbnail. I have recreated this bucket several times. I even copied the settings from my existing buckets. I have compared settings to all the other buckets and they appear to be the same.
I wanted to reach out and see if anyone has ran into this type of issue before. Or if there is something I might be missing.
Here is the code I'm using to generate the thumbnail.
import boto3
from boto3.dynamodb.conditions import Key, Attr
import os
import sys
import uuid
import urllib.parse
from urllib.parse import unquote_plus
from PIL.Image import core as _imaging
import PIL.Image
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['DB_TABLE_NAME'])
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
recordId = key
tmpkey = key.replace('/', '')
download_path = '/tmp/{}{}'.format(uuid.uuid4(), tmpkey)
upload_path = '/tmp/resized-{}'.format(tmpkey)
try:
s3.download_file(bucket, key, download_path)
resize_image(download_path, upload_path)
bucket = bucket.replace('source', 'thumbnail')
s3.upload_file(upload_path, bucket, key)
print(f"Thumbnail created and uploaded to {bucket} successfully.")
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
else:
s3.put_object_acl(ACL='public-read',
Bucket=bucket,
Key=key)
#create image url to add to dynamo
url = f"https://postreader-thumbnail.s3.us-west-2.amazonaws.com/{key}"
print(url)
#create record id to update the appropriate record in the 'Posts' table
recordId = key.replace('.jpeg', '')
#add the image_url column along with the image url as the value
table.update_item(
Key={'id':recordId},
UpdateExpression=
"SET #statusAtt = :statusValue, #img_urlAtt = :img_urlValue",
ExpressionAttributeValues=
{':statusValue': 'UPDATED', ':img_urlValue': url},
ExpressionAttributeNames=
{'#statusAtt': 'status', '#img_urlAtt': 'img_url'},
)
def resize_image(image_path, resized_path):
with PIL.Image.open(image_path) as image:
#change to standard/hard-coded size
image.thumbnail(tuple(x / 2 for x in image.size))
image.save(resized_path)

This could happen if the Content-Type of the file you're uploading is binary/octet-stream , you can modify your script like below to provide custom content-type while uploading.
s3.upload_file(upload_path, bucket, key, ExtraArgs={'ContentType':
"image/jpeg"})

After more troubleshooting the issue was apparently related to the bucket's name. I created a new bucket with a different name than it had previously. After doing so I was able to upload and share images without issue.
I edited my code so that the lambda uploads to the new bucket name and I am able to share the image via URL without downloading.

Related

Lambda task timeout but no application log

I have a python lambda that triggers by S3 uploads to a specific folder. The lambda function is to process the uploaded file and outputs it to another folder on the same S3 bucket.
The issue is that when I do a bulk upload using AWS console, some files do not get processed. I ended up setting a dead letter queue to catch these invocations. While inspecting the message in the queue, there is a request ID which I tried to find it in the lambda logs.
These are the logs for the request ID:
Now the odd part is that in the python code, the first line after the imports is print('Loading function') which does not show up in the lambda log?
Added the python code here. It should still print the Processing file name: " + key which is inside the handler ya?
import urllib.parse
from datetime import datetime
import boto3
from constants import CONTENT_TYPE, XML_EXTENSION, VALIDATING
from xml_process import *
from s3Integration import download_file
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
# Get the object from the event and show its content type
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
print("Processing file name: " + key)
try:
response = s3.get_object(Bucket=bucket, Key=key)
xml_content = response["Body"].read()
content_type = response["ContentType"]
tree = ET.fromstring(xml_content)
key_file_name = key.split("/")[1]
# Creating a temporary copy by downloading file to get the namespaces
temp_file_name = "/tmp/" + key_file_name
download_file(key, temp_file_name)
namespaces = {node[0]: node[1] for _, node in ET.iterparse(temp_file_name, events=['start-ns'])}
for name, value in namespaces.items():
ET.register_namespace(name, value)
# Preparing path for file processing
processed_file = key_file_name.split(".")[0] + "_processed." + key_file_name.split(".")[1]
print(processed_file, "processed")
db_record = XMLMapping(file_path=key,
processed_file_path=processed_file,
uploaded_by="lambda",
status=VALIDATING, uploaded_date=datetime.now(), is_active=True)
session.add(db_record)
session.commit()
if key_file_name.split(".")[1] == XML_EXTENSION:
if content_type in CONTENT_TYPE:
xml_parse(tree, db_record, processed_file, True)
else:
print("Content Type is not valid. Provided value: ", content_type)
else:
print("File extension is not valid. Provided extension: ", key_file_name.split(".")[1])
return "success"
except Exception as e:
print(e)
raise e
I don't think its a permission issue as other files uploaded in the same batch were processed successfully.

Copying S3 objects from one account to other using Lambda python

I'm using boto3 to copy files from s3 bucket from one account to other. I need a similar functionality like aws s3 sync. Please see my code. My company has decided to 'PULL' from other S3 bucket (source account). Please don't suggest replication, S3 batch, S3 trigger Lambda..etc. We have gone through all these options and my management do not want to do any configuration at source side. Can you please review this code and let me know if this code works for thousands of objects. Source bucket has nearly 10000 objects. We will create this lambda function in destination account and create a cloudwatch event to trigger the lambda once in a day.
I am checking ETag so that modified files will be copied across when this function is triggered.
Edit: I simplified my code just to see pagination works. It's working if I don't add client.copy(). If I add this line in for loop after reading 3,4 objects it's throwing "errorMessage": "2021-08-07T15:29:07.827Z 82757747-7b72-4f29-ae9f-22e95f969d6c Task timed out after 3.00 seconds". Please advise. Please note that 'test/' folder in my source bucket has around 1100 objects.
import os
import logging
import botocore
logger = logging.getLogger()
logger.setLevel(os.getenv('debug_level', 'INFO'))
client = boto3.client('s3')
def handler(event, context):
main(event, logger)
def main(event, logger):
try:
SOURCE_BUCKET = os.environ.get('SRC_BUCKET')
DEST_BUCKET = os.environ.get('DST_BUCKET')
REGION = os.environ.get('REGION')
prefix = 'test/'
# Create a reusable Paginator
paginator = client.get_paginator('list_objects_v2')
print ('after paginator')
# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket=SOURCE_BUCKET,Prefix = prefix)
print ('after page iterator')
index = 0
for page in page_iterator:
for obj in page['Contents']:
index += 1
print ("I am looking for {} in the source bucket".format(obj['ETag']))
copy_source = {'Bucket': SOURCE_BUCKET, 'Key': obj['Key']}
client.copy(copy_source, DEST_BUCKET, obj['Key'])
logger.info("number of objects copied {}:".format(index))
except botocore.exceptions.ClientError as e:
raise
This version is working fine if I increase the Lambda timeout to 15 min and memory to 512MB. This checks if the source object already exists in destination before copying.
import boto3
import os
import logging
import botocore
from botocore.client import Config
logger = logging.getLogger()
logger.setLevel(os.getenv('debug_level', 'INFO'))
config = Config(connect_timeout=5, retries={'max_attempts': 0})
client = boto3.client('s3', config=config)
#client = boto3.client('s3')
def handler(event, context):
main(event, logger)
def main(event, logger):
try:
DEST_BUCKET = os.environ.get('DST_BUCKET')
SOURCE_BUCKET = os.environ.get('SRC_BUCKET')
REGION = os.environ.get('REGION')
prefix = ''
# Create a reusable Paginator
paginator = client.get_paginator('list_objects_v2')
print ('after paginator')
# Create a PageIterator from the Paginator
page_iterator_src = paginator.paginate(Bucket=SOURCE_BUCKET,Prefix = prefix)
page_iterator_dest = paginator.paginate(Bucket=DEST_BUCKET,Prefix = prefix)
print ('after page iterator')
index = 0
for page_source in page_iterator_src:
for obj_src in page_source['Contents']:
flag = "FALSE"
for page_dest in page_iterator_dest:
for obj_dest in page_dest['Contents']:
# checks if source ETag already exists in destination
if obj_src['ETag'] in obj_dest['ETag']:
flag = "TRUE"
break
if flag == "TRUE":
break
if flag != "TRUE":
index += 1
client.copy_object(Bucket=DEST_BUCKET, CopySource={'Bucket': SOURCE_BUCKET, 'Key': obj_src['Key']}, Key=obj_src['Key'],)
print ("source ETag {} and destination ETag {}".format(obj_src['ETag'],obj_dest['ETag']))
print ("source Key {} and destination Key {}".format(obj_src['Key'],obj_dest['Key']))
print ("Number of objects copied{}".format(index))
logger.info("number of objects copied {}:".format(index))
except botocore.exceptions.ClientError as e:
raise

AWS Lambda : read image from S3 upload event

I am using Lambda to read image files when they are uploaded to S3 through a S3 trigger. The following is my code:
import json
import numpy as np
import face_recognition as fr
def lambda_handler(event, context):
for record in event['Records']:
bucket=record['s3']['bucket']['name']
key = record['s3']['object']['key']
print(bucket,key)
This correctly prints the bucket name and key. However how do I read the image so that I can run face-recognition module on the image. Can i generate the arn for each uploaded image and use it to read the same?
You can read the image from S3 directly:
s3 = boto3.client('s3')
resp = s3.get_object(Bucket=bucket, Key=key)
image_bytes = resp['Body'].read()

Uploading an image to a boto bucket

I am trying to an upload that I am retrieving from django forms to the amazon boto. But everytime I save it gets saved in first_part/second_part/third_part/amazon-sw/(required image) instead of getting saved in first_part/second_part/third_part.
I use the tinys3 library. I tried but found boto to be a little complex to use so used tinys3. Please do help me out.
access_key = aws_details.AWS_ACCESS_KEY_ID
secret_key = aws_details.AWS_SECRET_ACCESS_KEY
bucket_name = "s3-ap-southeast-1.amazonaws.com/first_part/second_part/third_part/"
myfile = request.FILES['image'] # getting the image from html view
fs = FileSystemStorage()
fs.save('demo_blah_blah.png', myfile) # saving the image
conn = tinys3.Connection(access_key, secret_key, tls=True, endpoint='s3-ap-southeast-1.amazonaws.com') # connecting to the bucket
f = open('demo_blah_blah.png', 'rb')
conn.upload('test_pic10000.png', f, bucket_name) # uploading to boto using tinys3 library

Lambda get image from s3

I am trying to get an image from my S3 bucket and return it for use in my API gateway.
Permissions are set correctly.
import boto3
s3 = boto3.resource('s3')
def handler(event, context):
image = s3.meta.client.download_file('mybucket', 'email-sig/1.png', '/tmp/1.png')
return image
however I am getting a null return and cannot seem to figure out how to get the image. Is this the correct approach, and why is it not returning my image.
You are downloading the image file which is in /tmp/1.png. What you are returning is the return value of download_file() which seems to be returning null. What data type does your API gateway expect?
I have images in s3 bucket and have to get or return that images,
First get the image and encoded to base64 format and the return that base64 format.
From that base64 format, I have just decoded base64 and Got the image and returned from API.
At the end of the code, it returns base64, go to the browser and search 'base64 to image',
and paste that returning base64 format you will get your s3 bucket image.
The following code will sure help someone.
import boto3
import base64
from boto3 import client
def lambda_handler(event, context):
user_download_img ='Name Of Your Image in S3'
print('user_download_img ==> ',user_download_img)
s3 = boto3.resource('s3')
bucket = s3.Bucket(u'Your-Bucket-Name')
obj = bucket.Object(key=user_download_img) #pass your image Name to key
response = obj.get() #get Response
img = response[u'Body'].read() # Read the respone, you can also print it.
print(type(img)) # Just getting type.
myObj = [base64.b64encode(img)] # Encoded the image to base64
print(type(myObj)) # Printing the values
print(myObj[0]) # get the base64 format of the image
print('type(myObj[0]) ================>',type(myObj[0]))
return_json = str(myObj[0]) # Assing to return_json variable to return.
print('return_json ========================>',return_json)
return_json = return_json.replace("b'","") # repplace this 'b'' is must to get absoulate image.
encoded_image = return_json.replace("'","")
return {
'status': 'True',
'statusCode': 200,
'message': 'Downloaded profile image',
'encoded_image':encoded_image # returning base64 of your image which in s3 bucket.
}
Now go to API gateway and create your API.