I have a s3 bucket with multiple folders. How can I generate s3 presigned URL for a latest object using python boto3 in aws for each folder asked by a user?
You can do something like
import boto3
from botocore.client import Config
import requests
bucket = 'bucket-name'
folder = '/' #you can add folder path here don't forget '/' at last
s3 = boto3.client('s3',config=Config(signature_version='s3v4'))
objs = s3.list_objects(Bucket=bucket, Prefix=folder)['Contents']
latest = max(objs, key=lambda x: x['LastModified'])
print(latest)
print (" Generating pre-signed url...")
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': bucket,
'Key': latest['Key']
}
)
print(url)
response = requests.get(url)
print(response.url)
here it will give the latest last modified file from the whole bucket however you can update login and update prefix value as per need.
if you are using Kubernetes POD, VM, or anything you can pass environment variables or use the python dict to store the latest key if required.
If it's a small bucket then recursively list the bucket, with prefix as needed. Sort the results by timestamp, and create the pre-signed URL for the latest.
If it's a very large bucket, this will be very inefficient and you should consider other ways to store the key of the latest file. For example: trigger a Lambda function whenever an object is uploaded and write that object's key into a LATEST item in DynamoDB (or other persistent store).
Related
I have a Lambda function that scans for text and is triggered by an S3 bucket. I get this error when trying to upload a photo directly into s3 bucket using browser
Unable to get object metadata from S3. Check object key, region, and/or access permissions
However, if I hardcode the key (e.g., image01.jpg) which is in my bucket, there are no errors.
import json
import boto3
def lambda_handler(event, context):
# Get bucket and file name
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
location = key[:17]
s3Client = boto3.client('s3')
client = boto3.client('rekognition', region_name='us-east-1')
response=client.detect_text(Image={'S3Object':
{'Bucket':'myarrowbucket','Name':key}})
detectedText = response['TextDetections']
I am confused as it was working a few weeks ago but now i am getting that error
ANSWER
I have seen this question answered many times and i tried every solution , the one which worked for me was 'key' name . i was getting the metadata error when the filename contained special characters e.g - or _ but when i changed the names of the files uploaded it works . Hope this answer helps someone.
I have a pipeline that moves approximately 1 TB of data, all CSV files. In this pipeline there are hundreds of files with different names. They have a date component, which is automatically partitioned. My question is how to use the CDK to automatically create subfolders based on the name of the file. In other words, the data comes in as broad category, but our data scientists need it at one more level of detail.
It appears that your requirement is to move incoming objects into folders based on information in their filename (Key).
This could be done by adding a trigger on the Amazon S3 bucket that triggers an AWS Lambda function when a new object is created.
Here is some code from Moving file based on filename with Amazon S3:
import boto3
import urllib
def lambda_handler(event, context):
# Get the bucket and object key from the Event
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
# Only copy objects that were uploaded to the bucket root (to avoid an infinite loop)
if '/' not in key:
# Determine destination directory based on Key
directory = key # Your logic goes here to extract the directory name
# Copy object
s3_client = boto3.client('s3')
s3_client.copy_object(
Bucket = bucket,
Key = f"{directory}/{key}",
CopySource= {'Bucket': bucket, 'Key': key}
)
# Delete source object
s3_client.delete_object(
Bucket = bucket,
Key = key
)
You would need to modify the code that determines the name of the destination directory based on the key of the new object.
It also assumes that new objects will come into the top-level (root) of the bucket and then be moved into sub-directories. If, instead, new objects are coming in a given path (eg incoming/) then only set the S3 trigger to operate on that path and remove the if '/' not in key logic.
import boto3
import json
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = "cloud-translate-output"
key = "key value"
try:
data = s3.get_object(Bucket=bucket, Key=key)
json_data = data["Body"].read()
return{
"response_code" : 200,
"data": str(json_data)
}
except Exception as e:
print (e)
raise e
I'm making ios app with xcode.
And I want to use aws to bring data from s3 to app in order of app-api gateway-lambda-s3. But is there a way to use the data of api using api in app, and if I upload this data to bucket number 1 of s3, the cloudformation will translate the uploaded text file and automatically save it to bucket number 2, and I want to import the text data file stored in bucket number 2 back to app through lambda, not key value, is there a way to use only the name of the bucket?
if I upload this data to bucket number 1 of s3, the cloudformation will translate the uploaded text file and automatically save it to bucket number 2
Sadly, this is not how CloudFormation work. It can't read or translate automatically any files from buckets, or upload them to new buckets.
I would stick with a lambda function. It is more suited to such tasks.
I want to write a downloaded file to Firebase storage using AWS Lambda.
I have already written a dynamic link for fire-storage
Can someone hint me how to do that? I have ready for s3 I want to store it in Firestorage now.
Only the index.html to store in fire storage name google-list/index.html
def lambda_handler(event, context):
url='https://www.google.com/index.html' # put your url here
bucket = 'google-list' #your s3 bucket
key = 'index.html' #your path
#write to s3 want to replace with firestorage.
#s3=boto3.client('s3')
#http=urllib3.PoolManager()
#s3.upload_fileobj(http.request('GET', url,preload_content=False), bucket, key, ExtraArgs={'ACL':'public-read'})```
Is there a way to upload file to AWS S3 with Tags(not add Tags to an existing File/Object in S3). I need to have the file appear in S3 with my Tags , ie in a single API call.
I need this because I use a Lambda Function (that uses these S3 object Tags) is triggered by S3 ObjectCreation
You can inform the Tagging attribute on the put operation.
Here's an example using Boto3:
import boto3
client = boto3.client('s3')
client.put_object(
Bucket='bucket',
Key='key',
Body='bytes',
Tagging='Key1=Value1'
)
As per the docs, the Tagging attribute must be encoded as URL Query parameters. (For example, "Key1=Value1")
Tagging — (String) The tag-set for the object. The tag-set must be
encoded as URL Query parameters. (For example, "Key1=Value1")
EDIT: I only noticed the boto3 tag after a while, so I edited my answer to match boto3's way of doing it accordingly.
Tagging directive is now supported by boto3. You can do the following to add tags if you are using upload_file()
import boto3
from urllib import parse
s3 = boto3.client("s3")
tags = {"key1": "value1", "key2": "value2"}
s3.upload_file(
"file_path",
"bucket",
"key",
ExtraArgs={"Tagging": parse.urlencode(tags)},
)
If you're uploading a file using client.upload_file() or other methods that have the ExtraArgs parameter, you specify the tags differently you need to add tags in a separate request. You can add metadata as follows, but this is not the same thing. For an explanation of the difference, see this SO question:
import boto3
client = boto3.client('s3')
client.upload_file(
Filename=path_to_your_file,
Bucket='bucket',
Key='key',
ExtraArgs={"Metadata": {"mykey": "myvalue"}}
)
There's an example of this on the S3 docs, but you have to know that "metadata" corresponds to tags be aware that metadata is not exactly the same thing as tags though it can function similarly.
s3.upload_file(
"tmp.txt", "bucket-name", "key-name",
ExtraArgs={"Metadata": {"mykey": "myvalue"}}
)