I am trying to run a python script that is present in AWS Lambda /tmp directory. The scripts require some extra dependencies like boto3 etc to run the file. When AWS Lambda runs the file it gives out the following error:
ModuleNotFoundError: No module named 'boto3'
However when i run this file directly as a lambda function then it runs easily whithout any import errors.
The Lambda Code that is trying to execute the code present in /tmp directory :
import json
import os
import urllib.parse
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
records = [x for x in event.get('Records', []) if x.get('eventName') == 'ObjectCreated:Put']
sorted_events = sorted(records, key=lambda e: e.get('eventTime'))
latest_event = sorted_events[-1] if sorted_events else {}
info = latest_event.get('s3', {})
file_key = info.get('object', {}).get('key')
bucket_name = info.get('bucket', {}).get('name')
s3 = boto3.resource('s3')
BUCKET_NAME = bucket_name
keys = [file_key]
for KEY in keys:
local_file_name = '/tmp/'+KEY
s3.Bucket(BUCKET_NAME).download_file(KEY, local_file_name)
print("Running Incoming File !! ")
os.system('python ' + local_file_name)
The /tmp code that is trying to get some data from S3 using boto3 :
import sys
import boto3
import json
def main():
session = boto3.Session(
aws_access_key_id='##',
aws_secret_access_key='##',
region_name='##')
s3 = session.resource('s3')
# get a handle on the bucket that holds your file
bucket = s3.Bucket('##')
# get a handle on the object you want (i.e. your file)
obj = bucket.Object(key='8.json')
# get the object
response = obj.get()
# read the contents of the file
lines = response['Body'].read().decode()
data = json.loads(lines)
transactions = data['dataset']['fields']
print(str(len(transactions)))
return str(len(transactions))
main()
So boto3 is imported in both the codes . But its only successful when the lambda code is executing it . However /tmp code cant import boto3 .
What can be the reason and how can i resolve it ?
Executing another python process does not copy Lambda's PYTHONPATH by default:
os.system('python ' + local_file_name)
Rewrite like this:
os.system('PYTHONPATH=/var/runtime python ' + local_file_name)
In order to find out complete PYTHONPATH the current Lambda version is using, add the following to the first script (one executed by Lambda):
import sys
print(sys.path)
Related
I have a file with urls in my s3 bucket. I would like to use a python lambda function to upload the url files to s3 bucket.
For example my uploaded file to s3 contains:
http://...
http://...
Each line corresponds to a file to be uploaded into s3.
Here is the code:
import json
import urllib.parse
import boto3
import requests
import os
from gzip import GzipFile
from io import TextIOWrapper
import requests
print('Loading functions')
s3 = boto3.client('s3')
def get_file_seqs(response):
try:
size = response['ResponseMetadata']['HTTPHeaders']['content-length']
print("[+] Size retrieved")
return size
except:
print("[-] Size can not be retrieved")
def lambda_handler(event, context):
# Defining bucket objects
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
#get file from s3
print('[+] Getting file from S3 bucket')
response = s3.get_object(Bucket=bucket, Key=key)
try:
#checking file size
print('[+] Checking file size')
file_size = get_file_seqs(response)
if file_size == 0:
print('File size is equal to 0')
return False
else:
#create new directories
print('[+] Creating new directories')
bucket_name = "triggersnextflow"
directories = ['backups/sample/', 'backups/control/']
#loop to create new dirs
for dirs in directories:
s3.put_object(Bucket = bucket_name, Key = dirs, Body = '')
#NOW I WOULD LIKE TO DOWNLOAD THE FILES FROM THE URLS INSIDE S3 OBJECT
#return true
return True
except Exception as e:
print(e)
print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
raise e
Download an S3 object to a file:
import boto3
s3 = boto3.resource('s3')
s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')
You will find great resource of information here:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_file
I need to deploy convertapi on an AWS Lambda.
If I try import convertapi with python, it doesn't work because I need to import it.
In AWS, we use local folder or ARN to deploy libraries.
Is there an available ARN for convertapi like in https://github.com/keithrozario/Klayers/blob/master/deployments/python3.7/arns/eu-west-3.csv ?
If not, which folder should I copy/paste in my lambda to be able to do import convertapi ?
This is an example in Python without using ConvertAPI library.
`requests` library is required to run this example.
It can be installed using
> pip install requests
or if you are using Python 3:
> pip3 install requests
'''
import requests
import os.path
import sys
file_path = './test.docx'
secret = 'Your secret can be found at https://www.convertapi.com/a'
if not os.path.isfile(file_path):
sys.exit('File not found: ' + file_path)
url = 'https://v2.convertapi.com/convert/docx/to/pdf?secret=' + secret
files = {'file': open(file_path, 'rb')}
headers = {'Accept': 'application/octet-stream'}
response = requests.post(url, files=files, headers=headers)
if response.status_code != 200:
sys.exit(response.text)
output_file = open('result.pdf', 'wb')
output_file.write(response.content)
output_file.close
I know that I can upload single files like this:
bucket_name = "my-bucket-name"
bucket = client.get_bucket(bucket_name)
blob_name = "myfile.txt"
blob = bucket.blob(blob_name)
blob.upload_from_filename(blob_name)
How can I do the same with a folder? Is there something like blob.upload_from_foldername?
I tried the same code with replacing myfile.txt with myfoldername but it did not work.
FileNotFoundError: [Errno 2] No such file or directory: 'myfoldername'
This is the folder structure:
I assume something is wrong with the path but I am not sure what. I am executing the code from Untitled.ipynb. Works with myfile.txt but not with myfoldername.
I do not want to use a command line function.
You can't upload an empty folder or directory in Google Cloud Storage but you can create empty folder in Cloud Storage using the client:
from google.cloud import storage
def create_newfolder(bucket_name, destination_folder_name):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(destination_folder_name)
blob.upload_from_string('')
print('Created {} .'.format(destination_folder_name))
And if you are trying to upload a whole directory, you can use the codes below:
import glob
import os
from google.cloud import storage
client = storage.Client()
def upload_from_directory(directory_path: str, destination_bucket_name: str, destination_blob_name: str):
rel_paths = glob.glob(directory_path + '/**', recursive=True)
bucket = client.get_bucket(destination_bucket_name)
for local_file in rel_paths:
remote_path = f'{destination_blob_name}/{"/".join(local_file.split(os.sep)[1:])}'
if os.path.isfile(local_file):
blob = bucket.blob(remote_path)
blob.upload_from_filename(local_file)
I followed https://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html guide by AWS until the virtual env part.
My zip file structure looks like this -
bin
numpy
numpy-1.15.2.dist-info
myscript.py
I get an error when I upload the zip file to AWS Lambda. The error says -
{
"errorMessage": "Unable to import module 'testingUpload'"
}
All my script file contains is
import numpy
def lambda_handler(event, context):
print ("This is the test package")
When I upload the zip file without import numpy, it works fine.
def lambda_handler(event, context):
print ("This is the test package")
I am using boto3 in aws lambda to fecth object in S3 located in Frankfurt Region.
v4 is necessary. otherwise following error will return
"errorMessage": "An error occurred (InvalidRequest) when calling
the GetObject operation: The authorization mechanism you have
provided is not supported. Please use AWS4-HMAC-SHA256."
Realized ways to configure signature_version http://boto3.readthedocs.org/en/latest/guide/configuration.html
But since I am using AWS lambda, I do not have access to underlying configuration profiles
The code of my AWS lambda function
from __future__ import print_function
import boto3
def lambda_handler (event, context):
input_file_bucket = event["Records"][0]["s3"]["bucket"]["name"]
input_file_key = event["Records"][0]["s3"]["object"]["key"]
input_file_name = input_file_bucket+"/"+input_file_key
s3=boto3.resource("s3")
obj = s3.Object(bucket_name=input_file_bucket, key=input_file_key)
response = obj.get()
return event #echo first key valuesdf
Is that possible to configure signature_version within this code ? use Session for example. Or is there any workaround on this?
Instead of using the default session, try using custom session and Config from boto3.session
import boto3
import boto3.session
session = boto3.session.Session(region_name='eu-central-1')
s3client = session.client('s3', config= boto3.session.Config(signature_version='s3v4'))
s3client.get_object(Bucket='<Bkt-Name>', Key='S3-Object-Key')
I tried the session approach, but I had issues. This method worked better for me, your mileage may vary:
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
You will need to import Config from botocore.client in order to make this work. See below for a functional method to test a bucket (list objects). This assumes you are running it from an environment where your authentication is managed, such as Amazon EC2 or Lambda with a IAM Role:
import boto3
from botocore.client import Config
from botocore.exceptions import ClientError
def test_bucket(bucket):
print 'testing bucket: ' + bucket
try:
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
b = s3.Bucket(bucket)
objects = b.objects.all()
for obj in objects:
print obj.key
print 'bucket test SUCCESS'
except ClientError as e:
print 'Client Error'
print e
print 'bucket test FAIL'
To test it, simply call the method with a bucket name. Your role will have to grant proper permissions.
Using a resource worked for me.
from botocore.client import Config
import boto3
s3 = boto3.resource("s3", config=Config(signature_version="s3v4"))
return s3.meta.client.generate_presigned_url(
"get_object", Params={"Bucket": AIRFLOW_BUCKET, "Key": key}, ExpiresIn=expTime
)