Convertapi in AWS lambda? - amazon-web-services

I need to deploy convertapi on an AWS Lambda.
If I try import convertapi with python, it doesn't work because I need to import it.
In AWS, we use local folder or ARN to deploy libraries.
Is there an available ARN for convertapi like in https://github.com/keithrozario/Klayers/blob/master/deployments/python3.7/arns/eu-west-3.csv ?
If not, which folder should I copy/paste in my lambda to be able to do import convertapi ?

This is an example in Python without using ConvertAPI library.
`requests` library is required to run this example.
It can be installed using
> pip install requests
or if you are using Python 3:
> pip3 install requests
'''
import requests
import os.path
import sys
file_path = './test.docx'
secret = 'Your secret can be found at https://www.convertapi.com/a'
if not os.path.isfile(file_path):
sys.exit('File not found: ' + file_path)
url = 'https://v2.convertapi.com/convert/docx/to/pdf?secret=' + secret
files = {'file': open(file_path, 'rb')}
headers = {'Accept': 'application/octet-stream'}
response = requests.post(url, files=files, headers=headers)
if response.status_code != 200:
sys.exit(response.text)
output_file = open('result.pdf', 'wb')
output_file.write(response.content)
output_file.close

Related

AWS Lambda /tmp python script import module error

I am trying to run a python script that is present in AWS Lambda /tmp directory. The scripts require some extra dependencies like boto3 etc to run the file. When AWS Lambda runs the file it gives out the following error:
ModuleNotFoundError: No module named 'boto3'
However when i run this file directly as a lambda function then it runs easily whithout any import errors.
The Lambda Code that is trying to execute the code present in /tmp directory :
import json
import os
import urllib.parse
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
records = [x for x in event.get('Records', []) if x.get('eventName') == 'ObjectCreated:Put']
sorted_events = sorted(records, key=lambda e: e.get('eventTime'))
latest_event = sorted_events[-1] if sorted_events else {}
info = latest_event.get('s3', {})
file_key = info.get('object', {}).get('key')
bucket_name = info.get('bucket', {}).get('name')
s3 = boto3.resource('s3')
BUCKET_NAME = bucket_name
keys = [file_key]
for KEY in keys:
local_file_name = '/tmp/'+KEY
s3.Bucket(BUCKET_NAME).download_file(KEY, local_file_name)
print("Running Incoming File !! ")
os.system('python ' + local_file_name)
The /tmp code that is trying to get some data from S3 using boto3 :
import sys
import boto3
import json
def main():
session = boto3.Session(
aws_access_key_id='##',
aws_secret_access_key='##',
region_name='##')
s3 = session.resource('s3')
# get a handle on the bucket that holds your file
bucket = s3.Bucket('##')
# get a handle on the object you want (i.e. your file)
obj = bucket.Object(key='8.json')
# get the object
response = obj.get()
# read the contents of the file
lines = response['Body'].read().decode()
data = json.loads(lines)
transactions = data['dataset']['fields']
print(str(len(transactions)))
return str(len(transactions))
main()
So boto3 is imported in both the codes . But its only successful when the lambda code is executing it . However /tmp code cant import boto3 .
What can be the reason and how can i resolve it ?
Executing another python process does not copy Lambda's PYTHONPATH by default:
os.system('python ' + local_file_name)
Rewrite like this:
os.system('PYTHONPATH=/var/runtime python ' + local_file_name)
In order to find out complete PYTHONPATH the current Lambda version is using, add the following to the first script (one executed by Lambda):
import sys
print(sys.path)

How to serve image from gcs using python 2.7 standard app engine?

The following code is almost verbatim copy of the sample code from Google to serve a file from Google Cloud Storage via Python 2.7 App Engine Standard Environment. When serving locally with command:
dev_appserver.py --default_gcs_bucket_name darianhickman-201423.appspot.com
import cloudstorage as gcs
import webapp2
class LogoPage(webapp2.RequestHandler):
def get(self):
bucket_name = "darianhickman-201423.appspot.com"
self.response.headers['Content-Type'] = 'image/jpeg'
self.response.headers['Message'] = "LogoPage"
gcs_file = gcs.open("/"+ bucket_name +'/logo.jpg')
contents = gcs_file.read()
gcs_file.close()
self.response.body.(contents)
app = webapp2.WSGIApplication([ ('/logo.jpg', LogoPage),
('/logo2.jpg', LogoPage)],
debug=True)
The empty body message I see on the console is:
NotFoundError: Expect status [200] from Google Storage. But got status 404.
Path: '/darianhickman-201423.appspot.com/logo.jpg'.
Request headers: None.
Response headers: {'date': 'Sun, 30 Dec 2018 18:54:54 GMT', 'connection': 'close', 'server': 'Development/2.0'}.
Body: ''.
Extra info: None.
Again this is almost identical to read logic documented at
https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/read-write-to-cloud-storage
If you serve it locally using dev_appserver.py, it runs a local emulation of Cloud Storage and does not connect to the actual Google Cloud Storage.
Try writing a file and then reading it. You’ll see that it will succeed.
Here is a sample:
import os
import cloudstorage as gcs
from google.appengine.api import app_identity
import webapp2
class MainPage(webapp2.RequestHandler):
def get(self):
bucket_name = os.environ.get('BUCKET_NAME',app_identity.get_default_gcs_bucket_name())
self.response.headers['Content-Type'] = 'text/plain'
filename = "/" + bucket_name + "/testfile"
#Create file
gcs_file = gcs.open(filename,
'w',
content_type='text/plain')
gcs_file.write('Hello world\n')
gcs_file.close()
#Read file and display content
gcs_file = gcs.open(filename)
contents = gcs_file.read()
gcs_file.close()
self.response.write(contents)
app = webapp2.WSGIApplication(
[('/', MainPage)], debug=True)
Run it with dev_appserver.py --default_gcs_bucket_name a-local-bucket .
If you deploy your application on Google App Engine then it will work (assuming you have a file called logo.jpg uploaded) because it connects to Google Cloud Storage. I tested it with minor changes:
import os
import cloudstorage as gcs
from google.appengine.api import app_identity
import webapp2
class LogoPage(webapp2.RequestHandler):
def get(self):
bucket_name = os.environ.get('BUCKET_NAME',app_identity.get_default_gcs_bucket_name())
#or you can use bucket_name = "<your-bucket-name>"
self.response.headers['Content-Type'] = 'image/jpeg'
self.response.headers['Message'] = "LogoPage"
gcs_file = gcs.open("/"+ bucket_name +'/logo.jpg')
contents = gcs_file.read()
gcs_file.close()
self.response.write(contents)
app = webapp2.WSGIApplication(
[('/', LogoPage)], debug=True)
Also, It's worth mentioning that the documentation for Using the client library with the development app server seems to be outdated, it states that:
There is no local emulation of Cloud Storage, all requests to read and
write files must be sent over the Internet to an actual Cloud Storage
bucket.
The team responsible for the documentation has already been informed about this issue.

Generate Signed URL in S3 using boto3

In Boto, I used to generate a signed URL using the below function.
import boto
conn = boto.connect_s3()
bucket = conn.get_bucket(bucket_name, validate=True)
key = bucket.get_key(key)
signed_url = key.generate_url(expires_in=3600)
How do I do the exact same thing in boto3?
I searched through boto3 GitHub codebase but could not find a single reference to generate_url.
Has the function name changed?
From Generating Presigned URLs:
import boto3
import requests
from botocore import client
# Get the service client.
s3 = boto3.client('s3', config=client.Config(signature_version='s3v4'))
# Generate the URL to get 'key-name' from 'bucket-name'
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': 'bucket-name',
'Key': 'key-name'
},
ExpiresIn=3600 # one hour in seconds, increase if needed
)
# Use the URL to perform the GET operation. You can use any method you like
# to send the GET, but we will use requests here to keep things simple.
response = requests.get(url)
Function reference: generate_presigned_url()
I get error InvalidRequestThe authorization mechanism you have provided is not supported when trying to access the url generated in normal browser – Aseem Apr 30 '19 at 5:22
As there isn't much info i am assuming you are getting signature version issue, if not maybe it will help someone else ! :P
For this you can import Config from botocore:-
from botocore.client import Config
and then get the client using this config and providing signature version as 's3v4'
s3 = boto3.client('s3', config=Config(signature_version='s3v4'))

Using python to update a file on google drive

I have the following script to upload a file unto google drive, using python27. As it is now it will upload a new copy of the file, but I want the existing file updated/overwritten. I can't find help in the Google Drive API references and guides for python. Any suggestions?
from __future__ import print_function
import os
from apiclient.discovery import build
from httplib2 import Http
from oauth2client import file, client, tools
try:
import argparse
flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
flags = None
# Gain acces to google drive
SCOPES = 'https://www.googleapis.com/auth/drive.file'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store, flags) \
if flags else tools.run(flow, store)
DRIVE = build('drive', 'v3', http=creds.authorize(Http()))
#The file that is being uploaded
FILES = (
('all-gm-keys.txt', 'application/vnd.google-apps.document'), #in google doc format
)
#Where the file ends on google drive
for filename, mimeType in FILES:
folder_id = '0B6V-MONTYPYTHONROCKS-lTcXc' #Not the real folder id
metadata = {'name': filename,'parents': [ folder_id ] }
if mimeType:
metadata['mimeType'] = mimeType
res = DRIVE.files().create(body=metadata, media_body=filename).execute()
if res:
print('Uploaded "%s" (%s)' % (filename, res['mimeType']))
I think that you are looking for the update method. Here is a link to the documentation. There is an example on overwriting the file in python.
I think that using the official google client api instead of pure http requests should make your task easier.
from apiclient import errors
from apiclient.http import MediaFileUpload
# ...
def update_file(service, file_id, new_title, new_description, new_mime_type,
new_filename, new_revision):
"""Update an existing file's metadata and content.
Args:
service: Drive API service instance.
file_id: ID of the file to update.
new_title: New title for the file.
new_description: New description for the file.
new_mime_type: New MIME type for the file.
new_filename: Filename of the new content to upload.
new_revision: Whether or not to create a new revision for this file.
Returns:
Updated file metadata if successful, None otherwise.
"""
try:
# First retrieve the file from the API.
file = service.files().get(fileId=file_id).execute()
# File's new metadata.
file['title'] = new_title
file['description'] = new_description
file['mimeType'] = new_mime_type
# File's new content.
media_body = MediaFileUpload(
new_filename, mimetype=new_mime_type, resumable=True)
# Send the request to the API.
updated_file = service.files().update(
fileId=file_id,
body=file,
newRevision=new_revision,
media_body=media_body).execute()
return updated_file
except errors.HttpError, error:
print 'An error occurred: %s' % error
return None
Link the example: https://developers.google.com/drive/api/v2/reference/files/update#examples

How to configure authorization mechanism inline with boto3

I am using boto3 in aws lambda to fecth object in S3 located in Frankfurt Region.
v4 is necessary. otherwise following error will return
"errorMessage": "An error occurred (InvalidRequest) when calling
the GetObject operation: The authorization mechanism you have
provided is not supported. Please use AWS4-HMAC-SHA256."
Realized ways to configure signature_version http://boto3.readthedocs.org/en/latest/guide/configuration.html
But since I am using AWS lambda, I do not have access to underlying configuration profiles
The code of my AWS lambda function
from __future__ import print_function
import boto3
def lambda_handler (event, context):
input_file_bucket = event["Records"][0]["s3"]["bucket"]["name"]
input_file_key = event["Records"][0]["s3"]["object"]["key"]
input_file_name = input_file_bucket+"/"+input_file_key
s3=boto3.resource("s3")
obj = s3.Object(bucket_name=input_file_bucket, key=input_file_key)
response = obj.get()
return event #echo first key valuesdf
Is that possible to configure signature_version within this code ? use Session for example. Or is there any workaround on this?
Instead of using the default session, try using custom session and Config from boto3.session
import boto3
import boto3.session
session = boto3.session.Session(region_name='eu-central-1')
s3client = session.client('s3', config= boto3.session.Config(signature_version='s3v4'))
s3client.get_object(Bucket='<Bkt-Name>', Key='S3-Object-Key')
I tried the session approach, but I had issues. This method worked better for me, your mileage may vary:
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
You will need to import Config from botocore.client in order to make this work. See below for a functional method to test a bucket (list objects). This assumes you are running it from an environment where your authentication is managed, such as Amazon EC2 or Lambda with a IAM Role:
import boto3
from botocore.client import Config
from botocore.exceptions import ClientError
def test_bucket(bucket):
print 'testing bucket: ' + bucket
try:
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
b = s3.Bucket(bucket)
objects = b.objects.all()
for obj in objects:
print obj.key
print 'bucket test SUCCESS'
except ClientError as e:
print 'Client Error'
print e
print 'bucket test FAIL'
To test it, simply call the method with a bucket name. Your role will have to grant proper permissions.
Using a resource worked for me.
from botocore.client import Config
import boto3
s3 = boto3.resource("s3", config=Config(signature_version="s3v4"))
return s3.meta.client.generate_presigned_url(
"get_object", Params={"Bucket": AIRFLOW_BUCKET, "Key": key}, ExpiresIn=expTime
)