AWS lambda extensions not sharing environment variables - amazon-web-services

The following is a secrets-wrapper.py lambda extension:
#!/usr/bin/env python3
import boto3
import json
import os
LAMBDA_WRAPPER_NAME = "secrets-wrapper"
def debug_print(message):
print(message, flush=True)
def error_print(message):
print(message, flush=True)
def load_secrets():
if not 'APP_PARAMS_PATH' in os.environ:
error_print(
f"{LAMBDA_WRAPPER_NAME} unable to load secrets APP_PARAMS_PATH not found")
return
app_config_path = os.environ['APP_PARAMS_PATH']
if 'APP_ENV' in os.environ:
env = os.environ['APP_ENV']
full_config_path = '/' + env + '/' + app_config_path
else:
full_config_path = '/' + app_config_path
debug_print(
f"{LAMBDA_WRAPPER_NAME} Loading secrets for path {full_config_path}")
try:
client = boto3.client('ssm')
# Get all parameters for this app
param_details = client.get_parameters_by_path(
Path=full_config_path,
Recursive=False,
WithDecryption=True
)
# Loop through the returned parameters and populate the ConfigParser
if 'Parameters' in param_details:
for param in param_details.get('Parameters'):
debug_print(f"Found param: {param}")
param_name = param.get('Name').split("/")[-1]
debug_print(f"Name: {param_name}")
param_value = param.get('Value')
debug_print(f"Value: {param_value}")
os.environ[param_name] = param_value
except:
debug_print(f"{LAMBDA_WRAPPER_NAME} could not load from SSM")
raise Exception(f"{LAMBDA_WRAPPER_NAME} could not load from SSM")
def main():
load_secrets()
# Start the function
args = os.sys.argv[1:]
os.system(" ".join(args))
if __name__ == "__main__":
main()
and this is the function:
import json
import os
def lambda_handler(event, context):
# TODO implement
print(f"ENV foo check: {`os.environ.get('foo')`}", flush=True)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
when setting the APP_ENV to the lambda function:
aws lambda update-function-configuration --function-name my-function-example \
--environment "Variables={APP_NAME=myApp}"
within the lambda-extension it fetches no problem:
env = os.environ['APP_ENV'] #dev
However, when setting the environment variables within the extension os.environ[param_name] = param_value
when fetching it back in the invoked function os.environ.get('foo')
it is returned as None as if it was not set. If I output the set environment variable anywhere in the extension it will be indeed set.
Question is:
is the environment variable strictly one direction?
if not, how can I share ENV variables from extension to runtime function?

Related

How to authenticate requests made to AWS AppSync in Python?

I have a website with a backend of AWS Amplify. For a post-payment function, I am creating a lambda function to update the database. I am trying to query certain fields with the help of AppSync and then run a mutation. This is my function code:
import json
import boto3
import os
import decimal
import requests
from requests_aws4auth import AWS4Auth
def lambda_handler(event, context):
dynamoDB = boto3.resource('dynamodb', region_name='ap-northeast-1')
// load event data (hidden)
userid = sentData.get("userid")
slots = sentData.get("slots")
url = os.environ.get("AWS_GRAPHQL_API_ENDPOINT")
api_key = os.environ.get("AWS_GRAPHQL_API_KEY")
session = requests.Session()
query = """
query MyQuery {
getUserPlan(id: "9ddf437a-55b1-445d-8ae6-254c77493c30") {
traits
traitCount
}
}
"""
credentials = boto3.session.Session().get_credentials()
session.auth = AWS4Auth(
credentials.access_key,
credentials.secret_key,
'ap-northeast-1',
'appsync',
session_token=credentials.token
)
# response = session.request(
# url=url,
# method="POST",
# json={"query": query},
# headers={"Authorization": api_key},
# )
# response = requests.post(
# url=url,
# json={"query": query},
# headers={"x-api-key": api_key}
# )
response = session.request(
url=url,
method="POST",
json={"query": query},
);
print(response.json())
return {
"statusCode": 200,
}
I get the following error when I execute the function:
{'data': {'getUserPlan': None}, 'errors': [{'path': ['getUserPlan'], 'data': None, 'errorType': 'Unauthorized', 'errorInfo': None, 'locations': [{'line': 3, 'column': 9, 'sourceName': None}], 'message': 'Not Authorized to access getUserPlan on type UserPlan'}]}
I have referred to this and this. I have tried their solutions but they haven't worked for me. I have confirmed that all the environment variables are working properly and even added the local aws-cli iam user to the custom-roles.json file for admin privileges by Amplify. When I was trying with the API Key, I made sure that it hadn't expired as well.
I figured out how to fix it. I had to create a function through the amplify-cli, give it access to the api, push the function and then add the name of the role to adminRoleNames in custom-roles.json

Lambda task timeout but no application log

I have a python lambda that triggers by S3 uploads to a specific folder. The lambda function is to process the uploaded file and outputs it to another folder on the same S3 bucket.
The issue is that when I do a bulk upload using AWS console, some files do not get processed. I ended up setting a dead letter queue to catch these invocations. While inspecting the message in the queue, there is a request ID which I tried to find it in the lambda logs.
These are the logs for the request ID:
Now the odd part is that in the python code, the first line after the imports is print('Loading function') which does not show up in the lambda log?
Added the python code here. It should still print the Processing file name: " + key which is inside the handler ya?
import urllib.parse
from datetime import datetime
import boto3
from constants import CONTENT_TYPE, XML_EXTENSION, VALIDATING
from xml_process import *
from s3Integration import download_file
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
# Get the object from the event and show its content type
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
print("Processing file name: " + key)
try:
response = s3.get_object(Bucket=bucket, Key=key)
xml_content = response["Body"].read()
content_type = response["ContentType"]
tree = ET.fromstring(xml_content)
key_file_name = key.split("/")[1]
# Creating a temporary copy by downloading file to get the namespaces
temp_file_name = "/tmp/" + key_file_name
download_file(key, temp_file_name)
namespaces = {node[0]: node[1] for _, node in ET.iterparse(temp_file_name, events=['start-ns'])}
for name, value in namespaces.items():
ET.register_namespace(name, value)
# Preparing path for file processing
processed_file = key_file_name.split(".")[0] + "_processed." + key_file_name.split(".")[1]
print(processed_file, "processed")
db_record = XMLMapping(file_path=key,
processed_file_path=processed_file,
uploaded_by="lambda",
status=VALIDATING, uploaded_date=datetime.now(), is_active=True)
session.add(db_record)
session.commit()
if key_file_name.split(".")[1] == XML_EXTENSION:
if content_type in CONTENT_TYPE:
xml_parse(tree, db_record, processed_file, True)
else:
print("Content Type is not valid. Provided value: ", content_type)
else:
print("File extension is not valid. Provided extension: ", key_file_name.split(".")[1])
return "success"
except Exception as e:
print(e)
raise e
I don't think its a permission issue as other files uploaded in the same batch were processed successfully.

Botocore Stubber - Unable to locate credentials

I'm working on unit tests for my lambda which is getting some files from S3, processing them and loading data from them to DynamoDB. I created botocore stubbers that are used during tests, but I got botocore.exceptions.NoCredentialsError: Unable to locate credentials
My lambda handler code
s3_client = boto3.client('s3')
ddb_client = boto3.resource('dynamodb', region_name='eu-west-1')
def lambda_handler(event, context):
for record in event['Records']:
s3_event = record.get('s3')
bucket = s3_event.get('bucket', {}).get('name', '')
file_key = s3_event.get('object', {}).get('key', '')
file = s3_client.get_object(Bucket=bucket, Key=file_key)
and tests file:
class TestLambda(unittest.TestCase):
def setUp(self) -> None:
self.session = botocore.session.get_session()
# S3 Stubber Set Up
self.s3_client = self.session.create_client('s3', region_name='eu-west-1')
self.s3_stubber = Stubber(self.s3_client)
# DDB Stubber Set Up
self.ddb_resource = boto3.resource('dynamodb', region_name='eu-west-1')
self.ddb_stubber = Stubber(self.ddb_resource.meta.client)
def test_s3_to_ddb_handler(self) -> None:
event = {}
with self.s3_stubber:
with self.ddb_stubber:
response = s3_to_ddb_handler.s3_to_ddb_handler(event, ANY)
Issue seems to be that actual call to AWS resources is done which shouldnt be the case and stubber should be used, how can I force that?
You need to call .activate() on your Stubber instances: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/stubber.html#botocore.stub.Stubber.activate

Copying S3 objects from one account to other using Lambda python

I'm using boto3 to copy files from s3 bucket from one account to other. I need a similar functionality like aws s3 sync. Please see my code. My company has decided to 'PULL' from other S3 bucket (source account). Please don't suggest replication, S3 batch, S3 trigger Lambda..etc. We have gone through all these options and my management do not want to do any configuration at source side. Can you please review this code and let me know if this code works for thousands of objects. Source bucket has nearly 10000 objects. We will create this lambda function in destination account and create a cloudwatch event to trigger the lambda once in a day.
I am checking ETag so that modified files will be copied across when this function is triggered.
Edit: I simplified my code just to see pagination works. It's working if I don't add client.copy(). If I add this line in for loop after reading 3,4 objects it's throwing "errorMessage": "2021-08-07T15:29:07.827Z 82757747-7b72-4f29-ae9f-22e95f969d6c Task timed out after 3.00 seconds". Please advise. Please note that 'test/' folder in my source bucket has around 1100 objects.
import os
import logging
import botocore
logger = logging.getLogger()
logger.setLevel(os.getenv('debug_level', 'INFO'))
client = boto3.client('s3')
def handler(event, context):
main(event, logger)
def main(event, logger):
try:
SOURCE_BUCKET = os.environ.get('SRC_BUCKET')
DEST_BUCKET = os.environ.get('DST_BUCKET')
REGION = os.environ.get('REGION')
prefix = 'test/'
# Create a reusable Paginator
paginator = client.get_paginator('list_objects_v2')
print ('after paginator')
# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket=SOURCE_BUCKET,Prefix = prefix)
print ('after page iterator')
index = 0
for page in page_iterator:
for obj in page['Contents']:
index += 1
print ("I am looking for {} in the source bucket".format(obj['ETag']))
copy_source = {'Bucket': SOURCE_BUCKET, 'Key': obj['Key']}
client.copy(copy_source, DEST_BUCKET, obj['Key'])
logger.info("number of objects copied {}:".format(index))
except botocore.exceptions.ClientError as e:
raise
This version is working fine if I increase the Lambda timeout to 15 min and memory to 512MB. This checks if the source object already exists in destination before copying.
import boto3
import os
import logging
import botocore
from botocore.client import Config
logger = logging.getLogger()
logger.setLevel(os.getenv('debug_level', 'INFO'))
config = Config(connect_timeout=5, retries={'max_attempts': 0})
client = boto3.client('s3', config=config)
#client = boto3.client('s3')
def handler(event, context):
main(event, logger)
def main(event, logger):
try:
DEST_BUCKET = os.environ.get('DST_BUCKET')
SOURCE_BUCKET = os.environ.get('SRC_BUCKET')
REGION = os.environ.get('REGION')
prefix = ''
# Create a reusable Paginator
paginator = client.get_paginator('list_objects_v2')
print ('after paginator')
# Create a PageIterator from the Paginator
page_iterator_src = paginator.paginate(Bucket=SOURCE_BUCKET,Prefix = prefix)
page_iterator_dest = paginator.paginate(Bucket=DEST_BUCKET,Prefix = prefix)
print ('after page iterator')
index = 0
for page_source in page_iterator_src:
for obj_src in page_source['Contents']:
flag = "FALSE"
for page_dest in page_iterator_dest:
for obj_dest in page_dest['Contents']:
# checks if source ETag already exists in destination
if obj_src['ETag'] in obj_dest['ETag']:
flag = "TRUE"
break
if flag == "TRUE":
break
if flag != "TRUE":
index += 1
client.copy_object(Bucket=DEST_BUCKET, CopySource={'Bucket': SOURCE_BUCKET, 'Key': obj_src['Key']}, Key=obj_src['Key'],)
print ("source ETag {} and destination ETag {}".format(obj_src['ETag'],obj_dest['ETag']))
print ("source Key {} and destination Key {}".format(obj_src['Key'],obj_dest['Key']))
print ("Number of objects copied{}".format(index))
logger.info("number of objects copied {}:".format(index))
except botocore.exceptions.ClientError as e:
raise

AWS Lambda error No module named 'StringIO' or No module named 'StringIO'

I try to use AWS Lambda for mass email sending, the code we use as the link below:
https://aws.amazon.com/cn/premiumsupport/knowledge-center/mass-email-ses-lambda/
from __future__ import print_function
import StringIO
import csv
import json
import os
import urllib
import zlib
from time import strftime, gmtime
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import boto3
import botocore
import concurrent.futures
__author__ = 'Said Ali Samed'
__date__ = '10/04/2016'
__version__ = '1.0'
# Get Lambda environment variables
region = os.environ['us-east-1']
max_threads = os.environ['10']
text_message_file = os.environ['email_body.txt']
html_message_file = os.environ['email_body.html']
# Initialize clients
s3 = boto3.client('s3', region_name=region)
ses = boto3.client('ses', region_name=region)
send_errors = []
mime_message_text = ''
mime_message_html = ''
def current_time():
return strftime("%Y-%m-%d %H:%M:%S UTC", gmtime())
def mime_email(subject, from_address, to_address, text_message=None, html_message=None):
msg = MIMEMultipart('alternative')
msg['Subject'] = subject
msg['From'] = from_address
msg['To'] = to_address
if text_message:
msg.attach(MIMEText(text_message, 'plain'))
if html_message:
msg.attach(MIMEText(html_message, 'html'))
return msg.as_string()
def send_mail(from_address, to_address, message):
global send_errors
try:
response = ses.send_raw_email(
Source=from_address,
Destinations=[
to_address,
],
RawMessage={
'Data': message
}
)
if not isinstance(response, dict): # log failed requests only
send_errors.append('%s, %s, %s' % (current_time(), to_address, response))
except botocore.exceptions.ClientError as e:
send_errors.append('%s, %s, %s, %s' %
(current_time(),
to_address,
', '.join("%s=%r" % (k, v) for (k, v) in e.response['ResponseMetadata'].iteritems()),
e.message))
def lambda_handler(event, context):
global send_errors
global mime_message_text
global mime_message_html
try:
# Read the uploaded csv file from the bucket into python dictionary list
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')
response = s3.get_object(Bucket=bucket, Key=key)
body = zlib.decompress(response['Body'].read(), 16+zlib.MAX_WBITS)
reader = csv.DictReader(StringIO.StringIO(body),
fieldnames=['from_address', 'to_address', 'subject', 'message'])
# Read the message files
try:
response = s3.get_object(Bucket=bucket, Key=text_message_file)
mime_message_text = response['Body'].read()
except:
mime_message_text = None
print('Failed to read text message file. Did you upload %s?' % text_message_file)
try:
response = s3.get_object(Bucket=bucket, Key=html_message_file)
mime_message_html = response['Body'].read()
except:
mime_message_html = None
print('Failed to read html message file. Did you upload %s?' % html_message_file)
if not mime_message_text and not mime_message_html:
raise ValueError('Cannot continue without a text or html message file.')
# Send in parallel using several threads
e = concurrent.futures.ThreadPoolExecutor(max_workers=max_threads)
for row in reader:
from_address = row['from_address'].strip()
to_address = row['to_address'].strip()
subject = row['subject'].strip()
message = mime_email(subject, from_address, to_address, mime_message_text, mime_message_html)
e.submit(send_mail, from_address, to_address, message)
e.shutdown()
except Exception as e:
print(e.message + ' Aborting...')
raise e
print('Send email complete.')
# Remove the uploaded csv file
try:
response = s3.delete_object(Bucket=bucket, Key=key)
if 'ResponseMetadata' in response.keys() and response['ResponseMetadata']['HTTPStatusCode'] == 204:
print('Removed s3://%s/%s' % (bucket, key))
except Exception as e:
print(e)
# Upload errors if any to S3
if len(send_errors) > 0:
try:
result_data = '\n'.join(send_errors)
logfile_key = key.replace('.csv.gz', '') + '_error.log'
response = s3.put_object(Bucket=bucket, Key=logfile_key, Body=result_data)
if 'ResponseMetadata' in response.keys() and response['ResponseMetadata']['HTTPStatusCode'] == 200:
print('Send email errors saved in s3://%s/%s' % (bucket, logfile_key))
except Exception as e:
print(e)
raise e
# Reset publish error log
send_errors = []
if __name__ == "__main__":
json_content = json.loads(open('event.json', 'r').read())
lambda_handler(json_content, None)
but it has problem when i choose python 2.7.the error is
module initialization error 'us-east-1'
when i choose python 3.6 the error is
Unable to import module 'lambda_function': No module named 'StringIO'
anyone can tell me what is the problem it is ?
From Python v3, the StringIO module has gone. Instead, import the io module and use io.StringIO.
The problem with the v27 version is presumably that the following statement is failing:
region = os.environ['us-east-1']
This will result in a KeyError if us-east-1 is not an available environment variable. Instead use AWS_REGION or AWS_DEFAULT_REGION. See the full list of Lambda environment variables.
Please set the environment variables as described in step 4 of the article:
"Configure Lambda environment variables appropriate to your usage scenario. For example, the following variables would be valid for a given use case:
REGION=us-east-1, MAX_THREADS=10, TEXT_MESSAGE_FILE=email_body.txt, HTML_MESSAGE_FILE=email_body.html."
What was done (as per the code provided in the question) is replacing names of environment variables with their values, which means that python is looking for e.g. 'us-east-1' environment variable which isn't there...
This is the original code
# Get Lambda environment variables
region = os.environ['REGION']
max_threads = os.environ['MAX_THREADS']
text_message_file = os.environ['TEXT_MESSAGE_FILE']
html_message_file = os.environ['HTML_MESSAGE_FILE']
You can also hard-code the values, like below:
# Get Lambda environment variables
region = 'us-east-1'
max_threads = '10'
text_message_file = 'email_body.txt'
html_message_file = 'email_body.html'
but I'd suggest to set the environment variables instead (and use the version of script provided by the article author). When it comes to setting environment variables in Lambda, see this article :)