How to mock multiple dynamodb tables using moto - amazon-web-services

I have a function create which uses 3 dynamodb tables. How do i mock three Dynamo db tables?
def create():
//This function uses a dynamodb table "x"
// Then it calls my_table() function
def my_table():
// This function basically uses two dynamodb table "y" and "z"
// This function returns a value which is used in create() function.
My test file has following code -
#patch.dict(os.environ, {"DYNAMODB_TABLE": "x",
'second_TABLE': "y",
'Third_TABLE': "z"
})
def test_create():
dynamodb_test()
event = { // my event values}
result = create(event)
assert result == 200
def dynamodb_test():
with mock_dynamodb2():
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
dynamodb.create_table(
TableName=os.environ["DYNAMODB_TABLE"],
KeySchema=[
{
'AttributeName': 'id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'id',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
yield dynamodb
whenever i am testing test_create() function using pytest , i am getting
botocore.exceptions.ClientError: An error occurred (ExpiredTokenException) when
calling the Scan operation: The security token included in the request is expired
I think its trying to access the actual aws dynamo db but i want it to use mock dynamodb. How can i achieve this ?

Moto only works when two conditions are met:
The logic to be tested is executed inside a Moto-context
The Moto-context is started before any boto3-clients (or resources) are created
The Moto-context in your example, with mock_dynamodb2(), is localized to the dynamodb_test-function. After the function finishes, the mock is no longer active, and Boto3 will indeed try to access AWS itself.
Solution
The following test-function would satisfy both criteria:
#patch.dict(os.environ, {"DYNAMODB_TABLE": "x",
'second_TABLE': "y",
'Third_TABLE': "z"
})
# Initialize the mock here, so that it is effective for the entire test duration
#mock_dynamodb2
def test_create():
dynamodb_test()
event = { // my event values}
# Ensure that any boto3-clients/resources created in the logic are initialized while the mock is active
from ... import create
result = create(event)
assert result == 200
def dynamodb_test():
# There is no need to start the mock-context again here, so create the table immediately
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
dynamodb.create_table(...)
The test code you provided does not talk about creating tables y and z - if the logic expects them to exist, you'd have to create them manually as well of course (just like table x was created in dynamodb_test.
Documentation for the import quirk can be found here: http://docs.getmoto.org/en/latest/docs/getting_started.html#recommended-usage

I believe this post is almost identical to yours. You could try this or utilize some of the other existing tools like localstack or dynamodb-local. The Python client for localstack for example: https://github.com/localstack/localstack-python-client
EDIT:
I see your title explains you want to use moto. I don't see you importing moto any of the moto modules into your code. See the last snippet in this page and replace s3 with either dynamodb, dynamodb2 (whichever you are using)

Related

Dynamically Insert/Update Item in DynamoDB With Python Lambda using event['body']

I am working on a lambda function that gets called from API Gateway and updates information in dynamoDB. I have half of this working really dynamically, and im a little stuck on updating. Here is what im working with:
dynamoDB table with a partition key of guild_id
My dummy json code im using:
{
"guild_id": "126",
"guild_name": "Posted Guild",
"guild_premium": "true",
"guild_prefix": "z!"
}
Finally the lambda code:
import json
import boto3
def lambda_handler(event, context):
client = boto3.resource("dynamodb")
table = client.Table("guildtable")
itemData = json.loads(event['body'])
guild = table.get_item(Key={'guild_id':itemData['guild_id']})
#If Guild Exists, update
if 'Item' in guild:
table.update_item(Key=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Updated Guild!')
return responseObject
#New Guild, Insert Guild
table.put_item(Item=itemData)
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps('Inserted Guild!')
return responseObject
The insert part is working wonderfully, How would I accomplish a similar approach with update item? Im wanting this to be as dynamic as possible so I can throw any json code (within reason) at it and it stores it in the database. I am wanting my update method to take into account adding fields down the road and handling those
I get the follow error:
Lambda execution failed with status 200 due to customer function error: An error occurred (ValidationException) when calling the UpdateItem operation: The provided key element does not match the schema.
A "The provided key element does not match the schema" error means something is wrong with Key (= primary key). Your schema's primary key is guild_id: string. Non-key attributes belong in the AttributeUpdate parameter. See the docs.
Your itemdata appears to include non-key attributes. Also ensure guild_id is a string "123" and not a number type 123.
goodKey={"guild_id": "123"}
table.update_item(Key=goodKey, UpdateExpression="SET ...")
The docs have a full update_item example.

Error when calling Lambda UDF from Redshift

When calling a python lambda UDF from my Redshift stored procedure i am getting the following error. Any idea what could be wrong ?
ERROR: Invalid External Function Response Detail:
-----------------------------------------------
error: Invalid External Function Response code:
8001 context: Extra rows in external function response query: 0
location: exfunc_data.cpp:330 process: padbmaster [pid=8842]
-----------------------------------------------
My Python Lambda UDF looks as follows.
def lambda_handler(event, context):
#...
result = DoJob()
#...
ret = dict()
ret['results'] = result
ret_json = json.dumps(ret)
return ret_json
The above lambda function is associated to an external function in Redshift by name send_email_lambda. The permissions and invocation works without any issues. I am calling the lambda function as follows.
select send_email_lambda('sebder#company.com',
'recipient1#company.com',
'sample body',
'sample subject);
Edit :
As requested , adding the event payload passed from redshift to lambda.
{
"user":"awsuser",
"cluster":"arn:aws:redshift:us-central-1:dummy:cluster:redshift-test-cluster",
"database":"sample",
"external_function":"lambda_send_email",
"query_id":178044,
"request_id":"20211b87-26c8-6d6a-a256-1a8568287feb",
"arguments":[
[
"sender#company.com",
"user1#company.com,user2#company.com",
"<html><h1>Hello Therer</h1><p>A sample email from redshift. Take care and stay safe</p></html>",
"Redshift email lambda UDF",
"None",
"None",
"text/html"
]
],
"num_records":1
}
It looks like a UDF can be passed multiple rows of data. So, it could receive a request to send multiple emails. The code needs to loop through each of the top-level array, then extract the values from the array inside that.
It looks like it then needs to return an array that is the same length as the input array.
For your code, create an array with one entry and then return the dictionary inside that.

How to see progress when using Glue to export DynamoDB table

I'm trying to export every item in a DynamoDB table to S3. I found this tutorial https://aws.amazon.com/blogs/big-data/how-to-export-an-amazon-dynamodb-table-to-amazon-s3-using-aws-step-functions-and-aws-glue/ and followed the example. Basically,
table = glueContext.create_dynamic_frame.from_options(
"dynamodb",
connection_options={
"dynamodb.input.tableName": table_name,
"dynamodb.throughput.read.percent": read_percentage,
"dynamodb.splits": splits
}
)
glueContext.write_dynamic_frame.from_options(
frame=table,
connection_type="s3",
connection_options={
"path": output_path
},
format=output_format,
transformation_ctx="datasink"
)
I tested it in a tiny table in nonprod environment and it works fine. But my Dynamo table in production is over 400GB, 200 mil items. I suppose it'll take a while, but I have no idea how long to expect. Hours, or even days? Are there any way to show progress? For example, showing a count of how many items have been processed. I don't want to blindly start this job and wait.
One way would be to enable continuous logging for your AWS Glue Job to monitor its progress.
Another way would be to trigger a Lambda function whenever a file has been stored in S3, using Amazon S3 event notifications.
Did you try the custom waiter class within was docs?
For instance custom waiter for a Glue Job should look something like this:
class JobCompleteWaiter(CustomWaiter):
def __init__(self, client):
super().__init__(
"JobComplete",
"get_job_run",
"JobRun.JobRunState",
{"SUCCEEDED": WaitState.SUCCEEDED, "FAILED": WaitState.FAILED},
client,
max_tries=100,
)
def wait(self, JobName, RunId):
self._wait(JobName=JobName, RunId=RunId)
According to boto3 docs, you should expect a set of 6 different possible states from a JOB: STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT'
So I chost checkein whether was SUCCEEDED or FAILED.

Mocking DynamoDB using moto + serverless

I am trying to write tests for a serverless application using the AWS serverless framework. I am facing a weird issue. Whenever I try to mock S3 or DynamoDB using moto, it does not work. Instead of mocking, the boto3 call actually goes to my AWS account and tries to do things there.
This is not desirable behaviour. Could you please help?
Sample Code:
import datetime
import boto3
import uuid
import os
from moto import mock_dynamodb2
from unittest import mock, TestCase
from JobEngine.job_engine import check_duplicate
class TestJobEngine(TestCase):
#mock.patch.dict(os.environ, {'IN_QUEUE_URL': 'mytemp'})
#mock.patch('JobEngine.job_engine.logger')
#mock_dynamodb2
def test_check_duplicate(self, mock_logger):
id = 'ABCD123'
db = boto3.resource('dynamodb', 'us-east-1')
table = db.create_table(
TableName='my_table',
KeySchema=[
{
'AttributeName': 'id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'id',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
table.meta.client.get_waiter('table_exists').wait(TableName='my_table')
table.put_item(
Item={
'id': {'S': id},
... other data ...
}
)
res = check_duplicate(id)
self.assertTrue(mock_logger.info.called)
self.assertEqual(res, True, 'True')
Please see the above code, I am trying to insert a record into the table and then call a function that would verify if the specified id is already present in the table. Here, I get an error table already exists when I run this code.
If I disable the network, I get an error:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://dynamodb.us-east-1.amazonaws.com/"
I fail to understand why there is an attempt to connect to AWS if we are trying to mock.
I did some digging and have finally managed to solve this.
See https://github.com/spulec/moto/issues/1793
This issue was due to some incompatibilities between boto and moto. Turns around that everything works fine when we downgrade botocore to 1.10.84

Using different settings on a aws lambda function coded in python

I'm using a lambda function, coded in python, as a backend to an aws-api-gateway method.
The api is completed, but now I have a new problem, the API should be deployed to multiple environments (production, test, etc), and each one should use a different configuration for the backend. Let's say that I had this handler:
import settings
import boto3
def dummy_handler(event, context):
logger.info('got event{}'.format(event))
utils = Utils(event["stage"])
response = utils.put_ticket_on_dynamodb(event["item"])
return json.dumps(response)
class Utils:
def __init__(self, stage):
self.stage = stage
def put_ticket_on_dynamodb(self, item):
# Write record to dynamoDB
try:
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(settings.TABLE_NAME)
table.put_item(Item=item)
except Exception as e:
logger.error("Fail to put item on DynamoDB: {0}".format(str(e)))
raise
logger.info("Item successfully written to DynamoDB")
return item
Now, in order to use a different TABLE_NAME on each stage, I replace the setting.py file by a module, with this structure:
settings/
__init__.py
_base.py
_servers.py
development.py
production.py
testing.py
Following this answer here.
But I don't have any idea of how can I use it on my solution, considering that stage (passed as parameter to the Utils class), will match the settings filename in the module settings, What should I change in my class Utils to make it works?
Another alternative to handling this use case is to use API Gateway's stage variables and pass in the setting which vary by stage as parameters to your Lambda function.
Stage variables are name-value pairs associated with a specific API deployment stage and act like environment variables for use in your API setup and mapping templates. For example, you can configure an API method in each stage to connect to a different backend endpoint by setting different endpoint values in your stage variables.
Here is a blog post on using stage variables.
Here is the full documentation on using stage variables.
I finally used a different approach here. Instead of a python module for the setting, I used a single script for the settings, with a dictionary containing the configuration for each environment. I would like to use a separate settings script for each environment, but so far I can't find how.
So, now my settings file looks like this:
COUNTRY_CODE = 'CL'
TIMEZONE = "America/Santiago"
LOCALE = "es_CL"
DEFAULT_PAGE_SIZE = 20
ENV = {
'production': {
'TABLE_NAME': "dynamodbTable",
'BUCKET_NAME': "sssBucketName"
},
'testing': {
'TABLE_NAME': "dynamodbTableTest",
'BUCKET_NAME': "sssBucketNameTest"
},
'test-invoke-stage': {
'TABLE_NAME': "dynamodbTableTest",
'BUCKET_NAME': "sssBucketNameTest"
}
}
And my code:
def put_ticket_on_dynamodb(self, item):
# Write record to dynamoDB
try:
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(settings.ENV[self.stage]["TABLE_NAME"])
table.put_item(Item=item)
except Exception as e:
logger.error("Fail to put item on DynamoDB: {0}".format(str(e)))
raise
logger.info("Item successfully written to DynamoDB")
return item