aws boto3 client Stubber help stubbing unit tests - python-2.7

I'm trying to write some unit tests for aws RDS. Currently, the start stop rds api calls have not yet been implemented in moto. I tried just mocking out boto3 but ran into all sorts of weird issues. I did some googling and found http://botocore.readthedocs.io/en/latest/reference/stubber.html
So I have tried to implement the example for rds but the code appears to be behaving like the normal client, even though I have stubbed it. Not sure what's going on or if I am stubbing correctly?
from LambdaRdsStartStop.lambda_function import lambda_handler
from LambdaRdsStartStop.lambda_function import AWS_REGION
def tests_turn_db_on_when_cw_event_matches_tag_value(self, mock_boto):
client = boto3.client('rds', AWS_REGION)
stubber = Stubber(client)
response = {u'DBInstances': [some copy pasted real data here], extra_info_about_call: extra_info}
stubber.add_response('describe_db_instances', response, {})
with stubber:
r = client.describe_db_instances()
lambda_handler({u'AutoStart': u'10:00:00+10:00/mon'}, 'context')
so the mocking WORKS for the first line inside the stubber and the value of r is returned as my stubbed data. When I try and go into my lambda_handler method inside my lambda_function.py and still use the stubbed client it behaves like a normal unstubbed client:
lambda_function.py
def lambda_handler(event, context):
rds_client = boto3.client('rds', region_name=AWS_REGION)
rds_instances = rds_client.describe_db_instances()
error output:
File "D:\dev\projects\virtual_envs\rds_sloth\lib\site-packages\botocore\auth.py", line 340, in add_auth
raise NoCredentialsError
NoCredentialsError: Unable to locate credentials

You will need to patch boto3 where it is called in the routine that you will be testing. Also Stubber responses appear to be consumed on each call and thus will require another add_response for each stubbed call as below:
def tests_turn_db_on_when_cw_event_matches_tag_value(self, mock_boto):
client = boto3.client('rds', AWS_REGION)
stubber = Stubber(client)
# response data below should match aws documentation otherwise more errors due to botocore error handling
response = {u'DBInstances': [{'DBInstanceIdentifier': 'rds_response1'}, {'DBInstanceIdentifierrd': 'rds_response2'}]}
stubber.add_response('describe_db_instances', response, {})
stubber.add_response('describe_db_instances', response, {})
with mock.patch('lambda_handler.boto3') as mock_boto3:
with stubber:
r = client.describe_db_instances() # first_add_response consumed here
mock_boto3.client.return_value = client
response=lambda_handler({u'AutoStart': u'10:00:00+10:00/mon'}, 'context') # second_add_response would be consumed here
# asert.equal(r,response)

Related

Pytest on Flask based API - test by calling the remote API

New to using Pytest on APIs. From my understanding, testing creates another instance of Flask. Additionally, from the tutorials I have seen, they also suggest to create a separate DB table instance to add, fetch and remove data for test purposes. However, I simply plan to use the remote api URL as host to simply make the call.
Now, I set my conftest like this, where the flag --testenv would indicate to make the get/post call on the host listed below:
import pytest
import subprocess
def pytest_addoption(parser):
"""Add option to pass --testenv=api_server to pytest cli command"""
parser.addoption(
"--testenv", action="store", default="exodemo", help="my option: type1 or type2"
)
#pytest.fixture(scope="module")
def testenv(request):
return request.config.getoption("--testenv")
#pytest.fixture(scope="module")
def testurl(testenv):
if testenv == 'api_server':
return 'http://api_url:5000/'
else:
return 'http://locahost:5000'
And my test file is written like this:
import pytest
from app import app
from flask import request
def test_nodes(app):
t_client = app.test_client()
truth = [
{
*body*
}
]
res = t_client.get('/topology/nodes')
print (res)
assert res.status_code == 200
assert truth == json.loads(res.get_data)
I run the code using this:
python3 -m pytest --testenv api_server
The thing I expect is that the test file would simply make a call to the remote api with the creds, fetch the data regardless of how it gets pulled in the remote code, and bring it here for assertion. However, I am getting the 400 BAD REQUEST error, with the error being like this:
assert 400 == 200
E + where 400 = <WrapperTestResponse streamed [400 BAD REQUEST]>.status_code
single_test.py:97: AssertionError
--------------------- Captured stdout call ----------------------
{"timestamp": "2022-07-28 22:11:14,032", "level": "ERROR", "func": "connect_to_mysql_db", "line": 23, "message": "Error connecting to the mysql database (2003, \"Can't connect to MySQL server on 'mysql' ([Errno -3] Temporary failure in name resolution)\")"}
<WrapperTestResponse streamed [400 BAD REQUEST]>
Does this mean that the test file is still trying to lookup the database locally for fetching? I am unable to figure out on which host are they sending the test url as well, so I am kind of stuck here. Looking to get some help around here.
Thanks.

How to run BigQuery after Dataflow job completed successfully

I am trying to run a query in BigQuery right after a dataflow job completes successfully. I have defined 3 different functions in main.py.
The first one is for running the dataflow job. The second one checks the dataflow jobs status. And the last one runs the query in BigQuery.
The trouble is the second function checks the dataflow job status multiple times for a period of time and after the dataflow job completes successfully, it does not stop checking the status.
And then function deployment fails due to 'function load attempt timed out' error.
from googleapiclient.discovery import build
from oauth2client.client import GoogleCredentials
import os
import re
import config
from google.cloud import bigquery
import time
global flag
def trigger_job(gcs_path, body):
credentials = GoogleCredentials.get_application_default()
service = build('dataflow', 'v1b3', credentials=credentials, cache_discovery=False)
request = service.projects().templates().launch(projectId=config.project_id, gcsPath=gcs_path, body=body)
response = request.execute()
def get_job_status(location, flag):
credentials=GoogleCredentials.get_application_default()
dataflow=build('dataflow', 'v1b3', credentials=credentials, cache_discovery=False)
result=dataflow.projects().jobs().list(projectId=config.project_id, location=location).execute()
for job in result['jobs']:
if re.findall(r'' + re.escape(config.job_name) + '', job['name']):
while flag==0:
if job['currentState'] != "JOB_STATE_DONE":
print('NOT DONE')
else:
flag=1
print('DONE')
break
def bq(sql):
client = bigquery.Client()
query_job = client.query(sql, location='US')
gcs_path = config.gcs_path
body=config.body
trigger_job(gcs_path,body)
flag=0
location='us-central1'
get_job_status(location,flag)
sql= """CREATE OR REPLACE TABLE 'table' AS SELECT * FROM 'table'"""
bq(SQL)
Cloud Function timeout is set to 540 seconds but deployment fails in 3-4 minutes.
Any help is very appreciated.
It appears from the code snippet provided that your HTTP-triggered cloud function is not returning a HTTP response.
All HTTP-based cloud functions must return a HTTP response for proper termination. From the google documentation Ensure HTTP functions send an HTTP response (Emphasis - mine):
If your function is HTTP-triggered, remember to send an HTTP response,
as shown below. Failing to do so can result in your function executing
until timeout. If this occurs, you will be charged for the entire
timeout time. Timeouts may also cause unpredictable behavior or cold
starts on subsequent invocations, resulting in unpredictable behavior
or additional latency.
Thus, you must have a function that in your main.py that returns some sort of value, ideally a value that can be coerced into a Flask http response.

How can I mock ECS with moto?

I want to create a mock ECS cluster, but it seems not to work properly. Although something is mocked (I don't get a credentials error), it seems not to "save" the cluster.
How can I create a mock cluster with moto?
MVCE
foo.py
import boto3
def print_clusters():
client = boto3.client("ecs")
print(client.list_clusters())
return client.list_clusters()["clusterArns"]
test_foo.py
import boto3
import pytest
from moto import mock_ecs
import foo
#pytest.fixture
def ecs_cluster():
with mock_ecs():
client = boto3.client("ecs", region_name="us-east-1")
response = client.create_cluster(clusterName="test_ecs_cluster")
yield client
def test_foo(ecs_cluster):
assert foo.print_clusters() == ["test_ecs_cluster"]
What happens
$ pytest test_foo.py
Test session starts (platform: linux, Python 3.8.1, pytest 5.3.5, pytest-sugar 0.9.2)
rootdir: /home/math/GitHub
plugins: black-0.3.8, mock-2.0.0, cov-2.8.1, mccabe-1.0, flake8-1.0.4, env-0.6.2, sugar-0.9.2, mypy-0.5.0
collecting ...
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― test_foo ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
ecs_cluster = <botocore.client.ECS object at 0x7fe9b0c73580>
def test_foo(ecs_cluster):
> assert foo.print_clusters() == ["test_ecs_cluster"]
E AssertionError: assert [] == ['test_ecs_cluster']
E Right contains one more item: 'test_ecs_cluster'
E Use -v to get the full diff
test_foo.py:19: AssertionError
---------------------------------------------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------------------------------------------
{'clusterArns': [], 'ResponseMetadata': {'HTTPStatusCode': 200, 'HTTPHeaders': {'server': 'amazon.com'}, 'RetryAttempts': 0}}
test_foo.py ⨯
What I expected
I expected the list of cluster ARNs to have one element (not the one in the assert statement, but an ARN). But the list is empty.
When creating a cluster, you're using a mocked ECS client.
When listing the clusters, you're creating a new ECS client outside the scope of moto.
In other words, you're creating a cluster in memory - but then ask AWS itself for a list of clusters.
You could rewrite the foo-method to use the mocked ECS client:
def print_clusters(client):
print(client.list_clusters())
return client.list_clusters()["clusterArns"]
def test_foo(ecs_cluster):
assert foo.print_clusters(ecs_cluster) == ["test_ecs_cluster"]
def test_foo(ecs_cluster):
assert foo.print_clusters(ecs_cluster) == ["test_ecs_cluster"]
#This will cause you a bug but I have fixed this bug ..so the code looks like this:
def cluster_list(ecs_cluster):
assert ecs.print_clusters(ecs_cluster) == ['arn:aws:ecs:us-east-1:123456789012:cluster/test_ecs_cluster']
#Explination
So basically you have passed the incorrect assert values..assert foo.print_clusters(ecs_cluster) --> this containes cluster arns it is in the form of an array which is ['arn:aws:ecs:us-east-1:123456789012:cluster/test_ecs_cluster'] and you are trying to access the [1] index and testing if its == "test_ecs_cluster" which is wrong so instead to that pass the full arn just to test your code ..

Internal Server Error when querying endpoint

I have a very simple lambda function that i created in aws. Please see below.
import json
print('Loading function')
def lambda_handler(event, context):
#1. Parse out query string params
userChestSize = event['userChestSize']
print('userChestSize= ' + userChestSize)
#2. Construct the body of the response object
transactionResponse = {}
transactionResponse['userChestSize'] = userChestSize
transactionResponse['message'] = 'Hello from Lambda'
#3. Construct http response object
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] = {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps(transactionResponse)
#4. Return the response object
return responseObject
Then I created a simple api with GET method. It generated a endpoint link for me to test my lambda. So when i use my link https://abcdefgh.execute-api.us-east-2.amazonaws.com/TestStage?userChestSize=30
I get
{"message": "Internal server error"}
Cloud Log has the following error
'userChestSize': KeyError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 7, in lambda_handler
userChestSize = event['userChestSize']
KeyError: 'userChestSize'
What am i doing wrong? I followed the basic instructions to create lambda and api gateway.
event['userChestSize'] does not exist. I suggest logging the entire event object so you can see what is actually in the event.

Lambda Python request athena error OutputLocation

I'm working with AWS Lambda and I would like to make a simple query in athena and store my data in an s3.
My code :
import boto3
def lambda_handler(event, context):
query_1 = "SELECT * FROM test_athena_laurent.stage limit 5;"
database = "test_athena_laurent"
s3_output = "s3://athena-laurent-result/lambda/"
client = boto3.client('athena')
response = client.start_query_execution(
QueryString=query_1,
ClientRequestToken='string',
QueryExecutionContext={
'Database': database
},
ResultConfiguration={
'OutputLocation': 's3://athena-laurent-result/lambda/'
}
)
return response
It works on spyder 2.7 but in AWS I have this error :
Parameter validation failed:
Invalid length for parameter ClientRequestToken, value: 6, valid range: 32-inf: ParamValidationError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 18, in lambda_handler
'OutputLocation': 's3://athena-laurent-result/lambda/'
I think that It doesn't understand my path and I don't know why.
Thanks
ClientRequestToken (string) --
A unique case-sensitive string used to ensure the request to create the query is idempotent (executes only once). If another StartQueryExecution request is received, the same response is returned and another query is not created. If a parameter has changed, for example, the QueryString , an error is returned. [Boto3 Docs]
This field is autopopulated if not provided.
If you are providing a string value for ClientRequestToken, ensure it is within length limits from 32 to 128.
Per #Tomalak's point ClientRequestToken is a string. However, per the documentation I just linked, you don't need it anyway when using the SDK.
This token is listed as not required because AWS SDKs (for example the AWS SDK for Java) auto-generate the token for users. If you are not using the AWS SDK or the AWS CLI, you must provide this token or the action will fail.
So, I would refactor as such:
import boto3
def lambda_handler(event, context):
query_1 = "SELECT * FROM some_database.some_table limit 5;"
database = "some_database"
s3_output = "s3://some_bucket/some_tag/"
client = boto3.client('athena')
response = client.start_query_execution(QueryString = query_1,
QueryExecutionContext={
'Database': database
},
ResultConfiguration={
'OutputLocation': s3_output
}
)
return response