According to a recent update, the AWS medical comprehend service should now be returning snomed categories along with other medical terms.
I am running this in a Python 3.9 Lambda:
import json
def lambda_handler(event, context):
clinical_note = "Patient X was diagnosed with insomnia."
import boto3
cm_client = boto3.client("comprehendmedical")
response = cm_client.infer_snomedct(Text=clinical_note)
print (response)
I get the following response:
{
"errorMessage": "'ComprehendMedical' object has no attribute 'infer_snomedct'",
"errorType": "AttributeError",
"requestId": "560f1d3c-800a-46b6-a674-e0c3c3cc719f",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 7, in lambda_handler\n response = cm_client.infer_snomedct(Text=clinical_note)\n",
" File \"/var/runtime/botocore/client.py\", line 643, in __getattr__\n raise AttributeError(\n"
]
}
So either I am missing something (probably something obvious) or maybe the method is not actually available yet? Anything to set me on the right path would be welcome.
This is most likely due to the default Boto3 version being out of date on my lambda.
Related
I am new to AWS and following a tutorial to read data from my dynamo db to dialogdlow to use as a response, I have been following a tutorial and I get a Webhook call failed. Error:
UNAVAILABLE, State: URL_UNREACHABLE, Reason: UNREACHABLE_5xx, HTTP
status code: 500.
Someone suggested it due to lambda not working(the function is linked to an API gateway I posted the webhook on in Dialogflow), when I ran the function I got the Response
{
"errorMessage": "'queryResult'",
"errorType": "KeyError",
"requestId": "d51caf49-1cf2-4428-b42a-d8dbe9cdb4f8",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 18, in lambda_handler\n distance=event['queryResult']['parameters']['distance']\n"
]
}
Error and my code are attached below, someone said it has to with how I am accessing keys in my python dictionary but I fail to understand
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('places')
def findPlaces(distance):
response=table.get_item(Key={'distance': distance})
if 'Item' in response:
return{'fulfillmentText':response['Item']['place']}
else:
return{'fulfillmentText':'No Places within given distance'}
def lambda_handler(event, context):
distance=event['queryResult']['parameters']['distance']
return findPlaces(int(distance))
I also need help with how I would store data from Dialogflow responses back to dynamo DB, thank you
I have an audio file in S3.
I don't know the language of the audio file. So I need to use IdentifyLanguage for start_transcription_job().
LanguageCode will be blank since I don't know the language of the audio file.
Envirionment
Using
Python 3.8 runtime,
boto3 version 1.16.5 ,
botocore version: 1.19.5,
no Lambda Layer.
Here is my code for the Transcribe job:
mediaFileUri = 's3://'+ bucket_name+'/'+prefixKey
transcribe_client = boto3.client('transcribe')
response = transcribe_client.start_transcription_job(
TranscriptionJobName="abc",
IdentifyLanguage=True,
Media={
'MediaFileUri':mediaFileUri
},
)
Then I get this error:
{
"errorMessage": "Parameter validation failed:\nMissing required parameter in input: \"LanguageCode\"\nUnknown parameter in input: \"IdentifyLanguage\", must be one of: TranscriptionJobName, LanguageCode, MediaSampleRateHertz, MediaFormat, Media, OutputBucketName, OutputEncryptionKMSKeyId, Settings, ModelSettings, JobExecutionSettings, ContentRedaction",
"errorType": "ParamValidationError",
"stackTrace": [
" File \"/var/task/app.py\", line 27, in TranscribeSoundToWordHandler\n response = response = transcribe_client.start_transcription_job(\n",
" File \"/var/runtime/botocore/client.py\", line 316, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 607, in _make_api_call\n request_dict = self._convert_to_request_dict(\n",
" File \"/var/runtime/botocore/client.py\", line 655, in _convert_to_request_dict\n request_dict = self._serializer.serialize_to_request(\n",
" File \"/var/runtime/botocore/validate.py\", line 297, in serialize_to_request\n raise ParamValidationError(report=report.generate_report())\n"
]
}
With this error, means that I must specify the LanguageCode and IdentifyLanguage is an invalid parameter.
100% sure the audio file exist in S3. But without LanguageCode it don't work, and IdentifyLanguage parameter is unknown parameter
I using SAM application to test locally using this command:
sam local invoke MyHandler -e lambda\TheDirectory\event.json
And I cdk deploy, and check in Aws Lambda Console as well, tested it the same events.json, but still getting the same error
This I think is Lambda Execution environment, I didn't use any Lambda Layer.
I look at this docs from Aws Transcribe:
https://docs.aws.amazon.com/transcribe/latest/dg/API_StartTranscriptionJob.html
and this docs of boto3:
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job
Clearly state that LanguageCode is not required and IdentifyLanguage is a valid parameter.
So what I missing out? Any idea on this? What should I do?
Update:
I keep searching and asked couple person online, I think I should build the function container 1st to let SAM package the boto3 into the container.
So what I do is, cdk synth a template file:
cdk synth --no-staging > template.yaml
Then:
sam build --use-container
sam local invoke MyHandler78A95900 -e lambda\TheDirectory\event.json
But still, I get the same error, but post the stack trace as well
[ERROR] ParamValidationError: Parameter validation failed:
Missing required parameter in input: "LanguageCode"
Unknown parameter in input: "IdentifyLanguage", must be one of: TranscriptionJobName, LanguageCode, MediaSampleRateHertz, MediaFormat, Media, OutputBucketName, OutputEncryptionKMSKeyId, Settings, JobExecutionSettings, ContentRedaction
Traceback (most recent call last):
File "/var/task/app.py", line 27, in TranscribeSoundToWordHandler
response = response = transcribe_client.start_transcription_job(
File "/var/runtime/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 607, in _make_api_call
request_dict = self._convert_to_request_dict(
File "/var/runtime/botocore/client.py", line 655, in _convert_to_request_dict
request_dict = self._serializer.serialize_to_request(
File "/var/runtime/botocore/validate.py", line 297, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
Really no clue what I doing wrong here. I also report a github issue here, but seem like cant reproduce the issue.
Main Question/Problem:
Unable to start_transription_job
without LanguageCode
with IdentifyLanguage=True
What possible reason cause this, and how can I solve this problem(Dont know the languange of the audio file, I want to identify language of audio file without given the LanguageCode) ?
Check whether you are using the latest boto3 version.
boto3.__version__
'1.16.5'
I tried it and it works.
import boto3
transcribe = boto3.client('transcribe')
response = transcribe.start_transcription_job(TranscriptionJobName='Test-20201-27',IdentifyLanguage=True,Media={'MediaFileUri':'s3://BucketName/DemoData/Object.mp4'})
print(response)
{
"TranscriptionJob": {
"TranscriptionJobName": "Test-20201-27",
"TranscriptionJobStatus": "IN_PROGRESS",
"Media": {
"MediaFileUri": "s3://BucketName/DemoData/Object.mp4"
},
"StartTime": "datetime.datetime(2020, 10, 27, 15, 41, 2, 599000, tzinfo=tzlocal())",
"CreationTime": "datetime.datetime(2020, 10, 27, 15, 41, 2, 565000, tzinfo=tzlocal())",
"IdentifyLanguage": "True"
},
"ResponseMetadata": {
"RequestId": "9e4f94a4-20e4-4ca0-9c6e-e21a8934084b",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/x-amz-json-1.1",
"date": "Tue, 27 Oct 2020 14:41:02 GMT",
"x-amzn-requestid": "9e4f94a4-20e4-4ca0-9c6e-e21a8934084b",
"content-length": "268",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}
End up I notice this is because my packaged lambda function isn’t being uploaded for some reason. Here is how I solved it after getting help from couple of people.
First modify CDK stack which define my lambda function like this:
from aws_cdk import (
aws_lambda as lambda_,
core
)
from aws_cdk.aws_lambda_python import PythonFunction
class MyCdkStack(core.Stack):
def __init__(self, scope: core.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
# define lambda
my_lambda = PythonFunction(
self, 'MyHandler',
entry='lambda/MyHandler',
index='app.py',
runtime=lambda_.Runtime.PYTHON_3_8,
handler='MyHandler',
timeout=core.Duration.seconds(10)
)
This will use aws-lambda-python module ,it will handle installing all required modules into the docker.
Next, cdk synth a template file
cdk synth --no-staging > template.yaml
At this point, it will bundling all the stuff inside entry path which define in PythonFunction and install all the necessary dependencies defined in requirements.txt inside that entry path.
Next, build the docker container
$ sam build --use-container
Make sure template.yaml file in root directory. This will build a docker container, and the artifact will build inside .aws-sam/build directory in my root directory.
Last step, invoke the function using sam:
sam local invoke MyHandler78A95900 -e path\to\event.json
Now finally successfully call start_transcription_job as stated in my question above without any error.
In Conclusion:
At the very beginning I only pip install boto3, this only will
install the boto3 in my local system.
Then, I sam local invoke without build the container 1st by sam build --use-container
Lastly, I have sam build at last, but in that point, I didn't
bundle what defined inside requirements.txt into the
.aws-sam/build, therefore need to use aws-lambda-python
module as stated above.
I try to use the custom entity recognition I just trained on Amazon Web Service (AWS).
The training worked so far:
However, if i try to recognize my entities with AWS Lambda (code below) with the given ARN-Endpoint I get the following error (even tho AWS should use the latest Version of the botocore/boto3 framework "EntpointArn" is not available (Docs)):
Response:
{
"errorMessage": "Parameter validation failed:\nUnknown parameter in input: \"EndpointArn\", must be one of: Text, LanguageCode",
"errorType": "ParamValidationError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 21, in lambda_handler\n entities = client.detect_entities(\n",
" File \"/var/runtime/botocore/client.py\", line 316, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/var/runtime/botocore/client.py\", line 607, in _make_api_call\n request_dict = self._convert_to_request_dict(\n",
" File \"/var/runtime/botocore/client.py\", line 655, in _convert_to_request_dict\n request_dict = self._serializer.serialize_to_request(\n",
" File \"/var/runtime/botocore/validate.py\", line 297, in serialize_to_request\n raise ParamValidationError(report=report.generate_report())\n"
]
}
I fixed this error with the first 4 lines in my code:
#---The hack I found on stackoverflow----
import sys
from pip._internal import main
main(['install', '-I', '-q', 'boto3', '--target', '/tmp/', '--no-cache-dir', '--disable-pip-version-check'])
sys.path.insert(0,'/tmp/')
#----------------------------------------
import json
import boto3
client = boto3.client('comprehend', region_name='us-east-1')
text = "Thats my nice text with different entities!"
entities = client.detect_entities(
Text = text,
LanguageCode = "de", #If you specify an endpoint, Amazon Comprehend uses the language of your custom model, and it ignores any language code that you provide in your request.
EndpointArn = "arn:aws:comprehend:us-east-1:215057830319:entity-recognizer/MyFirstRecognizer"
)
However, I still get one more error I cannot fix:
Response:
{
"errorMessage": "An error occurred (ValidationException) when calling the DetectEntities operation: 1 validation error detected: Value 'arn:aws:comprehend:us-east-1:XXXXXXXXXXXX:entity-recognizer/MyFirstRecognizer' at 'endpointArn' failed to satisfy constraint: Member must satisfy regular expression pattern: arn:aws(-[^:]+)?:comprehend:[a-zA-Z0-9-]*:[0-9]{12}:entity-recognizer-endpoint/[a-zA-Z0-9](-*[a-zA-Z0-9])*",
"errorType": "ClientError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 25, in lambda_handler\n entities = client.detect_entities(\n",
" File \"/tmp/botocore/client.py\", line 316, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/tmp/botocore/client.py\", line 635, in _make_api_call\n raise error_class(parsed_response, operation_name)\n"
]
}
This error also occurs if I use the NodeJS framework with the given endpoint. The funny thing I should mention is that every ARN-Endpoint I found (in tutorials) looks exactly like mine and do not match with the regex-pattern returned as error.
I'm not quite sure if I do something wrong here or if it is a bug on the AWS-Cloud (or SDK).. Maybe somebody can reproduce this error and/or find a solution (or even a hack) for this problem
Cheers
Endpoint ARN are different AWS resource as compared to Model ARN. Model ARN refers to a custom model while endpoint hosts that model. In your code, your code you are passing in the modelARN instead of endpointARN which is causing the error to be raised.
You can differentiate between the two ARNs on the basis of the prefix.
endpoint arn - arn:aws:comprehend:us-east-1:XXXXXXXXXXXX:entity-recognizer-endpoint/xxxxxxxxxx
model arn - arn:aws:comprehend:us-east-1:XXXXXXXXXXXX:entity-recognizer/MyFirstRecognizer
You can read more about Comprehend custom endpoints and its pricing on the documentation page.
https://docs.aws.amazon.com/comprehend/latest/dg/detecting-cer-real-time.html
I'm working with AWS Lambda and I would like to make a simple query in athena and store my data in an s3.
My code :
import boto3
def lambda_handler(event, context):
query_1 = "SELECT * FROM test_athena_laurent.stage limit 5;"
database = "test_athena_laurent"
s3_output = "s3://athena-laurent-result/lambda/"
client = boto3.client('athena')
response = client.start_query_execution(
QueryString=query_1,
ClientRequestToken='string',
QueryExecutionContext={
'Database': database
},
ResultConfiguration={
'OutputLocation': 's3://athena-laurent-result/lambda/'
}
)
return response
It works on spyder 2.7 but in AWS I have this error :
Parameter validation failed:
Invalid length for parameter ClientRequestToken, value: 6, valid range: 32-inf: ParamValidationError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 18, in lambda_handler
'OutputLocation': 's3://athena-laurent-result/lambda/'
I think that It doesn't understand my path and I don't know why.
Thanks
ClientRequestToken (string) --
A unique case-sensitive string used to ensure the request to create the query is idempotent (executes only once). If another StartQueryExecution request is received, the same response is returned and another query is not created. If a parameter has changed, for example, the QueryString , an error is returned. [Boto3 Docs]
This field is autopopulated if not provided.
If you are providing a string value for ClientRequestToken, ensure it is within length limits from 32 to 128.
Per #Tomalak's point ClientRequestToken is a string. However, per the documentation I just linked, you don't need it anyway when using the SDK.
This token is listed as not required because AWS SDKs (for example the AWS SDK for Java) auto-generate the token for users. If you are not using the AWS SDK or the AWS CLI, you must provide this token or the action will fail.
So, I would refactor as such:
import boto3
def lambda_handler(event, context):
query_1 = "SELECT * FROM some_database.some_table limit 5;"
database = "some_database"
s3_output = "s3://some_bucket/some_tag/"
client = boto3.client('athena')
response = client.start_query_execution(QueryString = query_1,
QueryExecutionContext={
'Database': database
},
ResultConfiguration={
'OutputLocation': s3_output
}
)
return response
I have a Lambda function, which is a basic Python GET call to an API. It works fine locally, however when I upload to Lambda (along with the requests library) it will not return the JSON response from the API call. I simply want it to return the entire JSON object to the caller. Am I doing something fundamentally wrong here - I stumbled across a couple of articles saying that returning JSON from a Lambda Python function is not supported.
Here is the code:
import requests
import json
url = "http://url/api/projects/"
headers = {
'content-type': "application/json",
'x-octopus-apikey': "redacted",
'cache-control': "no-cache"
}
def lambda_handler(event, context):
response = requests.request("GET", url, headers=headers)
return response
My package contains the requests library and dist, and the json library (I don't think it needs this though). Error message returned is:
{
"stackTrace": [
[
"/usr/lib64/python2.7/json/__init__.py",
251,
"dumps",
"sort_keys=sort_keys, **kw).encode(obj)"
],
[
"/usr/lib64/python2.7/json/encoder.py",
207,
"encode",
"chunks = self.iterencode(o, _one_shot=True)"
],
[
"/usr/lib64/python2.7/json/encoder.py",
270,
"iterencode",
"return _iterencode(o, 0)"
],
[
"/var/runtime/awslambda/bootstrap.py",
41,
"decimal_serializer",
"raise TypeError(repr(o) + \" is not JSON serializable\")"
]
],
"errorType": "TypeError",
"errorMessage": "<Response [200]> is not JSON serializable"
}
I've resolved this - the problem with my Python code is that it was trying to return the entire response, rather than simply the JSON body (the code for my local version prints 'response.text'). In addition, I have ensured that the response is JSON formatted (rather than raw text). Updated code:
import requests
import json
url = "http://url/api/projects/"
headers = {
'content-type': "application/json",
'x-octopus-apikey': "redacted",
'cache-control': "no-cache"
}
def lambda_handler(event, context):
response = requests.request("GET", url, headers=headers)
try:
output = response.json()
except ValueError:
output = response.text
return output
I was also getting same error and for a while I was able to solve this by changing the response code in Lambda python3.6:
Change : response['Body'].read() to response['Body'].read().decode()
This way you will get a JSON though in my case I got / everywhere which I removed later.