How to change failure message for Alexa? - amazon-web-services

I want to change the default failure message in Alexa, Sorry, I'm having trouble accessing your {} skill right now.

You cannot change that prompt but you can code to avoid that as much as possible. The error happens when Alexa is not able to get a valid response from your skill endpoint. There can be multiple reasons to that as mentioned here
1. Your endpoint is giving an invalid response
This can be due to the errors/exceptions happening in your endpoint code. You can make sure that error/exceptions don't occur and if they occur, thre is code to catch them and provide a valid response back to Alexa, with an error message of your choice.
2. Your endpoint availability
Make sure that your endpoints are available all the time if you have configured them as an endpoint. This is pretty much guaranteed if you are using Lambda endpoints. But if you are your own hosted web service endpoint, then you must put in all the measures to keep it available for Alexa to communicate with it.
3. Your endpoint response time
Make sure that your endpoint gives back the response within the time period that Alexa expects it to get(guess its 10 seconds). Also make sure if you are using Lambda functions, you have configured them with reasonable execution time to avoid timeout errors.
If you cover the exception/error/availability scenarios well then you can avoid the default error message as much as possible.

Related

Intermittent Internal Server Error - StatusCode 500 on API Gateway calling Lambda

I have a REST API in AWS API Gateway that invokes a Python Lambda function and returns some result.
Most of the times this workflow works fine, meaning that the Lambda function is executed and passes the result back to the API, which in turn returns a 200 OK response.
However, there are few times in which I get a 500 error code from the API and the Lambda seems not to be even executed. The response.reason says: "Internal Server Error" and no additional information is given.
There is no difference between the failing requests and the successful ones to the API in terms of the method or parameters format.
One more comment is that the API has the cache setting enabled.
I've seen similar posts and some of the answers mention the format of the JSON object returned by the Lambda function, others point to IAM permissions issues, but none of those seem to be the cause here. In fact, as this post's title says this is an intermittent behavior: most of the times it works fine, but occasionally I get this error.
Any hint would be highly appreciated.
I have the same problem and in my case I had to enable Log full requests/responses data together with INFO logs on the API Gateway stage to see the following logs:
(xxx) Endpoint response body before transformations:
{
"Type": "Service",
"message": "INFO: Lambda is initializing your function. It will be ready to invoke shortly."
}
In my case the issue was related to the fact that the lambda was in Inactive state, which happens If a function remains idle for several weeks.
I have the same problem and I suspect a timeout maybe due to lambda reaching its memory limit.
I have set the memory limit to the next notch (128 -> 512) and augmented the timeout to 10s (default is 3), and now I'm able to see the timeout in action.
I still have the problem for the moment but now I'll be able to investigate.
I hope that this helps you.
I see this with a HTTP API integration. It's intermittent, and it appears to improve when adding provisioned concurrency to the Lambda. For example, on a Lambda that has between 4 and 10 concurrent instances, but usually hovers in the 4 to 8 range, purchasing between 5 and 6 provisioned concurrent instances helped reduce, possibly eliminate, these 500 errors.
I am still monitoring to see whether they are gone for good. The frequency of these errors has gone down drastically with the provisioned instances.

AWS lambda execution fails only first time I run it with 'customer function error'

I trigger a lambda function via API gateway and everything works perfectly with the one exception that the very first time I trigger it on a given day it fails.
Strangely, the lambda function logs don't show any errors. I get my usual START log statement and then the request and context of the trigger, then after 5s, it ends unexpectedly.
When I look into the API gateway logs this is the error it returns:
Lambda execution failed with status 200 due to customer function error: 2018-12-10T11:00:31.208Z cc233168-fc9n-11fc-a05a-577bb4sd2b2ccc Task timed out after 5.01 seconds.
Has anyone encountered a similar problem? What is customer function error and how may I resolve this?
without knowing much of the background code you are using, i would termed this a Cold Start. Cold start happens for the first request where your function has not be called for a very long time. If you notice error message says "Time Out after 5.01 seconds. which is default set. you can increase a time out.
Alternatively, you could consider reducing the impact of cold starts by reducing the length of cold starts reference :
by authoring your Lambda functions in a language that doesn’t incur a high cold start time — i.e. Node.js, Python, or Go
choose a higher memory setting for functions on the critical path of handling user requests (i.e. anything that the user would have to wait for a response from, including intermediate APIs)
optimizing your function’s dependencies, and package size
You can also explore by putting a cron job through Cloud Watch after every specific interval to call your API through PING
Adding to Yash's answer:
I've only seen Lambda execution failed with status 200 in API Gateway execution logs, though in case it can manifest in other ways: ensure you have execution logging enabled for the endpoint. If you didn't already have it enabled you'll need to wait for the problem to manifest again.
You can verify it's a cold start problem as follows:
In the log entry with the error grab the #logStream value and the timestamp for the event; it'll be a long string of alphanumerics like a4f8115980dc83a511eeedc493a78741
Open the log group for that endpoint's execution log -> find the log stream with the identifier you just grabbed
Narrow the date/time range to a window around the time where the event occurred
If you chose a narrow window and if it's a cold start problem: I would expect the offending request to be the first one in the list. Click the There are older events to load. Load more. at the top of the list.
You should now see a gap of time between the last request received and the offending request.
In my case the error says connection reset by peer which leads me to think it's behaving as though a virtual machine were put to sleep then awoken in the sense that it believes TCP connections it previously had open are still valid.
In the short term the solution we're going with is to implement a retry strategy.
Besides the cold-start problem, there's another potential aspect of this problem: your API Gateway access log format.
Do the following:
Find the access log entries that correspond to the offending request in the execution log.
Is the HTTP status == 502?
502s in the API Gateway access log usually (always?) indicate the Lambda responded with malformed JSON.
The most obvious reason for it returning malformed JSON is a bug in your code. One of the less obvious reasons: a mistake in the access log format.
If you suspect that's the case, look for the following:
Quoted fields that shouldn't be; eg $context.error.messageString
Un-quoted fields that should be. A common idiom is to leave numeric fields un-quoted because it makes insights queries like this work: | filter #status >= 500. As convenient as that is, if the field isn't guaranteed to produce a numeric result then the JSON response will be malformed.
Trailing commas in {} bodies
Here's the documentation for many of the the context variables, though one thing to keep in mind: the context variables that are available differ between the different API Gateway endpoint types (lambda, websocket, etc).

Alexa sent multiple request to AWS Lambda

I'm building the Alexa skill that sends the request to my web server,
then web server will do some process and upload a file to Amazon S3.
During the period of web server process, I make skill keep getting the file from Amazon S3 per 10 seconds till get the file. And the response is based on the file content.
But unfortunately, the web server process takes more than 1 minute. That means skill must stay more than 1 minute to get the file to response.
For now, I used progressive response with async await in my code,
and skill did keep waiting for the file on S3.
But I found that the skill will send the second request to Lambda after 50 seconds automatically. That means for the same skill, i got the two lambda function running at the same time.
And the execution result is : After the first response that progressive response made, 50 seconds later will hear another response that also made by the progressive response which belongs to the second request.
And nothing happened till the end.
I know it is bad to let skill waits this long, but i still want to figure out the executable way if skill needs to wait this long.
There are some points I want to figure out.
Is there anyway to prevent the skill to send the second
requests to Lambda?
Is there another way I can try to accomplish the goal?
Thanks
Eventually, I found that the second invoke of Lambda is not from Alexa, is from AWS Lambda itself. Refer to the following artical
https://cloudonaut.io/your-lambda-function-might-execute-twice-deal-with-it/
So you have to deal with this kind of situation in your Lambda code. One thing can be used is these two times invoke's request id is the same. So you can tell if this is the first time execution by checking your storage for the same request id which you store at the first time execution.
Besides, I also found that once the Alexa Skill waits for more than 1 minutes, it will crash and return the error by speaking (test by Amazon Echo). And there is nothing different in the AWS Lambda log compare to the normal execution one. That meaning the Log seems to be fine but actually the execution result is not.
Hope this can help someone is also struggled at this problem.

Workaround aws apigateway timeout with lambda - asynchronous processing

I have a serverless backend running on lambda. The runtime usually varies betweeen 40-250s which is over the apigateway max allowed runtime (29s). As such I think my only option is to resort to asynchronous processing. I get the idea behind it, but help online seems sparse and I'd like to know if there are any best practices out there? Or what would be the simplest way for me to get around this timeout problem–using asynchronous processing or other?
It really depends on your use case. But probably an asynchronous approach is best fitted for this scenario given that it's not usually a good idea from the calling side of your API to wait 250 seconds to get the reply back (probably that's why the 29s limitation on API Gateway).
Asynchronous simply means that you will be replying back from Lambda saying that you received the request and you are going to work on it but it will be available only later.
Then, you will be changing the logic on the client side, too, to check back after some time or perform some checks in a loop until the requested resource is ready.
Depending on what work needs to be done you could create an S3 bucket on the fly and reply back to the client with an S3 presigned URL. Then your worker will upload their results to the S3 bucket and the client will poll that bucket for the results until they are present.

Amazon SNS CreatePlatformApplication returns error when reusing platform applications

I had code that was working that would create a new platform application for every message that went out. I thought that was wasteful so I tried to change the code to use list_platform_applications to get available applications and reuse the one that has the proper name (part of the PlatformApplicationArn).
This will work for several messages in a row when suddenly I'll get this error from CreatePlatformApplication:
{"Error":{"Code":"InvalidParameter","Message":"Invalid parameter: This
endpoint is already registered with a different
token.","Type":"Sender"},"RequestId":"06bd3443-598e-5c06-9f5c-7f84349ea067"}
That doesn't even make sense. I'm creating an endpoint. I didn't pass one in. Is it really complaining about the endpoint it's returning.
According to the Amazon documentation:
"The CreatePlatformEndpoint action is idempotent, so if the requester
already owns an endpoint with the same device token and attributes,
that endpoint's ARN is returned without creating a new endpoint."
So it seems to me, if there's an appropriate one it will be returned. Otherwise, create a brand new fresh one.
Am I missing something?
Oh darn. I think I found the reason for this behavior. After facing this issue, I made sure that each token was only uploaded once to AWS SNS. When testing this, I realized that nevertheless I ended up with multiple endpoints with the same token - huh???
It turned out that these duplicated tokens resulted from outdated tokens being uploaded to AWS SNS. After creating an endpoint using an outdated token, SNS would automagically revive the endpoint by updating it with the current device token (which afaik is delivered back from GCM as a canonical ID once you try to send push messages to outdated tokens).
So e.g. uploading these (made-up) tokens and custom data
APA9...YFDw, {original_token: APA9...YFDw}
APA9...XaSd, {original_token: APA9...XaSd} <-- Assume this token is outdated
APA9...sVQa, {original_token: APA9...sVQa}
might result in something like this - i.e. different endpoints with identical tokens:
APA9...YFDw, {original_token: APA9...YFDw}, arn:aws:sns:eu-west-1:4711:endpoint/GCM/myapp/daf64...5c204
APA9...YFDw, {original_token: APA9...XaSd}, arn:aws:sns:eu-west-1:4711:endpoint/GCM/myapp/a980f...e3c82 <-- Duplicate token!
APA9...sVQa, {original_token: APA9...sVQa}, arn:aws:sns:eu-west-1:4711:endpoint/GCM/myapp/14777...7d9ff
This scenario in turn seems to lead to above error on subsequent attempts to create endpoints using outdated tokens. On the hand, it seems correct that subsequent requests fail. On the other hand, intuitively I have the gut-feeling that the duplication of tokens that is taking place seems wrong, or at least difficult to handle. Maybe once SNS discovers that a token is outdated and needs to be changed, it could first check if there is already another endpoint existent with the same token...
I will research on this a bit more and see if I can find a way to handle this properly.
Cheers
Had the same issue, with the device reporting one token (outdated according to GCM) and the SNS retrieving/storing another.
We solved it by clearing the app cache on the device and reopening the app (which in our case, re-registered the device on the gcm service), generating the same token (not outdated) that SNS was attempting to push to.