What are best practices for API error response handling?

What are best practices for API error response handling? - django

It is best practice to handle such API errors by try and catch or the API response suppose to be like Google, Facebook and Microsoft API call
Google API call example
Facebook API call example
Microsoft API call example

As a general rule, there isn't such a thing as a standard API, so there also isn't a best practice as such either. If you are dealing with multiple APIs within your app, you'll end up having at least a handful of variations in what you check for and how you adapt.
Depending on how terminal the failure is, and where it happens in their processing stack, the HTTP status may be set, and you may also get an HTML, JSON or XML body with more detail (no matter what you thought you might get).
APIs also fail randomly with transient errors, so for your code to work reliably, you probably need a retry loop somewhere.
They also throttle, so some kind of detect/backoff/retry handler would help (details vary per API, as ever).
Psuedocode:
retry loop {
request
check connection (network errors)
check HTTP status code
check body
parse body if valid and extract errors
if terminal failure exit (authentication/authorisation etc)
if throttling backoff
}

Related

How to change failure message for Alexa?

I want to change the default failure message in Alexa, Sorry, I'm having trouble accessing your {} skill right now.

You cannot change that prompt but you can code to avoid that as much as possible. The error happens when Alexa is not able to get a valid response from your skill endpoint. There can be multiple reasons to that as mentioned here
1. Your endpoint is giving an invalid response
This can be due to the errors/exceptions happening in your endpoint code. You can make sure that error/exceptions don't occur and if they occur, thre is code to catch them and provide a valid response back to Alexa, with an error message of your choice.
2. Your endpoint availability
Make sure that your endpoints are available all the time if you have configured them as an endpoint. This is pretty much guaranteed if you are using Lambda endpoints. But if you are your own hosted web service endpoint, then you must put in all the measures to keep it available for Alexa to communicate with it.
3. Your endpoint response time
Make sure that your endpoint gives back the response within the time period that Alexa expects it to get(guess its 10 seconds). Also make sure if you are using Lambda functions, you have configured them with reasonable execution time to avoid timeout errors.
If you cover the exception/error/availability scenarios well then you can avoid the default error message as much as possible.

AWS lambda execution fails only first time I run it with 'customer function error'

I trigger a lambda function via API gateway and everything works perfectly with the one exception that the very first time I trigger it on a given day it fails.
Strangely, the lambda function logs don't show any errors. I get my usual START log statement and then the request and context of the trigger, then after 5s, it ends unexpectedly.
When I look into the API gateway logs this is the error it returns:
Lambda execution failed with status 200 due to customer function error: 2018-12-10T11:00:31.208Z cc233168-fc9n-11fc-a05a-577bb4sd2b2ccc Task timed out after 5.01 seconds.
Has anyone encountered a similar problem? What is customer function error and how may I resolve this?

without knowing much of the background code you are using, i would termed this a Cold Start. Cold start happens for the first request where your function has not be called for a very long time. If you notice error message says "Time Out after 5.01 seconds. which is default set. you can increase a time out.
Alternatively, you could consider reducing the impact of cold starts by reducing the length of cold starts reference :
by authoring your Lambda functions in a language that doesn’t incur a high cold start time — i.e. Node.js, Python, or Go
choose a higher memory setting for functions on the critical path of handling user requests (i.e. anything that the user would have to wait for a response from, including intermediate APIs)
optimizing your function’s dependencies, and package size
You can also explore by putting a cron job through Cloud Watch after every specific interval to call your API through PING

Adding to Yash's answer:
I've only seen Lambda execution failed with status 200 in API Gateway execution logs, though in case it can manifest in other ways: ensure you have execution logging enabled for the endpoint. If you didn't already have it enabled you'll need to wait for the problem to manifest again.
You can verify it's a cold start problem as follows:
In the log entry with the error grab the #logStream value and the timestamp for the event; it'll be a long string of alphanumerics like a4f8115980dc83a511eeedc493a78741
Open the log group for that endpoint's execution log -> find the log stream with the identifier you just grabbed
Narrow the date/time range to a window around the time where the event occurred
If you chose a narrow window and if it's a cold start problem: I would expect the offending request to be the first one in the list. Click the There are older events to load. Load more. at the top of the list.
You should now see a gap of time between the last request received and the offending request.
In my case the error says connection reset by peer which leads me to think it's behaving as though a virtual machine were put to sleep then awoken in the sense that it believes TCP connections it previously had open are still valid.
In the short term the solution we're going with is to implement a retry strategy.
Besides the cold-start problem, there's another potential aspect of this problem: your API Gateway access log format.
Do the following:
Find the access log entries that correspond to the offending request in the execution log.
Is the HTTP status == 502?
502s in the API Gateway access log usually (always?) indicate the Lambda responded with malformed JSON.
The most obvious reason for it returning malformed JSON is a bug in your code. One of the less obvious reasons: a mistake in the access log format.
If you suspect that's the case, look for the following:
Quoted fields that shouldn't be; eg $context.error.messageString
Un-quoted fields that should be. A common idiom is to leave numeric fields un-quoted because it makes insights queries like this work: | filter #status >= 500. As convenient as that is, if the field isn't guaranteed to produce a numeric result then the JSON response will be malformed.
Trailing commas in {} bodies
Here's the documentation for many of the the context variables, though one thing to keep in mind: the context variables that are available differ between the different API Gateway endpoint types (lambda, websocket, etc).

RESTful way to trigger server-side events

I have a situation where I need my API to have a call for triggering a service-side event, no information (besides authentication) is needed from the client, and nothing needs to be returned by the server. Since this doesn't fit well into the standard CRUD/Resource interaction, should I take this as an indicator that I'm doing something wrong, or is there a RESTful design pattern to deal with these conditions?

Your client can just:
POST /trigger
To which the server would respond with a 202 Accepted.
That way your request can still contain the appropriate authentication headers, and the API can be extended in the future if you need the client to supply an entity, or need to return a response with information about how to query the event status.
There's nothing "non-RESTful" about what you're trying to do here; REST principles don't have to correlate to CRUD operations on resources.
The spec for 202 says:
The entity returned with this response SHOULD include an indication of
the request's current status and either a pointer to a status monitor
or some estimate of when the user can expect the request to be
fulfilled.
You aren't obliged to send anything in the response, given the "SHOULD" in the definition.

REST defines the nature of the communication between the client and server. In this case, I think the issues is there is no information to transfer.
Is there any reason the client needs to initiate this at all? I'd say your server-side event should be entirely self-contained within the server. Perhaps kick it off periodically with a cron call?

API Design: How should distinct classes of errors be handled from an asynchronous XMLHTTP call?

I have a legacy VB6 application that needs to make asynchronous calls to a web service. The web service provides a search method allows end-users to query a central database and view the results from within the application. I'm using the MSXML2.XMLHTTP to make the requests, and have written a SearchWebService class that encapsulates the web service call and code to handle the response asychronously.
Currently, the SearchWebService raises one of two events to the caller: SearchCompleted and SearchFailed. A SearchCompleted event is raised that contains the search results in a parameter to the event if the call completes successfully. A SearchFailed is raised when any type of failure is detected, which can be anything from an improperly-formatted URL (this is possible because the URL is user-configurable), to low-level network errors such as "Host not found", to HTTP errors such as internal server errors. It returns a error message string to the end-user (which is extracted from the web service response body, if present, or from the HTTP status code text if the response has no body, or translated from the network error code if a network error occurs).
Because of various security requirements, the calling application does not access the web service directly, but instead accesses it through a proxy web server running at the customer site, which in turn accesses the actual web service through via a VPN. However, the SearchWebService doesn't know that the calling application is accessing the web service through a proxy: it's just given a URL and told to make the request. The existence of the proxy is a application-level requirement.
The problem is that from an end-user perspective, it's important that the calling application be able to distinguish between low-level network errors versus HTTP errors from the web service, and to distinguish proxy errors from remote web server errors. For example, the application needs to know if a request failed because the proxy server is down, or because the remote web service that the proxy is accessing is down. An application-specific message needs to be presented to the end-user in each case, such as "Search web service proxy server appears to be down. The proxy server may need to be restarted" versus "The proxy is currently running but the remote web server appears to be unavailable. Please contact (name of person in charge of the remote web server)." I could handle this directly in the SearchWebService class, but it seems wrong to generate these application-specific error messages from such a generic class (and the class might be used in environments that don't require a proxy, where the error messages would no longer make sense).
This distinction is important for troubleshooting: a proxy server problem can usually be resolved by the customer, but a remote web server error has to handled by a third party.
I was thinking one way to handle this would be to have the SearchWebService class detect different types of errors and raise different events in each case. For example, instead of a single SearchFailed event, I could have a NetworkError event for low-level network errors (which would indicate a problem accessing the proxy server), a ConfigurationError event for invalid properties on the SearchWebService class (such as passing an improperly-formatted URL), and a ServiceError for errors that occur on the remote web server (implying that the proxy is working properly but the remote server returned an error).
Now that I think about it, there is also an additional error scenario: it could be possible that the proxy server is running properly, but the remote web server is down, or the proxy server has been misconfigured.
Is the approach of using multiple error events to classify different classes of error a reasonable solution to this problem? For the last scenario (the proxy is running but the remote server cannot be reached), I'm guessing I may have to set up the proxy to return a specific HTTP error code so that client can detect this situation (i.e. something more specific than a 500 response).
Originally I kept the single SearchFailed event and simply added an additional errorCode parameter to the event, but that got messy quickly, especially in cases where there wasn't a logical error code to use (such as if the VB6 raises a "real" error, i.e. if the XMLHTTP class isn't registered).

I think that some ideas I've used with Java exceptions may apply here.
Having a large number of different Exceptions gets pretty messy, yet we need to give enough detail to the user so we don't want to lose information.
Hence I have a small number of specific Exceptions, which I guess would correspond to your Events:
InvalidRequestEvent: Used when the user specifies bad information
TransientErrorEvent: used when there's infrastructure issues when a retry might work.
I tend to work in environments where we have clusters of servers so if a user request hits a dying server then if he resubmits he'll probably get a good one, hence from his perspective a simple retry often works. However sometimes the error is with a service such as the Network or Database and in which case the user needs diagnostic information to report to the helpdesk. Hence we need to decide on the extra information to put into the exception. This is (if I understand you correctly) your question.
In the case of InvalidRequestException we would bet giving some information about the problems with the input. It could be on the lines of "Mismatched parenthese" or "Unknown column CUTSOMER in table ORDER". In the case of TransientErrorException it could be "Proxy server is down".
Now depending upon your exact requirments you may not actually choose to put that text in the Exception, but rather an error number which the presentation layer converts to a locale-specific string (English, French ...).
So either Exception might contain something like this (sorry for that Java syntax, but I hope the idea is clear):
BaseException {
String ErrorText; // the error text itself
// OR if you want to allow for internationaliation
int ErrorCode; // my application specific code, corresponds to text held by the UI
String[] params; // specific parameters to be substitued in the error text
// CUTSOMER and ORDER in my example above
int SystemErrorCode; // If you have an underlying error code it goes here
String SystemErrorText; // any further diagnoistic you might need to give to
// the user so that they can report the problem to the
// help desk.
// OR instead of the text (this is something I've seen done)
int SystemErrorTag; // A unique id for this particular error problem.
// This server systems will label their message in the
// server logs. Users just tell the help desk this number
// they don't need to read detailed server error text.
}

What are REST API error handling best practices? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm looking for guidance on good practices when it comes to return errors from a REST API. I'm working on a new API so I can take it any direction right now. My content type is XML at the moment, but I plan to support JSON in future.
I am now adding some error cases, like for instance a client attempts to add a new resource but has exceeded his storage quota. I am already handling certain error cases with HTTP status codes (401 for authentication, 403 for authorization and 404 for plain bad request URIs). I looked over the blessed HTTP error codes but none of the 400-417 range seems right to report application specific errors. So at first I was tempted to return my application error with 200 OK and a specific XML payload (ie. Pay us more and you'll get the storage you need!) but I stopped to think about it and it seems to soapy (/shrug in horror). Besides it feels like I'm splitting the error responses into distinct cases, as some are http status code driven and other are content driven.
So what is the industry recommendations? Good practices (please explain why!) and also, from a client pov, what kind of error handling in the REST API makes life easier for the client code?

A great resource to pick the correct HTTP error code for your API:
http://www.codetinkerer.com/2015/12/04/choosing-an-http-status-code.html
An excerpt from the article:
Where to start:
2XX/3XX:
4XX:
5XX:

So at first I was tempted to return my application error with 200 OK and a specific XML payload (ie. Pay us more and you'll get the storage you need!) but I stopped to think about it and it seems to soapy (/shrug in horror).
I wouldn't return a 200 unless there really was nothing wrong with the request. From RFC2616, 200 means "the request has succeeded."
If the client's storage quota has been exceeded (for whatever reason), I'd return a 403 (Forbidden):
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
This tells the client that the request was OK, but that it failed (something a 200 doesn't do). This also gives you the opportunity to explain the problem (and its solution) in the response body.
What other specific error conditions did you have in mind?

The main choice is do you want to treat the HTTP status code as part of your REST API or not.
Both ways work fine. I agree that, strictly speaking, one of the ideas of REST is that you should use the HTTP Status code as a part of your API (return 200 or 201 for a successful operation and a 4xx or 5xx depending on various error cases.) However, there are no REST police. You can do what you want. I have seen far more egregious non-REST APIs being called "RESTful."
At this point (August, 2015) I do recommend that you use the HTTP Status code as part of your API. It is now much easier to see the return code when using frameworks than it was in the past. In particular, it is now easier to see the non-200 return case and the body of non-200 responses than it was in the past.
The HTTP Status code is part of your api
You will need to carefully pick 4xx codes that fit your error conditions. You can include a rest, xml, or plaintext message as the payload that includes a sub-code and a descriptive comment.
The clients will need to use a software framework that enables them to get at the HTTP-level status code. Usually do-able, not always straight-forward.
The clients will have to distinguish between HTTP status codes that indicate a communications error and your own status codes that indicate an application-level issue.
The HTTP Status code is NOT part of your api
The HTTP status code will always be 200 if your app received the request and then responded (both success and error cases)
ALL of your responses should include "envelope" or "header" information. Typically something like:
envelope_ver: 1.0
status: # use any codes you like. Reserve a code for success.
msg: "ok" # A human string that reflects the code. Useful for debugging.
data: ... # The data of the response, if any.
This method can be easier for clients since the status for the response is always in the same place (no sub-codes needed), no limits on the codes, no need to fetch the HTTP-level status-code.
Here's a post with a similar idea: http://yuiblog.com/blog/2008/10/15/datatable-260-part-one/
Main issues:
Be sure to include version numbers so you can later change the semantics of the api if needed.
Document...

Remember there are more status codes than those defined in the HTTP/1.1 RFCs, the IANA registry is at http://www.iana.org/assignments/http-status-codes. For the case you mentioned status code 507 sounds right.

As others have pointed, having a response entity in an error code is perfectly allowable.
Do remember that 5xx errors are server-side, aka the client cannot change anything to its request to make the request pass. If the client's quota is exceeded, that's definitly not a server error, so 5xx should be avoided.

I know this is extremely late to the party, but now, in year 2013, we have a few media types to cover error handling in a common distributed (RESTful) fashion. See "vnd.error", application/vnd.error+json (https://github.com/blongden/vnd.error) and "Problem Details for HTTP APIs", application/problem+json (https://datatracker.ietf.org/doc/html/draft-nottingham-http-problem-05).

There are two sorts of errors. Application errors and HTTP errors. The HTTP errors are just to let your AJAX handler know that things went fine and should not be used for anything else.
5xx Server Error
500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
505 HTTP Version Not Supported
506 Variant Also Negotiates (RFC 2295 )
507 Insufficient Storage (WebDAV) (RFC 4918 )
509 Bandwidth Limit Exceeded (Apache bw/limited extension)
510 Not Extended (RFC 2774 )
2xx Success
200 OK
201 Created
202 Accepted
203 Non-Authoritative Information (since HTTP/1.1)
204 No Content
205 Reset Content
206 Partial Content
207 Multi-Status (WebDAV)
However, how you design your application errors is really up to you. Stack Overflow for example sends out an object with response, data and message properties. The response I believe contains true or false to indicate if the operation was successful (usually for write operations). The data contains the payload (usually for read operations) and the message contains any additional metadata or useful messages (such as error messages when the response is false).

Agreed. The basic philosophy of REST is to use the web infrastructure. The HTTP Status codes are the messaging framework that allows parties to communicate with each other without increasing the HTTP payload. They are already established universal codes conveying the status of response, and therefore, to be truly RESTful, the applications must use this framework to communicate the response status.
Sending an error response in a HTTP 200 envelope is misleading, and forces the client (api consumer) to parse the message, most likely in a non-standard, or proprietary way. This is also not efficient - you will force your clients to parse the HTTP payload every single time to understand the "real" response status. This increases processing, adds latency, and creates an environment for the client to make mistakes.

Modeling your api on existing 'best practices' might be the way to go.
For example, here is how Twitter handles error codes
https://developer.twitter.com/en/docs/basics/response-codes

Please stick to the semantics of protocol. Use 2xx for successful responses and 4xx , 5xx for error responses - be it your business exceptions or other. Had using 2xx for any response been the intended use case in the protocol, they would not have other status codes in the first place.

Don't forget the 5xx errors as well for application errors.
In this case what about 409 (Conflict)? This assumes that the user can fix the problem by deleting stored resources.
Otherwise 507 (not entirely standard) may also work. I wouldn't use 200 unless you use 200 for errors in general.

If the client quota is exceeded it is a server error, avoid 5xx in this instance.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js