REST: Mapping application errors to HTTP Status codes - web-services

Is it to be considered good practice to reuse RFC HTTP Status codes like this, or should we be making up new ones that map exactly to our specific error reasons?
We're designing a web service API around a couple of legacy applications.
In addition to JSON/XML data structures in the Response Body, we aim to return HTTP Status Codes that make sense to web caches and developers.
But how do you go about mapping different classes of errors onto appropriate HTTP Status codes? Everyone on the team agrees on the following:
GET /package/1234 returns 404 Not Found if 1234 doesn't exist
GET /package/1234/next_checkpoint returns 400 Bad Request if "next_checkpoint" and 1234 are valid to ask for but next_checkpont here doesn't make sense...
and so on... but, in some cases, things needs to be more specific than just "400" - for example:
POST /dispatch/?for_package=1234 returns 412 Precondition Failed if /dispatch and package 1234 both exist, BUT 1234 isn't ready for dispatch just yet.
(Edit: Status codes in HTTP/1.1 and Status codes in WebDAV ext.)

RESTful use of HTTP means that you must keep the API uniform. This means that you cannot add domain specific methods (ala GET_STOCK_QUOTE) but it also means that you cannot add domain specific error codes (ala 499 Product Out Of Stock).
In fact, the HTTP client error codes are a good design check because if you design your resource semantics properly, the HTTP error code meanings will correctly express any errors. If you feel you need additional error codes, your resource design is likely wrong.
Jan

422 Unprocessable Entity is a useful error code for scenarios like this. See this question what http response code for rest service on put method when domain rules invalid for additional information.

GET /package/1234/next_checkpoint
returns 400 Bad Request if
"next_checkpoint" and 1234 are valid
to ask for but next_checkpont here
doesn't make sense...
This is the wrong way to think about that URI.
URIs are opaque, so observing that parts of it are 'valid' and others are not doesn't make any sense from a client perspective. Therefore you should 'just' return a 404 to the client, since the resource "package/1234/next_checkpoint" doesn't exist.

You should use 4xx series responses that best match your request when the client makes a mistake, though be careful to not use ones that are meant for specific headers or conditions. I tend to return a human-readable status message and either a plain-text version of the error as the response body or a structured error message, depending on application context.
Update: Upon further reading of the RFC, "procondition failed" is meant for the conditional headers, such as "if-none-match". I'd give a general 400 message for that instead.

Actually, you shouldn't do this at all. Your use of 404 Not Found is correct, but 400 Bad Request is being used improperly. A 400 Bad Request according to the RFC is used solely when the HTTP protocol is malformed. In your case, the request is syntactically correct, it is just an unexpected argument. You should return a 500 Server Error and then include an error code in your REST result.

Related

Zero-length URL Segments

Using the latest versions of Flask and Flask-RESTful, I have some very basic routes defined as such:
def build_uri_rules(uri_map):
for cls, uri in uri_map.iteritems():
api.add_resource(cls, uri)
uris = {
SampleController: '/samples/<string:hash_or_id>',
SampleFamilyController: '/samples/<string:hash_or_id>/family',
}
build_uri_rules(uris)
This works for uris requested 'properly', but what if the /samples/ endpoint is hit without a parameter, or the sample*family endpoint is hit with an empty sample id? Currently, this results in a 404 error. This works well enough, but I believe the proper thing here would be to throw a 400 error, as they found a proper URL but their data is improperly structured. Is there a way that I can force this behavior?
As a side note:
Looking through the Werkzeug docs, I see that werkzeug.routing allows a minimum length for certain url parameters, but I also see that it's got a minimum of 1. Admittedly, I've not look for why this is the case, but would this be the right tree to bark up? or should I rather simply create a global 404 handler that checks for the length of the parameter and raise the proper error from there?
Thanks!
EDITED: For code correctness.
I would say that hitting /samples/ or /samples/family (or even /samples//family) should result in a 404 as there is nothing at that endpoint.
If, however, you want to do otherwise, the simplest way to handle it would be create a 404 handler for just /samples/ and /samples/family that returns a note with more information about what the consumers of your API are most likely doing wrong.
uris = {
Explanitory400Controller: '/samples/',
SampleController: '/samples/<string:hash_or_id>',
Explanitory400Controller: '/samples/family',
SampleFamilyController: '/samples/<string:hash_or_id>/family',
}

Create single and multiple resources using restful HTTP

In my API server I have this route defined:
POST /categories
To create one category you do:
POST /categories {"name": "Books"}
I thought that if you want to create multiple categories, then you could do:
POST /categories [{"name": "Books"}, {"name": "Games"}]
I just wanna confirm that this is a good practice for Restful HTTP API.
Or should one have a
POST /bulk
for allowing them to do whatever operations at once (Creating, Reading, Updating and Deleting)?
In true REST, you should probably POST this in multiple separate calls. The reason is that each one will result in a new representation. How would you expect to get that back otherwise.
Each post should return the resultant resource location:
POST -> New Resource Location
POST -> New Resource Location
...
However, if you need a bulk, then create a bulk. Be dogmatic where possible, but if not, pragmatism gets the job done. If you get too hung up on dogmatism, then you never get anything done.
Here is a similar question
Here is one that suggests HTTP Pipelining to make this more efficient
There's nothing particularly wrong with having a bulk operation that you POST to, to activate (it'll be non-idempotent so POST is the right verb) but there are some caveats:
You're making multiple resources, so you need to respond with multiple URLs. This means you can't use the redirect pattern: you'll have to send a list of URLs back in some form.
You have a problem in that bulk operations are often not very discoverable. Discoverability is one of the most important things about RESTfulness, as it means that someone can come along and figure out how to write a client without lots of help from the server author.
Dealing with partial failures when you've got bulk operations remains problematic. It's a problem with any other paradigm too (I've watched people tie themselves in knots over this when working with extensions to SOAP) so it isn't a surprise, but unless you can guarantee that all the creations will work, you're going to have to work out what happens when you make one resource and fail to make the second. (Also, if the bulk request wanted a third one done, would you go on and try that?)
The simplest approach is just to support one create per request; that's a much easier pattern to get right and is better understood all round.
There's nothing wrong with creating multiple resources at once with POST (just don't try it with PUT). It's not "un-REST-ful", especially if you create a representation for the bulk operation itself. I suggest you create an index resource at the same time you create the individual resources, and return a "303 See Other" to it. That index representation would then contain links to all of the created resources (and possibly error information if any of them failed).
POST /categories/uploads/
[{"name": "Books"}, {"name": "Games"}]
303 See Other
Location: /categories/uploads/321/
(actually, now that I think about it, 201 might be better than 303)
GET /categories/uploads/321/
200 OK
Content-Type: application/json
[{"name": "Books", "link": "/categories/Books/"},
{"name": "Games", "error": "The 'Games' category already exists."}]
In your case I would also go the /bulk resource way. But the pattern I would suggest is the following and from my understanding the most natural: Work with the 202 Accepted status code.
The idea of a bulk request is that the server should not be forced to answer immediately as this would mean client needs to wait until it's bulk request completed.
Here is the pattern:
POST /bulk [{"name": "Books"}, {"name": "Games"}]
202 Accepted | Location: /bulk/processing/status/resourceId
GET /bulk/processing/status/resourceId
entry = "REST in peace" | completed | 0 errors | /categories/category/resourceId
entry = "Walking dead" | processing | 0 errors ->
So, the client POSTs the bulk information to the server. The server just accepts them with a 202 which gives no guarantee about the processing state at the time of response.
But the server also provides the link to a status resource. Here the client can have a look on each of the created resources and the processing state. When finished the client can access the resource via the given link.
Error cases can be identified by the client and erroneous data might be resend by a PUT on the completed resource.
Finally, a good advice I am usually following is: Whenever you hit a resource in your design that cannot be mapped on a HTTP feature it is probably because of a missing resource.
Actually this is still a hot topic till today, But simplify things I almost of the time say there is always a batter suited scenario for each practice.
Eg:
1. If you are receiving the likes from a post you don't need the bulk as in case there is only one like per comment.
2. If you are receiving favorites comment the bulk can fit well by considering someone reviewing the comment he reads and check box all of his favorites and send it once.
Again this is based on my experience working with Restful API, and but currently for the sake of multi tasking and others things, me and my colleague we found our selves doing the bulk all the time in most MIS(Management Information System) we do. This is because modern days web app and mobile app that can do a lot of work and send the final results to the back-end, this way the back-end has little job to do as long as the data received don't violate the business logic.

How to properly send 406 status code?

I'm developing a RESTful API service which initially will only be accepting and responding in JSON format. I want to follow standards and in case of requester's Accept header was different than JSON I want to respond with 406 HTTP status code to inform the requester I cannot output data in other format.
According to W3 I "SHOULD include an entity containing a list of available entity characteristics and location(s) from which the user or user agent can choose the one most appropriate" in my response.
How do I do that, because the above explanation doesn't tell me much. What is the mentioned entity?
Any ideas/suggestions?
EDIT
Initially I thought that maybe could be a comma separated list in Content-Type header but after rethinking maybe I should do the same thing browsers do and use Accept header? This makes much more sense actually, but I cannot find any information to support this.
Three issues here:
First, the note from RFC 2616 is meant to address URI schemes where responses of different types are made available at various URI's, such as "/path/to/thing.xml" vs "/path/to/thing.json". That's not always a popular choice, but if you can do that, do so and include hyperlinks to each one in the "entity"; that is, in the body of the response. Since the RFC doesn't mandate a Content-Type or processing model for such links, you're on your own regarding how to return them, but HTML with <a> tags is common and useful.
If you don't want to expose multiple types at separate URI's, but just want to expose one type at the original URI, then it's perfectly fine to respond with 406 and an entity that simply says which types the resource can emit.
Second, note that most web browsers send */* in the Accept header (with a low quality value), which should match any Content-Type. In addition, the spec says "...if no Accept header field is present, then it is assumed that the client accepts all media types." So the cases where you should be raising 406 are rare.
Third, don't emit a Content-Type response header that is anything other than the Content-Type of the response entity. It should not be used to list acceptable types. You should also not emit a response header named 'Accept'; the 'Accept' header is for requests only; see http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1

Should the status field in the output of a REST-style API call be a numeric code or a string?

I have a REST-ish API that returns JSON. One of the output fields is status, which indicates if the call succeeded. I gather that it's somewhat standard to use numeric status codes such as
200 success
400 syntax error
401 authentication error
402 general error
404 user not found
408 timed out
500 fatal error
501 not yet implemented
Are there any reasons (other than to transmit slightly fewer bytes) to use numeric codes rather than more descriptive symbols as the status, like
success
syntax_error
auth_error
general_error
user_not_found_error
time_out_error
fatal_error
not_yet_implemented_error
Pointing to modern precedents in well-respected APIs would be helpful.
You can make them what you want, as long as they are standard (across your app) and documented for the benefit of those using the webservice.
The fondness of using HTTP status codes such as 404 is that a true REST webservice is based on HTTP transport, so it makes sense to use recognised standards.
That said, you may find that the HTTP/1.1 status codes are not an exact fit - such as your '404 user not found' equivalent to '404 not found' - I think as long as the meaning is there then fine.
If you have to create completely new codes, it's worth sticking to the HTTP groups - 2xx for acceptable results, 3xx for required changes, 4xx for client-side errors and 5xx for server-side errors.

Best practice for handling HTTP HEAD request with Django on App Engine

I'm receiving HEAD requests in my application, and wondering on the best way to handle them. Options are:
convert them to GETs, process GET normally, then:
strip the body (though I'm not sure how - response.content = '' doesn't seem to do it.
it seems app engine auto-strips the body, giving a warning "Dropping unexpected body in response to HEAD request"
It seems this is clean, and can be written nicely using decorators or middleware.
Handle each HEAD request specially:
this means I could avoid a DataStore access in some (many?) cases.
There is a risk, apparently, that middleware which sets the Content-length header will be prevented from doing so by this approach.
Anything else? Which should I do? Does using App Engine make a difference here? Are there subtle details; if so, is there appropriate middleware to use? To convert to GET, is `request.method = "GET" sufficient (it seems to work)?
Did you intend for you application to handle HEAD requests, or are these coming from some anonymous source? You certainly aren't obligated to honor a HEAD request. You can just return with a status code of 405 (Method not allowed) and provide the Allow header with GET or whatever you mean to handle.
I don't think that manually setting request.method to GET is meaningful; in all probability, you are just returning a response that is larger than what the requester wanted. They just wanted to see the headers for the response. If you don't want to handle the HEAD, do the 405 and Allow header approach.
Generally, a client sends a HEAD request because they are trying to be smart about not handling a full response if they don't need to. They are checking to see if the Content-Length has changed since the last time that they saw the response, or they want to see the Last-Modified or Expires header.
It is certainly well-behaved for your application to gracefully handle HEAD requests, but you don't have to.