REST Architecture - How the Url of a complex method would look like? - web-services

I have the following Url which returns me the list of resources:
http://example.com/resources/
I also implemented a method which returns a specific resource (in this case, the resource 142).
http://example.com/resources/142
I would like to add a method which is outside the typical HTTP method: List, Create, Retrieve, Replace, Update. What is the pattern to follow? In my specific case, I need to check the availability of resource. How would the Url look like (http://example.com/resources/checkavailability/142)?
I though about simply using the GET method and retrieve that information as part of the object returned. However, some of my colleagues argue that this would not be efficient (the data to transfer would be much bigger than just returning true/false).
Thanks for the help!

There is no need for a resource to check the availability of another resource, and there is no need for a GET request, a HEAD request should be enough, this is the same as a GET request but without transferring the body. You can then look at the return codes, and via those determine if the resource is available. This is assuming you have properly implemented return codes.

Restful over HTTP gives you uniform interface, you often don't need to encode the actions inside your URL
Regarding your mentioned /checkavailability using GET returning payload inefficiency is a valid reason, so use HEAD (it only gives you back the response headers).
request:
HEAD /resources/123
response status:
404 Not Found: equals to /checkavailability == false
200 OK: equals to /checkavailability == true
Other suggestions uniform interface replacements:
/resources/list : GET /resources
/resources/replace/123: PUT /resources/123
/resources/update/123: PUT /resources/123
/resources/create: POST /resources

Related

REST API for data processing and method chaining

I apologize in advance if the quality of the question is bad. I am still beginning to learn the concepts of REST API. I am trying to implement a scalable REST API for data processing. Here is what I could think of so far.
Consider some numerical data that can be retrieved using a GET call:
GET http://my.api/data/123/
Users can apply a sequence of arithmetic operations such as add and multiply. A non-RESTful way to do that is:
GET http://my.api/data/123?add=10&multiply=5
Assupmtions:
The original data in the DB is not changed. Only an altered version of it is returned to the user.
The data is large in size (say a large multi-dimensional array), so we can't afford to return the whole data with every opertation call. Instead, we want to apply operations as a batch and return the final modified data in the end.
There are 2 RESTful ways I am currently conisdering:
1. Model arithmetic operations as subresources of data.
If we consider add and multiply as subresources of data as here. In this case, we can use:
GET http://my.api/data/123/add/10/
which would be safe and idempotent, given that the original data is never changed. However, we need to chain multiple operations. Can we do that?
GET http://my.api/data/123/add/10/multiply/5/
Where multiply is creating a subresource of add/10/ which itself is a subresource of data/123
Pros:
Statelessness: The sever doesn't keep any information about the modified data.
Easy access to modified data: It is just a simple GET call.
Cons:
Chaining: I don't know if it can be easily implemented.
Long URIs: with each operation applied, the URI gets longer and longer.
2. Create an editable data object:
In this case, a user creates an editable version of the original data:
POST http://my.api/data/123/
will return
201 Created
Location: http://my.api/data/123/edit/{uniqueid}
Users can then PATCH this editable data
PATCH http://my.api/data/123/edit/{uniqueid}
{add:10, multiply:5}
And finally, GET the edited data
GET http://my.api/data/123/edit/{uniqueid}
Pros:
Clean URIs.
Cons:
The server has to save the state of edited data.
Editing is no long idempotent.
Getting edited data requires users to make at least 3 calls.
Is there a cleaner, more semantic way to implement data processing RESTfully?
Edit:
If you are wondering what is the real world problem behind this, I am dealing with digital signal processing.
As a simple example, you can think of applying visual filters to images. Following this example, a RESTful web service can do:
GET http://my.api/image/123/blur/5px/rotate/90deg/?size=small&format=png
A couple of things worth reviewing in your question.
REST based API’s are resource based
So looking at your first example, trying to chain transformation properties into the URL path following a resource identifier..
GET http://my.api/data/123/add/10/multiply/5/
..does not fit well (as well as being complicated to implement dynamically, as you already guessed)
Statelessness
The idea of statelessness in REST is built around a single HTTP call containing enough information to process the request and provide a result without going back to the client for more information. Storing the result of an HTTP call on the server is not state, it’s cache.
Now, given that a REST based API is probably not the best fit for your usage, if you do still want to use it here are your options:
1. Use the Querystring with a common URL operation
You could use the Querystring but simplify the resource path to accept all transformations upon a single URI. Given your examples and reluctance to store transformed results this is probably your best option.
GET http://my.api/data/123/transform?add=10&multiply=5
2. Use POST non-RESTfully
You could use POST requests, and leverage the HTTP body to send in the transformation parameters. This will ensure that you don’t ever run out of space on the query string if you ever decide to do a lot of processing and it will also keep your communication tidier. This isn’t considered RESTful if the POST returns the image data.
3. Use POST RESTfully
Finally, if you decide that you do want to cache things, your POST can in fact store the transformed object (note that REST doesn’t dictate how this is stored, in memory or DB etc.) which can be re-fetched by Id using a GET.
Option A
POSTing to the URI creates a subordinate resource.
POST http://my.api/data/123
{add:10, multiply:5}
returns
201 Created
Location: http://my.api/data/123/edit/{uniqueid}
then GET the edited data
GET http://my.api/data/123/edit/{uniqueid}
Option B
Remove the resource identifier from the URL to make it clear that you're creating a new item, not changing the existing one. The resulting URL is also at the same level as the original one since it's assumed it's the same type of result.
POST http://my.api/data
{original: 123, add:10, multiply:5}
returns
201 Created
Location: http://my.api/data/{uniqueid}
then GET the edited data
GET http://my.api/data/{uniqueid}
There are multiple ways this can be done. In the end it should be clean, regardless of what label you want to give it (REST non-REST). REST is not a protocol with an RFC, so don't worry too much about whteher you pass information as URL paths or URL params. The underlying webservice should be able to get you the data regarless of how it is passed. For example Java Jersey will give you your params no matter if they are param or URL path, its just an annotation difference.
Going back to your specific problem I think that the resource in this REST type call is not so much the data that is being used to do the numerical operations on but the actual response. In that case, a POST where the data ID and the operations are fields might suffice.
POST http://my.api/operations/
{
"dataId": "123",
"operations": [
{
"type": "add",
"value": 10
},
{
"type": "multiply",
"value": 5
}
]
}
The response would have to point to the location of where the result can be retrieved, as you have pointed out. The result, referenced by the location (and ID) in the response, is essentially an immutable object. So that is in fact the resource being created by the POST, not the data used to calculate that result. Its just a different way of viewing it.
EDIT: In response to your comment about not wanting to store the outcome of the operations, then you can use a callback to transmit the results of the operation to the caller. You can easily add the a field in the JSON input for the host or URL of the callback. If the callback URL is present, then you can POST to that URL with the results of the operation.
{
"dataId": "123",
"operations": [
{
"type": "add",
"value": 10
},
{
"type": "multiply",
"value": 5
}
],
"callBack": "<HOST or URL>"
}
Please don't view this as me answering my own question, but rather as a constribution to the discussion.
I have given a lot of thought into this. The main problem with the currently suggested architectures is scalability, since the server creates copies of data each time it is operated on.
The only way to avoid this is to model operations and data separately. So, similar to Jose's answer, we create a resource:
POST http://my.api/operations/
{add:10, multiply:5}
Note here, I didn't specify the data at all. The created resource represents a series of operations only. The POST returns:
201 Created
Location: http://my.api/operations/{uniqueid}
The next step is to apply the operations on the data:
GET http://my.api/data/123/operations/{uniqueid}
This seprate modeling approach have several advantages:
Data is not replicated each time applies a different set of operations.
Users create only operations resources, and since their size is tiny, we don't have to worry about scalability.
Users create a new resource only when they need a new set of operations.Going to the image example: if I am designing a greyscale website, and I want all images to be converted to greyscale, I can do
POST http://my.api/operations/
{greyscale: "50%"}
And then apply this operation on all my images by:
GET http://my.api/image/{image_id}/operations/{geyscale_id}
As long as I don't want to change the operation set, I can use GET only.
Common operations can be created and stored on the server, so users don't have to create them. For example:
GET http://my.api/image/{image_id}/operations/flip
Where operations/flip is already an available operation set.
Easily, applying the same set of operations to different data, and vice versa.
GET http://my.api/data/{id1},{id2}/operations/{some_operation}
Enables you to compare two datasets that are processed similarly. Alternatively:
GET http://my.api/data/{id1}/operations/{some_operation},{another_operation}
Allows you to see how different processing procedures affects the result.
I wouldn't try to describe your math function using the URI or request body. We have a more or less standard language to describe math, so you could use some kind of template.
GET http://my.api/data/123?transform="5*(data+10)"
POST http://my.api/data/123 {"transform": "5*({data}+10)"}
You need a code on client side, which can build these kind of templates and another code in the server side, which can verify, parse, etc... the templates built by the client.

Zero-length URL Segments

Using the latest versions of Flask and Flask-RESTful, I have some very basic routes defined as such:
def build_uri_rules(uri_map):
for cls, uri in uri_map.iteritems():
api.add_resource(cls, uri)
uris = {
SampleController: '/samples/<string:hash_or_id>',
SampleFamilyController: '/samples/<string:hash_or_id>/family',
}
build_uri_rules(uris)
This works for uris requested 'properly', but what if the /samples/ endpoint is hit without a parameter, or the sample*family endpoint is hit with an empty sample id? Currently, this results in a 404 error. This works well enough, but I believe the proper thing here would be to throw a 400 error, as they found a proper URL but their data is improperly structured. Is there a way that I can force this behavior?
As a side note:
Looking through the Werkzeug docs, I see that werkzeug.routing allows a minimum length for certain url parameters, but I also see that it's got a minimum of 1. Admittedly, I've not look for why this is the case, but would this be the right tree to bark up? or should I rather simply create a global 404 handler that checks for the length of the parameter and raise the proper error from there?
Thanks!
EDITED: For code correctness.
I would say that hitting /samples/ or /samples/family (or even /samples//family) should result in a 404 as there is nothing at that endpoint.
If, however, you want to do otherwise, the simplest way to handle it would be create a 404 handler for just /samples/ and /samples/family that returns a note with more information about what the consumers of your API are most likely doing wrong.
uris = {
Explanitory400Controller: '/samples/',
SampleController: '/samples/<string:hash_or_id>',
Explanitory400Controller: '/samples/family',
SampleFamilyController: '/samples/<string:hash_or_id>/family',
}

REST API Design : Is it ok to change the resource identifier during a PUT call?

I'm curious to learn more about RESTful design patterns around the PUT call. Specifically, am I violating norms by changing the resource ID as part of a PUT call?
Consider the following...
POST /api/event/ { ... } - returns the resource ID (eventid) of the new event in the body
GET /api/event/eventid
PUT /api/event/eventid - returns the (possibly new) resource ID depending on request body
GET /api/event/eventid - fails if the original eventid was used in the URI
The endpoints for GET and PUT can quickly access the resource if the eventid represents internal resources (like a database record). If the PUT results in the server moving the underlying resource, the ID can change.
Am I violating norms when I do this?
REST is not a strict specification, but more a set of guidelines and best practices that can be followed to build web-services that are easy to understand and work with. So there's nothing that prevents you from changing a resource IDs during a PUT.
That being said, doing so is IMO a bad practice. One of the ideas behind REST is that each resource can be referenced using a URI. In your case this URI is the concatenation of the path and (I assume) an internal ID. This URI could be used by other "systems" and stored as references. If you change the ID of a resource on a PUT, you change the URI and all references to that resource will be broken (404).
If you feel the need to change the ID that is part of the URI, you may not have picked the right property for it. Consider something else that would be immutable (e.g.: tag your resource with a UUID and use it rather than an internal DB ID).
Not addressing your question full on, but this makes me worry:
returns the resource ID (eventid) of the new event in the body
You aren't returning an integer id, and then letting the client construct urls from this, are you? A proper REST application should give url's to resources, not ids.
As for your question - PUT means something like "Create a new resource at this location". You could conceivably reply with a redirect and a Location header, but it's a bit of a strange thing to do. Besides, the semantics of PUT dictates that you send the entire entity with the request, which is probably not what you want in this scenario. Maybe it would be more fitting to use POST in this situation? (E.g. POST on /api/event/1234
I think it's ok; PUT is still idempotent (repeated calls will not lead to other modifications).
Just: I would ensure that the old ID is not reused, and have the api return 301 codes for calls to old ID (in case other clients had links to the resource).
Maybe the initial PUT that modifies the ID should return a 303 code that point to the new resource location, I'm not sure here.

Return 205 "Reset Content" for RESTful partial update?

Is 205 - Reset Content the appropriate response code to return when a partial update has been applied to a resource and the updated resource document isn't returned in the response?
Updated:
I'm creating a web service that will support partial updates to the URL of the resource. One option is to return a 200 along with an updated resource document. It seems like another option may be to return 205 - Reset Content without a body containing an updated representation indicating that the client should initiate a GET of the resource. It seems that this would be consistent with returning 201 w/ a Location header and no body for resource creation (which is what I'm currently doing for resource creation).
There's no reason not to simply use status code 200.
I still do not find that your question explains what problem you are trying to solve but here is my guess and my answer:
Q: You want to have a resource that the caller can update by sending PUT /myResource but for some reason, you do not want the whole resource to be updated, only to some extent(?).
A1: I have not heard about partial updates as a pattern. You might want to consider splitting your resource into many smaller resources. That might make things easier to develop and for the user of the API to understand.
A2: 201 ("Created") as you mention in a comment should not be used since you are doing an update.
A3: 205 ("Reset Content") does actually not sound totally wrong, since you want the client to invalidate its data but the question is still - why? Also, if only part of the client's data was updated on the server, you do NOT want the client to drop the whole representation.
A4: If the client representation becomes invalid after the PUT, I would say that returning 303 ("See Other") and setting the Location header to the URI of the data is the right way to go, even if the URI has not changed.

Should the paramaters provided in a web service call be included in the response

Iv not got much experience with creating web services, however, I do spend a lot of time interfacing with them.
I wondered if there was a best practice that stated weather or not parameter that are provided in the request should be included in the response.
E.g.
Request:
a.com/getStuff?key=123
(JSON) Response:
{"key":"123",
"value":"abc"}
or..
(JSON) Response:
{"value":"abc"}
I much prefer the more verbose first option because it dos not enforce coupling between the request and the response. i.e. the response dosn't care what the request was, so you do not need to pass state around.
Is there a best practice?
If you are referencing a record in a database, or some other entity that is uniquely identified by an integer, GUID, or specifically-formatted string value, you should ALWAYS return that unique ID with the response, particularly if you are planning to allow the user to update that entity or reference it in a subsequent operation for creating related data or searching for related data.
If you are returning a derived value that may be a composite of many records' values, or of environmentally specific data (such as "How much free disk space is on my server?"), then the supplied parameters wouldn't mean anything in the response, and therefore shouldn't be returned.
Your point on coupling request-response is right on the money. If you are doing multiple simultaneous asynchronous calls, then the key value is very useful when handling the responses.
Referring to your example: I think id should always be part of the resource representation in your case JSON). The representation should be as self-explaining and self-referrable as possible. On top of an id-attribute/field I like also to use a link field:
{
"id":123,
"link":{
"href":"http://api.com/item/123",
"rel":"self"
},
otherData...
}
If your example GET /getStuff?key=123 is more a search (the parameter looks a bit like that) then it good to present the user a "summary" of your search:
{
"items":[{
item1...
},
{
item2...
}
],
"submitted-params":{
"key":"123",
"other-param":"paramValue"
}
}