REST Array manipulation best practice - web-services

I have full access to foo resource via REST:
{
"name": "foo",
"tags": [
"tag01",
"tag02",
"tag03"
]
}
I would like to delete tag01 in tags array.
Usually I would GET \foo and PUT \foo it back without tag01.
In this case this object is small, so this is ok.
But let's assume it's much bigger. For this case I don't like to download and upload this data. After some google research I found out http PATCH. I looks like exactly what I need.
My request in PATCH way is now
PATCH /foo/tags?op={add|delete}
To delete I would use:
PATCH /foo/tags?op=delete
With this data:
{
"value": "tag01"
}
There are now two thinks that I don't like:
query field op - are there some deafult names described in rfc or smth. like this
member value in request data - this is also freely chosen name
It doesn't look correct to me.
Is there some other way to manipulate arrays via REST?
Are there some name conventions to do it in PATCH way?

The payload of a PATCH should contain "instructions describing how a resource currently residing on the origin server should be modified to produce a new version". All information should be passed in the payload and not in query-params.
For instance you could send:
PATCH /foo
[
{
"op": "remove",
"path": "/tags/0"
}
]
Path /tags/0 points to the first element of the array. The remaining elements should be shifted to the left.
See the JSON Patch draft for more details.

Is there some other way to manipulate arrays via REST?
Yes, because it is not correct. By REST you map your URLs to resources (not operations) and you manipulate resources using HTTP methods and sending representations. Having an op:remove in an URL or in a representation is wrong.
Are there some name conventions to do it in PATCH way?
No there are no REST naming conventions. The URI structure does not matter by REST clients, because they follow hyperlinks with semantic annotations.
If you need an op:remove or similar somewhere, then it indicates that your URI - resource mapping is not good. Probably you have to define a new resource or rethink the resource structure.
I would describe what you want as a bulk create and bulk delete. You can model this cases with something like:
POST /collection [{},{},...] -> 201
DELETE /collection?filter="..." -> 204
In order to delete something from a collection you need a resource identifier URI. In this case this can contain the tag name or the index in the array (if it is ordered).
/foo/tags/tag01
/foo/tags/0
It is up to you, but I would use the tag name.
After that it is pretty simple:
POST /foo/tags ["a","b","c"]
DELETE /foor/tags?name="a,b,c"
So PATCH is not the method you are looking for, because you are creating and removing resources and not replacing them.

Related

REST - post to get data. How else can this be done?

According to my understandings, you should not post to get data.
For example, I'm on a project and we are posting to get data.
For example, this following.
{
"zipCOde":"85022",
"city":"PHOENIX"
"country":"US"
"products":[
{
"sku":"abc-21",
"qty":2
},
{
"sku":"def-13",
"qty":2
}
]
}
Does it make sense to post? How could this be done without posting? There could be 1 or more products.
Actually there is a SEARCH method in HTTP, but sadly it is for webdav. https://msdn.microsoft.com/en-us/library/aa143053(v=exchg.65).aspx So if you want to send a request body with the request, then you can try with that.
POSTing is okay if you have a complex search. Complex search is relative, by me it means, that you have different logical operators in your query.
The current one is not that complex, and you can put the non-hierarchical components into the query string of the URI. An example with additional line breaks:
GET /products/?
zipCOde=85022&
city=PHOENIX&
country=US&
filters[0]['sku']=abc-21&
filters[0]['qty']=2&
filters[1]['sku']=def-13&
filters[1]['qty']=2
You can choose a different serialization format and encode it as URI component if you want.
GET /products/?filter={"zipCOde":"85022","city":"PHOENIX","country":"US","products":[{"sku":"abc-21","qty":2},{"sku":"def-13","qty":2}]}
One potential option is to JSON.serialize your object and send it as a query string parameter on the GET.

REST API for data processing and method chaining

I apologize in advance if the quality of the question is bad. I am still beginning to learn the concepts of REST API. I am trying to implement a scalable REST API for data processing. Here is what I could think of so far.
Consider some numerical data that can be retrieved using a GET call:
GET http://my.api/data/123/
Users can apply a sequence of arithmetic operations such as add and multiply. A non-RESTful way to do that is:
GET http://my.api/data/123?add=10&multiply=5
Assupmtions:
The original data in the DB is not changed. Only an altered version of it is returned to the user.
The data is large in size (say a large multi-dimensional array), so we can't afford to return the whole data with every opertation call. Instead, we want to apply operations as a batch and return the final modified data in the end.
There are 2 RESTful ways I am currently conisdering:
1. Model arithmetic operations as subresources of data.
If we consider add and multiply as subresources of data as here. In this case, we can use:
GET http://my.api/data/123/add/10/
which would be safe and idempotent, given that the original data is never changed. However, we need to chain multiple operations. Can we do that?
GET http://my.api/data/123/add/10/multiply/5/
Where multiply is creating a subresource of add/10/ which itself is a subresource of data/123
Pros:
Statelessness: The sever doesn't keep any information about the modified data.
Easy access to modified data: It is just a simple GET call.
Cons:
Chaining: I don't know if it can be easily implemented.
Long URIs: with each operation applied, the URI gets longer and longer.
2. Create an editable data object:
In this case, a user creates an editable version of the original data:
POST http://my.api/data/123/
will return
201 Created
Location: http://my.api/data/123/edit/{uniqueid}
Users can then PATCH this editable data
PATCH http://my.api/data/123/edit/{uniqueid}
{add:10, multiply:5}
And finally, GET the edited data
GET http://my.api/data/123/edit/{uniqueid}
Pros:
Clean URIs.
Cons:
The server has to save the state of edited data.
Editing is no long idempotent.
Getting edited data requires users to make at least 3 calls.
Is there a cleaner, more semantic way to implement data processing RESTfully?
Edit:
If you are wondering what is the real world problem behind this, I am dealing with digital signal processing.
As a simple example, you can think of applying visual filters to images. Following this example, a RESTful web service can do:
GET http://my.api/image/123/blur/5px/rotate/90deg/?size=small&format=png
A couple of things worth reviewing in your question.
REST based API’s are resource based
So looking at your first example, trying to chain transformation properties into the URL path following a resource identifier..
GET http://my.api/data/123/add/10/multiply/5/
..does not fit well (as well as being complicated to implement dynamically, as you already guessed)
Statelessness
The idea of statelessness in REST is built around a single HTTP call containing enough information to process the request and provide a result without going back to the client for more information. Storing the result of an HTTP call on the server is not state, it’s cache.
Now, given that a REST based API is probably not the best fit for your usage, if you do still want to use it here are your options:
1. Use the Querystring with a common URL operation
You could use the Querystring but simplify the resource path to accept all transformations upon a single URI. Given your examples and reluctance to store transformed results this is probably your best option.
GET http://my.api/data/123/transform?add=10&multiply=5
2. Use POST non-RESTfully
You could use POST requests, and leverage the HTTP body to send in the transformation parameters. This will ensure that you don’t ever run out of space on the query string if you ever decide to do a lot of processing and it will also keep your communication tidier. This isn’t considered RESTful if the POST returns the image data.
3. Use POST RESTfully
Finally, if you decide that you do want to cache things, your POST can in fact store the transformed object (note that REST doesn’t dictate how this is stored, in memory or DB etc.) which can be re-fetched by Id using a GET.
Option A
POSTing to the URI creates a subordinate resource.
POST http://my.api/data/123
{add:10, multiply:5}
returns
201 Created
Location: http://my.api/data/123/edit/{uniqueid}
then GET the edited data
GET http://my.api/data/123/edit/{uniqueid}
Option B
Remove the resource identifier from the URL to make it clear that you're creating a new item, not changing the existing one. The resulting URL is also at the same level as the original one since it's assumed it's the same type of result.
POST http://my.api/data
{original: 123, add:10, multiply:5}
returns
201 Created
Location: http://my.api/data/{uniqueid}
then GET the edited data
GET http://my.api/data/{uniqueid}
There are multiple ways this can be done. In the end it should be clean, regardless of what label you want to give it (REST non-REST). REST is not a protocol with an RFC, so don't worry too much about whteher you pass information as URL paths or URL params. The underlying webservice should be able to get you the data regarless of how it is passed. For example Java Jersey will give you your params no matter if they are param or URL path, its just an annotation difference.
Going back to your specific problem I think that the resource in this REST type call is not so much the data that is being used to do the numerical operations on but the actual response. In that case, a POST where the data ID and the operations are fields might suffice.
POST http://my.api/operations/
{
"dataId": "123",
"operations": [
{
"type": "add",
"value": 10
},
{
"type": "multiply",
"value": 5
}
]
}
The response would have to point to the location of where the result can be retrieved, as you have pointed out. The result, referenced by the location (and ID) in the response, is essentially an immutable object. So that is in fact the resource being created by the POST, not the data used to calculate that result. Its just a different way of viewing it.
EDIT: In response to your comment about not wanting to store the outcome of the operations, then you can use a callback to transmit the results of the operation to the caller. You can easily add the a field in the JSON input for the host or URL of the callback. If the callback URL is present, then you can POST to that URL with the results of the operation.
{
"dataId": "123",
"operations": [
{
"type": "add",
"value": 10
},
{
"type": "multiply",
"value": 5
}
],
"callBack": "<HOST or URL>"
}
Please don't view this as me answering my own question, but rather as a constribution to the discussion.
I have given a lot of thought into this. The main problem with the currently suggested architectures is scalability, since the server creates copies of data each time it is operated on.
The only way to avoid this is to model operations and data separately. So, similar to Jose's answer, we create a resource:
POST http://my.api/operations/
{add:10, multiply:5}
Note here, I didn't specify the data at all. The created resource represents a series of operations only. The POST returns:
201 Created
Location: http://my.api/operations/{uniqueid}
The next step is to apply the operations on the data:
GET http://my.api/data/123/operations/{uniqueid}
This seprate modeling approach have several advantages:
Data is not replicated each time applies a different set of operations.
Users create only operations resources, and since their size is tiny, we don't have to worry about scalability.
Users create a new resource only when they need a new set of operations.Going to the image example: if I am designing a greyscale website, and I want all images to be converted to greyscale, I can do
POST http://my.api/operations/
{greyscale: "50%"}
And then apply this operation on all my images by:
GET http://my.api/image/{image_id}/operations/{geyscale_id}
As long as I don't want to change the operation set, I can use GET only.
Common operations can be created and stored on the server, so users don't have to create them. For example:
GET http://my.api/image/{image_id}/operations/flip
Where operations/flip is already an available operation set.
Easily, applying the same set of operations to different data, and vice versa.
GET http://my.api/data/{id1},{id2}/operations/{some_operation}
Enables you to compare two datasets that are processed similarly. Alternatively:
GET http://my.api/data/{id1}/operations/{some_operation},{another_operation}
Allows you to see how different processing procedures affects the result.
I wouldn't try to describe your math function using the URI or request body. We have a more or less standard language to describe math, so you could use some kind of template.
GET http://my.api/data/123?transform="5*(data+10)"
POST http://my.api/data/123 {"transform": "5*({data}+10)"}
You need a code on client side, which can build these kind of templates and another code in the server side, which can verify, parse, etc... the templates built by the client.

REST API Design : Is it ok to change the resource identifier during a PUT call?

I'm curious to learn more about RESTful design patterns around the PUT call. Specifically, am I violating norms by changing the resource ID as part of a PUT call?
Consider the following...
POST /api/event/ { ... } - returns the resource ID (eventid) of the new event in the body
GET /api/event/eventid
PUT /api/event/eventid - returns the (possibly new) resource ID depending on request body
GET /api/event/eventid - fails if the original eventid was used in the URI
The endpoints for GET and PUT can quickly access the resource if the eventid represents internal resources (like a database record). If the PUT results in the server moving the underlying resource, the ID can change.
Am I violating norms when I do this?
REST is not a strict specification, but more a set of guidelines and best practices that can be followed to build web-services that are easy to understand and work with. So there's nothing that prevents you from changing a resource IDs during a PUT.
That being said, doing so is IMO a bad practice. One of the ideas behind REST is that each resource can be referenced using a URI. In your case this URI is the concatenation of the path and (I assume) an internal ID. This URI could be used by other "systems" and stored as references. If you change the ID of a resource on a PUT, you change the URI and all references to that resource will be broken (404).
If you feel the need to change the ID that is part of the URI, you may not have picked the right property for it. Consider something else that would be immutable (e.g.: tag your resource with a UUID and use it rather than an internal DB ID).
Not addressing your question full on, but this makes me worry:
returns the resource ID (eventid) of the new event in the body
You aren't returning an integer id, and then letting the client construct urls from this, are you? A proper REST application should give url's to resources, not ids.
As for your question - PUT means something like "Create a new resource at this location". You could conceivably reply with a redirect and a Location header, but it's a bit of a strange thing to do. Besides, the semantics of PUT dictates that you send the entire entity with the request, which is probably not what you want in this scenario. Maybe it would be more fitting to use POST in this situation? (E.g. POST on /api/event/1234
I think it's ok; PUT is still idempotent (repeated calls will not lead to other modifications).
Just: I would ensure that the old ID is not reused, and have the api return 301 codes for calls to old ID (in case other clients had links to the resource).
Maybe the initial PUT that modifies the ID should return a 303 code that point to the new resource location, I'm not sure here.

Should the paramaters provided in a web service call be included in the response

Iv not got much experience with creating web services, however, I do spend a lot of time interfacing with them.
I wondered if there was a best practice that stated weather or not parameter that are provided in the request should be included in the response.
E.g.
Request:
a.com/getStuff?key=123
(JSON) Response:
{"key":"123",
"value":"abc"}
or..
(JSON) Response:
{"value":"abc"}
I much prefer the more verbose first option because it dos not enforce coupling between the request and the response. i.e. the response dosn't care what the request was, so you do not need to pass state around.
Is there a best practice?
If you are referencing a record in a database, or some other entity that is uniquely identified by an integer, GUID, or specifically-formatted string value, you should ALWAYS return that unique ID with the response, particularly if you are planning to allow the user to update that entity or reference it in a subsequent operation for creating related data or searching for related data.
If you are returning a derived value that may be a composite of many records' values, or of environmentally specific data (such as "How much free disk space is on my server?"), then the supplied parameters wouldn't mean anything in the response, and therefore shouldn't be returned.
Your point on coupling request-response is right on the money. If you are doing multiple simultaneous asynchronous calls, then the key value is very useful when handling the responses.
Referring to your example: I think id should always be part of the resource representation in your case JSON). The representation should be as self-explaining and self-referrable as possible. On top of an id-attribute/field I like also to use a link field:
{
"id":123,
"link":{
"href":"http://api.com/item/123",
"rel":"self"
},
otherData...
}
If your example GET /getStuff?key=123 is more a search (the parameter looks a bit like that) then it good to present the user a "summary" of your search:
{
"items":[{
item1...
},
{
item2...
}
],
"submitted-params":{
"key":"123",
"other-param":"paramValue"
}
}

How to solve two REST problems: the interface document; loss of privacy in descriptive URLs

Coming from a lot of frustrating times with WSDL/Soap, I very much like the REST paradigm, but am trying to solve two basic problems in our application, before moving over to REST. The first problem relates to the lack of an interface document. I think I finally see how to handle this situation: One can query his way down from a top-level "/resources" resource using various requests of GET, HEAD, and OPTIONS to find the one needed resource in the correct hypermedia format. Is this the idea? If so, the client need only be provided with a top-level resource URI: http://www.mywebservicesite.com/mywebservice/resources. He will then have to do some searching and possible keep track of what he is discovering, so that he can use the URIs again efficiently in future to do GETs, POSTs, PUTs, and DELETEs. Are there any thoughts on what should happen here?
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber. We do have an implementation of opaque URLs we use in the context of a session, and I'm wondering how opaque URLs might be applied to REST. The general problem is how to keep domain-specific details out of URLs, and still benefit from what REST has to offer.
The other problem is that we cannot use descriptive URLs like /resources/../customer/Madonna/phonenumber.
I think you've misunderstood the point of opaque URIs. The notion of opaque URIs is with respect to clients: A client shall not decipher a URI to guess anything of semantic meaning from it. So a service may well have URIs like /resources/.../customer/Madonna/phonenumber, and that's quite a good idea. The URIs should be treated as opaque by clients: not infer from the URI that it represents Madonna's phone number, and that Madonna is a customer of some sort. That knowledge can only be obtained by looking inside the URI itself, or perhaps by remembering where the URI was discovered.
Edit:
A consequence of this is that navigation should happen by links, not by deconstructing the URI. So if you see /resouces/customer/Madonna/phonenumber (and it actually represents Customer Madonna's phone number) you should have links in that resource to point to the Madonna resource: e.g.
{
"phone_number" : "01-234-56",
"customer_URI": "/resources/customer/Madonna"
}
That's the only way to navigate from a phone number resource to a customer resource. An important aspect is that the server implementation might or might not have domain specific information in the URI, The Madonna record might just as well live somewhere else: /resources/customers/byid/81496237. This is why clients should treat URIs as opaque.
Edit 2:
Another question you have (in the comments) is then how a client, with the required no knowledge of the server's URIs is supposed to be able to find anything. Clients have the following possibilities to find resources:
Provide a search interface. This could be done by providing an OpenSearch description document, which tells clients how to search for items. An OpenSearch template can include several variables, and several endpoints, depending on what you're looking for. So if you have a "customer ID" that's unique, you could have the following template: /customers/byid/{proprietary:customerid}", the customerid element needs to be documented somewhere, inside the proprietary namespace. A client can then know how to use such a template.
Provide a custom form. This implies making a custom media type in which you explicitly define how (based on an instance of the document) a URI to a customer can be forged. <customers template="/customers/byid/{id}"/>. The documentation (for the media type) would have to state that the template attribute must be interpreted as a relative URI after the string substitution "{id}" to an actual customer ID.
Provide links to all resources. Some resources aren't innumerable, so you can simply make a link to each and every one of them, optionally including identifying information along with the links. This could also be done in a custom media type: <customer id="12345" href="/customer/byid/12345"/>.
It should be noted that #1 and #2 are two ways of saying the same thing: Clients are allowed to create URIs if they
haven't got the URI structure a priori
a media type exists for which the documentation states that URIs should be created
This is much the same way as a web browser has no idea of any URI structure on the web, except for the rules laid out in the definition of HTML forms, to add a ? and then all the query parameters separated by &.
In theory, if you have a customer with id 12345, then you could actually dispense with the href, since you could plug the customer id 12345 into #1 or #2. It's more common to actually provide real links between resources, rather than always relying on lookup or search techniques.
I haven't really used web RPC systems (WSDL/Soap), but i think the 'interface document' is there mostly to allow client libraries to create the service API, right? if so, REST shouldn't need it, because the verbs are already defined and don't really need to be documented again.
AFAIUI, the REST way is to document the structure of each resource (usually encoded in XML or JSON). In that document, you'll also have to document the relationship between those resources. In my case, a resource is often a container of other resources (sometimes more than one type), therefore the structure doc specifies what field holds a list of URLs pointing to the contained resources. Ideally, only one unique resource will need a single, fixed (documented) URL. everithing else follows from there.
The URL 'style' is meaningless to the client, since it shouldn't 'construct' an URL. Every URL it needs should be already constructed on a resource field. That let's you change the URL structure without changing the client (that has saved tons of time to me). Your URLs can be as opaque or as descriptive as you like. (personally, i don't like text keys or slugs; my keys are all BIGINTs or UUIDs)
I am currently building a REST "agent" that addresses the first part of your question. The agent offers a temporary bookmarking service. The client code that is interacting with the agent can request that an URL be bookmarked using some identifier. If the client code needs to retrieve that representation again, it simply asks the agent for the url that corresponds to the saved bookmark and then navigates to that bookmark. Currently those bookmarks are not persisted so they only last for the lifetime of the client application, but I have found it a useful mechanism for accessing commonly used resources. E.g. The root representation provides a login link. I bookmark that link and if the client ever receives a 401 then I can redirect to the "login" bookmark.
To address an issue you mentioned in a comment, the agent also has the ability to store retrieved representations in a dictionary. If it becomes necessary to aggregate and manipulate multiple representations at the same time then I can simply request that the agent store the current representation in a dictionary associated to a key and then continue navigating to the next resource. Once the client has accumulated all the necessary representation it can do what it needs to do.