Alternative approach to sending a lot of parameter data on GET - web-services

I am creating a REST-API and am encountering a problem where the client needs to get a calculation based on a lot of different parameters.
This GET operation might not be a part of any Save or Update operations (before the GET or after), and can happen in a stateless manner.
Due to this the GET URL can be very long (and even exceed the maximum allowed by the browser).
I have looked in other posts here in SO and elsewhere and it is discouraged to use a body in GET requests. But whats most important about all these posts is that none of them give an alternative to this problem they just state that something is flawed in the design ETC...ETC...
Well nothing is wrong with the design here. its a stateless calculation built on a lot of parameters.
I would like some alternatives. Thank you.

nothing is wrong with the design here
There is. From Wiki:
An important concept in REST is the existence of resources (sources of specific information), each of which is referenced with a global identifier (e.g., a URI in HTTP).
Your calculation parameters have nothing to do with the underlying resource identified by the URL you make the request to. You're not requesting an existing resource (as that's what GET is for, depending on how you're willing to interpret REST), but some calculations will be done based on some input. This is a Remote Procedure Call, not a REST call.
You can change your approach by modeling a Calculation, so you send a POST /Calculations/ request with all your parameters.
There's no requirement for a POST call to change server state (i.e. store the results):
httpbis-draft, POST (which is somewhat better worded and updated than RFC 2616):
The POST method requests that the target resource process the
representation enclosed in the request according to the resource's
own specific semantics.
POST is used for (among others): providing a block of data, such as the fields entered into an HTML
form, to a data-handling process;
So you can just return the calculation results along with a 200, or you can store them and return a 200, 201 or 204, containing or pointing to the calculation results, so you can retrieve them later, using GET /Calculations/$id.

As far as I can tell, the only solution you have left is to break the rules of REST and use a POST request. POST can have an arbitrary number of arguments, but it's meant for a "modification" operation in REST.
Like everything in software, the rules are there to help you avoid mistakes. But if the rules prevent you from solving your problems, you need to modify them a little bit (or modify them for a well defined part of your code).
Just make sure that everyone understands how you changed the rules, where the new rules apply (and where they don't). Otherwise, the next guy will "fix" your "broken" code with his simple test cases.

So you want a safe HTTP method that accepts a payload. Have a look at http://greenbytes.de/tech/webdav/draft-ietf-httpbis-method-registrations-14.html - both SEARCH and REPORT are theoretical candidates, if you can live with the WebDAV baggage they come with.
An alternative would be to start work on either generalizing these, or defining something new (but don't forget that definitions of new HTTP methods need IETF review, see http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p2-semantics-25.html#considerations.for.new.methods.

Related

DELETE operation in GET rest service

I got a query while developing rest service.
As per the REST design, GET is to read , PUT or POST are to create or update based on scenario , DELETE is to delete the resources.
But technically, Can't we perform a create or delete operation in GET call.
i.e. It is up to client way of calling by using specified URL pattern and required response type to hit the exact method in the service class of REST application. But why can't we perform a delete or create of some data in the GET service.
so my question is the DELETE or CREATE technically not possible in GET service or is it a rule to adhere to REST principles.
so my question is the DELETE or CREATE technically not possible in GET service or is it a rule to adhere to REST principles.
The latter. It is only a convention to use the DELETE HTTP method for delete operations. However using the GET HTTP method for delete operations is a bad idea. Below is a quote from "RESTful Java with JAX-RS 2.0, 2nd Edition" that explains why:
It is crucial that we do not assign
functionality to an HTTP method that supersedes the specification-defined boundaries
of that method. For example, an HTTP GET on a particular resource should be readonly.
It should not change the state of the resource it is invoking on. Intermediate services
like a proxy-cache, a CDN (Akamai), or your browser rely on you to follow the semantics
of HTTP strictly so that they can perform built-in tasks like caching effectively. If you
do not follow the definition of each HTTP method strictly, clients and administration
tools cannot make assumptions about your services, and your system becomes more
complex
so my question is the DELETE or CREATE technically not possible in GET
service or is it a rule to adhere to REST principles?
REST uses standards aka. uniform interface constraint. One of these standards is the HTTP standards which defines the HTTP methods. According to the HTTP standard the GET is a safe method:
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested.
According to the RFC 2119:
SHOULD NOT - This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
For example write can be a side effect by GET, if you want to increase the visitor count by each request.
How the server software (API) is constructed and what 'rules' are applied is somewhat 'arbitrary'. Developers and their product managers could enforce 'rules' such as 'thou shalt not code or support DELETE operations through the GET operation', but in practice, that is not necessarily the main reason POST is chosen over GET. As others have mentioned, there may be assumptions based on the HTTP protocol that other vendors may rely on, but that is a rather complex and not necessarily relevant reasoning. For instance, your application may be built to connect directly to a server application, and another vendor's rules may not apply.
In a simpler example, on the world wide web and due to compliance and other factors, query string has a limited byte length. Because of this, operations that require a lot of data, such as a few very long encrypted data strings that might be needed for a DELETE operation in a database, GET may not be able to pass enough data, so POST may be the only viable option.
Custom built applications using a CuRL library might extend to include other RESTful operations with their intended functionality, but that would be for the benefit of the server API. Coding more operations on the client-side doesn't necessarily make things 'easier', 'faster', or necessarily 'more secure' from the client perspective, but doing so could help manage resources (a bit) on the server side and help maintain compatibility with third party software and appliances.

Why perform transformations in middleware?

A remote system sends a message via middleware (MQ) to my application.
In middleware a transformation (using xslt) is applied to this message. It is just reformatted and there is no enrichment nor validation. My system is the only consumer of this transformed message and the xslt is maintained by my team.
The original author of all of this has long gone and I am wondering why he thought it was a good idea to do the transformation in middleware rather than in my app. I can't see the value in moving this to middleware, it makes it less visible and less simple to maintain.
Also I would have thought that the xslt would be maintained by the message producer not the consumer.
Are there any guidelines for this sort of architecture? Has he done the right thing here?
It is a bad idea to modify a message body in the middleware. This negatively affects the maintainability and performance.
The only reason of doing this is trying to connect two incompatible endpoints without modifying them. This would require the transformation of the source content to be understood by the destination endpoint.
The motivation to delegate middleware to perform transformation could be a political one (endpoints are maintained by different teams, management is reluctant to touch the endpoint code, etc.).
If you are trying to create an application architecture where there is a need to serve data to different users in different formats, and perhaps receive data in different formats (think weather reports, or sports news), then creating a hub capable of doing the transformations between many different formats makes excellent sense. (Whether you call that "middleware" is up to you.) Perhaps your predecessor had this kind of architecture in mind, but it never grew big or complex enough to justify the design.
From a architectural point of view, it's a good idea to provide consumers of messages or content that is in a humanly readable format, e.g. xslt, unless there is a significant performance gain in using a binary format.
In the humanly readable format case, one simply has to look at the message to verify that it is correct. In the binary case, one would have to develop a utility to tranform binary message into a humanly readable form. Different implementers of such a utility may not always interpret the binary form as intended and it may turn into a finger pointing exercise as to who or what is correct.
Also, if one is looking at what's in the queue, it is easier to make sense of it if the messages are in a humanly readable format.
It doesn't hurt to start with humanly readable format and get the app working first. Then profile the app and see if in the big picture the transformation routines are significant sources of delay. If yes, then go to a binary format.
It would have been preferable to have the original message producer provide messages in xslt format, but they must have had good reasons for doing what they did when they did it. E.g potentially other consumers, xslt didn't exist then, resource constraints, etc.
Read about the adaptor design pattern and you will understand the intent of the current system architecture.

Naming convention for asynchronous operations

I have a web service that triggers some long operations on the server via asynchronous methods. Each operation has 3 methods:
One of them start the operation and immediately returns a ticket number.
One of them is called continuously from the client; it receives the ticket number and returns a boolean value, saying whether the operation is done.
The last one of them is called only after the operation is finished; it receives the ticket number and returns the result of the operation.
I'm not sure how to call this methods. I think about calling the methods something like this:
OperationName_Start
OperationName_IsReady
OperationName_GetResult
but I'm afraid I could be reinventing the wheel. Is there any well known naming convention for this usage pattern?
Its a shame you couldn't get an answer the first time around. I would strongly suggest reading this SOA Principles Link to gain a better understanding of the principals and importance of web service naming.
It's key to remember to name from the perspective of the consumer and maximise consumability and re-useability. Its also useful to remember that when you invoke a service, your are performing and action on an object i.e. you should have a verb and a noun. Also remember that web services are very very similar to functions in an object oriented language, so its helpful to think of what the code of the function of the web service would look like.
It's helpful to consider scenarios like:
What would happen if i changed the system the web service was calling?
Is there another scenario that could call this service?
How granular should the service be? What are the performance impacts of these decisions?
Without knowing the business context of what you are trying to achieve, we'll assume a basic example of submitting and electronic payment.
In this scenario, you may have:
ElectronicPayment_SendPayment (Note: the use of 'Send' keeping the business context, not the technical context; send could be via email, post, webservice. From your 'Start' example, what are you starting; here, the intent of the service is apparent)
ElectronicPayment_CheckStatus (Note: this is from the consumer's perspective. It's likely that checking processing status is a generic service, that could be seperated into something along the lines of CheckProcessingStatus (TicketNumber tN)
ElectronicPayment_RetrieveReciept (Note: Request/Response pattern with semantic link. Business Context of receipt and payment maintained)
Naming is highly contextual, and the above is not perfect, but hopefully it helps yourself and others who stumble upon this.
From the lack of answers, I assume that apparently there isn't a widely used standard on this issue, so it's OK to roll my own.
I chose to stick with:
OperationName_Start
OperationName_IsReady
OperationName_GetResult

Different REST resource content based on user viewing privileges

I want to provide different answers to the same question for different users, based on the access rights. I read this question:
Excluding private data in RESTful response
But I don't agree with the accepted answer, which states that you should provide both /people.xml and /unauthenticated/people.xml, since my understanding of REST is that a particular resource should live in a particular location, not several depending on how much of its information you're interested in.
The system I'm designing is even more complicated than that one. Let's say that a user has created a number of circles of friends, and assigned different access rights to them. For example, my "acquaintances" circle might have access to my birthday, and my "professional" circle might have access to my employment history, but not the other way around. In order to apply the answer from the question I mentioned, I need to have a way of getting all of the user's circles (which I might want to keep secret for security reasons), and then go through /circles/a/users/42, /circles/b/users/42, /circles/c/users/42 and so on, and then merge the results to display the maximum amount of information available. Obviously there's not necessarily a single circle that gets all the information that any of the other circles get. I believe this is tricky enough (note that I probably need to do this with several kinds of objects and that future versions might require a different procedure), but what if I want to impose security restrictions on a particular user despite the fact that he's also in some of my circles? Can that problem even be solved? Even if I refuse to respond to any of the above-mentioned queries and come up with a new one that could give me an answer, it'd still reveal the fact that this specific user is treated differently due to individual access restrictions.
What am I missing here? Is it even possible for me to develop a RESTful web service?
If the conclusion is that the behavior is not RESTful, would this still constitute as a situation where it'd be morally okay to break the REST contract? If so, what are the negative implications? Do I risk proxy caching issues for example?
According to Fielding's dissertation (it really is a great writing):
A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.
In other words, if you have a resource that is defined as "the requesting user's assigned projects" and representations thereof accessible by a URI of /projects, you do not violate any constraints of REST by returning one list of projects (i.e., representation) for user A and another (representation) for user B when they GET that same URI. In this way, the interface is uniform/consistent.
In addition to this, REST only prescribes that an explicit caching instruction be included with the response, whether that is 'cache for this long' or 'do not cache at all':
Cache constraints require that the data within a response to a request be implicitly or explicitly labeled as cacheable or non-cacheable.
How you choose to manage that is up to you.
Keeping that in mind,
You should feel comfortable returning a representation of a resource that varies depending on the user requesting a representation of a particular resource, as long as you are not violating the constraints of a uniform interface -- don't use a single resource identifier to return representations of different resources.
If it helps, consider that the server responds with varying representations of a resource as well -- XML or JSON, French or English, etc. The credentials sent with the request are just another factor the server is able to use in determining which representation to to send in response. That's what the header section is there for.
I agree that the other solution doesn't seem right. It makes the URL structure complicated and more difficult to find the resource. However, if you did REST properly, it shouldn't matter what the URL for the resource is as the server controls it (and is free to relocate it as it sees fit). If your client is really "REST", it would discover the resources it needed through prior negotiation with the server. So the path truly would not matter on the client. I don't like it because its confusing to use - not because of some violation of REST principles.
But that probably doesn't answer your question -
What you didn't mention is your security setup - presumably you are a passing a session token with the request as part of the request header. So your back-end processing should have the ability to tie it to particular set of security constraints. From there, you form the list with whatever business logic you need and return a limited resource based on the user's security tied to the session.
For the algorithm itself, one usually implements a least or most restrictive type algorithm that merges the allowable data into a response (very similar to java realms or Microsoft's user security model).
If the data is structured differently for the restricted/non-restricted case, you could create two different representations of the data and return which ever one the user was authorized to see. The client should be asking for the accepted mime response types anyway, and it would just provide different answers based on the session security in the request header. Alternatively, you could provide optional elements with the representations and fill out the appropriate one base on authorization (although this is a little hack-ee in my opinion).

Is RPC disguised as REST a bad idea?

Our whole system is being designed around REST and are now considering how processes which are quite clearly RPC in intent can be mapped to RESTful resources without using verbs in the URL. Our remote procedure call is used to rebuild our search index when a content listing has been modified elsewhere.
What we are thinking about doing is this:
POST /index_updates
<indexUpdate><contentId>123</contentId></indexUpdate>
Nothing wrong with that in itself, but the smell is this resource which has been created does not return the URL of the newly created resource e.g. /index_updates/1234 which we can then access with a GET.
The indexing engine we are using does have a log mechanism, so in theory we could return a URL to a index_update resource so as to allow a GET to retrieve the resource, but to be honest we're not interested in the resource as this is nothing more than an RPC in disguise.
So my question is whether RESTfulness is expressed in structure or intent. I feel the structure of what I have outlined is restful, but the intent is not.
Does anyone have an comments or advice?
Thanks,
Chris
Use the right tool for the job. In this case, it definitely seems like the right tool is a pure remote procedure call, and there's no reason to pretend it's REST.
One reason you might return a new resource identifier from your POST /index_updates call is to monitor the status of the operation.
POST /index_updates
<contentId>123</contentId>
201 Created
Location: /index_updates/a9283b734e
GET /index_jobs/a9283b734e
<index_update><percent_complete>89</percent_complete></index_update>
This is obviously a subjective field, but GET PUT POST DELETE is a rich enough vocabulary to describe anything. And when I go to non-English-speaking Asian countries I just point and they know what I mean since I don't speak the language... but it's hard to really get into a nice conversation with someone...
It's not a bad idea to disguise RPC as REST, since that's the whole exercise. Personally, I think SOAP has been bashed and hated while in fact it has many strengths (and with HTTP compression, HTTP/SSL, and cookies, many more strengths)... and your app is really exposing methods for the client to call. Why would you want to translate that to REST? I've never been convinced. SOAP lets you use a language that we know and love, that of the programming interface.
But to answer your question, is it a bad idea to disguise RPC as REST? No. Disguising RPC as REST and translating to the four basic operations is what the thing is about. Whether you think that's cool or not is a different story.