Is this web-service scenario an instance where REST is not appropriate?

Is this web-service scenario an instance where REST is not appropriate? - web-services

I'm designing a web service that parses a large document (150-200k) and returns some analytical data. The contents of the document are sensitive, and currently not persisted by the backend.
With a stateless REST web service, where all requests are idempotent, this would require every request to include the large document payload, which seems less than ideal.
Would a stateful alternative be a more appropriate design for this scenario, where a session is established after the initial document is POSTed? The client could then make further requests to endpoints which would provide differing analytical results, using the document in memory?

You can think of it as a REST interface tacked onto a document storage service.
The document is stored temporarily. Perhaps it stays for 10 minutes or until released by the owner. The doc storage service returns a token allowing access to the document. But the token expires with the document timeout.
Then you only need REST services to ask questions about the document. Each call needs to include the token but can be repeated indefinitely and still get the same response.
You may want to cache certain information about each document. That's a performance issue.
You might want to consider how to encrypt the token in such a way that it can't be copied off the "wire" and used by a "bad guy(TM)".

Related

Authorization of microservices in monolith application

I have a django application that puts a task in a queue. Another service is used to read that queue and process some files. At the end I need to save the processed files in the database managed by the django application.
I do not want to give the microservice access directly to the database, since I want the responsibility only to be to process the files.
So I wanted to post the changes to django using HTTP request. The problem is that I do not have any authorization at the time, even though I know that HTTP from this type of machine is to be accepted.
For the django application I use JWT as an authorization token. How is the best way to approach this type of problem? Maybe just send a token together to the queue? But how to make such token? It's not certain when the process will be executed..

When you really think about it, there is no need for your internal services to authenticate themselves if they are in the same network.
In that case - You can put Django behind an API gateway (don't write your own, find an open source highly rated project). Then you can control via this gateway which end point is allowed by which traffic source. Then you can easily control end points that are specifically for internal services and which end points need authentication by an external entity.
If they aren't in the same network (which means they are separated by the great gulf of the cloudy net) then the usual way two machines communicate is with an API key. In that case, you can configure your services with symmetric keys, or private/public pair, it doesn't really matter. Machines can be trusted with secret keys. Why would you need to send the token in the queue? If the service is allowed to post results to Django, its allowed to do so for all requests, so it needs to be configured with an API key that tells your API that it is allowed to post processed files.

How to handle authentication using akka-streams and akka-http for client apps

I'm writing an app using Akka Streams and Akka Http which needs to connect to an authenticated web service (which returns an authentication token) and then needs to regularly query the service and potentially perform other actions with it in response to the query (download files etc.). The authentication token times out after a certain amount of time and so will need to be refreshed.
How should I handle the authentication token? It needs to be passed to different Flows in the graph (everywhere I'm querying the service), and when the authentication token becomes invalid I need to request a new one.
One idea would be to do the authentication request outside the stream then pass in the token when materialising the stream so that each flow gets the token as a parameter during materialisation. Then when the token eventually times out the stream will fail and I tear it down and make a new one. I think this would work, but it seems a little clumsy and I'd like to know if there's a way to work entirely with the stream-based world.
One thought I had was that the authentication token could be zipped with the other data flowing through the stream and passed along to each Flow element that needed it. Then if the token fails at some point the stream somehow requests a new one with somekind of feedback flow or a recovery mechanism. But I don't know if this is possible or how to implement it.
Is there a third approach I haven't thought of, or something I've missed in Akka streams or Akka HTTP?

akka-http is built on akka-streams, so you are already covered on that front. For user session management using akka-http, take a look at akka-http-session. You might also want to read through this excellent post.
You might also take a look at some sample code I recently uploaded, which does not use akka-http-session - available here. Hope some of these materials help.

Improvements on cookie based session management

"Instead of using cookies for authorization, server operators might
wish to consider entangling designation and authorization by treating
URLs as capabilities. Instead of storing secrets in cookies, this
approach stores secrets in URLs, requiring the remote entity to
supply the secret itself. Although this approach is not a panacea,
judicious application of these principles can lead to more robust
security." A. Barth
https://www.rfc-editor.org/rfc/rfc6265
What is meant by storing secrets in URLs? How would this be done in practice?

One technique that I believe fits this description is requiring clients to request URLs that are signed with HMAC. Amazon Web Services offers this technique for some operations, and I have seen it implemented in internal APIs of web companies as well. It would be possible to sign URLs server side with this or a similar technique and deliver them securely to the client (over HTTPS) embedded in HTML or in responses to XMLHttpRequests against an API.
As an alternative to session cookies, I'm not sure what advantage such a technique would offer. However, in some situations, it is convenient or often the best way to solve a problem. For example, I've used similar techniques when:
Cross Domain
You need to give the browser access to a URL that is on another domain, so cookies are not useful, and you have the capability to sign a URL server side to give access, either on a redirect or with a long enough expiration that the browser has time to load the URL.
Examples: Downloading files from S3. Progressive playback of video from CloudFront.
Closed Source Limitations
You can't control what the browser or other client is sending, aside from the URL, because you are working with a closed source plugin of some kind and can't change its behavior. Again you sign the URL server side so that all the client has to do is GET the URL.
Examples: Loading video captioning and/or sprite files via WEBVTT, into a closed-source Flash video player. Sending a payload along with a federated single sign-on callback URL, when you need to ensure that the payload can't be changed in transit.
Credential-less Task Worker
You are sending a URL to something other than a browser, and that something needs to access the resource at that URL, and on top of that you don't want to give it actual credentials.
Example: You are running a queue consumer or task-based worker daemon or maybe an AWS Lambda function, which needs to download a file, process it, and send an email. Simply pre-sign all the URLs it will use, with a reasonable expiration, so that it can perform all the requests it needs to without any additional credentials.

If REST is stateless then how it handles multiple requests from client?

I read many articles and blogs including Wikipedia and came to know REST is stateless. But please make me clear in simple language How REST handles multiple requests from client ?.
Thanks.

I assume that your question is about multiple calls that depend on the sequence of prior calls, not independent ones. In other words, you would like to know about calls with a conversational state.
When REST system needs to preserve the conversational state between calls, it does so by transferring additional information to the client. Each call from the client carries the conversational state received in the previous calls, enabling the server to stay stateless.

Because of the stateless architecture, each request is handled with no server-side information of previous session data.
To create the illusion of state, the client application stores the session specific data and attaches it on the HTTP requests when necessary. Take the following example...
The server requires authentication
After authentication, the key is sent to server via HTTP request
Images taken from
http://www.codeproject.com/Articles/149738/Basic-Authentication-on-a-WCF-REST-Service

Securing REST API without reinventing the wheel

When designing REST API is it common to authenticate a user first?
The typical use case I am looking for is:
User wants to get data. Sure cool we like to share! Get a public API key and read away!
User wants to store/update data... woah wait up! who are you, can you do this?
I would like to build it once and allow say a web-app, an android application or an iPhone application to use it.
A REST API appears to be a logical choice with requirements like this
To illustrate my question I'll use a simple example.
I have an item in a database, which has a rating attribute (integer 1 to 5).
If I understand REST correctly I would implement a GET request using the language of my choice that returns csv, xml or json like this:
http://example.com/product/getrating/{id}/
Say we pick JSON we return:
{
"id": "1",
"name": "widget1",
"attributes": { "rating": {"type":"int", "value":4} }
}
This is fine for public facing APIs. I get that part.
Where I have tons of question is how do I combine this with a security model? I'm used to web-app security where I have a session state identifying my user at all time so I can control what they can do no matter what they decide to send me. As I understand it this isn't RESTful so would be a bad solution in this case.
I'll try to use another example using the same item/rating.
If user "JOE" wants to add a rating to an item
This could be done using:
http://example.com/product/addrating/{id}/{givenRating}/
At this point I want to store the data saying that "JOE" gave product {id} a rating of {givenRating}.
Question: How do I know the request came from "JOE" and not "BOB".
Furthermore, what if it was for more sensible data like a user's phone number?
What I've got so far is:
1) Use the built-in feature of HTTP to authenticate at every request, either plain HTTP or HTTPS.
This means that every request now take the form of:
https://joe:joepassword#example.com/product/addrating/{id}/{givenRating}/
2) Use an approach like Amazon's S3 with private and public key: http://www.thebuzzmedia.com/designing-a-secure-rest-api-without-oauth-authentication/
3) Use a cookie anyway and break the stateless part of REST.
The second approach appears better to me, but I am left wondering do I really have to re-invent this whole thing? Hashing, storing, generating the keys, etc all by myself?
This sounds a lot like using session in a typical web application and rewriting the entire stack yourself, which usually to me mean "You're doing it wrong" especially when dealing with security.
EDIT: I guess I should have mentioned OAuth as well.

Edit 5 years later
Use OAuth2!
Previous version
No, there is absolutely no need to use a cookie. It's not half as secure as HTTP Digest, OAuth or Amazon's AWS (which is not hard to copy).
The way you should look at a cookie is that it's an authentication token as much as Basic/Digest/OAuth/whichever would be, but less appropriate.
However, I don't feel using a cookie goes against RESTful principles per se, as long as the contents of the session cookie does not influence the contents of the resource you're returning from the server.
Cookies are evil, stop using them.

Don't worry about being "RESTful", worry about security. Here's how I do it:
Step 1: User hits authentication service with credentials.
Step 2: If credentials check out, return a fingerprint, session id, etc..., and pop them into shared memory for quick retrieval later or use a database if you don't mind adding a few milliseconds to your web service turnaround time.
Step 3: Add an entry point call to the top of every web service script that validates the fingerprint and session id for every web service request.
Step 4: If the fingerprint and session id aren't valid or have timed out redirect to authentication.
READ THIS:
RESTful Authentication

Edit 3 years later
I completely agree with Evert, use OAuth2 with HTTPS, and don't reinvent the wheel! :-)
By simpler REST APIs - not meant for 3rd party clients - JSON Web Tokens can be good as well.
Previous version
Use a cookie anyway and break the stateless part of REST.
Don't use sessions, with sessions your REST service won't be well scalable... There are 2 states here: application state (or client state or session s) and resource state. Application state contains the session data and it is maintained by the REST client. Resource state contains the resource properties and relations and is maintained by the REST service. You can decide very easy whether a particular variable is part of the application state or the resource state. If the amount of data increases with the number of active sessions, then it belongs to the application state. So for example user identity by the current session belongs to the application state, but the list of the users or user permissions belongs to the resource state.
So the REST client should store the identification factors and send them with every request. Don't confuse the REST client with the HTTP client. They are not the same. REST client can be on the server side too if it uses curl, or it can create for example a server side http only cookie which it can share with the REST service via CORS. The only thing what matters that the REST service has to authenticate by every request, so you have to send the credentials (username, password) with every request.
If you write a client side REST client, then this can be done with SSL + HTTP auth. In that case you can create a credentials -> (identity, permissions) cache on the server to make authentication faster. Be aware of that if you clear that cache, and the users send the same request, they will get the same response, just it will take a bit longer. You can compare this with sessions: if you clear the session store, then users will get a status: 401 unauthorized response...
If you write a server side REST client and you send identification factors to the REST service via curl, then you have 2 choices. You can use http auth as well, or you can use a session manager in your REST client but not in the REST service.
If somebody untrusted writes your REST client, then you have to write an application to authenticate the users and to give them the availability to decide whether they want to grant permissions to different clients or not. Oauth is an already existing solution for that. Oauth1 is more secure, oauth2 is less secure but simpler, and I guess there are several other solution for this problem... You don't have to reinvent this. There are complete authentication and authorization solutions using oauth, for example: the wso identity server.
Cookies are not necessarily bad. You can use them in a RESTful way until they hold client state and the service holds resource state only. For example you can store the cart or the preferred pagination settings in cookies...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js