Implementing an asynchronous web API

Implementing an asynchronous web API - web-services

We are developing a web API which processes potentially very large amounts of user-submitted content, which means that calls to our endpoints might not return immediate results. We are therefore looking at implementing an asynchronous/non-blocking API. Currently our plan is to have the user submit their content via:
POST /v1/foo
The JSON response body contains a unique request ID (a UUID), which the user then submits as a parameter in subsequent polling GETs on the same endpoint:
GET /v1/foo?request_id=<some-uuid>
If the job is finished the result is returned as JSON, otherwise a status update is returned (again JSON).
(Unless they fail both the above calls simply return a "200 OK" response.)
Is this a reasonable way of implementing an asynchronous API? If not what is the 'right' (and RESTful) way of doing this? The model described here recommends creating a temporary status update resource and then a final result resource, but that seems unnecessarily complicated to me.

Actucally the way described in the blog post you mentioned is the 'right' RESTful way of processing aysnchronous operations. I've implemented an API that handles large file uploads and conversion and does it this way. In my opinion this is not over complicated and definitely better then delaying the response to the client or something.
Some additional note: If a task has failed, I would also return 200 OK together with a representation of the task resource and the information that the resource creation has failed.

Related

What are best practices for API error response handling?

It is best practice to handle such API errors by try and catch or the API response suppose to be like Google, Facebook and Microsoft API call
Google API call example
Facebook API call example
Microsoft API call example

As a general rule, there isn't such a thing as a standard API, so there also isn't a best practice as such either. If you are dealing with multiple APIs within your app, you'll end up having at least a handful of variations in what you check for and how you adapt.
Depending on how terminal the failure is, and where it happens in their processing stack, the HTTP status may be set, and you may also get an HTML, JSON or XML body with more detail (no matter what you thought you might get).
APIs also fail randomly with transient errors, so for your code to work reliably, you probably need a retry loop somewhere.
They also throttle, so some kind of detect/backoff/retry handler would help (details vary per API, as ever).
Psuedocode:
retry loop {
request
check connection (network errors)
check HTTP status code
check body
parse body if valid and extract errors
if terminal failure exit (authentication/authorisation etc)
if throttling backoff
}

Empty body in API Gateway response

I have been trying to get the response from API gateway, but after countless tries and going through several online answers, I still wasn't able to solve my issue.
When I test my POST method for the API, it gives me proper response on lambda test and API gateway method test, but when I try it from my react app, it doesn't return the same output.
My lambda snippet:
const response = {
statusCode: 200,
body: JSON.stringify({payload: {"key": "value"}})
};
return response;
But the response I am getting using fetch API on my react app:
I am new to AWS and would appreciate if someone point me in the right direction.

So the fetch API allows you to receive responses as a readablestream, which is what it shows you are receiving in that image there. This resource here, should be helpful in how to properly handle the response.
There are also many other commonly used libraries like axios that are primarily promise / callback driven and you won't have to worry about streams too much unless you want to. You should be able to get fetch working with promises too, but I've never done it myself.
In general, streams are really useful when you have a large amount of data and receiving it all at once in a giant chunk would be really slow, cause timeouts, etc.

How to update progress bar while making a Django Rest api request?

My django rest app accepts request to scrape multiple pages for prices & compare them (which takes time ~5 seconds) then returns a list of the prices from each page as a json object.
I want to update the user with the current operation, for example if I scrape 3 pages I want to update the interface like this :
Searching 1/3
Searching 2/3
Searching 3/3
How can I do this?
I am using Angular 2 for my front end but this shouldn't make a big difference as it's a backend issue.

This isn't the only way, but this is how I do this in Django.
Things you'll need
Asynchronous worker procecess
This allows you to do work outside the context of the request-response cycle. The most common are either django-rq or Celery. I'd recommend django-rq for its simplicity, especially if all you're implementing is a progress indicator.
Caching layer (optional)
While you can use the database for persistence in this case, temporary cache key-value stores make more sense here as the progress information is ephemeral. The Memcached backend is built into Django, however I'd recommend switching to Redis as it's more fully featured, super fast, and since it's behind Django's caching abstraction, does not add complexity. (It's also a requirement for using the django-rq worker processes above)
Implementation
Overview
Basically, we're going to send a request to the server to start the async worker, and poll a different progress-indicator endpoint which gives the current status of that worker's progress until it's finished (or failed).
Server side
Refactor the function you'd like to track the progress of into an async task function (using the #job decorator in the case of django-rq)
The initial POST endpoint should first generate a random unique ID to identify the request (possibly with uuid). Then, pass the POST data along with this unique ID to the async function (in django-rq this would look something like function_name.delay(payload, unique_id)). Since this is an async call, the interpreter does not wait for the task to finish and moves on immediately. Return a HttpResponse with a JSON payload that includes the unique ID.
Back in the async function, we need to set the progress using cache. At the very top of the function, we should add a cache.set(unique_id, 0) to show that there is zero progress so far. Using your own math implementation, as the progress approaches 100% completion, change this value to be closer to 1. If for some reason the operation fails, you can set this to -1.
Create a new endpoint to be polled by the browser to check the progress. This looks for a unique_id query parameter and uses this to look up the progress with cache.get(unique_id). Return a JSON object back with the progress amount.
Client side
After sending the POST request for the action and receiving a response, that response should include the unique_id. Immediately start polling the progress endpoint at a regular interval, setting the unique_id as a query parameter. The interval could be something like 1 second using setInterval(), with logic to prevent sending a new request if there is still a pending request.
When the progress received equals to 1 (or -1 for failures), you know the process is finished and you can stop polling
That's it! It's a bit of work just to get progress indicators, but once you've done it once it's much easier to re-use the pattern in other projects.
Another way to do this which I have not explored is via Webhooks / Channels. In this way, polling is not required, and the server simply sends the messages to the client directly.

Auditing Jetty Client requests and responses

I have a requirement to count the jetty transactions and measure the time it took to process the request and get back the response using JMX for our monitoring system.
I am using Jetty 8.1.7 and I can’t seem to find a proper way to do this. I basically need to identify when request is sent (due to Jetty Async approach this is triggered from thread A) and when the response is complete (as the oncompleteResponse is done in another thread).
I usually use ThreadLocal for such state in other areas I need similar functionality, but obviously this won’t work here.
Any ideas how to overcome?

To use jetty's async requests you basically have to subclass ContentExchange and override its methods. So you can add an extra field to it which would contain a timestamp of when the request was sent, and use it later in your onResponseComplete() method to measure the processing time. If you need to know the time when your request was actually sent to the server instead of when it was created you can override the onRequestCommitted() and onRequestComplete() methods.

RESTful way to trigger server-side events

I have a situation where I need my API to have a call for triggering a service-side event, no information (besides authentication) is needed from the client, and nothing needs to be returned by the server. Since this doesn't fit well into the standard CRUD/Resource interaction, should I take this as an indicator that I'm doing something wrong, or is there a RESTful design pattern to deal with these conditions?

Your client can just:
POST /trigger
To which the server would respond with a 202 Accepted.
That way your request can still contain the appropriate authentication headers, and the API can be extended in the future if you need the client to supply an entity, or need to return a response with information about how to query the event status.
There's nothing "non-RESTful" about what you're trying to do here; REST principles don't have to correlate to CRUD operations on resources.
The spec for 202 says:
The entity returned with this response SHOULD include an indication of
the request's current status and either a pointer to a status monitor
or some estimate of when the user can expect the request to be
fulfilled.
You aren't obliged to send anything in the response, given the "SHOULD" in the definition.

REST defines the nature of the communication between the client and server. In this case, I think the issues is there is no information to transfer.
Is there any reason the client needs to initiate this at all? I'd say your server-side event should be entirely self-contained within the server. Perhaps kick it off periodically with a cron call?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js