Business logic and restful API design - web-services

Let's assume we have a simple API allowing clients to fetch a list of items of a specific type:
GET /items/foo
GET /items/bar
GET /items/blah
A response is a list of items of the requested type, each entry has an unique ID.
The client will usually display these items in table/grid/etc.
Now in the client we must implement a pinning feature so another API allows pinning/unpinning items based on their ID & their type. So I was discussing with my colleagues possibilities to inform the client about which items are pinned or not.
An option was to have another API GET /pinning/{type} to return the list of all the pinned items of a specified type.
Another solution was to use a similar API GET /pinning/{type} to return the list of the IDs of all the pinned items. Let the client sort it out.
The first solution was accepted. Their argument was that the backend is responsible for business logic and that the client shouldn't be involved in business logic so the client should just display data it receives from the server. This argument didn't sell it for me. I'm thinking the server should in this case provide the data that allows the client to perform additional presentation logic.
Which solution is better? Or what other solutions are possible?

If the server would only return ItemIds at GET /pinning/{type}, the client would have to repeatedly call something like GET /items/{itemId} in order to obtain data it can display on the UI, right? This in turn would just increase the load on the server. If the id would be enough, you can probably get away with the proposed solution. Since both the client and the server seem to be under the same umbrella (as in your company is also the API consumer), you have enough information to make a decision.
Even if it were a Public API with lots of clients I would still go down the route of returning items instead of just itemIds - probably in a paged manner, for performance reasons.

Related

What is the best practice to write an API for an action that affects multiple tables?

Consider the example use case as below.
You need to invite a Company as your connection. The sub actions that needs to happen in this situation is.
A Company need to be created by adding an entry to the Company table.
A User account needs to be created for the staff member to login by creating an entry in the User table.
A Staff object is created to ensure that the User has access to the Company by creating an entry in the Staff table.
The invited company is related to the invitee company, so a relation similar to friendship is created to connect the two companies by creating an entry in the Connection table.
An Invitation object is created to store the information as to who invited who onto the system, with other information like invitation time, invite message etc. For this, and entry is created in the Invitation table.
An email needs to be sent to the user to accept invitation and join by setting password.
As you can see, entries are to be made in 5 Tables.
Is it a good practice to do all this in a single API call?
If not, what are the other option.
How do I maintain data integrity if it is to be split into multiple APIs?
If the actions need to be atomic, then it's definitely best to do this in a single API call. Otherwise, you run the risk of someone not completing all the tasks required and leaving the resources in a potentially conflicting state.
That said, you're not updating a single resource, so this isn't a good fit for a single RESTful resource creation call (e.g., POST /companyInvitations) -- as all these other things being created and stitched together might lead to quite a bit of confusion.
If the action you're doing is "inviting a Company", then one option is to use Google's "custom method" syntax (POST /resources/1234:action) as defined in AIP-136. In this case, you might do POST /companies/1234:invite which says "I want to invite Company #1234 to be my connection".
Under the hood, this might atomically upsert (create if resources don't already exist) all the right things that you've listed out.
Something to consider when approaching an API call where multiple things happen when called, is how long those downstream actions take. Leaving the api call blocked isn't the best idea in the world while things are processing in the background.
You could consider (depending on your usecase) taking in the api request, immediately responding with a 200 status, and dropping the request onto an internal queue for processing. When your background service picks up the request it can update whatever needs to be updated and manage the transactions appropriately etc. This also caters for horizontal scaling scenarios where lots of "worker" services can be deployed to process the requests.
As part of this you could consider adding another "status" endpoint where requests can be made to find out how things are going. To avoid lots of polling status requests you could also take in callback details as part of the original api call which then gets called when the background processing is complete. Or you could do both!

Access list data as a group

We have a company program designed to help us get control over data. It has feature to group all the application of one Client. If I want to take a look at them I click on the Client and I see a list of all applications made for him. Take a look at the picture below:
I was wondering if Microsoft Access can do the same? If yes where should I start looking?
I did some internet search and no solution found.
That is built in, and it is called Subdatasheet. You have relationships properly set between Clients and Order, for instance, when you open the Clients table you will see such small "+" allowing to view the Orders of the current client. You may have to set the Subdatasheet Name property of table Clients to "Orders" in this case.
If you want to work with forms, you can build a continuous from for Clients, then one for Orders, then insert the Orders subform in the Footer of the Clients form. Access might tell you you can't do this, just ignore, it works.
In Access that would simply be a continuous form with a filter. Typically opened from a list of clients, setting a filter for the applications of the selected client.
Unless I'm misunderstanding the question.

Best Practices to update multiple records with a single server request

I have a User model which hasMany phones. The UI for the user allows to add/delete/update phones on the single form.
When user submits the form all changes to the phone list are sent to the server with a single request.
I have extended the App.UserSerializer with custom serializeHasMany to include all the phone details in the single request.
The real problem is to sync the store state after the request is complete.
Basically I need to solve these two problems:
Remove deleted records from the store. I could not find any methods which just removes a record from a store.
Update new records with the ids generated by server. (Or just remove the new records from the store and hasMany array since response creates the dups for the added records)
Is there any best practices or work arounds for this kind of scenarios?
Thank you.
I think the best practice for now is just sticking to regular REST. In your case this will mean a few extra requests (really though, how many phones can a user have?), but it will spare you a lot of effort in handling things manually.
Ember may support bulk updates in the future (https://github.com/emberjs/data/blob/master/TRANSITION.md, "We plan to support batch saving with a single HTTP request through a dedicated API in the future.")

SOA/Web Service Pagination

In SOA we should not be building or holding state (or designing dependencies) between client and server. This is understood. But what patterns can be followed in the case that a client wants to consume a real-time service that may return an open ended number of 'rows'?
Web applications, similar to SOA but allowing for state (sessions) have solved this with pagination. Pagination requires (in most cases, especially with SQL) that the server holds the data and that the client request the data in chunks.
If we where to consider pagination-like scenarios for web services, what patterns would these follow that would still allow the tenets of SOA to be adhered (or as close as possible).
Some rules for the thinkers:
1) Backed by a SQL database (therefore there is no concept of a row number in a select set)
2) It is important to not skip a row or duplicate a row in a set during pagination
3) Data may be inserted and deleted at any time into the database by other clients
4) There is no need to consider the dataset a live (update-able) dataset
Personally, I think that 1 and 2 above already spell our the solution by constraining the solution space with the requirements.
My proposed solution would have the data (as much as is selected) be stored in a read-only store/cache where it can be assigned a row number within the result set and allow pagination to occur on this data snapshot. I have would have infrastructure to store snapshots (servers, external caches, memcached or ehcache - this must scale quite large). The result of such a query would be a snapshot ID and clients could retrieve the data from the snapshot using a snapshot API (web services) and the snapshot ID. Results would be processed in a read-only, forward only manner for x records at a time where x was something reasonable.
Competing thoughts and ideas, criticisms or accolades would be greatly appreciated.
Paginated results in a Web Service is actually quite easy to achieve.
All you have to do is add two parameters to the web service call: Page Size, Page Number.
Page Size is the number of results to include in a page. Page Number is the number of the page of results you are looking for.
Your web service then goes back to the database (or cache), retreives the results, figures out which results fit on the requested page, and return only those results.
The client then has to make a single request per page of results they want from the service.
What you propose with memcached will also work with a caching table. The first service call would (1) INSERT results INTO the caching table with a snapshot ID (2) return the first page from the caching table and the snapshot ID. Subsequent calls would return pages based on page size and page number by querying the caching table using the snapshot ID.
I should think this could also be optimized by using an in-memory caching table, but that depends on whether your database supports INSERT-INTO from a disk table to an in-memory table. That might get complicated in a clustered environment though.
Such a cache is stateful by its very nature if you are retaining a client-specific copy between requests, whether storage is in a session object, database table or memcached data store. Given the requirements though, you have no choice but to cache results in some form or another, except you risk the chance of returning deleted or no-longer-relevant records as legitimate results.
SOA is not meant for such low level functionality.
SOA is meant to glue together business areas, not frontends to backends. Not because your application talks to the back end using webservices you have a "SOA" application. This is non sense since SOA is meaningless in the context of 1 isolated system.
From that point of view, it is then clear that, in SOA, the caller should not have known about the SQL table you are paginating, that’s an implementation detail that SOA should hide. In the other hand the server should not know about the client's state, because it should be agnostic to the details of the clients, to be really open.
So, just understand that pagination is not SOA. Do as you wish, just understand that the webservice you are using to paginate is an internal artifact of your application, not to be used for external clients in a SOA bus. Also remember that it can not be transaction consistent with out state in the server. Probably the problem is that you have only one service layer for the application's UI and the SOA bus, you need to separate them.
Using this webservice in a SOA bus would be bad. I can not be consistent as the user paginates and as other applications hang to it they become tied to the specific SQL.
... then you might as well have granted direct SQL access to the table for all that matters.
SOA is for business messages between systems, not to glue an application's frontend to the backend.
Same problem, resolved using the Navision approach.
$ws->getList($first_record_id, $limit)
This return a page of $limit element that start from the the passed id
select * from collection where collection.id > $first_record_id ASC limit $limit
ordered by id ASC
Navision use Key (each element has a key) but in MySQL an autoincrement id is better.
In this case pagination is intended for handle large result sets and not for a frontend pagination...
I am not sure if SOA is of concern here. The problem you have seems to be with paginating your API's. I will point you to how twitter handles their pagination dev.twitter.com/rest/public/timelines

RESTful API design question - how should one allow users to create new resource instances?

I'm working in a research group where we intend to publish implementations of some of the algorithms we develop on the web via a RESTful API. Most of these algorithms work on small to medium size datasets, and in many cases, a user of our services might want to run multiple queries (with different parameters) on the same dataset, so for me it seems reasonable to allow users to upload their datasets in advance and refer to them in their queries later. In this sense, a dataset could be a resource in my API, and an algorithm could be another.
My question is: how should I let the users upload their own datasets? I cannot simply let users upload their data to /dataset/dataset_id as letting the users invent their own dataset_ids might result in ID collision and users overwriting each other's datasets by accident. (I believe one of the most frequently used dataset ID would be test). I think an ideal way would be to have a dedicated URL (like /dataset/upload) where users can POST their datasets and the response would contain a unique ID under which the dataset was stored, but I'm not sure that it does not violate the basic principles of REST. What is the preferred way of dealing with such scenarios?
According to this you should not have dedicated URI, and rather handle POST to /dataset/ as creation.
Your idea is not violating the principles of REST :)
The preferred way is to use POST and return the path to the newly created resource in the Location header.
In your case. Client POSTs to /dataset. The server generates an identifier and returns a reference to the dataset in the Location-header:
Location: /dataset/1234
The response status should be 201 (created)