API Versioning Best Practices - Should v1 get show v2 items? - api-versioning

Our service has shipped our api with a public version we'll support for at least 18 months. We're now starting on some new features which will be in v2.
I'm reading up on it but haven't found the answer yet.
When designing a new API Version for a public webservice
Our V2 entities have at least all of the same entities as V1 items. However they often add some new properties for V2 items. With this in mind...
When a customer does a v1 API Version Get, should we show v2 items at all?
How about when they do a V2 get?
V2 adds some properties that v1 doesn't have. With a V2 get, should we return V1 items as well? In that case, should we just leave those properties empty?
What's the 'Right way' to do this?

There are always edge cases in specific API implementations, but if we're talking about REST over HTTP, then every API version where said entity exists should be be available forward and backward.
First, let's consider a resource. Let's say you have /order/123. Remember that REST is Representational State Transfer. The API version should not be thought of as a version in binary code nor an API as a method that is invoked. HTTP is the API and GET is the method (think http.get(request)). That being said, the API version indicates how you want the entity to be represented. An API version is better thought of as a form of media type negotiation.
If I'm a customer or a client integrating with the Orders API, I'm not thinking specifically about API versions. I just just know I have order 123. This order exists regardless of the API version. Client code only cares about an API version in the context of "I know/expect an order to look like...". This means that all orders should be available in all versions. This behavior likely requires that you need some special handling or data transfer objects (DTOs) to make things work as expected over the wire. This could be hiding/removing new members or filling in missing members, if possible. This can lead to a lot of extra work, which is why it's important to have a sound versioning policy such as N-2.
You've elected to version by query string so that puts you in a pretty good position. Making the query parameter required in all versions is an excellent policy because the client should always have to explicitly ask for what they want. We all know what happens when the server assumes. Versioning by URL segment, despite its popularity, is the worst way to version. It violates the Uniform Interface REST constraint. A resource is identified by its URL path - the whole path. A human being sees 123 as the identifier, but to HTTP it's order/123. v1/order/123 and v2/order/123 are not different orders (as the URL implies), rather they are different representations. The query string never identifies a resource so that is a reasonable, if not pragmatic approach, to versioning as opposed to true media type negotiation.

i think that API should be different.
you can't change api v1 because customers have implemented their software on what you return from your api.
so you should implement a different API, and you are free to change ALL YOU WANT.
Obviously, the API version must be clearly exposed to the endpoint, example:
myapi.api.com/v2/get/articles.
(and don't forget to DON'T change endpoint of version 1)

Related

How to change s3's v4 signature to v2 ?

I have a successfully working algorithm, that computes V4 Sign for amazon s3 web services, but I need to implement V2 Signer too.
Is there a list of total differences between those two signs?
I found this, but that is not everything, that needed: https://docs.aws.amazon.com/general/latest/gr/sigv4_changes.html
It's really best not to think of the "changes" between Signature V2 and Signature V4.
Here is the full list of what is different between the two algorithms:
Everything.
That is, the differences far outnumber the similarities.
Almost all parameter names are different and their values are calculated differently and the values are in different formats. V4 was a complete redesign of the signing algorithm. Almost nothing meaningful from V2 is found in V4.
So, based on experience, my suggestion is that you don't try to adapt or modify your V4 code to also do V2. Keep them separate. Implementing V2 from scratch will be far easier. Don't even think about V4 while working on V2. Just implement V2.
S3 has its own section on V2, at https://docs.aws.amazon.com/AmazonS3/latest/dev/auth-request-sig-v2.html
The IAM docs also discuss V2, at
https://docs.aws.amazon.com/general/latest/gr/signature-version-2.html
The S3 implementation has some quirky attributes, so prefer the S3 documentation first.
Note that in the Signature field in V2 absolutely requires that + be url-escaped (percent-encoded) after base64 encoding, but genuine S3 will quietly ignore the fact that you didn't encode = and possibly / so an implementation that overlooks the url-escaping requirement will generate signatures that sometimes work and sometimes don't. If the 3rd party service you're using doesn't provide helpful SignatureDoesNotMatch requests the way S3 does, you'll want to test your implementation against a bucket in an older S3 region, since it does provide useful diagnostics that can help you troubleshoot your implementation, if you know how to read them.

Restful API - handling large amounts of data

I have written my own Restful API and am wondering about the best way to deal with large amounts of records returned from the API.
For example, if I use GET method to myapi.co.uk/messages/ this will bring back the XML for all message records, which in some cases could be 1000's. This makes using the API very sluggish.
Can anyone suggest the best way of dealing with this? Is it standard to return results in batches and to specify batch size in the request?
You can change your API to include additional parameters to limit the scope of data returned by your application.
For instance, you could add limit and offset parameters to fetch just a little part. This is how pagination can be done in accordance with REST. A request like this would result in fetching 10 resources from the messages collection, from 21st to 30th. This way you can ask for a specific portion of a huge data set:
myapi.co.uk/messages?limit=10&offset=20
Another way to decrease the payload would be to only ask for certain parts of your resources' representation. Here's how facebook does it:
/joe.smith/friends?fields=id,name,picture
Remember that while using either of these methods, you have to provide a way for the client to discover each of the resources. You can't assume they'll just look at the parameters and start changing them in search of data. That would be a violation of the REST paradigm. Provide them with the necessary hyperlinks to avoid it.
I strongly recommend viewing this presentation on RESTful API design by apigee (the screencast is called "Teach a Dog to REST"). Good practices and neat ideas to approach everyday problems are discussed there.
EDIT: The video has been updated a number of times since I posted this answer, you can check out the 3rd edition from January 2013
There are different ways in general by which one can improve the API performance including for large API sizes. Each of these topics can be explored in depth.
Reduce Size Pagination
Organizing Using Hypermedia
Exactly What a User Need With Schema Filtering
Defining Specific Responses Using The Prefer Header
Using Caching To Make Response
More Efficient More Efficiency Through Compression
Breaking Things Down With Chunked Responses
Switch To Providing More Streaming Responses
Moving Forward With HTTP/2
Source: https://apievangelist.com/2018/04/20/delivering-large-api-responses-as-efficiently-as-possible/
if you are using .net core
you have to try this magic package
Microsoft.AspNetCore.ResponseCompression
then use this line in configureservices in startup file
services.AddResponseCompression();
then in configure function
app.UseResponseCompression();

REST vs RPC for a C++ API

I am writing a C++ API which is to be used as a web service. The functions in the API take in images/path_to_images as input parameters, process them, and give a different set of images/paths_to_images as outputs. I was thinking of implementing a REST interface to enable developers to use this API for their projects (independent of whatever language they'd like to work in). But, I understand REST is good only when you have a collection of data that you want to query or manipulate, which is not exactly the case here.
[The collection I have is of different functions that manipulate the supplied data.]
So, is it better for me to implement an RPC interface for this, or can this be done using REST itself?
Like lcfseth, I would also go for REST. REST is indeed resource-based and, in your case, you might consider that there's no resource to deal with. However, that's not exactly true, the image converter in your system is the resource. You POST images to it and it returns new images. So I'd simply create a URL such as:
POST http://example.com/image-converter
You POST images to it and it returns some array with the path to the new images.
Potentially, you could also have:
GET http://example.com/image-converter
which could tell you about the status of the image conversion (assuming it is a time consuming process).
The advantage of doing it like that is that you are re-using HTTP verbs that developers are familiar with, the interface is almost self-documenting (though of course you still need to document the format accepted and returned by the POST call). With RPC, you would have to define new verbs and document them.
REST use common operation GET,POST,DELETE,HEAD,PUT. As you can imagine, this is very data oriented. However there is no restriction on the data type and no restriction on the size of the data (none I'm aware of anyway).
So it's possible to use it in almost every context (including sending binary data). One of the advantages of REST is that web browser understand REST and your user won't need to have a dedicated application to send requests.
RPC presents more possibilities and can also be used. You can define custom operations for example.
Not sure you need that much power given what you intend to do.
Personally I would go with REST.
Here's a link you might wanna read:
http://www.sitepen.com/blog/2008/03/25/rest-and-rpc-relationship/
Compared to RPC, REST's(json style interface) is lightweight, it's easy for API user to use. RPC(soap/xml) seems complex and heavy.
I guess that what you want is HTTP+JSON based API, not the REST API that claimed by the REST author
http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven

How to implement backend of api with multiple versions

I'm using Django to implement a private rest-like API and I'm unsure of how to handle different versions of the API on the backend.
Meaning, if I have 2 versions of the API what does my code look like? Should I have different apps that handle different version? Should different functions handle different versions? Or should I just use if statements for when one version differs from another?
I plan on stating the version in the Header.
Thanks
You do not need to version REST APIs. With REST, versioning happens at runtime either through what one might call 'must-ignore payload extension rules' or through content negotiation.
'must-ignore payload extension rules' refer to an aspect you build into the design of your messages. 'Must-ignore' means that a piece of software that processes a message of the given format must ignore any unknown syntactical constructs. This is what we all know from HTML and what makes it possible to insert all sorts of fancy tags into an HTML page without the parser choking.
'Must-ignore' allows you to evolve the capabilities of your service by adding stuff to what you send already without considering clients that only understand the older versions.
Content-negotiation refers to the HTTP-built-in mechanism of negotiating the actual representation the server sends to a given client at runtime. The typical scenario is this: Clients send the Accept header in the request to advertise what they are capable of and servers pick the representation to send back based on these capabilities. But there are also variations of this theme (see here for details: http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html ).
Content negotiation allows for incompatible changes, meaning that I can evolve my service to being able to send incompatible old and new versions and based on the Accept header my service will send the appropriate one.
Bottom line: with both approaches, your API remains as it is. No need to do any versioning at the API level - especially not the often suggested (but totally wrong) inclusion of version identifiers in the URIs (remember, you are doing REST here, not SOAP!)

Can you help clarify some points regarding RESTful services and Code Generation?

I've been struggling with understanding a few points I keep reading regarding RESTful services. I'm hoping someone can help clarify.
1a) There seems to be a general aversion to generated code when talking about RESTful services.
1b) The argument that if you use a WADL to generate a client for a RESTful service, when the service changes - so does your client code.
Why I don't get it: Whether you are referencing a WADL and using generated code or you have manually extracted data from a RESTful response and mapped them to your UI (or whatever you're doing with them) if something changes in the underlying service it seems just as likely that the code will break in both cases. For instance, if the data returned changes from FirstName and LastName to FullName, in both instances you will have to update your code to grab the new field and perhaps handle it differently.
2) The argument that RESTful services don't need a WADL because the return types should be well-known MIME types and you should already know how to handle them.
Why I don't get it: Is the expectation that for every "type" of data a service returns there will be a unique MIME type in existence? If this is the case, does that mean the consumer of the RESTful services is expected to read the RFC to determine the structure of the returned data, how to use each field, etc.?
I've done a lot of reading to try to figure this out for myself so I hope someone can provide concrete examples and real-world scenarios.
REST can be very subtle. I've also done lots of reading on it and every once in a while I went back and read Chapter 5 of Fielding's dissertation, each time finding more insight. It was as clear as mud the first time (all though some things made sense) but only got better once I tried to apply the principles and used the building blocks.
So, based on my current understanding let's give it a go:
Why do RESTafarians not like code generation?
The short answer: If you make use of hypermedia (+links) There is no need.
Context: Explicitly defining a contract (WADL) between client and server does not reduce coupling enough: If you change the server the client breaks and you need to regenerate the code. (IMHO even automating it is just a patch to the underlying coupling issue).
REST helps you to decouple on different levels. Hypermedia discoverability is one of the goods ones to start with. See also the related concept HATEOAS
We let the client “discover” what can be done from the resource we are operating on instead of defining a contract before. We load the resource, check for “named links” and then follow those links or fill in forms (or links to forms) to update the resource. The server acts as a guide to the client via the options it proposes based on state. (Think business process / workflow / behavior). If we use a contract we need to know this "out of band" information and update the contract on change.
If we use hypermedia with links there is no need to have “separate contract”. Everything is included within the hypermedia – why design a separate document? Even URI templates are out of band information but if kept simple can work like Amazon S3.
Yes, we still need a common ground to stand on when transferring representations (hypermedia), so we define your own media types or use widely accepted ones such as Atom or Micro-formats. Thus, with the constraints of basic building blocks (link + forms + data - hypermedia) we reduce coupling by keeping out of band information to a minimum.
As first it seems that going for hypermedia does not change the impact of change :) : But, there are subtle differences. For one, if I have a WADL I need to update another doc and deploy/distribute. Using pure hypermedia there is no impact since it's embedded. (Imagine changes rippling through a complex interweave of systems). As per your example having FirstName + LastName and adding FullName does not really impact the clients, but removing First+Last and replacing with FullName does even in hypermedia.
As a side note: The REST uniform interface (verb constraints - GET, PUT, POST, DELETE + other verbs) decouples implementation from services.
Maybe I'm totally wrong but another possibility might be a “psychological kick back” to code generation: WADL makes one think of the WSDL(contract) part in “traditional web services (WSDL+SOAP)” / RPC which goes against REST. In REST state is transferred via hypermedia and not RPC which are method calls to update state on the server.
Disclaimer: I've not completed the referenced article in detail but I does give some great points.
I have worked on API projects for quite a while.
To answer your first question.
Yes, If the services return values change (Ex: First name and Last name becomes Full Name) your code might break. You will no longer get the first name and last name.
You have to understand that WADL is a Agreement. If it has to change, then the client needs to be notified. To avoid breaking the client code, we release a new version of the API.
The version 1.0 will have First Name and last name without breaking your code. We will release 1.1 version which will have the change to Full name.
So the answer in short, WADL is there to stay. As long as you use that version of the API. Your code will not break. If you want to get full name, then you have to move to the new versions. With lot of code generation plugins in the technology market, generating the code should not be a issue.
To answer your next question of why not WADL and how you get to know the mime types.
WADL is for code generation and serves as a contract. With that you can use JAXB or any mapping framework to convert the JSON string to generated bean objects.
If not WADL, you don't need to inspect every element to determine the type. You can easily do this.
var obj =
jQuery.parseJSON('{"name":"John"}');
alert( obj.name === "John" );
Let me know, If you have any questions.