What are your WebService Versioning best practices? - web-services

We have 2 separate products that need to communicate with each other via web services.
What is the best-practice to support versioining of the API?
I have this article from 2004 claiming there is no actual standard, and only best practices. Any better solutions? How do you solve WS versioning?
Problem Description
System A
Client
class SystemAClient{
SystemBServiceStub systemB;
public void consumeFromB(){
SystemBObject bObject = systemB.getSomethingFromB(new SomethingFromBRequest("someKey"));
}
}
Service
class SystemAService{
public SystemAObject getSomethingFromA(SomethingFromARequest req){
return new SystemAObjectFactory.getObject(req);
}
}
Transferable Object
Version 1
class SystemAObject{
Integer id;
String name;
... // getters and setters etc;
}
Version 2
class SystemAObject{
Long id;
String name;
String description;
... // getters and setters etc;
}
Request Object
Version 1
class SomethingFromARequest {
Integer requestedId;
... // getters and setters etc;
}
Version 2
class SomethingFromARequest {
Long requestedId;
... // getters and setters etc;
}
System B
Client
class SystemBClient{
SystemAServiceStub systemA;
public void consumeFromA(){
SystemAObject aObject = systemA.getSomethingFromA(new SomethingFromARequest(1));
aObject.getDescription() // fail point
// do something with it...
}
}
Service
class SystemBService{
public SystemBObject getSomethingFromB(SomethingFromBRequest req){
return new SystemBObjectFactory.getObject(req);
}
}
Transferable Object
Version 1
class SystemBObject{
String key;
Integer year;
Integer month;
Integer day;
... // getters and setters etc;
}
Version 2
class SystemBObject{
String key;
BDate date;
... // getters and setters etc;
}
class BDate{
Integer year;
Integer month;
Integer day;
... // getters and setters etc;
}
Request Object
Version 1
class SomethingFromBRequest {
String key;
... // getters and setters etc;
}
Version 2
class SomethingFromBRequest {
String key;
BDate afterDate;
BDate beforeDate;
... // getters and setters etc;
}
Fail Scenarios
If a System A client of version 1 calls a System B service of version 2 it can fail on:
missing methods on SystemBObject (getYear(), getMonth(), getDay())
Unknown type BDate
If a System A client of version 2 calls a System B service of version 1 it can fail on:
Unknown type BDate on the SomethingFromBRequest (A client uses a newer B request object that B version 1 doesn't recognize)
If the System A client is smart enough to use version 1 of the request object, it can fail on missing methods on the SystemBObject object (getDate())
If a System B client of version 1 calls a System A service of version 2 it can fail on:
Type missmatch or overflow on SystemAObject (returned Long but expected Integer)
If a System B client of version 2 calls a System A service of version 1 it can fail on:
Type missmatch or overflow on SystemARequest (request Long instead of Integer)
If the request passed somehow, casting issues (the stub is Long but the service returns an Integer not nessecarily compatible in all WS implementations)
Possible solutions
Use numbers when advancing versions: e.g. SystemAObject1, SystemBRequest2 etc but this is missing a an API for matching source / target version
In the signature, pass XML and not objects (yuck, pass escaped XML in XML, double serialization, deserialization / parsing, unparsing)
Other: e.g. does Document/literal / WS-I has a remedy?

I prefer the Salesforce.com method of versioning. Each version of the Web Services gets a distinct URL in the format of:
http://api.salesforce.com/{version}/{serviceName}
So you'll have Web Service URLs that look like:
http://api.salesforce.com/14/Lead
http://api.salesforce.com/15/Lead
and so on...
With this method, you get the benefits of:
You always know which version you're talking to.
Backwards compatibility is maintained.
You don't have to worry about dependency issues. Each version has the complete set of services. You just have to make sure you don't mix versions between calls (but that's up to the consumer of the service, not you as the developer).

The solution is to avoid incompatible changes to your types.
Take for example, SystemBObject. You describe "version 1" and "version 2" of this type, but they are not the same type at all. A compatible change to this type involves only Adding properties, and not changing the type of any existing properties. Your hypothetical "Version update" has violated both of those constraints.
By following that one guildeline, you can avoid ALL of the problems you described.
Therefore, if this is your type definition in version 1
class SystemBObject{ // version 1
String key;
Integer year;
Integer month;
Integer day;
... // getters and setters etc;
}
Then, this cannot be your type definition in v2:
// version 2 - NO NO NO
class SystemBObject{
String key;
BDate date;
... // getters and setters etc;
}
...because it has eliminated existing fields. If that is the change you need to make, it is not a new "version", it is a new type, and should be named as such, both in code and in the serialization format.
Another example: If this is your existing v1 type :
class SomethingFromARequest {
Integer requestedId;
... // getters and setters etc;
}
... then this is not a valid "v2" of that type:
class SomethingFromARequest {
Long requestedId;
... // getters and setters etc;
}
...because you have changed the type of the existing property.
These constraints are explained in much more detail a mostly technology-neutral way in Microsoft's Service Versioning article.
Aside from avoiding that source of incompatibility, you can and should include a version number in the type. This can be a simple serial number. If you are in the habit of logging or auditing messages, and bandwidth and storage space is not a problem, you may want to augment the simple integer with a UUID to identify an instance of each unique version of a type.
Also, you can design forward-compatibility into your data transfer objects, by using lax processing, and mapping "extra" data into an "extra" field. If XML is your serialization format, then you might use xsd:xmlAny or xsd:any and processContents="lax" to capture any unrecognized schema elements, when a v1 service receives a v2 request (more). If your serialization format is JSON, with its more open content model, then this comes for free.

I think something else to keep in mind is your client base, are you publishing this service publicly, or is it restricted to a set of known agents?
I'm involved in the latter situation, and we've found it's not that difficult to solve this via simple communication / stakeholdering.
Though it's only indirectly related to your question, we've found that basing our versioning number on compatibility seems to work quite well. Using A.B.C as an example...
A: Changes which require recompilation (breaks backwards compatibility)
B: Changes which do not require recompilation, but have additional features not available without doing so (new operations etc.)
C: Changes to the underlying mechanics that do not alter the WSDL

I know this is late to the game, but I've been digging into this issue rather deeply. I really think the best answer involves another piece of the puzzle: a service intermediary. Microsoft's Managed Services Engine is an example of one - I'm sure others exist as well. Basically, by changing the XML namespace of your web service (to include a version number or date, as the linked article mentions), you allow the intermediary the capability to route the various client calls to the appropriate server implementations. An additional (and, IMHO, very cool) feature of MSE is the ability to perform policy-based transformation. You can define XSLT transforms that convert v1 requests into v2 requests, and v2 responses into v1 responses, allowing you to retire the v1 service implementations without breaking client implementations.

Related

Serialize C++ classes between processes and across the network

I'd like to understand how to transmit the contents of a C++ class between processes or across a network.
I'm reading the Google Protobuf tutorial:
https://developers.google.com/protocol-buffers/docs/cpptutorial
and it seems you must create an abstracted, non-C++ interface to represent your class:
syntax = "proto2";
package tutorial;
message Person {
optional string name = 1;
optional int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
}
However, I'd prefer to specify my class via C++ code (rather than the abstraction) and just add something like serialize() and deserialize() methods.
Is this possible with Google Protobuf? Or is this how Protobuf works and I'd need to use a different serialization technique?
UPDATE
The reason for this is I don't want to have to maintain two interfaces. I'd prefer to have one C++ class, update it and not have to worry about a second .proto interface/definition. Code maintainability.
That's how Protobuf works. You have to use something else if you want to serialize your manually-written C++ classes. However, I'm not sure you really want that, because you then will have to either restrict yourself to very simple fields with no invariants (just like in Protobuf) or write custom (de)serialization logic yourself.
You could make a simple protocol buffer to hold binary information, but it sort of breaks the point of using Protocol buffers.
You can sort of cheat the system by using SerializeToString() and ParseFromString() to simply serialize binary information into a string.
There is also SerializeToOstream() and ParseFromIstream().
The real value of protocol buffers is being able to use messages across programs, systems and languages while using a single definition. If you aren't making messages using the protocol they've defined; this is more work than simply using native C++ capabilities.

How to chain planners from JDBC Adapter SchemaFactory?

I extended JDBC adapter and used a model.json configuration custom schema factory with 1 original schema and 2 derived schemas to add rules and that worked, rules got executed on original schema during planning, but their end-result didn't get chosen as the best option by the Volcano planner because it's too expensive. Rules transformed RelNode to execute on 2 derived schemas. More details below and in code.
1) Can I tell Volcano planner to ignore 1 out of 3 schemas that I passed through custom JDBC SchemaFactory?
I want the parser to work on that 1 original schema, but for the planner to never suggest an optimal (cheapest) plan in that schema (only other 2 derived schemas). 1 original schema is always mapped 1-to-1 with other 2 derived schemas, so the RelNode that my rule returns is always semantically equivalent, just more expensive (security reasons).
2) If that can't work, how can I call HepPlanner instead of default Volcano planner from SchemaFactory that is set in model.json, since that's my starting point?
You can find my entire code on GitHub, I made it publicly available so that everyone can have a better starting point with Calcite than I did.
Here is the link: https://github.com/igrgurina/multicloud_rewriter
Calcite library is amazing, but it's really hard to get into because it lacks examples and tutorials for common tasks.
Ideally, I would have HepPlanner execute my rules that transform them to semantically equivalent expressions that use 2 derived schemas instead of 1 original schema (I have a rule that does that), and then have Volcano planner optimize that using only 2 derived schemas, without having an idea that 1 original schema exists, due to security reasons.
I haven't found any reasonable examples that demonstrate how to do that so any help would be appreciated (please don't post links to Druid example, or Apache Calcite docs website, I went through them a thousand times).
I've managed to make this work by using Hook.PROGRAM and prepending my custom program that executes my rules before all others.
Since Hook is marked as for testing and debugging only in Calcite library, I would say this is not how it's supposed to be done, but I have nothing better at the moment.
Here is a short summary with code sample:
public static class MultiCloudHookManager {
private static final Program PROGRAM = new MultiCloudProgram();
private static Hook.Closeable globalProgramClosable;
public static void addHook() {
if (globalProgramClosable == null) {
globalProgramClosable = Hook.PROGRAM.add(program());
}
}
private static Consumer<Holder<Program>> program() {
return prepend(PROGRAM);
}
// this doesn't have to be in the separate program
private static Consumer<Holder<Program>> prepend(Program program) {
return (holder) -> {
if (holder == null) {
throw new IllegalStateException("No program holder");
}
Program chain = holder.get();
if (chain == null) {
chain = Programs.standard();
}
holder.set(Programs.sequence(program, chain));
};
}
}
The MultiCloudHookManager is then used in SchemaFactory, where you simply call MultiCloudHookManager.addHook() method. In this case, MultiCloudHookManager.PROGRAM is set to MultiCloudProgram, that simply executes a set of rules in HepPlanner.
For full details, refer to the source code in GitHub repository.
This hack solution is inspired by another library.

Can we invoke self-defined callback function in the parser of google protocol buffer textformat?

In google protocol buffer, there exists a textual version of message. When parsing this textual message, can we define ourselves the callback functions in order that we could store the information parsed into our own data structure?
For example, if we have defined .proto:
message A {
required string name = 1;
optional string value =2;
repeated B bList =3;
}
message B {
required string name =1;
optional string value =2;
}
And we have textformat message:
A {
name: "x"
value: "123"
B {
name: "y"
value: "987"
}
B {
name: "z"
value: "965"
}
}
The protobuf compiler generates the corresponding class named "A", class named "B". The parser can parse this text format into the instance of A. However, if user want to defined our own version of class "A", or there exists a version of A used before. Now as we would like to replace the old exchange format by google protocol buffer, we are willing to parse the google protocol buffer text format version directly into the old data structure. If not, we will have to first of all have the generated data structure (class "A") filled then adapt the generated data structure to the legacy data structure. It occupies two times the memory than necessary. It can be much less efficient than we wanted.
The traditional method used for integrating a parser is to have a parser that can callback self-defined functors to be accustomed to the new data structure.
So, does there exist a way to inject the self-defined callback function into the text format parser?
No, the protobuf TextFormat implementation does not support such extensions.
That said, TextFormat (in at least C++, Java, and Python) is implemented as a self-contained module that operates only on public interfaces (mainly, the reflection interface). You can easily clone it and then make your own modifications to the format, or even write a whole new module in the same style that implements any arbitrary format. For example, many people have written JSON parsers / encoders based on Protobuf reflection, using the TextFormat implementation as a guide.

C++ DBI Class - best/developer friendliest style

In our current project, we need some high level DBI for different databases. It should provide the following features:
in memory cache - the DBI should be able to cache all reads, and update the cache on writing calls (the application we are coding on is heavy threaded, and needs fast access to the current data all the time). The memory cache will be based on boost::multi_index
automatic sql building - We don't want to parse a sql statement to lookup in the memory cache
As we need to provide functions for: defining a table layout, do selects, do inserts, do updates, joins, ..., the interface will get very complex.
We need a good way to invoke the interface function.
There are many styles around, but we could not find any useful for our usage.
Here a few examples:
SOCI
sql << "select name, salary from persons where id = " << id, into(name), into(salary);
We don't want some SQL statements, so we would have to define what and from a different way.
pqxx
Conn.prepare("select_salary",
"select name, salary from persons where id = $1")
((string)"integer",prepare::treat_direct);
The heavy usage of the overloaded operator() is just ugly, but it could work for us too.
Any suggestions how to design the interface?
How about using object relational mapping? Here's some code fragment ideas off the top of my head - I've only done this in Python, never in C++, and only for fairly simple databases. There's a list of frameworks on Wikipedia that should avoid too much wheel-related R&D.
class people: public dbi_table
{
// id column handled by dbi_table.
name: string_column;
salary: money_column;
};
class cost_center: public dbi_table
{
name: string_column;
office: foreign_key<offices>;
};
class people_cost_center_link: public link_table
{
// Many-many relationships.
};
Then you can manipulate records as objects, all the relational stuff is handled by the framework. Querying is done by defining a query object and then getting an iterator to the results (see the ODB wikipedia page for a code example).
I would do it like this (and it'd be good in c++ point of view, dunno if it's correct database stuff):
struct Handle { int id; }
class DBI
{
public:
virtual Handle select(int column_id)=0;
virtual Handle select(int column1, int column2)=0;
virtual Handle id(int id)=0;
virtual Handle join(Handle i1, Handle i2)=0;
virtual void execute_query(Handle i)=0;
};
Usually these functions would be implemented like this:
Handle select(int column_id) {
return new_handle(new SelectNode(column_id));
}
where new_handle function would just insert SelectNode to std::vector or std::map and create an handle for it.

Validation Framework in C++

Both Java and .Net seem to have a wealth of object validation frameworks (e.g. Commons Validator, XWork, etc.), but I've been unable to find anything similar for C++. Has anyone come across something like this, or do people typically roll their own?
In 2020 there is cpp-validator library.
This is a C++14/C++17 header-only library that can be used to validate:
plain variables;
properties of objects, where a property can be accessed either as object's variable or object's getter method;
contents and properties of containers;
nested containers and objects.
Basic usage of the library includes two steps:
first, define a validator using almost declarative syntax;
then, apply the validator to data that must be validated and check the results.
See example below.
// define validator
auto string_validator=validator(
value(gte,"sample string"),
size(lt,15)
);
// validate variable
std::string var="sample";
error_report err;
validate(var,string_validator,err);
if (err)
{
std::cerr << err.message() << std::endl;
/* prints:
must be greater than or equal to "sample string"
*/
}
Some GUI frameworks have validators.
Check out wxWidgets Validators