I am reading about challenges of concurrent and networked software in pattern oriented software architecure vol 2.
Service access often involves invoking remote operations on resuable
components like OMG event service, etc. Supporting the static and
dynamic evolution of services and applications is antoher key
challenge in networked software system.
Evoution can occur in following way
Interfaces to and connectivity between component service roles can
change, often at run-time, and new service roles can be implemented
and installed into and installed into existing components.
It is even more challenging to determine how to access services that
are configured into a system 'on-demand' and whose implementations are
unknown when the system was designed origanally. Here design challenge
are two-fold.
First, an applicatoin must export new services, even though it may not know their detailed interfaces.
Second, an applicaiton must integrate these services into its own control flow and processing sequence transparently and robustly, even
at run-time.
I need your help in understanding above text by answering following questions.
What does author mean by "Interfaces to and connectivity between component service roles can change, often at run-time" ? Request to explain with easy to undestand example.
What does author mean by two points mentioned on-demand challenges which mentioned above. Request elobartion on above two points.
Thanks for your time and help.
1.What does author mean by "Interfaces to and connectivity between component service roles can change, often at run-time" ?
I'm not sure exactly. Interfaces change overtime because:
New technology standards can be adopted - say moving from SOAP to REST, or form XML to JSON, but that would happen slowly overtime through deployment - where as for me "runtime" is a memory space in which things run, and I don't see interfaces changing themselves taht fast - otherwise how could anyone integrate with them?
The API or interface contract itself changes to fulfill business need.
2.What does author mean by two points mentioned on-demand challenges which mentioned above.
Hmmm, good design patterns tend to survive time well (they never change, because they are never broken - like SOLID). The book you are refering to was written in 2000 I think - a lot has changed since then, so whilst the pattern may survive maybe the way we'd now describe it has changed (i.e what he means by "export new services" is open to interpretation)...
1.First, an application must export new services, even though it may not know their detailed interfaces.
Separation Of Concerns (basic OO stuff), all parts of your app don't (shouldn't) inherently know what the other parts are doing internally; likewise, as long as someone (including an external system) is satisfying the interface then who cares how it does so internally.
2.Second, an application must integrate these services into its own control flow and processing sequence transparently and robustly, even
at run-time.
I take this to mean that the program should never break, it should always compile, and if the application is dynamically creating and executing code (say based on user input) then there needs to be checks in-place so that the dynamic code doesn't break the app either.
Related
By 'functionalities structuring', I mean how we organize and coordinate different API endpoints to offer desired functionalities to clients. The context here is web APIs for consumption by mobile phones with GPS tracking, and I assume either cellular or WiFi connectivity is required for most functionalities.
I personally prefer a more 'modular' approach where each endpoint does mostly one thing and a collection of them fulfill all the requirements. Of course, you may need to combine some subset or sequence of these endpoints to achieve certain functionalities. Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
On the other hand, I know some other people prefer client-side convenience (or simplicity) over modularity in the following ways:
If the client needs to achieve a functionality, then there should exist a single API endpoint which does exactly that, such that the client needs only a single request to fulfill the functionality with minimal caching/logic in between requests.
For GET endpoints, if there are multiple levels/kinds of data involved for some functionalities, they prefer as much data as possible (often all necessary data) returned by a single endpoint. Ironically, they may also want a dedicated endpoint for retrieving only the "lowest level" data using a corresponding "highest level" ID. For example, If A corresponds to a collection of Bs, and each B corresponds to a collection of Cs, then they will prefer a direct endpoint that retrieves all the relevant Cs given an A.
In some extreme cases, they will ask for a single endpoint with ambiguous naming (e.g. /api/data) that returns related data from different underlying DB tables (in other words, different resources) based on different combinations of query string parameters.
I understand that people preferring such conveniences above aim to: 1. reduce the number of API requests necessary to fulfill functionalities; 2. minimize data caching and data logic on the client side to reduce client complexity, which arguably leads to a 'simple' client with simplified interaction with the server.
However, I also wonder if the cost of doing so is unjustifiable in other aspects in the long run, especially in terms of the performance and the maintenance of the server-side API. Hence my questions:
What are the tried-and-true guidelines for structuring API functionalities?
How do we determine an optimal number of requests necessary for fulfilling a functionality in a mobile app? Of course, if all other things equal, a single request is the best, but achieving such a single-request implementation usually carries penalty in other aspects.
Given the contention between the number of client requests and the performance and maintainability of server-side API, what are the approaches for striking a balance in order to deliver a sensible design?
What you are asking about breaks into at least three main areas of API design:
Ontology Design (organization)
Request/Response Design (complexity/performance)
Maintenance Considerations
Based on my experience (which is largely from working with very large organizations both on the API producing and consuming side and talking with hundreds of developers on the topic), let's look at each area, addressing the specific points you bring up...
Ontology Design
There are a couple of things to take in to consideration in your design that are perhaps implied when you say:
Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
This approach makes the APIs easily discoverable. When you are in a situation where you are publishing APIs for consumption by other developers who you may or may not know (and may or may not have enough resources to truly support), this kind of modularity - making them easy to find and learn about - creates a different kind of "convenience" leading to easier adoption and reuse of your APIs.
I know some other people much prefer convenience over modularity: 1. if the client needs a functionality, then there should exist a single endpoint in the API which does exactly that...
The best public example that comes to mind for this approach is perhaps the Google Analytics Core Reporting API. They implement a series of querystring parameters to build a call that returns the data requested, ex:
https://www.googleapis.com/analytics/v3/data/ga
?ids=ga:12134
&dimensions=ga:browser
&metrics=ga:pageviews
&filters=ga:browser%3D~%5EFirefox
&start-date=2007-01-01
&end-date=2007-12-31
In that example we are querying Google Analytics Account 12134 for pageviews by browser where broswer is Firefox for the given date range.
Given the number of metrics, dimensions, filters, and segments their API exposes, they have a tool called the Dimensions & Metrics Explorer to help developers understand how to use the APIs.
One approach makes the APIs discoverable and more understandable from the outset. The other requires more supporting work to explain the intricacies of consuming the API. One thing that isn't immediately obvious with the Google API above is that certain segments and metrics are incompatible, so if you are making calls passing one key/value pair, you may not longer be able to pass certain other pairs.
Request/Response Design
The context here is APIs for mobile applications.
That is still very broad, and better defining (if possible) how you intend for your "mobile applications" to be used can help you design your APIs.
Do you intend for them to be used totally offline? If so, heavy/complete data caching may be desirable.
Do you intend for them to be used in low bandwidth and/or high latency/error-rate connectivity scenarios? If so, heavy/complete data caching may be desirable, but so might small/discrete data requests.
for GET endpoints, they often prefer as much data as possible returned by a single endpoint, especially when there are multiple levels/layers of data involved
This is safe if you know you'll only ever be in good mobile connectivity scenarios, or you can cache the data heavily when you are (and thus access it offline or when things are spotty).
I understand that people preferring convenience aim to reduce the number of API calls necessary to achieve functionalities...
One way to find a happy middle ground is to implement paging in your data-intensive calls. For example, a querystring can be passed in a GET specifying 'pagesize'. Thus 10,000 records could be returned 100 at a time over 100 successive calls, or 1,000 at a time over 10 calls.
With this approach, you can design and publish your API without necessarily knowing what your consuming developer will need. Even though the paging example above uses the Google API referenced earlier, it can still be used in a more semantically designed API. For example, say you have GET /customer/phonecalls you could still design it to accept a pagesize value and make successive calls to get all the phonecalls associated with customer.
Maintenance
I also wonder if the cost of doing so [reduce the number of API calls necessary to achieve functionalities and to minimize data caching] is not justifiable in the long run, especially for the performance and the maintenance of an API.
The key guiding principle here is separation of concerns if your collection of APIs is going to grow to any significant level of complexity and scale.
What happens when you have everything bundled together into one big service and a small part of it changes? You are now creating not only a maintenance headache on your side, but also for your API consumer.
Did that "breaking change" really affect the part of the API they were using? It will take time and energy for them to figure that out. Designing API functionality into discrete, semantic services will let you create a roadmap and version them in a more understandable way.
For further reading, I'd suggest checking out Martin Fowler's writings on Microservices Architecture:
In short, the microservice architectural style is an approach to
developing a single application as a suite of small services, each
running in its own process and communicating with lightweight
mechanisms
Although there is a lot of debate about how to design and build for "microservices" in practice, reading up on that should help further shape your thinking on the API design decisions you're facing and prepare you to engage in "current" discussions around the topic.
In my workplace (and a lot of other areas), there is a lot of emphasis on building architecture around services. (I am working in an e-commerce startup). However, I think services are implicitly considered as distributed. I am a believer of the first law of distribution - "don't distribute". So, I believe that we should not un-necessarily complicate architecture. It should be an architecture which can evolve. So, one of the ways to approach the problem would be to create well defined namespaces and build code around it, but keep the communication via java api. (this keeps monitoring requirement low, and reliability/availability problems low). This can easily be evolved into a distributed architecture by wrapping modules into web service, as and when, the scale requirements kick-in. So, the question is - what are the cons of writing code as a single application and evolving into distributed services, rather than straight jumping into implementing web services based architecture? Am I right in assuming that services should imply the basic principles of design (abstraction, encapsulation etc), rather than distribution over network?
Distribution requires modularity. However, it requires more than just modularity: it also requires coarse-grained interaction between the modules.
For example, in a single-process ecommerce system, you might have separate modules for managing the user's shopping cart and calculating prices. They might interact by the cart asking the calculator to price an item, then another item, etc. That would be perfectly fine.
However, in a distributed system, that would require a torrent of small method calls, which is inefficient; you might get away with it if you used CORBA for distribution, but with SOAP, you'd be in trouble. Rather, you would want to have the cart ask the calculator to price the whole order in one go. That might be worse from a separation of concerns point of view (why should the calculator have to know about the idea of carts?), but it would be required to make the system perform adequately.
Related to granularity, there's also the problem of modules interacting via interfaces or implementations. With a single process, you can define a set of interfaces through which modules will interact; modules can pass each other objects implementing those interfaces without having to tell each other about the implementations (eg a scheduler module could be passed anything implementing interface Job { void run(); }). Across a network, the requirement for coarse grain means that any objects passed must be passed by value (because passing by reference would entail fine-grained calls back to the passing module - unless you were using mobile code, which you aren't, because nobody is), which means that both modules must know about and agree on the implementations of the objects.
So, while building a single-process system in a modular way makes it easier to implement SOA later, it doesn't make it as simple as wrapping each module in a SOAP interface. At least, not unless you build your system in a coarse-grained manner from the start, which means throwing away a number of sound and helpful good software engineering practices.
I'm working on the initial architecture for a solution for which an SOA approach has been recommended by a previous consultant. From reading the Erl book(s) and applying to previous work with services (and good design patterns in general), I can see the benefits of such an approach. However, this particular group does not currently have any traditional needs for implementing web services -- there are no external consumers, and no integration with other applications.
What I'm wondering is, are there any advantages to going with web services strictly to stick to SOA, that we couldn't get from just implementing objects that are "service ready"?
To explain, an example. Let's say you implement the entity "Person" as a service. You have to implement:
1. Business object/logic
2. Translator to service data structure
3. Translator from service data structure
4. WSDL
5. Service data structure (XML/JSON/etc)
6. Assertions
Now, on the other hand, if you don't go with a service, you only have to implement #1, and make sure the other code accesses it through a loose reference (using dependency injection, or a wrapper, etc). Then, if it later becomes apparent that a service is needed, you can just have the reference instead point to #2/#3 logic above in a wrapper object (so all caller objects do not need updating), and implement the same amount of objects without a penalty to the amount of development you have to do -- no extra objects or code have to be created as opposed to doing it all up front.
So, if the amount of work that has to be done is the same whether the service is implemented initially or as-needed, and there is no current need for external access through a service, is there any reason to initially implement it as a service just to stick to SOA?
Generally speaking you'd be better to wait.
You could design and implement a web service which was simply a technical facade that exposes the underlying functionality - the question is would you just do a straight one for one 'reflection' of that underlying functionality? If yes - did you design that underlying thing in such a way that it's fit for external callers? Does the API make sense, does it expose members that should be private, etc.
Another factor to consider is do you really know what the callers of the service want or need? The risk you run with building a service is that (as you're basically only guessing) you might need to re-write it when the first customers / callers come along. This can could result in all sorts of work including test cases, backwards compatibility if it drives change down to the lower levels, and so on.
having said that the advantage of putting something out there is that it might help spark use of the service - get people thinking - a more agile principled approach.
If your application is an isolated Client type application (a UI that connects to a service just to get data out of the Database) implementing a SOA like architecture is usually overkill.
Nevertheless there could be security, maintainability, serviceability aspects where using web services is a must. e.g. some clients needs access to the data outside the firewall or you prefer to separate your business logic/data access from the UI and put it on 1 server so that you don’t need to re-deploy the app every time some bus. rules changes.
Entreprise applications require many components interacting with each other and many developers working on it. In this type of scénario using SOA type architecture is the way to go.
The main reason to adopt SOA is to reduce the dependencies.
Enterprise Applications usually depends on a lot of external components (logic or data) and you don’t want to integrate these components by sharing assemblies.
Imagine that you share a component that implements some specific calculation, would you deploy this component to all the dependent applications? What will happen if you want to change some calculation logic? Would you ask all teams to upgrade their references and recompile and redeploy their app?
I recently posted on my blog a story where the former Architect had also choosed not to use web services and thought that sharing assemblies was fine. The result was chaos. Read more here.
As I mentioned, it depends on your requirements. If it’s a monolithically application and you’re sure you’ll never integrate this app and that you’ll never reuse the bus. Logic/data access a 2 tier application (UI/DB) is good enough.
Nevertheless this is an Architectural decision and as most of the architectural decisions it’s costly to change. Of course you can still factor in a web service model later on but it’s not as easy as you could think. Refactoring an existing app to add a service layer is usually a difficult task to accomplish even when using a good design based on interfaces. Example of things that could go wrong: data structure that are not serializable, circular references in properties, constructor overloading, dependencies on some internal behaviors…
We are building three-tier architectures for over a decade now. Dividing presentation-, logic- and data-tier is supposed to allow us to exchange each layer individually, should the need ever arise, be it through changed requirements or new technologies.
I have never seen it working in practice...
Mostly because (at least) one of the following reasons:
The three tiers concept was only visible in the source code (e.g. package naming in Java) which was then deployed as one, tied together package.
The code representing each layer was nicely bundled in its own deployable format but then thrown into the same process (e.g. an "enterprise container").
Each layer was run in its own process, sometimes even on different machines but through the static nature they were connected to each other, replacing one of them meant breaking all of them.
Thus what you usually end up with, in is a monolithic, tightly coupled system that does not deliver what it's architecture promised.
I therefore think "three-tier architecture" is a total misnomer. The true benefit it brings is that the code is logically sound. But that's at "write time", not at "run time". A better name would be something like "layered by responsibility". In any case, the "architecture" word is misleading.
What are your thoughts on this? How could working three-tier architecture be achieved? By that I mean one which holds its promises: Allowing to plug out a layer without affecting the other ones. The system should survive that and be in a well defined state afterwards.
Thanks!
The true purpose of layered architectures (both logical and physical tiers) isn't to make it easy to replace a layer (which is quite rare), but to make it easy to make changes within a layer without affecting the others (and as Ben notes, to facilitate scalability, consistency, and security) - which works all the time all around us.
One example of a 3-tier architecture is a typical database-driven web application:
End-user's web browser
Server-side web application logic
Database engine
In every system, there is the nice, elegant architecture dreamed up at the beginning, then the hairy mess when its finally in production, full of hundreds of bug fixes and special case handlers, and other typical nasty changes made to address specific issues not realized during the design.
I don't think the problems you've described are specific to three-teir architecture at all.
If you haven't seen it working, you may just have bad luck. I've worked on projects that serve several UIs (presentation) from one web service (logic). In addition, we swapped data providers via configuration (data) so we could use a low-cost database while developing and Oracle in higher environments.
Sure, there's always some duplication - maybe you add validation in the UI for responsiveness and then validate again in the logic layer - but overall, a clean separation is possible and nice to work with.
Once you accept that n-tier's major benefits--namely scalability, logical consistency, security--could not easily be achieved through other means, the question of whether or not any of the tiers can be replaced outright without breaking the the others becomes more like asking whether there's any icing on the cake.
Any operating system will have a similar kind of architecture, or else it won't work. The presentation layer is independent of the hardware layer, which is abstracted into drivers that implement a certain interface. The data is handled using logic that changes depending on the type of data being read (think NTFS vs. FAT32 vs. EXT3 vs. CD-ROM). Linux can run on just about any hardware you can throw at it and it will still look and behave the same because the abstractions between the layers insulate each other from changes within a single layer.
One of the biggest practical benefits of the 3-tier approach is that it makes it easy to split up work. You can easily have a DBA and a business anylist or two building a data layer, a traditional programmer building the server side app code, and a graphic designer/ web designer building the UI. The three teams still need to communicate, of course, but this allows for much smoother development in most cases. In this regard, I see the 3-tier approach working reliably everyday, and this enough for me, even if I cannot count on "interchangeable parts", so to speak.
I am a total newbie to the world of SOA. As such, I am looking at some "SOA frameworks/technologies", and trying to understand how to utilize them to build a highly scalable (Facebook class) website.
There are several "pains" I am trying to solve here:
Composability (+ managing dependencies, Pub/Sub)
Language-independence of services
Scalability & Performance
High availability
I looked into some technologies that answer a subset of the above criteria:
Thrift - Facebook's cross-platform RPC platform
WCF - supports SOAP, JSON & REST, so it can be considered language-interoperable. Generates WSDL files that can be use to generate java proxies.
Microsoft DSS - just inclued it in my survey, but it doesn't seem relevant as it is highly state-driven and .NET specific.
Web Services
Now, I understand how I get some aspects of composability and language-independence out of the above. But, I haven't found much concrete information (not buzz) about how to use the above / other tools for scalability and high availability. So finally I get to my question:
How does one leverage SOA technologies to solve the pains I defined above? Where can I find technical guides for that? I am looking for more than just system diagrams, but rather actual libraries, code samples, APIS...
I think the question has more to do with the concepts involved than the tools. Answers to the items:
Understand and internalize bounded context. Keeping unrelated pieces separate its important to get real reuse on different services. Related technologies won't help you on this, its you that separate the app in appropriate contexts, which you can appropriately reuse for different services.
Clear endpoint for communication based on known protocols allows implementing different pieces with different technologies
Having the operations flow like independent actions based on different protocols, gives you a lot of places where you can add tiers. Is a particular sub-process of the overall process using lots of resources and the server can't take it anymore, just move to a separate server. Load kept growing, and that server isn't taking it anymore, add an additional server and load balance. You also have more opportunity to use caching and connection pooling.
Have a critical sub-process that needs to be available all the time, add a server so you can have a fail over. Have an overall process that needs to be "available" all the time, use queues for pieces that can be processed later.
Do you really need to support that type of load? Set appropriate performance/load/interoperability targets that relate to the specific scenario. If you really need to support that type of load, I recommend you get someone on board who has dealt with it.
If it is something for a load that might eventually be, identify the bounded contexts and design the interaction between those with a SOA mindset. Keeping the code clean is all you have to do for the rest, use TDD, loose coupling, focused integration tests, etc in your code base. With good code, if you later need to separate pieces of the system, it will be a lot easier.
There are interesting and relevant things said about services and architecture by Amazon's CTO - Werner Vogels:
An interview about the Amazon technology platform
A post on his blog - "Eventually Consistent - Revisited"
A 50 min presentation about Availability & Consistency
IDesign.net has a bunch of great downloads for WCF.
Worth a look at: http://www.manning.com/davis/ ?
Check out the Mule project which bundles the CXF services stack and also the Mule REST pack which provides RESTful alternatives. I think you'll see it addresses all of your pains and there are lots of examples in the documentation as well as the distribution.
May I recommend the book: Enterprise Service Oriented Architectures published by Springer Verlag.
All of the advice here is well and good, but don't worry about it until you really need to.
Focus on building a usable, functional application that people really like. When you start running into problems, then start handling bottlenecks.
You will never be able to foresee every way an application will fail, so how can you tell if [[insert tech here]] is your answer?