Context
Let's say we have an organization that has multiple businesses. In this example, Business A sells a gigabit internet service to college students. Business B sells a megabit internet service to seniors. The businesses sell related products with slight variations, each targeting a different demographic.
At first glance, this seems like we can just have one application handle all the requests. However, it is natural for the businesses to diverge from each other given that they each target a specific demographic - by nature, each business will have its own business requirements. For example, Business A might expose a mobile application for customers to manage their account. Business B might expose a phone number that has to be called for customers to manage their account. The list goes on.
What is the best way to utilize microservices given this context?
The problem is that there is both common and uncommon functionality across the different businesses.
We can remain somewhat DRY and have a set of base microservices (billing-api, order-api, etc.) that can be consumed by the different businesses. This works but this causes the microservices to have more "general" abstractions - leading to more complexity. For a concrete example, let's say the billing-api service has a /charge endpoint that is shared by Business A and B. Business B's requirement is to always discount $5 off the order:
//billing-api
if (businessB) {
orderCost -= 5;
}
In this DRY approach, we would have an API gateway for each business (BFF pattern) which would aggregate different microservices to fulfill their business needs. All "business-specific" logic would get moved from the base microservices into the respective businesses' API gateway. In this discount example, instead of having an if (businessB) check in the billing-api endpoint, we can invert this control to the consumer:
//billing-api
const { orderDiscountAmount } = req.body; //body parameters
if (orderDiscountAmount > 0) {
orderCost -= orderDiscountAmount;
}
Then the endpoint in Business B's API gateway would pass in an orderDiscountAmount of 5 when calling the billing-api endpoint:
//Business B API Gateway
billingApi({ orderDiscountAmount: 5 });
This seems fine, but all we did was take Business B's logic in the billing-api endpoint and created a generic (but forced) abstraction. This is "justified" by saying maybe Business A may use that one day - but that may never actually happen. Overall, this feels like an unnatural exercise for the developer and the consumer of the endpoint. Complexity and cognitive load on all sides are increased.
We can scrap DRY and avoid sharing microservices between businesses for maximum flexibility and simplicity. However, if more businesses are added (10-20) then there's probably going to be a good chunk of duplicated functionality.
How should teams be structured given this context?
If we are okay with the DRY approach from above, how should teams be structured? We can have vertically-sliced feature teams, but does that mean if we have 10 businesses, a team would need to own a feature (i.e. checkout) on all the businesses? The drawback with this approach is that the feature teams won't be experts in any business as a whole - the teams would only be an expert in one feature in a given business. Not having the full context on a business could make it difficult to make the right decisions.
We can have a stream-aligned team for each business dedicated to the UI and the API gateway. We would then have platform teams creating microservices for the stream-aligned teams to consume. The drawback with this is that there is a handoff step between the stream-aligned team and the platform team, a.k.a a dependency.
I'm not sure if I'm looking at all this from the wrong lens - any feedback would be appreciated!
Sorry to say, but this is not a good question for Stackoverflow, because any answer will be a opinion based and many approaches may work and depend on more details in your specific use case. So don't be disappointed if the question get's closed at some point.
That being said I am not too shy to offer my opinion or at least some thoughts about your described situation.
I believe your question about how to set up teams and how to set up the architecture are very tightly linked because architecture will with no doubt follow the organizational structure effectively. So I will give further thoughts about the organizational setup first.
An estimation of the total manpower required for each of the businesses should give you an idea of how many teams you need. Trying to keep team size small (say 2-8 people) will help reducing the communication overhead. So if you think this is the size for a whole business then there is no need to further split responsibility.
Responsibility is the most important keyword. You have to avoid any situation where a common service/library is used but has multiple or no owners. There should always be exactly one organizational owner. Thus, when organizations recognize the overlap of functionality in separate areas it is a common practice to establish a team that will be responsible and provide this functionality to others. This could be in the form of shared libraries or actually deployed services. In both cases it is important that the communication is formalized by correctly versioning their work and leaving it to the consuming groups which versions to use, when to upgrade, putting in requests for new features, etc. This approach will decouple the teams that use this common functionality.
In your problem description the core of the problem is the business logic and it's complexity / overlap. So I would argue that the most important role is the product management. They have to be very good (and at least a bit technical) and sort this exact mess into reusable pieces and things that are specific to only a single business. If you have a whole team of product managers they need to communicate very well and build this picture together. What is most important here is a good communication about the vision for the future and not just immediate requirements (Provide a great domain view). Only then can the architecture and teams be set up in the best possible way.
No matter how careful the initial setup - Changes WILL happen. Whatever you think in the beginning to be the best solution will change at some point in the future. In order to prepare for this I always recommend to go with the simplest approaches - Even if it means some code duplication or other imperfections. As software architects we tend to love the beauty of perfection, but that is rarely the most effective approach in the real world.
It is common sense to make a simple shared service/library that can be made to fit multiple use cases by adding some configurability. Up to certain degree of complexity that is a useful approach, but you have to be sensible to the consumers of that library/service and it should be easy to reuse at any point. It is not black and white about when to a functionality becomes too big / complex and has to be split into multiple pieces to be maintainable, but looking at it with the eyes of the maintainer and the eyes of consumer will make a determination easier. In the case of configurable service libraries you could also have separate deployments with different configurations, so using commonly developed components, but deploying different endpoints for each use case. If you use technologies that produce only a small deployment overhead (for example golang containers that are only a few mbs), then the large number of deployed services is not a drawback but a strength because they can be upgraded / versioned independently and it is even easy to run multiple versions in parallel.
Infrastructure and service deployment may or may not follow the architecture of the services. As a general rule I would recommend to look for the simplest approach, which often means providing common infrastructure that is shared among services and deployment configurations are where the distinction between services starts. For example a all services share a common cluster / streaming / gateways / databases / etc. Exceptions to that could be very special needs for single services, like a hardware encryption key store or GPU servers for machine learning, etc. This would be the approach for any reasonable sized system. (Of course if you are going scale to very large sizes it also a very feasible approach to have complete stacks / clusters for specific services.)
Persistency design is most crucial. Where it is relatively easy to evolve a business logic, reorganize it, etc., it is rather difficult to evolve your historic data. Often you have a choice to do a design in one of two ways:
Smart algorithms, dumb data.
Smart data, dumb algorithms.
(Smart referring to more elaborate / reflecting more of the business requirements)
The second approach is usually harder initially, but in my experience will have better results when a certain complexity threshold is reached.
So these are just a few things that came to my head when reading your question. I apologize that they cannot answer your detailed question about how to slice the billing API, but maybe you have a few additional considerations at hand.
Related
By 'functionalities structuring', I mean how we organize and coordinate different API endpoints to offer desired functionalities to clients. The context here is web APIs for consumption by mobile phones with GPS tracking, and I assume either cellular or WiFi connectivity is required for most functionalities.
I personally prefer a more 'modular' approach where each endpoint does mostly one thing and a collection of them fulfill all the requirements. Of course, you may need to combine some subset or sequence of these endpoints to achieve certain functionalities. Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
On the other hand, I know some other people prefer client-side convenience (or simplicity) over modularity in the following ways:
If the client needs to achieve a functionality, then there should exist a single API endpoint which does exactly that, such that the client needs only a single request to fulfill the functionality with minimal caching/logic in between requests.
For GET endpoints, if there are multiple levels/kinds of data involved for some functionalities, they prefer as much data as possible (often all necessary data) returned by a single endpoint. Ironically, they may also want a dedicated endpoint for retrieving only the "lowest level" data using a corresponding "highest level" ID. For example, If A corresponds to a collection of Bs, and each B corresponds to a collection of Cs, then they will prefer a direct endpoint that retrieves all the relevant Cs given an A.
In some extreme cases, they will ask for a single endpoint with ambiguous naming (e.g. /api/data) that returns related data from different underlying DB tables (in other words, different resources) based on different combinations of query string parameters.
I understand that people preferring such conveniences above aim to: 1. reduce the number of API requests necessary to fulfill functionalities; 2. minimize data caching and data logic on the client side to reduce client complexity, which arguably leads to a 'simple' client with simplified interaction with the server.
However, I also wonder if the cost of doing so is unjustifiable in other aspects in the long run, especially in terms of the performance and the maintenance of the server-side API. Hence my questions:
What are the tried-and-true guidelines for structuring API functionalities?
How do we determine an optimal number of requests necessary for fulfilling a functionality in a mobile app? Of course, if all other things equal, a single request is the best, but achieving such a single-request implementation usually carries penalty in other aspects.
Given the contention between the number of client requests and the performance and maintainability of server-side API, what are the approaches for striking a balance in order to deliver a sensible design?
What you are asking about breaks into at least three main areas of API design:
Ontology Design (organization)
Request/Response Design (complexity/performance)
Maintenance Considerations
Based on my experience (which is largely from working with very large organizations both on the API producing and consuming side and talking with hundreds of developers on the topic), let's look at each area, addressing the specific points you bring up...
Ontology Design
There are a couple of things to take in to consideration in your design that are perhaps implied when you say:
Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
This approach makes the APIs easily discoverable. When you are in a situation where you are publishing APIs for consumption by other developers who you may or may not know (and may or may not have enough resources to truly support), this kind of modularity - making them easy to find and learn about - creates a different kind of "convenience" leading to easier adoption and reuse of your APIs.
I know some other people much prefer convenience over modularity: 1. if the client needs a functionality, then there should exist a single endpoint in the API which does exactly that...
The best public example that comes to mind for this approach is perhaps the Google Analytics Core Reporting API. They implement a series of querystring parameters to build a call that returns the data requested, ex:
https://www.googleapis.com/analytics/v3/data/ga
?ids=ga:12134
&dimensions=ga:browser
&metrics=ga:pageviews
&filters=ga:browser%3D~%5EFirefox
&start-date=2007-01-01
&end-date=2007-12-31
In that example we are querying Google Analytics Account 12134 for pageviews by browser where broswer is Firefox for the given date range.
Given the number of metrics, dimensions, filters, and segments their API exposes, they have a tool called the Dimensions & Metrics Explorer to help developers understand how to use the APIs.
One approach makes the APIs discoverable and more understandable from the outset. The other requires more supporting work to explain the intricacies of consuming the API. One thing that isn't immediately obvious with the Google API above is that certain segments and metrics are incompatible, so if you are making calls passing one key/value pair, you may not longer be able to pass certain other pairs.
Request/Response Design
The context here is APIs for mobile applications.
That is still very broad, and better defining (if possible) how you intend for your "mobile applications" to be used can help you design your APIs.
Do you intend for them to be used totally offline? If so, heavy/complete data caching may be desirable.
Do you intend for them to be used in low bandwidth and/or high latency/error-rate connectivity scenarios? If so, heavy/complete data caching may be desirable, but so might small/discrete data requests.
for GET endpoints, they often prefer as much data as possible returned by a single endpoint, especially when there are multiple levels/layers of data involved
This is safe if you know you'll only ever be in good mobile connectivity scenarios, or you can cache the data heavily when you are (and thus access it offline or when things are spotty).
I understand that people preferring convenience aim to reduce the number of API calls necessary to achieve functionalities...
One way to find a happy middle ground is to implement paging in your data-intensive calls. For example, a querystring can be passed in a GET specifying 'pagesize'. Thus 10,000 records could be returned 100 at a time over 100 successive calls, or 1,000 at a time over 10 calls.
With this approach, you can design and publish your API without necessarily knowing what your consuming developer will need. Even though the paging example above uses the Google API referenced earlier, it can still be used in a more semantically designed API. For example, say you have GET /customer/phonecalls you could still design it to accept a pagesize value and make successive calls to get all the phonecalls associated with customer.
Maintenance
I also wonder if the cost of doing so [reduce the number of API calls necessary to achieve functionalities and to minimize data caching] is not justifiable in the long run, especially for the performance and the maintenance of an API.
The key guiding principle here is separation of concerns if your collection of APIs is going to grow to any significant level of complexity and scale.
What happens when you have everything bundled together into one big service and a small part of it changes? You are now creating not only a maintenance headache on your side, but also for your API consumer.
Did that "breaking change" really affect the part of the API they were using? It will take time and energy for them to figure that out. Designing API functionality into discrete, semantic services will let you create a roadmap and version them in a more understandable way.
For further reading, I'd suggest checking out Martin Fowler's writings on Microservices Architecture:
In short, the microservice architectural style is an approach to
developing a single application as a suite of small services, each
running in its own process and communicating with lightweight
mechanisms
Although there is a lot of debate about how to design and build for "microservices" in practice, reading up on that should help further shape your thinking on the API design decisions you're facing and prepare you to engage in "current" discussions around the topic.
We are looking at the possibility of implementing a Common Information Model for data across several systems in a SOA architecture.
Many of these services will be consumed by a composite UI, we therefore see a benefit in having common data types.
What we are wondering is if this is a feasible approach, or if we should just map to common types in the client?
This question is framed pretty broadly, so my answer is going to remain pretty broad as well.
The key consideration here would seem to be location independence - though you're working with several applications, they're all going to share certain sorts of data (though not, as far as I can see from your question, actual data). An obvious use case for this is authentication and authorization data.
If you have determined that the common data is truly cooked enough to isolate in the fashion you're describing then I think it makes perfect sense to layer it off into a service. I think the perfect example of this is Windows Identity Framework. It takes something that we as architects have always treated as data and turns it into a service.
What you lose with the location independence is a little bit of efficiency that you would otherwise have in making batches calls to the same server, though SOA applications lose this efficiency early in their design, in my experience. But the efficiency you gain from "patternizing" a section of your apps generally outweighs that enormously.
Having a common information model doesn't imply common data types or common classes. Simply defining the relationships between, for instance, Customer, Order, OrderItem and Product goes a great distance toward common business logic and the ability to have different services and applications be able to interoperate in an SOA environment.
You might consider having an actual common model in some modeling language. From this, concrete data types and classes could be generated for particular circumstances. One might use UML for this, but I personally prefer to use NORMA, an Object-Role Modeling tool. It works at the conceptual level, so creates models that are independent of the data store technology.
NORMA runs as an add-in to Visual Studio Standard edition or above, but out of the box generates artifacts for several databases, as well as LINQ to SQL classes and even PHP web services, all from the same model. It is extensible so that you can generate your own artifacts from the model. And of course, the model is represented as XML, so you can do whatever you like with it.
For a government contract we will be proposing to build a traffic monitoring architecture. We will have the following components:
Video camera's set up around the area of interest. The cameras will be aware of their location and orientation and viewing parameters.
A GIS map server which can be queried for streets, building, etc.
An algorithm the takes in raw video and street location information and outputs car locations.
Another algorithm takes in car locations and very low level street information and provides information about which cars are driving anomalously.
Another database takes in information about car locations and anomaly reports over time and can be queried for this later.
A proxy (or perhaps more accurately, a facade) is set up over the archive database and the real-time algorithms in order to provide a unified interface to the information.
A client attaches to the proxy and to the street server and paints various representations of the traffic situation on the screen.
I'm just now learning what an SOA is. Is this an ideal candidate of a Service Oriented Architecture SOA? I had heard that SOA services should be stateless (or is that only RESTful services?) I had also heard that it was inadvisable to pipe one service to the next to the next because it increases hidden complexity, and that there was something you should do to make this situation better (an "orchestration"?). The services above do appear to be modular and reusable. For instance, there will be plenty of cameras, various types of vehicle detection and anomaly algorithms, distributed databases, and plenty of clients. I will need to have the capability to handle events: for instance, if I may want to register to a service and be notified whenever a big truck moves past this point.
If this isn't ideally implemented by a SOA, then where else should I be looking. If this is ideal for a SOA, then where should I start when designing this? (And I'm starting basically from having read Wikipedia's SOA page.) Are there any good case studies to look at here?
Yes, SOA is ideal in this case (complex, distributed system with a wide mix of technologies) but from the sound of it you need to do a whole lot more research to get your head around the concept. It is not a tough concept by any stretch, it's actually simple, but there is no one prescribed way to do it. I suggest going over SOA case studies for similarly-sized projects, successes and failures.
You mention a facade for one of your subsystems. Extend that same concept to the rest of your components. E.g. each service is a facade to a complex subsystem.
Also, I recommend implementing a couple of different web services in your choice of technologies and abstracting arbitrary different subsystems (a database should be one of the coponents.) Then write a client that makes use of them. Doing so will give you a lot of practical experience and insight into the concept.
Last thought: The one area where an SOA architecture might stumble is if you have to move video data between several different services. The stateless, transactional nature of SOAs might introduce performance issues when moving very large amounts of data or when performing bulk transactions on very large data sets. You either need to keep video localized or implement a back-end subsystem (cheat) to avoid potentially nasty bottlenecks.
Call me a troll if you want, but I'm serious: how exactly is the new SOA trend any different than the client-service architecture that I was building 15 years ago? I keep hearing SOA but I don't see how it's different than what we've always done.
Back 10 years ago, my company had multiple clients (in multiple languages) which talked to the same service. It wasn't XML (it was a binary protocol called Microsoft DCOM) and there wasn't auto-discovery through WSDL but that's OK since reading the docs was just as easy. Our system was even "open" in the sense we documented it enough to allow 3rd parties to talk to our services. We were not pioneers - every other company I knew 10 years ago was doing the same thing.
The ONLY difference I see between then and now is that now there's a single service available on the internet, whereas 10 years ago, each customer would host his own instance of the service. But that's not an architecture issue - where the service physically lives is transparent to anyone using the service.
So what exactly is SOA that's different than what we've been doing for years? Is SOA simply a marketing term representing a best practice that actually became common a long long time ago? Or am I missing some subtely to SOA that's different than what we've been doing all along?
Forget about XML. Forget about WSDL. SOA is not a technology you can buy, though it's often marketed that way.
The real point of SOA is all about IT organization. The point of SOA is to avoid having a huge bunch of "applications" that have isolated data pools and either don't talk to each other at all (and thus often duplicate data), or only in an inefficient, buggy way through adapter layers or EAI systems.
For large companies, this is a serious problem - they have literally hundreds of separate apps that are insufficiently integrated. There's duplicate and inconsistent data everywhere and the result is that customers get pissed off and real money is lost because the billing department keeps sending invoices for a cancelled order and the customer service rep can't even find the order because it's cancelled in the order tracking system, but not the billing system.
SOA is supposed to solve this by designing every app from the ground up to publish its services in a standardized, cross-platfrom manner so that other apps can access the data and don't have to duplicate it.
From a business perspective, this is highly desirable. The buzzword hype and the acronym soup is just IT companies' attempts to cash in on that desirability. Unfortunately, this has (mis)led many people, including CEOs into believing that SOA is a product you can buy and it will magically make your IT more efficient, without realizing that this will only happen if you also reorganize your entire IT (and quite possibly your business units as well) to be SOA-compatible.
Let me use the famous whipping boy of Integration Hell: Telco.
Way back in the 90's, cell phone companies were plethoric in my neighborhood, almost as plentiful as the long distance resellers made possible by the communications deregulation of the mid 90's. Well, time goes on, and Bell Atlantic becomes the powerhouse that is Verizon, and swallows up company after company (and at least one Baby Bell). Every single one of these companies has technologies in place, in towers, in switching equipment, in billing systems that are COMPLETELY incompatible with one another.
So the company goes off and says, okay, we have these models for how we do business, let's put a friendly, consistent face on ALL of our technology in the form of WSDL/SOAP/XSD - every language and system we have today can be interfaced to this! Slowly but surely, the company is making all of it's systems capable of reporting on capabilities, being interrogated for load and billing purposes, and exposed for future visionaries to exploit in manners that haven't been accounted for yet.
Anyone can build a SOA client. Anyone with wget and a text editor. And anyone can parse the results (XML).
That is what's fundamentally different from past client/server architectures. I was just talking the other day to someone about interfacing Cobol and Smalltalk based systems to SOA architectures. That's an easy problem to solve. Tell me you can say the same for your DCOM systems.
SOA is nothing but a way of design, in which the modules comunicates with each others through "services". It is just that, and now the next question is: what is exactly a "service" and what is its difference with a regular "method"??
A service is an operation that performs a single, atomic business operation. This atomicity make it highly reusable from many modules. Then a complex business operation is just the orchestation of the invokation of many of these services in a specific order.
SOA has nothing to do with specific technology, is just an specific way of designing.
Professor Frank Leymann from the University of Stuttgart takes SOA as a key concept for his Service oriented Computing (SOC) research work as he speaks about SOA. He is seen to be asked about the definition of SOA and the ensuing conversation could be a good read.
Please note that our roadmap is about "service oriented computing (SoC)", i.e. the compute paradigm behind service-orientation. Service Oriented Architecture (SOA) is an architectural realization of this compute paradigm. You may compare this with "client/server computing" as paradigm and "browser/web server" or "DB-client/stored procedure" as two (of various other) architectural realizations of this paradigm.
...
SOA is not completely new. Some individual aspects of SOA are used in practice for a long time. For example, take a look at "loose coupling": Enterprises are using reliable messaging technology since decades to integrate applications, i.e. to loosely couple them. Don't get me wrong, there are new concepts in SOA, e.g. concepts resulting from the combination of concepts put together in SOA, i.e. they result from emergence.
Web Service specifications make the corresponding technologies available cross platform. I.e. the corresponding specifications do not invent fundamentally new concepts but define how these concepts and corresponding implementations work in heterogeneous environments. The resulting interoperability is groundbreaking, making SOA real.
In summary, SOA is a mixture of mature things and new emerging things.
There is also a SoC paper reference dated April 2006.
A google search identifies Prof. Frank Leymann and his works.
Neal Ford has many strong opinions regarding SOA. You might find his viewpoint interesting.
Tactics vs. Strategy (SOA & The Tarpit of Irrelevancy)
Standards Based vs. Standardized (SOA & the Tarpit of Irrelevancy)
Tools & Anti-Behavior (SOA & the Tarpit of Irrelevancy)
Rubick's Cubicle (SOA & the Tarpit of Irrelevancy)
The Triumph of Hope over Reason (SOA & The Tarpit of Irrelevancy)
Guerrilla SOA (SOA & The Tarpit of Irrelevancy)
I think SOA is both a marketing term and an integration of existing solutions with the idea of instead of selling the whole software or machine, we sell the services.
For me, a Service Oriented Architecture comes about when an Enterprise wishes to integrate a selection of disparate applications which concern a common domain into a set of interoperable services which operate against a single data source.
In the case of a new startup company with an idea for an item of software/suite of softwares, I can't see how a company can kick off with a Service Oriented Architecture from the off. At first, each solution (which may well evolve into a service such that it may become interoperable) should seek to solve its problem space in isolation.
Perhaps it will be in the roadmap for an enterprise capability or suite for each solution to become an interoperable service as the solutions are completed and enter service. For this, perhaps the development teams will undertake a modular/component oriented approach to building the soluton (eventual service), so as to make it easier to include the solution as a service in a Service Oriented Architecture.
In the case where existing islands of software are to become interoperable services in a Service Oriented Architecture, the approach allows for the software items - which may be distributed and may be written in different languages - to communicate via an exposed API and/or common protocol (for example a flavor of Web Service) and generic data format (for example XML).
SOA is an approach or idea. It is not a framework or a tool. When WDSLs and EJBs get name-dropped, this is often forgotten... as is that the idea of SOA is not new at all.
Most of the answers here seems to convey that SOA (Service Oriented architecture) is about building application in a standardized manner so that other applications can interact with it in platform independent manner.
I am not sure if meaning has changed since but I have had an opportunity to work with a company that offers SOA suite and following are my thoughts on it.
Of course when you design an application you cannot guarantee it will be cross platform compliant. Take for example stock Trading systems. They use Fix protocol to transfer messages. Do you expect it now to return data in XML format so that it can be so called SOA compliant? Definitely not! SOA is an architectural approach that can help you decouple your application/services and let them interact with each other. Backbone of SOA is a ESB (Enterprise Service Bus) which is used to transfer data from one service to other. SOA architecture should take care of formats conversions. For example -
FIX(Service 1) -> (XML ---ESB---> XML) -> JSON (Service 2)
These conversion modules are commonly called as adapters and are generally part of SOA suite. For a bit more information refer to another answer -
Difference between SOA and ESB
Sure SOA is a word is hyped for marketing purpose. Technically speaking it as simple as de-serializing and serializing data so that services can be decoupled and platform independent but the idea behind it is concrete.
Also refer Wiki page for the same.
In reality, SOA is a collection of well-defined services. Basically SOA use loosely coupled service to get the desire result easily. Implementation details of a service are hidden from the client/consumer so any change in the implementation doesn’t affect the service until the contract between them is change. Service providers are components that execute some business logic based on predetermined inputs and outputs, and expose this functionality through an SOA implementation. This allows systems based on SOA to respond more quickly and cost effectively for the business. The main difference between component and SOA is that, SOA provide a open standards message which is not specific to any programming language or platform. As a result, you can achieve a high degree of loose coupling and interoperability across platforms and technologies. In a traditional client-server world, the provider will be a server and the consumer will be a client.You can read more about SOA here :Service Oriented architecture (SOA)
A service-oriented architecture (SOA) is an architectural pattern in which softwares are designed as building block. i.e. Modular development, which makes flexibility to assemble any way we want. If you want to start new project instead of starting from scratch, we can reuse the services and if you want to new service we can easily integrate with existing service to make new project. So we can save lot of time and money.The basic principles of service oriented architecture are independent of vendors, products and technologies.
Analogy: Toys build using Lego building blocks.
In fact, SOA also utilizes client-servier architecture. In addition, SOA is a way to design your software. Suppose that your application can break into simple and independent tasks like search a book, add new book, recommend a book according to user preference and so forth. If you consider a service (an API) for each of task, actually, you are using SOA. The advantage of this architecture is doesn't matter you're building a web app or mobile app, you only need the developed aforementioned services (APIs).
Service-oriented architecture (SOA) is a design approach where multiple services
collaborate to provide some end set of capabilities. A service here typically means a
completely separate operating system process. Communication between these services
occurs via calls across a network rather than method calls within a process boundary.
SOA emerged as an approach to combat the challenges of the large monolithic
applications. It is an approach that aims to promote the reusability of software; two or
more end-user applications, for example, could both use the same services. It aims to
make it easier to maintain or rewrite software, as theoretically we can replace one service
with another without anyone knowing, as long as the semantics of the service don’t
change too much.
I am a total newbie to the world of SOA. As such, I am looking at some "SOA frameworks/technologies", and trying to understand how to utilize them to build a highly scalable (Facebook class) website.
There are several "pains" I am trying to solve here:
Composability (+ managing dependencies, Pub/Sub)
Language-independence of services
Scalability & Performance
High availability
I looked into some technologies that answer a subset of the above criteria:
Thrift - Facebook's cross-platform RPC platform
WCF - supports SOAP, JSON & REST, so it can be considered language-interoperable. Generates WSDL files that can be use to generate java proxies.
Microsoft DSS - just inclued it in my survey, but it doesn't seem relevant as it is highly state-driven and .NET specific.
Web Services
Now, I understand how I get some aspects of composability and language-independence out of the above. But, I haven't found much concrete information (not buzz) about how to use the above / other tools for scalability and high availability. So finally I get to my question:
How does one leverage SOA technologies to solve the pains I defined above? Where can I find technical guides for that? I am looking for more than just system diagrams, but rather actual libraries, code samples, APIS...
I think the question has more to do with the concepts involved than the tools. Answers to the items:
Understand and internalize bounded context. Keeping unrelated pieces separate its important to get real reuse on different services. Related technologies won't help you on this, its you that separate the app in appropriate contexts, which you can appropriately reuse for different services.
Clear endpoint for communication based on known protocols allows implementing different pieces with different technologies
Having the operations flow like independent actions based on different protocols, gives you a lot of places where you can add tiers. Is a particular sub-process of the overall process using lots of resources and the server can't take it anymore, just move to a separate server. Load kept growing, and that server isn't taking it anymore, add an additional server and load balance. You also have more opportunity to use caching and connection pooling.
Have a critical sub-process that needs to be available all the time, add a server so you can have a fail over. Have an overall process that needs to be "available" all the time, use queues for pieces that can be processed later.
Do you really need to support that type of load? Set appropriate performance/load/interoperability targets that relate to the specific scenario. If you really need to support that type of load, I recommend you get someone on board who has dealt with it.
If it is something for a load that might eventually be, identify the bounded contexts and design the interaction between those with a SOA mindset. Keeping the code clean is all you have to do for the rest, use TDD, loose coupling, focused integration tests, etc in your code base. With good code, if you later need to separate pieces of the system, it will be a lot easier.
There are interesting and relevant things said about services and architecture by Amazon's CTO - Werner Vogels:
An interview about the Amazon technology platform
A post on his blog - "Eventually Consistent - Revisited"
A 50 min presentation about Availability & Consistency
IDesign.net has a bunch of great downloads for WCF.
Worth a look at: http://www.manning.com/davis/ ?
Check out the Mule project which bundles the CXF services stack and also the Mule REST pack which provides RESTful alternatives. I think you'll see it addresses all of your pains and there are lots of examples in the documentation as well as the distribution.
May I recommend the book: Enterprise Service Oriented Architectures published by Springer Verlag.
All of the advice here is well and good, but don't worry about it until you really need to.
Focus on building a usable, functional application that people really like. When you start running into problems, then start handling bottlenecks.
You will never be able to foresee every way an application will fail, so how can you tell if [[insert tech here]] is your answer?