Common information model for SOA systems - web-services

We are looking at the possibility of implementing a Common Information Model for data across several systems in a SOA architecture.
Many of these services will be consumed by a composite UI, we therefore see a benefit in having common data types.
What we are wondering is if this is a feasible approach, or if we should just map to common types in the client?

This question is framed pretty broadly, so my answer is going to remain pretty broad as well.
The key consideration here would seem to be location independence - though you're working with several applications, they're all going to share certain sorts of data (though not, as far as I can see from your question, actual data). An obvious use case for this is authentication and authorization data.
If you have determined that the common data is truly cooked enough to isolate in the fashion you're describing then I think it makes perfect sense to layer it off into a service. I think the perfect example of this is Windows Identity Framework. It takes something that we as architects have always treated as data and turns it into a service.
What you lose with the location independence is a little bit of efficiency that you would otherwise have in making batches calls to the same server, though SOA applications lose this efficiency early in their design, in my experience. But the efficiency you gain from "patternizing" a section of your apps generally outweighs that enormously.

Having a common information model doesn't imply common data types or common classes. Simply defining the relationships between, for instance, Customer, Order, OrderItem and Product goes a great distance toward common business logic and the ability to have different services and applications be able to interoperate in an SOA environment.
You might consider having an actual common model in some modeling language. From this, concrete data types and classes could be generated for particular circumstances. One might use UML for this, but I personally prefer to use NORMA, an Object-Role Modeling tool. It works at the conceptual level, so creates models that are independent of the data store technology.
NORMA runs as an add-in to Visual Studio Standard edition or above, but out of the box generates artifacts for several databases, as well as LINQ to SQL classes and even PHP web services, all from the same model. It is extensible so that you can generate your own artifacts from the model. And of course, the model is represented as XML, so you can do whatever you like with it.

Related

Microservices - in an organization with multiple businesses

Context
Let's say we have an organization that has multiple businesses. In this example, Business A sells a gigabit internet service to college students. Business B sells a megabit internet service to seniors. The businesses sell related products with slight variations, each targeting a different demographic.
At first glance, this seems like we can just have one application handle all the requests. However, it is natural for the businesses to diverge from each other given that they each target a specific demographic - by nature, each business will have its own business requirements. For example, Business A might expose a mobile application for customers to manage their account. Business B might expose a phone number that has to be called for customers to manage their account. The list goes on.
What is the best way to utilize microservices given this context?
The problem is that there is both common and uncommon functionality across the different businesses.
We can remain somewhat DRY and have a set of base microservices (billing-api, order-api, etc.) that can be consumed by the different businesses. This works but this causes the microservices to have more "general" abstractions - leading to more complexity. For a concrete example, let's say the billing-api service has a /charge endpoint that is shared by Business A and B. Business B's requirement is to always discount $5 off the order:
//billing-api
if (businessB) {
orderCost -= 5;
}
In this DRY approach, we would have an API gateway for each business (BFF pattern) which would aggregate different microservices to fulfill their business needs. All "business-specific" logic would get moved from the base microservices into the respective businesses' API gateway. In this discount example, instead of having an if (businessB) check in the billing-api endpoint, we can invert this control to the consumer:
//billing-api
const { orderDiscountAmount } = req.body; //body parameters
if (orderDiscountAmount > 0) {
orderCost -= orderDiscountAmount;
}
Then the endpoint in Business B's API gateway would pass in an orderDiscountAmount of 5 when calling the billing-api endpoint:
//Business B API Gateway
billingApi({ orderDiscountAmount: 5 });
This seems fine, but all we did was take Business B's logic in the billing-api endpoint and created a generic (but forced) abstraction. This is "justified" by saying maybe Business A may use that one day - but that may never actually happen. Overall, this feels like an unnatural exercise for the developer and the consumer of the endpoint. Complexity and cognitive load on all sides are increased.
We can scrap DRY and avoid sharing microservices between businesses for maximum flexibility and simplicity. However, if more businesses are added (10-20) then there's probably going to be a good chunk of duplicated functionality.
How should teams be structured given this context?
If we are okay with the DRY approach from above, how should teams be structured? We can have vertically-sliced feature teams, but does that mean if we have 10 businesses, a team would need to own a feature (i.e. checkout) on all the businesses? The drawback with this approach is that the feature teams won't be experts in any business as a whole - the teams would only be an expert in one feature in a given business. Not having the full context on a business could make it difficult to make the right decisions.
We can have a stream-aligned team for each business dedicated to the UI and the API gateway. We would then have platform teams creating microservices for the stream-aligned teams to consume. The drawback with this is that there is a handoff step between the stream-aligned team and the platform team, a.k.a a dependency.
I'm not sure if I'm looking at all this from the wrong lens - any feedback would be appreciated!
Sorry to say, but this is not a good question for Stackoverflow, because any answer will be a opinion based and many approaches may work and depend on more details in your specific use case. So don't be disappointed if the question get's closed at some point.
That being said I am not too shy to offer my opinion or at least some thoughts about your described situation.
I believe your question about how to set up teams and how to set up the architecture are very tightly linked because architecture will with no doubt follow the organizational structure effectively. So I will give further thoughts about the organizational setup first.
An estimation of the total manpower required for each of the businesses should give you an idea of how many teams you need. Trying to keep team size small (say 2-8 people) will help reducing the communication overhead. So if you think this is the size for a whole business then there is no need to further split responsibility.
Responsibility is the most important keyword. You have to avoid any situation where a common service/library is used but has multiple or no owners. There should always be exactly one organizational owner. Thus, when organizations recognize the overlap of functionality in separate areas it is a common practice to establish a team that will be responsible and provide this functionality to others. This could be in the form of shared libraries or actually deployed services. In both cases it is important that the communication is formalized by correctly versioning their work and leaving it to the consuming groups which versions to use, when to upgrade, putting in requests for new features, etc. This approach will decouple the teams that use this common functionality.
In your problem description the core of the problem is the business logic and it's complexity / overlap. So I would argue that the most important role is the product management. They have to be very good (and at least a bit technical) and sort this exact mess into reusable pieces and things that are specific to only a single business. If you have a whole team of product managers they need to communicate very well and build this picture together. What is most important here is a good communication about the vision for the future and not just immediate requirements (Provide a great domain view). Only then can the architecture and teams be set up in the best possible way.
No matter how careful the initial setup - Changes WILL happen. Whatever you think in the beginning to be the best solution will change at some point in the future. In order to prepare for this I always recommend to go with the simplest approaches - Even if it means some code duplication or other imperfections. As software architects we tend to love the beauty of perfection, but that is rarely the most effective approach in the real world.
It is common sense to make a simple shared service/library that can be made to fit multiple use cases by adding some configurability. Up to certain degree of complexity that is a useful approach, but you have to be sensible to the consumers of that library/service and it should be easy to reuse at any point. It is not black and white about when to a functionality becomes too big / complex and has to be split into multiple pieces to be maintainable, but looking at it with the eyes of the maintainer and the eyes of consumer will make a determination easier. In the case of configurable service libraries you could also have separate deployments with different configurations, so using commonly developed components, but deploying different endpoints for each use case. If you use technologies that produce only a small deployment overhead (for example golang containers that are only a few mbs), then the large number of deployed services is not a drawback but a strength because they can be upgraded / versioned independently and it is even easy to run multiple versions in parallel.
Infrastructure and service deployment may or may not follow the architecture of the services. As a general rule I would recommend to look for the simplest approach, which often means providing common infrastructure that is shared among services and deployment configurations are where the distinction between services starts. For example a all services share a common cluster / streaming / gateways / databases / etc. Exceptions to that could be very special needs for single services, like a hardware encryption key store or GPU servers for machine learning, etc. This would be the approach for any reasonable sized system. (Of course if you are going scale to very large sizes it also a very feasible approach to have complete stacks / clusters for specific services.)
Persistency design is most crucial. Where it is relatively easy to evolve a business logic, reorganize it, etc., it is rather difficult to evolve your historic data. Often you have a choice to do a design in one of two ways:
Smart algorithms, dumb data.
Smart data, dumb algorithms.
(Smart referring to more elaborate / reflecting more of the business requirements)
The second approach is usually harder initially, but in my experience will have better results when a certain complexity threshold is reached.
So these are just a few things that came to my head when reading your question. I apologize that they cannot answer your detailed question about how to slice the billing API, but maybe you have a few additional considerations at hand.

How to display computer vision and AR in UML use case diagrams

My system uses both Augmented reality and computer vision,
The first feature is: The user actor can scan a specific object and the computer vision should recognize it.
The second feature is: The user actor can view a specific place using the augmented reality.
Each feature is a use case connected to the user, but do I also connect them to some sort of AI actors? and if so what is the suitable way to do it?
Do I just say "Computer vision system", and "Augmented reality system" ?
Feature or use-case?
This is a good start. However, there is a key misconception here:
Features are characteristics or capabilities offered by the software that are valued by users because it helps them to achieve some purpose. Features are often identified with user stories.
Use-cases represent goals of the actors using the system, that corresponds to a set of behaviors and interactions with the user, without reference to the internal structure of the system.
These a two different concepts. There is of course some overlap: some higher level capabilities can be described in terms of goals. For example, an ERP can be expected to have accounting, warehouse management and sales administration features.
But features are more general: it can also describe technical capabilities that are not directly observable by the user (e.g. backup), capabilities that are not directly related to a specific set of behaviors (e.g. multilingual user interface), Or which are much more detailed (e.g. date picking feature)
If you're on features, you may consider non-UML techniques, such as a feature tree, or user-story mapping (which is a kind of feature tree constructed with user-stories).
The big picture with use-cases
In your diagram, the bulbs seem to show that the system offers, and not what the user wants to do. If you want to show the big-picture with use-cases, you need to relate the bubbles with user goals:
Does the user just want to scan objects? Or is this scanning only one step for a larger goal, such as making an inventory, recognizing and ordering spare parts, or populating a virtual world.
Does the user just want to view a place in VR? Or are the expectations more ambitious, like purchasing products that would look fine in a given place?
This might look like an unecessary philosophic debate. But it is not. Because the main benefit of use-cases is a goal-oriented approach. Framing the problem or the expectations correctly, may allow you to think more creatively at alternatives instead of locking you early in a pre-conceived solution.
The right boundaries
The actors raise another question: are these actors autonomous and independent systems and do they matter to the user? Or are they just implementation details?
Formally, actors are external to the system, and moreover, the use-case should not depend on the internal structure of the system. So if the computer vision and the virtual reality system are in fact libraries, components, sub-systems of your system, they should not appear in the diagram.
Secondly, use-cases should offer observable result of value for actors. If the external system is dependent on your system and has no value on its own, then the use-case results cannot be of value to this system. For example, a DBMS are often viewed as candidate actors, but do not pass this test: the DBMS without the main system would be useless. If the system is not independent an autonomous, just remove it from the diagram to keep things simple.
Lastly, is, does the system actor matter to the other actors? If it makes no difference for your human users if an external system-actor intervenes, keep it simple and do not show the system-actor although you could. Because then again, it's more an implementation choice to rely on an external system than a requirement.
The way you denoted it is common practice. The so-called primary actor (which is who receives the added value from the system under consideration) is placed to the left and the so-called secondary actors (which only take part in and/or support the use case) are placed to the right. Depending on who the reader of the UC diagram will be their appearance will make sense or not. If you present it to some customer they are likely not interested in IT blurb. But for system designers it would be some important information.

Modeling frontend and backend in a use case diagram

I am trying to make a use case diagram for my project, the backend is going to be made using Django rest framework and the front end using react, my question is how can i model this situation in the right way, should i model the frontend and represent the backend as an actor or the opposite, since i am thinking of making a mobile application as a second front end?
The right answer here is the Standard Answer of the Business Analyst no 1: It depends.
The question is - what do you want to model and why. Then - what is the correct tool (diagram) to do it.
The goal of the Use Case diagram is to show what functionalities a system is going to offer. Now the system can be treated as a whole, in which case you show the functionalities without depicting how the system is internally organised (this is the most common scenario and most probable the best way to use Use Case diagram in your case - but it does not show the fact of having FE and BE, note that this type of diagram isn't really best suited to do so, so keep reading).
You may also tread e.g. BE as the system itself (it can make sense especially when you're preparing headless API and really separate BE from FE; even more so when your BE and FE teams are totally separate). In such case FE will become an actor (just like e.g. other system that can interact with your BE). Obviously FE can be treated in the same way (i.e. be considered the system with BE being an actor), however usually there's less reason to do so.
Now having said that, if you want to depict the distinction between BE and FE, you should consider other types of diagrams. Keep in mind that Use Case diagram is a dynamic diagram, and the internal structure of the system is static, so obviously it should be one of the static diagrams instead. One that is dedicated to show the internal structure of the system is the Component diagram and it would most likely serve best the purpose of indicating existence of FE and BE (potentially with further level of details, e.g. existing microservices).
If on the other hand you would like to show specific technology in use, Deployment diagram might be your best shot. It allows to show the actual runtime environments, artifacts and their technologies.
Keep in mind - tying to use one type of diagram, or even worse one diagram, to show everything is usually a bad idea and a mistake often made by newbies. Be smarter than that.
Use-case are about a set of behaviors with an observable result that is of value for the actors. They should not care about the internals of a system:
UseCases define the offered Behaviors of the subject without reference to its internal structure.
Therefore, you should in principle not care about the distinction between front-end and back-end, but focus on actor goals with the system.
The only situation where you'd care for the back-end in a use-case diagram, is the case where the front-end would be an independent application that is of value on its own, but can interact with actors that represent external independent systems. (More here)

Functionalities structuring in API design

By 'functionalities structuring', I mean how we organize and coordinate different API endpoints to offer desired functionalities to clients. The context here is web APIs for consumption by mobile phones with GPS tracking, and I assume either cellular or WiFi connectivity is required for most functionalities.
I personally prefer a more 'modular' approach where each endpoint does mostly one thing and a collection of them fulfill all the requirements. Of course, you may need to combine some subset or sequence of these endpoints to achieve certain functionalities. Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
On the other hand, I know some other people prefer client-side convenience (or simplicity) over modularity in the following ways:
If the client needs to achieve a functionality, then there should exist a single API endpoint which does exactly that, such that the client needs only a single request to fulfill the functionality with minimal caching/logic in between requests.
For GET endpoints, if there are multiple levels/kinds of data involved for some functionalities, they prefer as much data as possible (often all necessary data) returned by a single endpoint. Ironically, they may also want a dedicated endpoint for retrieving only the "lowest level" data using a corresponding "highest level" ID. For example, If A corresponds to a collection of Bs, and each B corresponds to a collection of Cs, then they will prefer a direct endpoint that retrieves all the relevant Cs given an A.
In some extreme cases, they will ask for a single endpoint with ambiguous naming (e.g. /api/data) that returns related data from different underlying DB tables (in other words, different resources) based on different combinations of query string parameters.
I understand that people preferring such conveniences above aim to: 1. reduce the number of API requests necessary to fulfill functionalities; 2. minimize data caching and data logic on the client side to reduce client complexity, which arguably leads to a 'simple' client with simplified interaction with the server.
However, I also wonder if the cost of doing so is unjustifiable in other aspects in the long run, especially in terms of the performance and the maintenance of the server-side API. Hence my questions:
What are the tried-and-true guidelines for structuring API functionalities?
How do we determine an optimal number of requests necessary for fulfilling a functionality in a mobile app? Of course, if all other things equal, a single request is the best, but achieving such a single-request implementation usually carries penalty in other aspects.
Given the contention between the number of client requests and the performance and maintainability of server-side API, what are the approaches for striking a balance in order to deliver a sensible design?
What you are asking about breaks into at least three main areas of API design:
Ontology Design (organization)
Request/Response Design (complexity/performance)
Maintenance Considerations
Based on my experience (which is largely from working with very large organizations both on the API producing and consuming side and talking with hundreds of developers on the topic), let's look at each area, addressing the specific points you bring up...
Ontology Design
There are a couple of things to take in to consideration in your design that are perhaps implied when you say:
Overall, I try to minimize the overlapping between endpoints in terms of both computation and functionalities.
This approach makes the APIs easily discoverable. When you are in a situation where you are publishing APIs for consumption by other developers who you may or may not know (and may or may not have enough resources to truly support), this kind of modularity - making them easy to find and learn about - creates a different kind of "convenience" leading to easier adoption and reuse of your APIs.
I know some other people much prefer convenience over modularity: 1. if the client needs a functionality, then there should exist a single endpoint in the API which does exactly that...
The best public example that comes to mind for this approach is perhaps the Google Analytics Core Reporting API. They implement a series of querystring parameters to build a call that returns the data requested, ex:
https://www.googleapis.com/analytics/v3/data/ga
?ids=ga:12134
&dimensions=ga:browser
&metrics=ga:pageviews
&filters=ga:browser%3D~%5EFirefox
&start-date=2007-01-01
&end-date=2007-12-31
In that example we are querying Google Analytics Account 12134 for pageviews by browser where broswer is Firefox for the given date range.
Given the number of metrics, dimensions, filters, and segments their API exposes, they have a tool called the Dimensions & Metrics Explorer to help developers understand how to use the APIs.
One approach makes the APIs discoverable and more understandable from the outset. The other requires more supporting work to explain the intricacies of consuming the API. One thing that isn't immediately obvious with the Google API above is that certain segments and metrics are incompatible, so if you are making calls passing one key/value pair, you may not longer be able to pass certain other pairs.
Request/Response Design
The context here is APIs for mobile applications.
That is still very broad, and better defining (if possible) how you intend for your "mobile applications" to be used can help you design your APIs.
Do you intend for them to be used totally offline? If so, heavy/complete data caching may be desirable.
Do you intend for them to be used in low bandwidth and/or high latency/error-rate connectivity scenarios? If so, heavy/complete data caching may be desirable, but so might small/discrete data requests.
for GET endpoints, they often prefer as much data as possible returned by a single endpoint, especially when there are multiple levels/layers of data involved
This is safe if you know you'll only ever be in good mobile connectivity scenarios, or you can cache the data heavily when you are (and thus access it offline or when things are spotty).
I understand that people preferring convenience aim to reduce the number of API calls necessary to achieve functionalities...
One way to find a happy middle ground is to implement paging in your data-intensive calls. For example, a querystring can be passed in a GET specifying 'pagesize'. Thus 10,000 records could be returned 100 at a time over 100 successive calls, or 1,000 at a time over 10 calls.
With this approach, you can design and publish your API without necessarily knowing what your consuming developer will need. Even though the paging example above uses the Google API referenced earlier, it can still be used in a more semantically designed API. For example, say you have GET /customer/phonecalls you could still design it to accept a pagesize value and make successive calls to get all the phonecalls associated with customer.
Maintenance
I also wonder if the cost of doing so [reduce the number of API calls necessary to achieve functionalities and to minimize data caching] is not justifiable in the long run, especially for the performance and the maintenance of an API.
The key guiding principle here is separation of concerns if your collection of APIs is going to grow to any significant level of complexity and scale.
What happens when you have everything bundled together into one big service and a small part of it changes? You are now creating not only a maintenance headache on your side, but also for your API consumer.
Did that "breaking change" really affect the part of the API they were using? It will take time and energy for them to figure that out. Designing API functionality into discrete, semantic services will let you create a roadmap and version them in a more understandable way.
For further reading, I'd suggest checking out Martin Fowler's writings on Microservices Architecture:
In short, the microservice architectural style is an approach to
developing a single application as a suite of small services, each
running in its own process and communicating with lightweight
mechanisms
Although there is a lot of debate about how to design and build for "microservices" in practice, reading up on that should help further shape your thinking on the API design decisions you're facing and prepare you to engage in "current" discussions around the topic.

services based architecture should not necessarily imply distribution?

In my workplace (and a lot of other areas), there is a lot of emphasis on building architecture around services. (I am working in an e-commerce startup). However, I think services are implicitly considered as distributed. I am a believer of the first law of distribution - "don't distribute". So, I believe that we should not un-necessarily complicate architecture. It should be an architecture which can evolve. So, one of the ways to approach the problem would be to create well defined namespaces and build code around it, but keep the communication via java api. (this keeps monitoring requirement low, and reliability/availability problems low). This can easily be evolved into a distributed architecture by wrapping modules into web service, as and when, the scale requirements kick-in. So, the question is - what are the cons of writing code as a single application and evolving into distributed services, rather than straight jumping into implementing web services based architecture? Am I right in assuming that services should imply the basic principles of design (abstraction, encapsulation etc), rather than distribution over network?
Distribution requires modularity. However, it requires more than just modularity: it also requires coarse-grained interaction between the modules.
For example, in a single-process ecommerce system, you might have separate modules for managing the user's shopping cart and calculating prices. They might interact by the cart asking the calculator to price an item, then another item, etc. That would be perfectly fine.
However, in a distributed system, that would require a torrent of small method calls, which is inefficient; you might get away with it if you used CORBA for distribution, but with SOAP, you'd be in trouble. Rather, you would want to have the cart ask the calculator to price the whole order in one go. That might be worse from a separation of concerns point of view (why should the calculator have to know about the idea of carts?), but it would be required to make the system perform adequately.
Related to granularity, there's also the problem of modules interacting via interfaces or implementations. With a single process, you can define a set of interfaces through which modules will interact; modules can pass each other objects implementing those interfaces without having to tell each other about the implementations (eg a scheduler module could be passed anything implementing interface Job { void run(); }). Across a network, the requirement for coarse grain means that any objects passed must be passed by value (because passing by reference would entail fine-grained calls back to the passing module - unless you were using mobile code, which you aren't, because nobody is), which means that both modules must know about and agree on the implementations of the objects.
So, while building a single-process system in a modular way makes it easier to implement SOA later, it doesn't make it as simple as wrapping each module in a SOAP interface. At least, not unless you build your system in a coarse-grained manner from the start, which means throwing away a number of sound and helpful good software engineering practices.