Is there a good C++ framework to implement XA distributed transactions?
With the term "good" I mean usable, simple (doesn't imply "easy"), well-structured.
Due to study reasons, at the moment I'm proceeding with a personal implementation, following X/Open XA specification.
Thank you in advance.
I am not aware of an open-source or free transaction monitor that has any degree of maturity, although This link does have some fan-out. The incumbent commercial ones are BEA's Tuxedo, Tibco's Enterprise Message Service (really a transactional message queue manager like IBM's MQ) and Transarc's Encina (now owned by IBM). These systems are all very expensive.
If you want to make your own (and incidentally make a bit of a name for yourself by filling a void in the open-source software space) get a copy of Grey and Reuter.
This is the definitive work on transaction processing systems architecture, written by two of the foremost experts in the field.
Interestingly, they claim that one can implement a working TP monitor in around 10,000 lines of C. This actually sounds quite reasonable, as what it does is not all that complex. On occasion I have been tempted to try.
Essentially you need to make a distributed transaction coordinator that runs as a daemon process. You will need to get the resource manager protocol working from it, so starting with this as a prototype is probably a good start. If you can get it to independently roll back or commit a transaction you have the basis of the TM-RM interface.
The XA API as defined in the spec is the API to control the transaction manager. Strictly speaking, you don't need to make a 3-tier architecture to use distributed transactions of this sort, but they are more or less pointless without a TP monitor. How you communicate from the front-end to the middle-tier can be left as an exercise for the reader. You are probably best off using an existing ORB, of which there are several good open-source implementations available.
Depending on whether you want to make the DTC and the app server separate processes (which is possibly desirable for stability but not strictly necessary) you could also use ACE as a basis for the DTC server.
If you want to make a high-performance middle-tier server, check out Douglas Schmidt's ACE framework. This comes with an ORB called TAO, and is flexible enough to allow you to use more or less any threading model that takes your fancy. Using this is a trade-off between learning it and the effort of writing your own and debugging all the synchronisation and concurrancy issues.
Maybe quite late for your task, but it can be useful for other users: LIXA is not a "framework", but it provides an implementation for the TX Transaction Demarcation specification and supports most of the XA features.
The TX specification is for C and COBOL languages, but the integration of the C version inside a C++ project should be effortless.
Other option is open source Enduro/X distributed transaction processing framework which allows to write simple C/C++ services which may operate with resource managers (e.g. databases) and gives capability to commit or abort works done by several different executables on same/different physical servers worked with different resources/databases.
Internally XA 2PC is used there.
Related
Background:
I want to create an automation framework in C++ where on the one hand "sensors" and "actors" and on the other "logic engines" can be connected to a "core".
The "sensors" and "actors" might be connected to the machine running the "core", but some might also be accessible via a field bus or via normal computer network. Some might work continuous or periodically (e.g. every 100 milliseconds a new value), others might work event driven (e.g. only when a switch is [de]activated a message will come with the new state).
The "logic engine" would be sort of pluggable into the core and e.g. consist out of embedded well known script languages (Perl, Python, Lua, ...). There will run different little scripts from the users that can subscribe to "sensors" and write to "actors".
The "core" would route the sensor/actor informations to the subscribed scripts and call them. Some just after the event occurred, others periodically as defined in a scheduler.
Additional requirements:
The systems ("server") running this automation application might also be quite
small (500MHz x86 and 256 MB RAM) or if possible even tiny (OpenWRT
based router) as power consumption is an issue
=> efficiency is important
=> multicore support not for the moment, but I'm sure it'll become important soon - so the design has to support it
Some sort of fail save mode has to be possible, e.g. two systems monitoring each other
application / framework will be GPL => all used libraries have to be compatible
the server would run Linux, but cross platform would be nice
The big question:
What is the best architecture for such a kind of application / framework?
My reasoning:
Not to reinvent the wheel I was wondering to use MPI to do all the event handling.
This would allow me to focus on the relevant stuff and not on the message handling, especially when two or more "servers" would work together (watchdog for each other as well as each having a few sensors and actors connected). Each sensor and actor handler as well as the logic engines themself would only be required to implement a predefined MPI based interface and thus be crash save. The core could restart each when it's not responsive anymore.
The additional questions:
Would that be even possible with MPI? (It'd be used a bit out of context...)
Would the overhead of MPI be too big? Should I just write it myself using sockets and threads?
Are there other libraries possible that are better suited in this case?
You should be able to construct your system using MPI, but I think MPI is too much focused on high performance computing. Moreover, since it was designed for C, it does not really fit the object oriented way of programming very much. IMO there are other approaches better suited for your needs:
Boost ASIO might be a good fit for designing your system. It includes both network functionality and helps at event-driven programming (which could be a good way to design your system). You can have a look at Think-Async webpage for some examples on using ASIO for event-driven programming.
You could also use plain threads and borrow the network capabilities from ASIO (without using the event-driven programming parts). If you can use C++11, then you can directly use std::thread and all the other functionality available (mutex, conditional variables, futures, etc.). If you cannot use C++11, you can always use Boost Thread.
Finally, if you really want to go for MPI, you can have a look at Boost MPI. At least you will have a much more C++ friendly way of using MPI.
In my workplace (and a lot of other areas), there is a lot of emphasis on building architecture around services. (I am working in an e-commerce startup). However, I think services are implicitly considered as distributed. I am a believer of the first law of distribution - "don't distribute". So, I believe that we should not un-necessarily complicate architecture. It should be an architecture which can evolve. So, one of the ways to approach the problem would be to create well defined namespaces and build code around it, but keep the communication via java api. (this keeps monitoring requirement low, and reliability/availability problems low). This can easily be evolved into a distributed architecture by wrapping modules into web service, as and when, the scale requirements kick-in. So, the question is - what are the cons of writing code as a single application and evolving into distributed services, rather than straight jumping into implementing web services based architecture? Am I right in assuming that services should imply the basic principles of design (abstraction, encapsulation etc), rather than distribution over network?
Distribution requires modularity. However, it requires more than just modularity: it also requires coarse-grained interaction between the modules.
For example, in a single-process ecommerce system, you might have separate modules for managing the user's shopping cart and calculating prices. They might interact by the cart asking the calculator to price an item, then another item, etc. That would be perfectly fine.
However, in a distributed system, that would require a torrent of small method calls, which is inefficient; you might get away with it if you used CORBA for distribution, but with SOAP, you'd be in trouble. Rather, you would want to have the cart ask the calculator to price the whole order in one go. That might be worse from a separation of concerns point of view (why should the calculator have to know about the idea of carts?), but it would be required to make the system perform adequately.
Related to granularity, there's also the problem of modules interacting via interfaces or implementations. With a single process, you can define a set of interfaces through which modules will interact; modules can pass each other objects implementing those interfaces without having to tell each other about the implementations (eg a scheduler module could be passed anything implementing interface Job { void run(); }). Across a network, the requirement for coarse grain means that any objects passed must be passed by value (because passing by reference would entail fine-grained calls back to the passing module - unless you were using mobile code, which you aren't, because nobody is), which means that both modules must know about and agree on the implementations of the objects.
So, while building a single-process system in a modular way makes it easier to implement SOA later, it doesn't make it as simple as wrapping each module in a SOAP interface. At least, not unless you build your system in a coarse-grained manner from the start, which means throwing away a number of sound and helpful good software engineering practices.
We are building three-tier architectures for over a decade now. Dividing presentation-, logic- and data-tier is supposed to allow us to exchange each layer individually, should the need ever arise, be it through changed requirements or new technologies.
I have never seen it working in practice...
Mostly because (at least) one of the following reasons:
The three tiers concept was only visible in the source code (e.g. package naming in Java) which was then deployed as one, tied together package.
The code representing each layer was nicely bundled in its own deployable format but then thrown into the same process (e.g. an "enterprise container").
Each layer was run in its own process, sometimes even on different machines but through the static nature they were connected to each other, replacing one of them meant breaking all of them.
Thus what you usually end up with, in is a monolithic, tightly coupled system that does not deliver what it's architecture promised.
I therefore think "three-tier architecture" is a total misnomer. The true benefit it brings is that the code is logically sound. But that's at "write time", not at "run time". A better name would be something like "layered by responsibility". In any case, the "architecture" word is misleading.
What are your thoughts on this? How could working three-tier architecture be achieved? By that I mean one which holds its promises: Allowing to plug out a layer without affecting the other ones. The system should survive that and be in a well defined state afterwards.
Thanks!
The true purpose of layered architectures (both logical and physical tiers) isn't to make it easy to replace a layer (which is quite rare), but to make it easy to make changes within a layer without affecting the others (and as Ben notes, to facilitate scalability, consistency, and security) - which works all the time all around us.
One example of a 3-tier architecture is a typical database-driven web application:
End-user's web browser
Server-side web application logic
Database engine
In every system, there is the nice, elegant architecture dreamed up at the beginning, then the hairy mess when its finally in production, full of hundreds of bug fixes and special case handlers, and other typical nasty changes made to address specific issues not realized during the design.
I don't think the problems you've described are specific to three-teir architecture at all.
If you haven't seen it working, you may just have bad luck. I've worked on projects that serve several UIs (presentation) from one web service (logic). In addition, we swapped data providers via configuration (data) so we could use a low-cost database while developing and Oracle in higher environments.
Sure, there's always some duplication - maybe you add validation in the UI for responsiveness and then validate again in the logic layer - but overall, a clean separation is possible and nice to work with.
Once you accept that n-tier's major benefits--namely scalability, logical consistency, security--could not easily be achieved through other means, the question of whether or not any of the tiers can be replaced outright without breaking the the others becomes more like asking whether there's any icing on the cake.
Any operating system will have a similar kind of architecture, or else it won't work. The presentation layer is independent of the hardware layer, which is abstracted into drivers that implement a certain interface. The data is handled using logic that changes depending on the type of data being read (think NTFS vs. FAT32 vs. EXT3 vs. CD-ROM). Linux can run on just about any hardware you can throw at it and it will still look and behave the same because the abstractions between the layers insulate each other from changes within a single layer.
One of the biggest practical benefits of the 3-tier approach is that it makes it easy to split up work. You can easily have a DBA and a business anylist or two building a data layer, a traditional programmer building the server side app code, and a graphic designer/ web designer building the UI. The three teams still need to communicate, of course, but this allows for much smoother development in most cases. In this regard, I see the 3-tier approach working reliably everyday, and this enough for me, even if I cannot count on "interchangeable parts", so to speak.
Call me a troll if you want, but I'm serious: how exactly is the new SOA trend any different than the client-service architecture that I was building 15 years ago? I keep hearing SOA but I don't see how it's different than what we've always done.
Back 10 years ago, my company had multiple clients (in multiple languages) which talked to the same service. It wasn't XML (it was a binary protocol called Microsoft DCOM) and there wasn't auto-discovery through WSDL but that's OK since reading the docs was just as easy. Our system was even "open" in the sense we documented it enough to allow 3rd parties to talk to our services. We were not pioneers - every other company I knew 10 years ago was doing the same thing.
The ONLY difference I see between then and now is that now there's a single service available on the internet, whereas 10 years ago, each customer would host his own instance of the service. But that's not an architecture issue - where the service physically lives is transparent to anyone using the service.
So what exactly is SOA that's different than what we've been doing for years? Is SOA simply a marketing term representing a best practice that actually became common a long long time ago? Or am I missing some subtely to SOA that's different than what we've been doing all along?
Forget about XML. Forget about WSDL. SOA is not a technology you can buy, though it's often marketed that way.
The real point of SOA is all about IT organization. The point of SOA is to avoid having a huge bunch of "applications" that have isolated data pools and either don't talk to each other at all (and thus often duplicate data), or only in an inefficient, buggy way through adapter layers or EAI systems.
For large companies, this is a serious problem - they have literally hundreds of separate apps that are insufficiently integrated. There's duplicate and inconsistent data everywhere and the result is that customers get pissed off and real money is lost because the billing department keeps sending invoices for a cancelled order and the customer service rep can't even find the order because it's cancelled in the order tracking system, but not the billing system.
SOA is supposed to solve this by designing every app from the ground up to publish its services in a standardized, cross-platfrom manner so that other apps can access the data and don't have to duplicate it.
From a business perspective, this is highly desirable. The buzzword hype and the acronym soup is just IT companies' attempts to cash in on that desirability. Unfortunately, this has (mis)led many people, including CEOs into believing that SOA is a product you can buy and it will magically make your IT more efficient, without realizing that this will only happen if you also reorganize your entire IT (and quite possibly your business units as well) to be SOA-compatible.
Let me use the famous whipping boy of Integration Hell: Telco.
Way back in the 90's, cell phone companies were plethoric in my neighborhood, almost as plentiful as the long distance resellers made possible by the communications deregulation of the mid 90's. Well, time goes on, and Bell Atlantic becomes the powerhouse that is Verizon, and swallows up company after company (and at least one Baby Bell). Every single one of these companies has technologies in place, in towers, in switching equipment, in billing systems that are COMPLETELY incompatible with one another.
So the company goes off and says, okay, we have these models for how we do business, let's put a friendly, consistent face on ALL of our technology in the form of WSDL/SOAP/XSD - every language and system we have today can be interfaced to this! Slowly but surely, the company is making all of it's systems capable of reporting on capabilities, being interrogated for load and billing purposes, and exposed for future visionaries to exploit in manners that haven't been accounted for yet.
Anyone can build a SOA client. Anyone with wget and a text editor. And anyone can parse the results (XML).
That is what's fundamentally different from past client/server architectures. I was just talking the other day to someone about interfacing Cobol and Smalltalk based systems to SOA architectures. That's an easy problem to solve. Tell me you can say the same for your DCOM systems.
SOA is nothing but a way of design, in which the modules comunicates with each others through "services". It is just that, and now the next question is: what is exactly a "service" and what is its difference with a regular "method"??
A service is an operation that performs a single, atomic business operation. This atomicity make it highly reusable from many modules. Then a complex business operation is just the orchestation of the invokation of many of these services in a specific order.
SOA has nothing to do with specific technology, is just an specific way of designing.
Professor Frank Leymann from the University of Stuttgart takes SOA as a key concept for his Service oriented Computing (SOC) research work as he speaks about SOA. He is seen to be asked about the definition of SOA and the ensuing conversation could be a good read.
Please note that our roadmap is about "service oriented computing (SoC)", i.e. the compute paradigm behind service-orientation. Service Oriented Architecture (SOA) is an architectural realization of this compute paradigm. You may compare this with "client/server computing" as paradigm and "browser/web server" or "DB-client/stored procedure" as two (of various other) architectural realizations of this paradigm.
...
SOA is not completely new. Some individual aspects of SOA are used in practice for a long time. For example, take a look at "loose coupling": Enterprises are using reliable messaging technology since decades to integrate applications, i.e. to loosely couple them. Don't get me wrong, there are new concepts in SOA, e.g. concepts resulting from the combination of concepts put together in SOA, i.e. they result from emergence.
Web Service specifications make the corresponding technologies available cross platform. I.e. the corresponding specifications do not invent fundamentally new concepts but define how these concepts and corresponding implementations work in heterogeneous environments. The resulting interoperability is groundbreaking, making SOA real.
In summary, SOA is a mixture of mature things and new emerging things.
There is also a SoC paper reference dated April 2006.
A google search identifies Prof. Frank Leymann and his works.
Neal Ford has many strong opinions regarding SOA. You might find his viewpoint interesting.
Tactics vs. Strategy (SOA & The Tarpit of Irrelevancy)
Standards Based vs. Standardized (SOA & the Tarpit of Irrelevancy)
Tools & Anti-Behavior (SOA & the Tarpit of Irrelevancy)
Rubick's Cubicle (SOA & the Tarpit of Irrelevancy)
The Triumph of Hope over Reason (SOA & The Tarpit of Irrelevancy)
Guerrilla SOA (SOA & The Tarpit of Irrelevancy)
I think SOA is both a marketing term and an integration of existing solutions with the idea of instead of selling the whole software or machine, we sell the services.
For me, a Service Oriented Architecture comes about when an Enterprise wishes to integrate a selection of disparate applications which concern a common domain into a set of interoperable services which operate against a single data source.
In the case of a new startup company with an idea for an item of software/suite of softwares, I can't see how a company can kick off with a Service Oriented Architecture from the off. At first, each solution (which may well evolve into a service such that it may become interoperable) should seek to solve its problem space in isolation.
Perhaps it will be in the roadmap for an enterprise capability or suite for each solution to become an interoperable service as the solutions are completed and enter service. For this, perhaps the development teams will undertake a modular/component oriented approach to building the soluton (eventual service), so as to make it easier to include the solution as a service in a Service Oriented Architecture.
In the case where existing islands of software are to become interoperable services in a Service Oriented Architecture, the approach allows for the software items - which may be distributed and may be written in different languages - to communicate via an exposed API and/or common protocol (for example a flavor of Web Service) and generic data format (for example XML).
SOA is an approach or idea. It is not a framework or a tool. When WDSLs and EJBs get name-dropped, this is often forgotten... as is that the idea of SOA is not new at all.
Most of the answers here seems to convey that SOA (Service Oriented architecture) is about building application in a standardized manner so that other applications can interact with it in platform independent manner.
I am not sure if meaning has changed since but I have had an opportunity to work with a company that offers SOA suite and following are my thoughts on it.
Of course when you design an application you cannot guarantee it will be cross platform compliant. Take for example stock Trading systems. They use Fix protocol to transfer messages. Do you expect it now to return data in XML format so that it can be so called SOA compliant? Definitely not! SOA is an architectural approach that can help you decouple your application/services and let them interact with each other. Backbone of SOA is a ESB (Enterprise Service Bus) which is used to transfer data from one service to other. SOA architecture should take care of formats conversions. For example -
FIX(Service 1) -> (XML ---ESB---> XML) -> JSON (Service 2)
These conversion modules are commonly called as adapters and are generally part of SOA suite. For a bit more information refer to another answer -
Difference between SOA and ESB
Sure SOA is a word is hyped for marketing purpose. Technically speaking it as simple as de-serializing and serializing data so that services can be decoupled and platform independent but the idea behind it is concrete.
Also refer Wiki page for the same.
In reality, SOA is a collection of well-defined services. Basically SOA use loosely coupled service to get the desire result easily. Implementation details of a service are hidden from the client/consumer so any change in the implementation doesn’t affect the service until the contract between them is change. Service providers are components that execute some business logic based on predetermined inputs and outputs, and expose this functionality through an SOA implementation. This allows systems based on SOA to respond more quickly and cost effectively for the business. The main difference between component and SOA is that, SOA provide a open standards message which is not specific to any programming language or platform. As a result, you can achieve a high degree of loose coupling and interoperability across platforms and technologies. In a traditional client-server world, the provider will be a server and the consumer will be a client.You can read more about SOA here :Service Oriented architecture (SOA)
A service-oriented architecture (SOA) is an architectural pattern in which softwares are designed as building block. i.e. Modular development, which makes flexibility to assemble any way we want. If you want to start new project instead of starting from scratch, we can reuse the services and if you want to new service we can easily integrate with existing service to make new project. So we can save lot of time and money.The basic principles of service oriented architecture are independent of vendors, products and technologies.
Analogy: Toys build using Lego building blocks.
In fact, SOA also utilizes client-servier architecture. In addition, SOA is a way to design your software. Suppose that your application can break into simple and independent tasks like search a book, add new book, recommend a book according to user preference and so forth. If you consider a service (an API) for each of task, actually, you are using SOA. The advantage of this architecture is doesn't matter you're building a web app or mobile app, you only need the developed aforementioned services (APIs).
Service-oriented architecture (SOA) is a design approach where multiple services
collaborate to provide some end set of capabilities. A service here typically means a
completely separate operating system process. Communication between these services
occurs via calls across a network rather than method calls within a process boundary.
SOA emerged as an approach to combat the challenges of the large monolithic
applications. It is an approach that aims to promote the reusability of software; two or
more end-user applications, for example, could both use the same services. It aims to
make it easier to maintain or rewrite software, as theoretically we can replace one service
with another without anyone knowing, as long as the semantics of the service don’t
change too much.
I am a total newbie to the world of SOA. As such, I am looking at some "SOA frameworks/technologies", and trying to understand how to utilize them to build a highly scalable (Facebook class) website.
There are several "pains" I am trying to solve here:
Composability (+ managing dependencies, Pub/Sub)
Language-independence of services
Scalability & Performance
High availability
I looked into some technologies that answer a subset of the above criteria:
Thrift - Facebook's cross-platform RPC platform
WCF - supports SOAP, JSON & REST, so it can be considered language-interoperable. Generates WSDL files that can be use to generate java proxies.
Microsoft DSS - just inclued it in my survey, but it doesn't seem relevant as it is highly state-driven and .NET specific.
Web Services
Now, I understand how I get some aspects of composability and language-independence out of the above. But, I haven't found much concrete information (not buzz) about how to use the above / other tools for scalability and high availability. So finally I get to my question:
How does one leverage SOA technologies to solve the pains I defined above? Where can I find technical guides for that? I am looking for more than just system diagrams, but rather actual libraries, code samples, APIS...
I think the question has more to do with the concepts involved than the tools. Answers to the items:
Understand and internalize bounded context. Keeping unrelated pieces separate its important to get real reuse on different services. Related technologies won't help you on this, its you that separate the app in appropriate contexts, which you can appropriately reuse for different services.
Clear endpoint for communication based on known protocols allows implementing different pieces with different technologies
Having the operations flow like independent actions based on different protocols, gives you a lot of places where you can add tiers. Is a particular sub-process of the overall process using lots of resources and the server can't take it anymore, just move to a separate server. Load kept growing, and that server isn't taking it anymore, add an additional server and load balance. You also have more opportunity to use caching and connection pooling.
Have a critical sub-process that needs to be available all the time, add a server so you can have a fail over. Have an overall process that needs to be "available" all the time, use queues for pieces that can be processed later.
Do you really need to support that type of load? Set appropriate performance/load/interoperability targets that relate to the specific scenario. If you really need to support that type of load, I recommend you get someone on board who has dealt with it.
If it is something for a load that might eventually be, identify the bounded contexts and design the interaction between those with a SOA mindset. Keeping the code clean is all you have to do for the rest, use TDD, loose coupling, focused integration tests, etc in your code base. With good code, if you later need to separate pieces of the system, it will be a lot easier.
There are interesting and relevant things said about services and architecture by Amazon's CTO - Werner Vogels:
An interview about the Amazon technology platform
A post on his blog - "Eventually Consistent - Revisited"
A 50 min presentation about Availability & Consistency
IDesign.net has a bunch of great downloads for WCF.
Worth a look at: http://www.manning.com/davis/ ?
Check out the Mule project which bundles the CXF services stack and also the Mule REST pack which provides RESTful alternatives. I think you'll see it addresses all of your pains and there are lots of examples in the documentation as well as the distribution.
May I recommend the book: Enterprise Service Oriented Architectures published by Springer Verlag.
All of the advice here is well and good, but don't worry about it until you really need to.
Focus on building a usable, functional application that people really like. When you start running into problems, then start handling bottlenecks.
You will never be able to foresee every way an application will fail, so how can you tell if [[insert tech here]] is your answer?