Background:
I want to create an automation framework in C++ where on the one hand "sensors" and "actors" and on the other "logic engines" can be connected to a "core".
The "sensors" and "actors" might be connected to the machine running the "core", but some might also be accessible via a field bus or via normal computer network. Some might work continuous or periodically (e.g. every 100 milliseconds a new value), others might work event driven (e.g. only when a switch is [de]activated a message will come with the new state).
The "logic engine" would be sort of pluggable into the core and e.g. consist out of embedded well known script languages (Perl, Python, Lua, ...). There will run different little scripts from the users that can subscribe to "sensors" and write to "actors".
The "core" would route the sensor/actor informations to the subscribed scripts and call them. Some just after the event occurred, others periodically as defined in a scheduler.
Additional requirements:
The systems ("server") running this automation application might also be quite
small (500MHz x86 and 256 MB RAM) or if possible even tiny (OpenWRT
based router) as power consumption is an issue
=> efficiency is important
=> multicore support not for the moment, but I'm sure it'll become important soon - so the design has to support it
Some sort of fail save mode has to be possible, e.g. two systems monitoring each other
application / framework will be GPL => all used libraries have to be compatible
the server would run Linux, but cross platform would be nice
The big question:
What is the best architecture for such a kind of application / framework?
My reasoning:
Not to reinvent the wheel I was wondering to use MPI to do all the event handling.
This would allow me to focus on the relevant stuff and not on the message handling, especially when two or more "servers" would work together (watchdog for each other as well as each having a few sensors and actors connected). Each sensor and actor handler as well as the logic engines themself would only be required to implement a predefined MPI based interface and thus be crash save. The core could restart each when it's not responsive anymore.
The additional questions:
Would that be even possible with MPI? (It'd be used a bit out of context...)
Would the overhead of MPI be too big? Should I just write it myself using sockets and threads?
Are there other libraries possible that are better suited in this case?
You should be able to construct your system using MPI, but I think MPI is too much focused on high performance computing. Moreover, since it was designed for C, it does not really fit the object oriented way of programming very much. IMO there are other approaches better suited for your needs:
Boost ASIO might be a good fit for designing your system. It includes both network functionality and helps at event-driven programming (which could be a good way to design your system). You can have a look at Think-Async webpage for some examples on using ASIO for event-driven programming.
You could also use plain threads and borrow the network capabilities from ASIO (without using the event-driven programming parts). If you can use C++11, then you can directly use std::thread and all the other functionality available (mutex, conditional variables, futures, etc.). If you cannot use C++11, you can always use Boost Thread.
Finally, if you really want to go for MPI, you can have a look at Boost MPI. At least you will have a much more C++ friendly way of using MPI.
Related
We are going to write a concurrent program using Clojure, which is going to extract keywords from a huge amount of incoming mail which will be cross-checked with a database.
One of my teammates has suggested to use Erlang to write this program.
Here I want to note something that I am new to functional programming so I am in a little doubt whether clojure is a good choice for writing this program, or Erlang is more suitable.
Do you really mean concurrent or distributed?
If you mean concurrent (multi-threaded, multi-core etc.), then I'd say Clojure is the natural solution.
Clojure's STM model is perfectly designed for multi-core concurrency since it is very efficient at storing and managing shared state between threads. If you want to understand more, well worth looking at this excellent video.
Clojure STM allows safe mutation of data by concurrent threads. Erlang sidesteps this problem by making everything immutable, which is fine in itself but doesn't help when you genuinely need shared mutable state. If you want shared mutable state in Erlang, you have to implement it with a set of message interactions which is neither efficient nor convenient (that's the price of a nothing shared model....)
You will get inherently better performance with Clojure if you are in a concurrent setting in a large machine, since Clojure doesn't rely on message passing and hence communication between threads can be much more efficient.
If you mean distributed (i.e. many different machines sharing work over a network which are effectively running as isolated processes) then I'd say Erlang is the more natural solution:
Erlang's immutable, nothing-shared, message passing style forces you to write code in a way that can be distributed. So idiomatic Erlang automatically can be distributed across multiple machines and run in a distributed, fault-tolerant setting.
Erlang is therefore very well optimised for this use case, so would be the natural choice and would certainly be the quickest to get working.
Clojure could do it as well, but you will need to do much more work yourself (i.e. you'd either need to implement or choose some form of distributed computing framework) - Clojure does not currently come with such a framework by default.
In the long term, I hope that Clojure develops a distributed computing framework that matches Erlang - then you can have the best of both worlds!
The two languages and runtimes take different approaches to concurrency:
Erlang structures programs as many lightweight processes communicating between one another. In this case, you will probably have a master process sending jobs and data to many workers and more processes to handle the resulting data.
Clojure favors a design where several threads share data and state using common data structures. It sounds particularly suitable for cases where many threads access the same data (read-only) and share little mutable state.
You need to analyze your application to determine which model suits you best. This may also depend on the external tools you use -- for example, the ability of the database to handle concurrent requests.
Another practical consideration is that clojure runs on the JVM where many open source libraries are available.
Clojure is Lisp running on the Java JVM. Erlang is designed from the ground up to be highly fault tolerant and concurrent.
I believe the task is doable with either of these languages and many others as well. Your experience will depend on how well you understand the problem and how well you know the language. If you are new to both, I'd say the problem will be challenging no matter which one you choose.
Have you thought about something like Lucene/Solr? It's great software for indexing and searching documents. I don't know what "cross checking" means for your context, but this might be a good solution to consider.
My approach would be to write a simple test in each language and test the performance of each one. Both languages are somewhat different to C style languages and if you aren't used to them (and you don't have a team that is used to them) you may end up with a maintenance nightmare.
I'd also look at using something like Groovy 1.8. Groovy now includes GPars to enable parallel computing. String and file manipulation in Groovy is very easy indeed.
It depends what you mean by huge.
Strings in erlang are painful..
but:
If huge means tens of distributed machines, than go with erlang and write workers in text friendly languages (python?, perl?). You will have distributed layer on the top with highly concurrent local workers. Each worker would be represented by erlang process. If you need more performance, rewrite your worker into C. In Erlang it is super easy to talk to another languages.
If huge still means one strong machine go with JVM. It is not huge then.
If huge is hundreds of machines, I think you will need something stronger google-like (bigtable, map/reduce) probably on C++ stack. Erlang still OK, however you will need good devs to code it.
I am currently looking for a discrete event simulator written for C++. I did not find much on the web written specifically in OO-style; there are some, but outdated. Some others, such as Opnet, Omnet and ns3 are way too complicated for what I need to do. And besides, I need to simulate agent-based algorithms capable of simulating systems of thousands of nodes.
Does anybody know anything suitable for my needs?
Others have good direct answers, but I'm going to suggest an alternative. If I understand you right, you want a system in C++ or such where you can post events that fire in the future, and code is run when those events fire.
I had a project to do like this, and I started out trying to write such an event system in C++ and then quickly realized I had a better solution.
Have you considered writing your program in behavioral Verilog? That may seem strange to write software in a hardware description language, but a Verilog simulator is an event-based system underneath, and behavioral Verilog is a very convenient way to express events, timing, triggers, etc. There is a free Verilog simulator (which is what I used) called Icarus Verilog. If you're not using Ubuntu or some Linux distro with Icarus already in a package, building from source is straightforward.
I would recommend having a second look to OmNet++. At first sight it may look quite complex, but if you look it into more detail you will find that most of the complexity is in the network add-on (the INET Framework). Unless you are going to do a detailed network simulation you do not need the INET.
Using OmNet++ core is not specially difficult and it may be simpler than other similar tools.
You may want to have a look to an intro.
One of the things that makes OmNet++ attractive to me is its scalability. Is possible to run large simulations in a desktop. Besides, it is possible to scale the same simulation to a cluster without rewriting the code.
You should consider SystemC, although I'd also recommend taking a second look at OmNet++.
We use SIMLIB at my school. It is very fast, easy to understand, object oriented, discrete and continuous simulator. It might look outdated but it is still maintained.
There is CSIM from Mesquite Software which supports developing models in C, C++ and Java. However, it is paid-commercial, AFAIK.
Take a look at GBL library. It's written in modern C++ and even supports C++0x features like move semantics and lambda functions. It offers several modeling mechanisms: synchronous and asynchronous event handlers, preemptive threads, and fibers. You can create purely behavioral, cycle accurate, and real-time models, or any mixture of those.
From some browsing on net, I just understood that any framework is set of libraries provided by the framework and we can simply use those library functions to develop the application.
I would like to know more about
what is a framework with respect to C++.
How are C++ frameworks designed?
How can we use them and develop applications.
Can someone provide me some links to understand the concept of "framework" in C++
A "framework" is something designed to provide the structure of a solution - much as the steel frame of a skyscraper gives it structure, but needs to be fleshed out with use-specific customisations. Both assume some particular problem space - whether it's multi-threaded client/server transactions, or a need for air-conditioned office space, and if your needs are substantively different - e.g. image manipulation or a government art gallery - then trying to use a poorly suited framework is often worse than using none. Indeed, if the evolving needs of your system pass beyond what the framework supports, you may find your options for customising the framework itself are insufficient, or the design you adopted to use it just doesn't suit the re-architected solution you later need. For example, a single-threaded framework encourages you to program in a non-threadsafe fashion, which may be a nightmare to make efficiently multi-threaded post-facto.
They're designed by observing that a large number of programs require a similar solution architecture, and abstracting that into a canned solution framework with facilities for those app-specific customisations.
How they're used depends on the problems they're trying to solve. A framework for transaction dispatch/handling will typically define a way to list IP ports to listen on, nominate functions to be called when connections are made and new data arrives, register timer events that call back to arbitrary functions. XML document, image manipulation, A.I. etc. frameworks would be totally different.... The whole idea is that they each provide a style of use that is simple and intuitive for the applications that might wish to use them.
A big hassle with many frameworks is that they assume ownership of the applications that use them, and relegate the application to a secondary role of filling in some callbacks. If the application needs to use several frameworks, or even one framework with some extra libraries doing e.g. asynchronous communications, then the frameworks may make that very difficult. A good framework is designed more like a set of libraries that the client can control, but need not be confined by. Good frameworks are rare.
More often than not, a framework (as opposed to "just" a library or set of libraries), in OOP languages (including C++), implies a software subsystem that, among other things, supplies classes you're supposed to inherit from, overriding certain methods to specialize the class's functionality for your application's needs, in your application code. If it was just some collection of functions and typedefs it should more properly be called a library, rather than a framework.
I hope this addresses your points 1 and 3. Regarding point 2, ideally, the designers of a framework have a lot of experience designing applications in a certain area, and they "distill" their experience and skill into a framework that lets (possibly less-experienced) developers build their own applications in that area more easily and expeditiously. In the real world, of course, such ideals are not always followed.
With a tool like CppDepend you can analyze any C++ framework, reverse engineer its design in a minute, but also have an accurate idea of the overall code quality of the framework.
An application framework (regardless of language) is a library that attempts to provide a complete framework within which you plug in functionality for your specific application.
The idea is that things like web applications and GUI applications typically require quite a bit of boilerplate to get working at all. The application framework provides all that boilerplate code, and some sort of organization (typically some variation of model-view-controller) where you can plug in the logic specific to your particular application, and it handles most of the other stuff like automatically routing messages and such as needed.
a theoretical question. After reading Armstrongs 'programming erlang' book I was wondering the following:
It will take some time to learn Erlang. Let alone master it. It really is fundamentally different in a lot of respects.
So my question: Is it possible to write 'like erlang' or with some 'erlang like framework', which given that you take care not to create functions with sideffects, you can create scaleable reliable apps as well as in Erlang? Maybe with the same msgs sending, loads of 'mini processes' paradigm.
The advantage would be to not throw all your accumulated C/C++ knowledge over the fence.
Any thoughts about this would be welcome
Yes, it is possible, but...
Probably the best answer for this question is given by Robert Virding’s First Rule:
“Any sufficiently complicated
concurrent program in another language
contains an ad hoc,
informally-specified, bug-ridden, slow
implementation of half of Erlang.”
Very good rule is use the right tool for the task. Erlang excels in concurrency and reliability. C/C++ was not designed with these properties in mind.
If you don't want to throw away your C/C++ knowledge and experience and your project allows this kind of division, good approach is to create a mixed solution. Write concurrent, communication and error handling code in Erlang, then add C/C++ parts, which will do CPU and IO bound stuff.
You clearly can - the Erlang/OTP system is largely written in C (and Erlang). The question is 'why would you want to?'
In 'ye olde days' people used to write their own operating system - but why would you want to?
If you elect to use an operating system your unwritten software has certain properties - it can persist to hard disk, it can speak to a network, it can draw on screens, it can run from the command line, it can be invoked in batch mode, etc, etc...
The Erlang/OTP system is 1.5M lines of code which has been demonstrated to give 99.9999999% uptime in large systems (the UK phone system) - that's 31ms downtime a year.
With Erlang/OTP your unwritten software has high reliability, it can hot-swap itself, your unwritten application can failover when a physical computer dies.
Why would you want to rewrite that functionality?
I would break this into 2 questions
Can you write concurrent, scalable C++ applications
Yes. It's certainly possible to create the low level constructs needed in order to achieve this.
Would you want to write concurrent, scalable, C++ applications
Perhaps. But if I was going for a highly concurrent application, I would choose a language that was either designed to fill that void or easily lent itself to doing so (Erlang, F# and possibly C#).
C++ was not designed to build highly concurrent applications. But it can certainly be tweaked into doing so. The cost might be higher than you expect though once you factor in memory management.
Yes, but you will be doing some extra work.
Regarding side effects, consider how the .net/plinq team is approaching. Plinq won't be able to enforce you hand it stuff with no side effects, but it will assume you do so and play by its rules so we get to use a simpler api. Even if the language doesn't have built-in support for it, it will still simplify things as you can break the operations more easily.
What I can do in one Turing complete language I can do in any other Turing complete language.
So I interpret your question to read, is it as easy to write a reliable and scalable application in C++ as it is in Erlang?
The answer to that is highly subjective. For me it is easier to write it in C++ for the following reasons:
I have already done it in C++ (at least three times).
I don't know Erlang.
I have read a great deal about Stackless Python, which feels to me like a highly concurrent message based cooperative multitasking system in python, but of course python is written on top of C.
Having said that. If you already know both languages, and you have the problem well defined, you can then make the best choice based on all the information you have at hand.
the main 'problem' with C (or C++) for writing reliable and easy to extend programs is that in C you can do anything. so, the first step would be to write a simple framework that restricts just a bit. most good programmers do that anyway.
in this case, the restrictions would be mostly to make it easy to define a 'process' within whatever level of isolation you want. fork() has a reputation of being slow, and threads also need significant time to spawn, so you might want to use a cooperative multitasking, which can be far more efficient, and you could even make it preemptive (i think that's what Erlang does). to get multi-core efficiency, set a pool of threads and make all of them complete to run the tasks.
another important part would be to create an appropriate library of immutable data structures, so that using them (instead of the standard lib) your functions would be (mostly) side-effect-free.
then it's just a matter of setting a good API for message passing and futures... not easy, but at least it doesn't seem like changing the language itself.
Is there a good C++ framework to implement XA distributed transactions?
With the term "good" I mean usable, simple (doesn't imply "easy"), well-structured.
Due to study reasons, at the moment I'm proceeding with a personal implementation, following X/Open XA specification.
Thank you in advance.
I am not aware of an open-source or free transaction monitor that has any degree of maturity, although This link does have some fan-out. The incumbent commercial ones are BEA's Tuxedo, Tibco's Enterprise Message Service (really a transactional message queue manager like IBM's MQ) and Transarc's Encina (now owned by IBM). These systems are all very expensive.
If you want to make your own (and incidentally make a bit of a name for yourself by filling a void in the open-source software space) get a copy of Grey and Reuter.
This is the definitive work on transaction processing systems architecture, written by two of the foremost experts in the field.
Interestingly, they claim that one can implement a working TP monitor in around 10,000 lines of C. This actually sounds quite reasonable, as what it does is not all that complex. On occasion I have been tempted to try.
Essentially you need to make a distributed transaction coordinator that runs as a daemon process. You will need to get the resource manager protocol working from it, so starting with this as a prototype is probably a good start. If you can get it to independently roll back or commit a transaction you have the basis of the TM-RM interface.
The XA API as defined in the spec is the API to control the transaction manager. Strictly speaking, you don't need to make a 3-tier architecture to use distributed transactions of this sort, but they are more or less pointless without a TP monitor. How you communicate from the front-end to the middle-tier can be left as an exercise for the reader. You are probably best off using an existing ORB, of which there are several good open-source implementations available.
Depending on whether you want to make the DTC and the app server separate processes (which is possibly desirable for stability but not strictly necessary) you could also use ACE as a basis for the DTC server.
If you want to make a high-performance middle-tier server, check out Douglas Schmidt's ACE framework. This comes with an ORB called TAO, and is flexible enough to allow you to use more or less any threading model that takes your fancy. Using this is a trade-off between learning it and the effort of writing your own and debugging all the synchronisation and concurrancy issues.
Maybe quite late for your task, but it can be useful for other users: LIXA is not a "framework", but it provides an implementation for the TX Transaction Demarcation specification and supports most of the XA features.
The TX specification is for C and COBOL languages, but the integration of the C version inside a C++ project should be effortless.
Other option is open source Enduro/X distributed transaction processing framework which allows to write simple C/C++ services which may operate with resource managers (e.g. databases) and gives capability to commit or abort works done by several different executables on same/different physical servers worked with different resources/databases.
Internally XA 2PC is used there.