Intel Thread Building Blocks alternatives & licensing - c++

Before you mark this question as a duplicate please take in to consideration that most of the similar questions are 5+ years old!
I have two questions:
The dual-licensing. What does it mean?! Would I have to buy the commercial version to make a commercial closed-source project?
At this page, it says that
IntelĀ® TBB is offered commercially for customers who want the additional support that comes with IntelĀ® Premier Support. The commercial version is also available for developers who cannot follow the GPLv2 with the runtime exception license. - See more at: https://www.threadingbuildingblocks.org/download#sthash.YjVMDqhv.dpuf
But in this answer he says that the only advantage of buying it, is that you get support. Please notice this question is 7 years old, and therefore I can't trust it, as things could have changed.
If I can't use TBB for a closed-source commercial project, what are som alternatives? I will most likely only need features like concurrent maps and queues.
Edit: Also if the commercial version is required, could I wait buying it till release of my app?

Re: #2 (TBB alternatives), if you're on Windows, the PPL provides parallel containers and algorithms that are somewhat source compatible with TBB.
Also, Boost.Lockfree has lock-free queue and stack implementations.
If you need parallel algorithms and don't mind being on the bleeding edge, take a look at HPX as an alternative to TBB. It's under very active development, though, so it might be a bit of a moving target... In their latest 0.9.11 release they've implemented some aspects of the Parallelism TS, so there might be some API stability there that could make you (somewhat) well positioned to transition to standard algorithms if those ever materialize. Relevant docs are here.

Related

Intel TBB vs CilkPlus [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am developing time-demanding simulations in C++ targeting Intel x86_64 machines.
After researching a little, I found two interesting libraries to enable parallelization:
Intel Threading Bulding Blocks
Intel Cilk Plus
As stated in docs, they both target parallelism on multicore processors but still haven't figured which one is the best. AFAIK Cilkplus simply implements three keywords for an easier parallelism (which causes GCC to be recompiled to support these keywords); while TBB is just a library to promote a better parallel development.
Which one would you recommend?
Consider that I am having many many many problems installing CilkPlus (still trying and still screaming). So I was wondering, should I go check TBB first? Is Cilkplus better than TBB? What would you recommend?
Are they compatible?
Should I accomplish installing CilkPlus (still praying for this), would it be possible to use TBB together with it? Can they work together? Is there anyone who did experience sw develpment with both CiclkPlus and TBB? Would you recommend working with them together?
Thankyou
Here are some FAQ type of information to the question in the original post.
Cilk Plus vs. TBB vs. Intel OpenMP
In short it depends what type of parallelization you are trying to implement and how your application is coded.
I can answer this question in context to TBB. The pros of using TBB are:
No compiler support needed to run the code.
Generic C++ algorithms of TBB lets user create their own objects and map them to thread as task.
User doesn't need to worry about thread management. The built in task scheduler automatically detects the number of possible hardware threads. However user can chose to fix the number of threads for performance studies.
Flow graphs for creating tasks that respect dependencies easily lets user exploit functional as well as data parallelism.
TBB is naturally scalable obviating the need for code modification when migrating to larger systems.
Active forum and documentation being continually updated.
with intel compilers, latest version of tbb performs really well.
The cons can be
Low user base in the open source community making it difficult to find examples
examples in documentations are very basic and in older versions they are even wrong. However the Intel forum is always ready to extend support to resolve issues.
the abstraction in the template classes are very high making the learning curve very steep.
the overhead of creating tasks is high. User has to make sure that the problem size is large enough for the partitioner to create tasks of optimal grain size.
I have not worked with cilk either, but it's apparent that if at all there are users in the two domain, the majority is that of TBB. It's likely if Intel pushes for TBB by it's updated document and free support, the user community in TBB grows
They can be used in complement to each other (CILK and TBB). Usually, thats the best. But from my experience you will use TBB the most.
TBB and CILK will scale automatically with the number of cores. (by creating a tree of tasks, and then using recursion at run-time).
TBB is a runtime library for C++, that uses programmer defined Task Patterns, instead of threads. TBB will decide - at run-time - on the optimal number of threads, tasks granularity and performance oriented scheduling (Automatic load balancing through tasks stealing, Cache efficiency and memory reusing). Create tasks recursively (for a tree this is logarithmic in number of tasks).
CILK(plus) is a C/C++ language extension requires compiler support.
Code may not be portable to different compilers and operating systems. It supports fork-join parallelism. Also, it is extremely easy to parallelize recursive algorithms. Lastly, it has a few tools (spawn, sync), with which you can parallelize a code very easily. (not a lot of rewrite is needed!).
Other differences, that might be interesting:
a) CILK's random work stealing schedule for countering "waiting" processes.
a) TBB steals from the most heavily loaded process.
Is there a reason you can't use the pre-built GCC binaries we make available at https://www.cilkplus.org/download#gcc-development-branch ? It's built from the cilkplus_4-8_branch, and should be reasonably current.
Which solution you choose is up to you. Cilk provides a very natural way to express recursive algorithms, and its child-stealing scheduler can be very cache friendly if you use cache-oblivious algorithms. If you have questions about Cilk Plus, you'll get the best response to them in the Intel Cilk Plus forum at http://software.intel.com/en-us/forums/intel-cilk-plus/.
Cilk Plus and TBB are aware of each other, so they should play well together if you mix them. Instead of getting a combinatorial explosion of threads you'll get at most the number of threads in the TBB thread pool plus the number of Cilk worker threads. Which usually means you'll get 2P threads (where P is the number of cores) unless you change the defaults with library calls or environment variables. You can use the vectorization features of Cilk Plus with either threading library.
- Barry Tannenbaum
Intel Cilk Plus developer
So, as a request from the OP:
I have used TBB before and I'm happy with it. It has good docs and the forum is active. It's not rare to see the library developers answering the questions. Give it a try. (I never used cilkplus so I can't talk about it).
I worked with it both in Ubuntu and Windows. You can download the packages via the package manager in Ubuntu or you can build the sources yourself. In that case, it shouldn't be a problem. In Windows I built TBB with MinGW under the cygwin environment.
As for the compatibility issues, there shouldn't be none. TBB works fine with Boost.Thread or OpenMP, for example; it was designed so it could be mixed with other threading solutions.

Recommendations for a Boost::ASIO based CORBA library

I'm looking for a CORBA kit. I need the IDL compiler plus libraries (or source) for the ORB. I don't really know a helluva lot more about CORBA, but we need to interface with a server whose functions are exposed via CORBA.
The requirements I've been given, in rough order of priority are:
1 - Low cost or license amenable to commercial (closed source) use.
2 - Performance performance performance - is there a Boost::ASIO based ORB?
3 - Simple to integrate for at least Windows and Linux development.
We measure our software's performance in microseconds, so I need to be sure that the underlying network latency has been kept to an absolute minimum, but also, personally, I don't want to wrestle with a half-finished or half-working project and I don't want integrating this stuff to become the whole project. Essentially I need to get this API built and be calling remote functions with as little fuss as possible. That might just be wishful thinking, but it's worth mentioning.
So, has anyone out there had RECENT experience integrating CORBA into modern desktop application project? What would you recommend to use, and what should I beware of?
I'm currently using omniorb for an embedded software in the telecommunication field.
As for your questions:
It is free even for commercial use. It comes with a LGPL license
I haven't mesured performances, but I've got good results in an embedded real-time project. (About your question on boost::asio: I'm pretty sure that an ORB based on boost::asio doesn't exist)
It's been tested on many platforms, including linux and windows.
Maybe you could give a try to omniorb. Otherwise you could try TAO: it's a real-time ORB, but I never used it.
As far as I know there is no ORB that is bui;d on top of boost::asio. I would recommend you to have a look at TAO or TAOX11 which is a modern CORBA implementation. There is a free CORBA Programmers Guide with some starter information by Remedy IT, or the OCI Developers Guide.

Boost advocacy - help needed

Possible duplicates
Is there a reason to not use Boost?
What are the advantages of using the C++ BOOST libraries?
OK, the high-level question is "Please provide me with what you consider to be the most effective arguments of why entire Boost, or some specific parts of it, should be compiled on our company's system and endorsed in software engineering standards".
Details of what I need:
Would gladly accept both positive arguments (why install), as well as proposed rebuttals of likely counter-arguments I might hear (see question context below).
Arguments should be made aimed at both technical Software Engineering team members and/or very technical senior managers - in other words, for the latter, the details of the argument may/should be technical, but the thrust of the argument should be "how would this make/save the company X money vs losing the company Y money as a cost of adding it to our toolset".
Context of the question:
I'm a developer in a company with several hundred developers, many dosens of whom do C++.
I had the (mis)fortune of being reassigned from my beloved Perl development spot to a team where I am also doing C++ development. So far I found numerous things that I could easily have done in Perl that are very hard/cumbersome to do in C++ (foreach loop as an example), and anytime I hit one of these, the answer 50% likely ends up being "You can't do this in standard C++ but you can do it with Boost"
Our toolkit includes some legacy RogeWave libraries, and VERY limited number of Boost libraries (e.g. no regex, no foreach), of very old vintage.
Any development must use libraries compiled and vetted by Software Engineering team. That is a hard and fast rule.
SE team is somewhat resistant to adding new libraries, for a variety of reasons (e.g. effort to do this; functionality conflicts with RogeWave, for example for RegEx; the risk of installing and using any new software; cost of educating developers, etc...). They will add the libraries if presented with sufficient business need or majorly convincing cost/benefit ratio argument, but they have pretty tough threshold.
So, I'm looking for examples of which parts of Boost are so wonderful (with exact cost/benefit estimates) that installing them would be an Obviously Worth It Effort for Software Engineering.
Thanks in advance for any ideas/suggestions/examples.
Please don't mark this question as subjective as I am looking for measurable answers, not merely wonderful feelings :)
Wherever I worked in the last decade, when they had their own smart pointer class, I found bugs in that - usually within a few weeks. And, no, I never went and looked at it hoping to find errors.
I got into the habit of posting the following quote from the TR1 smart pointer proposal:
The Boost developers found a shared-ownership smart pointer exceedingly difficult to implement correctly. Others have made the same observation. For example, Scott Meyers [Meyers01] says:
"The STL itself contains no reference-counting smart pointer, and writing a good one - one that works correctly all the time - is tricky enough that you don't want to do it unless you have to. I published the code for a reference-counting smart pointer in More Effective C++ in 1996, and despite basing it on established smart pointer implementations and submitting it to extensive pre- publication reviewing by experienced developers, a small parade of valid bug reports has trickled in for years. The number of subtle ways in which reference-counting smart pointers can fail is remarkable."
This plus a detailed analysis of the bug(s) I found usually got me the job of incorporating the boost libs into the code base. :)
It seems to me you're doing this the wrong way around. Since proposals to add new libraries are going to be met with a lot of resistance, don't even bother trying to argue for boost as a whole. Choose your battles instead.
Find the specific Boost libraries which you know (with your knowledge of the application it's to be used in) will be useful and save time and money. And then propose adding those.
I could easily list the Boost libs I've found useful, and why I think they're great, but I don't know if they'll be of any use in your application.
Push for individual boost libraries to be included, and then perhaps, over time, so many of them will be included that it'll be simpler for everyone to just include all of Boost.
It's an open standard not controlled by a specific company ( no licensing costs )
It's cross platform
It's expertly designed / written and very fast / efficient extensively tested
There are open source implementations which your team can compile themselves.
Boost will soon become part of the standard C++ STL
Here is a slightly oldish 2005 article on Dr. Dobbs discussing the upcoming C++0x standard.
http://www.ddj.com/cpp/184401958
I had to maintain a component using that old vintage Tools.h++ from Roguewave, on a Solaris system.
On Solaris, if we want to use boost, we need to use either gcc, or SunStudio with STLport implementation of the standard (instead of Roguewave one). And as Tools.h++ requires the old Roguewave pre-standard implementation of the standard -- on Solaris --, I had to give up on boost.
In the end I rewrote a simplified version of a few boost-like features I needed.
If you are in that same situation (*) , you would not be able to move from Roguewave library to boost that easily. There is a non negligible cost in this operation, as for instance pointer containers from both libraries have quite different interfaces.
(*) Where we can't slowly change bits by bits of the old code to progressively use boost. In that situation, the migration has to be radical and to change simultaneously every occurrence of Tools.h++ by something more trendy, or even better.
NB: Most people are able to progressively use boost in old projects, and may miss a very important, and yes technical, point. Hence my negative answer.
Boost is a great tool and an invaluable part of our product development (we'd be lost without smart_ptr)... but because it is changing so fast the stability of releases can be effected.
For example, we were happily introducing new versions of Boost as soon as they came out without thinking twice. That is until we were stung with a bug in the 1.35 threading library that produced occassional (ie difficult to debug) but critical errors. Fortunately we identified the issue before anything was released to the public and could move back to 1.34.
Ever since then we've taken a specific version, extensively tested it, and not updated without a compelling reason to do so.
Here are two suggestions for advocating boost:
Who's Using Boost? (http://www.boost.org/users/uses.html)
lots of major projects use boost: (e.g., adobe photoshop, CERN)
Boost Project Cost (http://www.boost.org/development/index.html)
How much it would cost to hire a team to write boost from scratch? There's a nifty (somewhat gimmicky) calculator there that helps to put it in perspective.

Lightweight, portable C++ fibers, MIT license

I would like to get ahold of a lightweight, portable fiber lib with MIT license (or looser). Boost.Coroutine does not qualify (not lightweight), neither do Portable Coroutine Library nor Kent C++CSP (both GPL).
Edit: could you help me find one? :)
Libtask: MIT License
Libconcurrency: LGPL (a little tighter than MIT, but it's a functional library!)
Both are written for C.
I actually blogged about this in the past. Have a look! I hope it answers your questions. In it, I cover a number of libraries, and I was particularly interested in ones that were useful for systems programming (asynchronous IO).
Conspicuously absent from that coverage is Boost.Coroutine, which I'll discuss here. Boost.Coroutine may be considered "heavyweight" conceptually (in terms of its family of types), but the implementation is quite efficient. The real problem is that Boost.Coroutine is incomplete, and (last I checked) far from complete. I had spent some time trying to work with the author through its non-starter issues, as I was really looking forward to using it in conjunction with Boost.Asio (this was one of Boost.Coroutine's primary objectives), but the author has not had the time to take his work to the Boost formal review stage.
list of implementations for C
for ultra lightweight "threads" take a look at Protothreads at the bottom of the wikipedia article.
If Boost seems to heavy, helpful people have extracted the relevant parts of Boost (fcontext) as a standalone library, e.g. deboost.context.
Now you have two better options with Boost license:
Boost.Fiber
Boost.Coroutine2
There is a blazing fast and lightweight C asymmetric coroutine library - libaco.
It is really small, very fast and extremely memory efficient:
Along with the implementation of a production-ready C coroutine
library, here is a detailed documentation about how to implement a
fastest and correct coroutine library and also with a strict
mathematical proof;
It has no more than 700 LOC but has the full
function you may want from a coroutine library;
The benchmark part
shows that one time of the context switching between coroutines only
takes about 10 ns (for the case of standalone stack) on the AWS
c5d.large machine;
User could choose to create a new coroutine with a
standalone stack or with a share stack (could be shared with others);
It is extremely memory efficient: 10,000,000 amount of co
simultaneously to run only cost 2.8 GB physical memory (run with
tcmalloc, each co has a 120B copy-stack size configuration).
It also has a very detailed documentation.
PS:
It is under the Apache License, Version 2.0.

Is it worth learning AMD-specific APIs?

I'm currently learning the APIs related to Intel's parallelization libraries such as TBB, MKL and IPP. I was wondering, though, whether it's also worth looking at AMD's part of the puzzle. Or would that just be a waste of time? (I must confess, I have no clue about AMD's library support - at all - so would appreciate any advice you might have.)
Just to clarify, the reason I'm going the Intel way is because 1) the APIs are very nice; and 2) Intel seems to be taking tool support as seriously as API support. (Once again, I have no clue how AMD is doing in this department.)
The MKL and IPP libraries will perform (nearly) as well on AMD machines. My guess is that TBB will also run just fine on AMD boxes. If I had to suggest a technology that would be beneficial and useful to both, it would be to master the OpenMP libraries. The Intel compiler with the OpenMP extensions is stunningly fast and works with AMD chips also.
Only worth it if you are specifically interested in building something like Video games, Operating systems, database servers, or virtualization software. In other words: if you have a segment where you care enough about performance to take the time to do it (and do it right) in assembler. The same is true for Intel.
If your company sells packages of just Intel Servers with your software, then you shouldn't bother learning the AMD approach. But if you're going to have to offer software for both (or many) different platforms, then it might be worth looking into the different technologies. It will be very difficult to create the wrappers for the hardware-specific libraries. (Especially since threading is involved.)
And you definitely don't want to write completely separate implementation for each hardware configuration. In fact, if your software is to be consumed by a generic user, then you may want to abandon the Intel technology, and use standard threading techniques. I don't mean to be discouraging, but I believe that the Intel threading libraries are a bit ahead of their time for all intents and purposes.