TBB Intel Threading Building Blocks for Raspberry Pi 3 - build

So I am trying to compile Intel's TBB C++ library which enables parallelisms in programs. I am particularly needing this to use C++ React, which is a library which provides reactive library (e.g. asynchronous loops) for a project I am doing.
I have figured out how to compile it for Raspberry Pi 2. But my problem is that the guides I have seen have only updated for the ARM-7a architecture.
Currently, when I try to make a build which uses TBB as a dependency, I get this error:
In file included from /home/pi/tbb43_20150611oss/include/tbb/tbb_machine.h:247:0,
from /home/pi/tbb43_20150611oss/include/tbb/task.h:25,
from /home/pi/tbb43_20150611oss/include/tbb/task_group.h:24,
from /home/pi/cpp.react-master/include/react/engine/PulsecountEngine.h:18,
from /home/pi/cpp.react-master/src/engine/PulsecountEngine.cpp:7:
/home/pi/tbb43_20150611oss/include/tbb/machine/gcc_armv7.h:31:2: error: #error compilation requires an ARMv7-a architecture.
#error compilation requires an ARMv7-a architecture.
I just want to know how I can port TBB to work on ARM-53 for the new Raspberry Pi.
An easy solution such as replacing _ARM_ARCH_7A_ in gcc_arm7.h would be nice, but how do people go about porting TBB for other architectures?
Thank you

If you want to contribute to TBB (e.g. to port it for some other architecture), you can go to "submit contribution" page on the open source site and send your patch.
To port TBB on ARMv8, you have at least several options:
If ARMv8 and ARMv7 are very similar, you can try to extend the check on line 30 in gcc_arm7.h to work with ARMv8;
If ARMv8 and ARMv7 are quite different, you can create gcc_arm8.h (or gcc_arm with support v7 and v8) and improve the logic in tbb_machine.h near lines 246-248;
Theoretically, if gcc on ARMv8 supports built-in atomics, you can use gcc_generic.h on ARMv8 (see tbb_machine.h:249)
It looks like that you do not need to improve make files but I'd recommend running make test to be sure that modified TBB works correctly on your system.
[UPDATE] TBB has been ported to ARMv8 since version 2018 U5.

Latest update August 2018,
Check out my git: https://github.com/abhiTronix/TBB_Raspberry_pi
Latest binary (2018 - Update 4) of TBB for the Raspberry Pi exclusively for Raspberry Pi (.deb) file
compiled for a Raspberry Pi 2/3 Model B/B+ running Raspbian Stretch.
Enjoy ;)

Related

How can we distribute compiled source code if it is specific to the hardware it was compiled on?

Suppose we take a compiled language, for example, C++. Now let's take an example Framework, suppose Qt. Qt has it's source code publically available and has the options for users to download the binary files and let users use their API. My question is however, when they compiled their code, it was compiled to their specific HardWare, Operating System, all that stuff. I understand how many Software Require recompilation for different types of Operating Systems (Including 32 vs 64bit) and offer multiple downloads on their website, however how does it not go even further to suggest it is also Hardware Specific and eventually result in the redistribution of compiled executes extremely frustrating to produce?
Code gets compiled to a target base CPU (e.g. 32-bit x86, x86_64, or ARM), but not necessarily a specific processor like the Core i9-10900K. By default, the compiler typically generates the code to run on the widest range of processors. And Intel and AMD guarantee forward compatibility for running that code on newer processors. Compilers often offer switches for optimizing to run on newer processors with new instruction sets, but you rarely do that since not all your customers have that config. Or perhaps you build your code twice (once for older processors, and an optimized build for newer processors).
There's also a concept called cross-compiling. That's where the compiler generates code for a completely different processor than it runs on. Such is the case when you build your iOS app on a Mac. The compiler itself is an x86_64 program, but it's generating ARM CPU instruction set to run on the iPhone.
Code gets compiled and linked with a certain set of OS APIs and external runtime libraries (including the C/C++ runtime). If you want your code to run on Windows 7 or Mac OSX Maverics, you wouldn't statically link to an API that only exists on Windows 10 or Mac OS Big Sur. The code would compile, but it wouldn't run on the older operating systems. Instead, you'd do a workaround or conditionally load the API if it is available. Microsoft and Apple provides the forward compatibility of providing those same runtime library APIs to be available on later OS releases.
Additionally Windows supports running 32-bit processes on 64-bit chips and OS. Mac can even emulate x86_64 on their new ARM based devices coming out later this year. But I digress.
As for Qt, they actually offer several pre-built configurations for their reference binary downloads. Because, at least on Windows, the MSVCRT (C-runtime APIs from Visual Studio) are closely tied to different compiler versions of Visual Studio. So they offer various downloads to match the configuration you want to build your your code for (32-bit, 64-bit, VS2017, VS2019, etc...). So when you put together a complete application with 3rd party dependencies, some of these build, linkage, and CPU/OS configs have to be accounted for.

How to build and use Google tensorflow C++ API on ARM processor

This is a follow-on to "how-to-build-and-use-google-tensorflow-c-api" : can any one explain how to build a Tensorflow C++ program on an ARM processor? I'm thinking specifically of Nvidia's Jetson family of GPU devices. Nvidia has lots and lots of documentation for these, but it all seems to be for Python (like this), for toy examples, and nothing for anyone who wants to write a C++ program using the full tensorflow API (if one even exists) for their own machine learning models. I'd like to be able to build programs like this one, which is a deep learning inference and exactly what the Jetson is supposedly made for.
I've found Web sites that offer links to installers too, but they all seem to be for the x86 architecture instead of ARM.
I have the same question about Bazel. I gather from all the unsatisfactory documentation I've been looking at that Bazel is mandatory for anyone who wants to build tensorflow programs using a GPU, but all of the installation instructions I can find are either incomplete or for a different architecture such as x86 (for example https://www.osetc.com/en/how-to-install-bazel-on-ubuntu-14-04-16-04-18-04-linux.html
I'll add that any link or github repository that dumps a load of code in my lap without making clear the prerequisites (since my little Jetson may not have the stuff installed that you assume) or the commands needed to actually build it (especially if it includes a project file for a compiler I never heard of) isn't very much help.

OpenCL development on Intel CPU/GPU under Linux

I have an intel i7 haswell cpu, and I would like to start exploring OpenCL development. In particular, I am interested to run OpenCL code on the integrated GPU.
Unfortunately, by now, I was not able to find any SDK on Intel's site..
May you provide some links, together with a summary of the current status of OpenCL tools for the Linux platform and Intel hardware?
I think this would be useful to many other people..
Thanks a lot!
Intel does not provide free support for OpenCL on their iGPUs under Linux - you have to buy the Intel Media Server Studio, minimum $499. On Windows, you can download a free driver to get OpenCL capability for the iGPU: https://software.intel.com/en-us/articles/opencl-drivers#philinux.
Note that you can use any OpenCL SDK you want - it doesn't have to be Intel. The SDK is only useful for building your program. For running an OpenCL program, you need an appropriate runtime (driver) from the manufacturer. The AMD SDK will give you access to the CPU as an OpenCL device, but not the iGPU.
There is Open Source OpenCL implementation for Intel GPUs on Linux called Beignet, maintained by bunch of guys from Intel.
Sadly, couldn't personally try and check if Your's GPU is properly supported, but on their wiki they states:
Supported Targets
4th Generation Intel Core Processors "Haswell", need kernel patch currently, see the "Known Issues" section.
Beignet: self-test failed" and almost all unit tests fail. Linux 3.15 and 3.16 (commits f0a346b to c9224fa) enable the register whitelist by default but miss some registers needed for Beignet.
This can be fixed by upgrading Linux, or by disabling the whitelist:
# echo 0 > /sys/module/i915/parameters/enable_cmd_parser
On Haswell hardware, Beignet 1.0.1 to 1.0.3 also required the above workaround on later Linux versions, but this should not be required in current (after 83f8739) git master.
So, it's worth a shoot. Btw, it worked well on my 3rd generation HD4000.
Also, toolchain and driver in question includes bunch of GPU-support test cases.
For anyone who comes across this question as I did, the existing answers have some out-of-date information; Intel now offers free drivers for Linux on the site posted above: https://software.intel.com/en-us/articles/opencl-drivers#philinux
The drivers themselves are only supported on 5th, 6th and 7th gen Core processors (and a bunch of other Celerons and Xeons, see link), with earlier processors such as 4th gen still needing the Media Server Studio.
However, they now offer a Linux Community version of Media Server Studio which is free to download.
They also have a Driver Support Matrix for Intel Media SDK and OpenCL which has some useful information about compatibility: https://software.intel.com/en-us/articles/driver-support-matrix-for-media-sdk-and-opencl
You may check intel open source Beignet OpenCL library: http://arrayfire.com/opencl-on-intel-hd-iris-graphics-on-linux/
For me (ubuntu 15.10 + Intel i5 4th generation GPU) it works quite well.
P.S.
Also I must say that I managed to download "media server" for linux a couple of months ago (but didn't used it yet). So you may check it also.

How to select processor(MIPS R2000) in g++?

What is the command for selecting processor(MIPS R2000) in g++? Thanks
You'll probably need a cross-compilation environment for your target platform. You might find an existing one or you may need to build your own cross-compiler using the gcc toolchain. There's no single way to do this - it will depend on the specifics of the target architecture. Specifically, is there already an operating system (e.g. Linux, BSD, etc.) running on your target system? What kind of userland does it use - your build chain will need the relevant C and C++ library as well as any other libraries you need to build and run your software. Or are you coding straight against the metal? In this case, you'll want to find existing bootstrap code for getting the system into a sensible state for running your code - rolling your own will not be easy.
Generally, you're probably best off finding an existing developer community centred around the platform in question and asking for advice there. They may have step-by-step instructions for getting started.
Note that the CPU alone is only part of the picture - for example, the ARM architecture is very popular, but compiling code for Android devices (Linux kernel with Android userland), iOS devices (xnu kernel with BSD- and OSX-derived iOS userland), a Nintendo DS or a Playstation Vita (probably no multitasking OS at all) will be extremely different, even though they all use ARM chips, in many cases even the same instruction set generation.

Building a library across platforms without running all of the platforms

I have a small piece of code that works as a plugin for a larger graphics application. The development platform is Qt with c++ code. I've managed to build a .so, .dylib and .dll for linux, MacOS and Windows respectively, but to do so I had to have a machine running each operating system (in my case, running linux [ubuntu] gcc natively, and windows MinGW and MacOS XCode gcc in virtual machines).
Is there a way to build for all 3 platforms from one? I beat my head against this problem a while back, and research to date suggests that it's not easily (or feasibly) done. The code only needs to link against a single header that defines the plugin
API and is built from a fairly basic Makefile (currently with small variations per platform).
You should have a look at crosscompiling.
You basically build a compiler that (on your current plattform) will output binaries for your desired platforms.
Try this link about doing it on linux, for windows, with QT
Better late than never, I just came across IMCROSS
It looks quite promising!
For Linux it is fairly easy to setup or even download a virtual machine using VMWare for instance. Getting OSX to run on VMWare is somewhat tricky but possible.
Running VMWare and sharing a directory on a local drive you can even compile for the different platforms using the same exact files.
There is somewhere a cross-compiler for OSX but I wouldn't trust it to be of great quality.