How to set target cpu/hardware with armflang? - fortran

A question related to specifying target CPU with Arm Fortran Compiler (armflang)?
How to set target CPU?
Is there any way for armflang to autodetect the CPUs?
Which CPUs does it support?

Arm Fortran Compiler (armflang) supports -mcpu= option. When set to -mcpu=native, it tries to detect the host CPU. This enables armflang to make CPU specific optimizations as compared to generic Arm related optimizations.
The 18.4 release supports following targets
Cavium ThunderX2 (-mcpu=thunderx2t99)
Hisilicon Hi1616 (-mcpu=cortex-a72)
Qualcomm Falkor hardware (-mcpu=falkor)
Softiron (-mcpu=cortex-a57)
Cavium ThunderX (-mcpu=thunderx)

Related

llc/clang exact -march target for the current host cpu

I am trying to find a command line option to specify "generate code for this host your are compiling on, it should take advantage of all the CPU features available, and needs not run on any other system" for llvm.
I have fairly recent llvm versions across all the platforms I use, they are all arm, arm64, x86 or x86-64. All code is either C or C++.
Is there a generic option for this for llvm?
The correct command line parameters are: -mtune=native and -mcpu=native

Simple console application compiled with Intel C++ Compiler 2019.4 does not run on Ryzen processor

The simple program
#include <stdio.h>
int main()
{
printf( "Hello, world!\n" );
}
when compiled with Intel C++ Compiler 2019.4 with the following switches:
/O3 /Qunroll /Qunroll-aggressive /QxSSE3 /QaxCORE-AVX2 refuses to run on Ryzen 3 1200 processor running Windows 10.
The error I get on the console is the list of processor features required to run the application. All of these features are available on Ryzen processor (SSE3, AVX2, CMOV, FXSAVE, etc) yet the application does not run.
The full run-time library error for this simple program reads as follows:
Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX, FXSAVE, SSE, SSE2 and SSE3 instructions.
This is just a bare minimum example, I of course have a lot more complex application on my mind, but it does not run either.
The only workaround is to use /O3 /Qunroll /Qunroll-aggressive /QxSSE2 switches, but that effectively disables AVX2 auto-dispatch and SSE3 instructions.
Is there a workaround for this issue possible?
Even though those instruction sets are available compiler-emitted code that verifies their availability may not necessary recognize their presence on non-intel CPUs. Documentation explicitly states that options like CORE-AVX2 may work only for Intel processors:
CORE-AVX2 May generate Intel® Advanced Vector Extensions 2 (Intel® AVX2), Intel® AVX, SSE4.2, SSE4.1, SSE3, SSE2, SSE, and SSSE3 instructions for Intel® processors. Optimizes for Intel® processors that support Intel® AVX2 instructions.
(remarks)
Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Intel and Ryzen is not same for compilation, for example, Intel compile with one architecture to his processor and Ryzen to his, if you compile into Intel and execute on Ryzen, what will be happen ?

How to bypass runtime flag check in DPDK

ERROR: This system does not support SSE4_1
Please check that RTE_MACHINE is set correctly.
Is there any way to bypass this flag in DPDK?
DPDK version 17.08.1
OS : fedora 20
Is there any way to bypass this flag in DPDK?
Sure, the DPDK needs to be compiled without SSE4.1, so it will not require SSE to be present at runtime.
If we do not care about portability, the best way to deal with the issue is to compile DPDK with RTE_MACHINE="native", i.e. using x86_64-native-linuxapp-gcc config (or similar).
This will use the most CPU capabilities your local host supports, but might somewhat limit the portability to other CPUs.
To make it more portable, set RTE_MACHINE="snb" to compile DPDK for SandyBridge CPUs and newer.
The full list of supported machines are listed here:
http://dpdk.org/browse/dpdk/tree/mk/machine
EDIT:
According to DPDK 17.08 Release Notes:
Starting with version 17.08, DPDK requires SSE4.2 to run on x86. Previous versions required SSE3.
That was due to the new vPMD functionality, as described in the patch discussion.
dpdk-stable-XX\mk\machine\native\rte.vars.mk
--ifeq ($(SSE42_SUPPORT),)
++ifneq ($(SSE42_SUPPORT),)

Detecting SIMD instruction sets to be used with C++ Macros in Visual Studio 2015

So, here is what I am trying to accomplish. In my C++ project that has to be compiled with Microsoft Visual Studio 2015 or above, I need to have some code have different versions depending on the newest SIMD instrunction set available in the CPU of the user, among: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX512.
Since what I am look for at this point is compile-time CPU dispatching, my first guess was that it could be easily accomplished using compiler macros. However, to my astonishment, it has been quite hard to find information on how to achieve such CPU dispatching with macros in VS2015.
For instance, the former question "Detect the availability of SSE/SSE2 instruction set in Visual Studio" has information on how to detect SSE and SSE2 for x86 code, but not for x64 code. Although, they make a reference to this Microsoft's document: http://msdn.microsoft.com/en-us/library/b0084kay.aspx
There, we only have information on how to detect whether SSE, SSE2, AVX and AVX2 are enabled in the compiler - not exactly whether they are supported by CPU. Also, there is nothing at all about the other instrunction sets, like SSE3, SSSE3, SSE4.1, SSE4.2 and AVX512.
So, my question becomes: how can I detect whether the user's CPU supports those instrunction sets via macro, just like other compilers do, but with Microsoft Visual Studio 2015?
The problem you're facing is that Visual Studio historically is intended for software vendors. The idea that you compile your own software simply isn't in Microsoft's DNA.
The practical result is that Microsoft hardly cares about the processor of the build machine. That's unlikely to be the processor used to run the software.
On the upside, this also means that Microsoft doesn't suffer from the perennial Linux problem that the build system libraries are assumed to be present on the target machine. Building on Windows 10 for Windows 7 just works.
The compiler also doesn't allow you to enable up to SSE4.1, for example. You can only use /arch:avx or nothing. Also, that option only defines __AVX__, not the usual macros like __SSSE3__ that gcc/clang/icc define to indicate target support for previous instruction sets implied by AVX.

Intel TBB will work on AMD processors? [duplicate]

This question already has an answer here:
Closed 11 years ago.
The community reviewed whether to reopen this question 8 months ago and left it closed:
Original close reason(s) were not resolved
Possible Duplicate:
AMD multi-core programming
Is Intel TBB processor dependent? Will it work on amd or on ARM (under meeGo for example?)
TBB is not completely processor-independent; there is a (rather small) layer that isolates the rest of TBB from processor architecture (primarily to provide atomic read-modify-write operations such as compare-and-swap) and certain OS pecularities. Implementations of this layer use some compiler-specific stuff as well, such as inlined assembler or built-in functions (intrinsics).
TBB will work out-of-the-box on x86 (32 and 64 bit) processors including those from AMD, except for rather old ones that do not have mfence instruction.
As for ARM, there is no direct support, but TBB 3.0 Update 7 added an implementation of TBB's platform isolation layer that uses GCC atomic built-ins. So it is definitely possible to make TBB running on ARM, probably with rather small additional effort. And actually there was a report about certain success with such a port at the TBB forum.
And, Intel(R) AppUp SDK for MeeGo also contains TBB, though it's only for Intel's Atom processor.
The answer is yes, for AMD anyhow.
For ARM things are more complex, judging by feedback on the Intel forums. I don't see anybody has gotten this working? For example see http://software.intel.com/en-us/forums/showthread.php?t=74346
The commercial version 3.0 has this in its release notes regarding recommended hardware: other platforms may be more sketchily supported, I would think.
Microsoft* Windows* Systems
Intel(R) Core(TM) 2 Duo processor or Intel(R) Xeon(R) processor
or higher
Linux* Systems
Intel(R) Core(TM) 2 Duo processor or Intel(R) Xeon(R) processor
or Intel(R) Itanium(R) processor or higher
Mac OS* X Systems
Intel(R) Core(TM) 2 Duo processor or higher
(Updated info Dec 2014)
ARM is supported on TBB as of 4.1 Update 3, with fixes in 4.2 Update 3. I have not used this myself so cannot attest to the robustness of this port.
No, it is not processor dependent. It is just a C++ library so as long as the compiler you are using is capable of compiling it you should be fine. From the FAQ of the website you linked to:
What compilers, operating systems and processors are supported?
The project is dedicated to supporting all compilers, all OSes and all processors as a cornerstone objective of the project. Up to date information on status is available on the web site.
Edit: Poking around a little more it looks like people are having problems getting it working on ARM processors, but nothing that should be insurmountable.