POSIX equivalent of boost::thread::hardware_concurrency [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Programmatically find the number of cores on a machine
What is the POSIX or x86, x86-64 specific system call to determine the max number of threads the system can run without over-subscription? Thank you.

It uses C-compatible constructs, so why not just use the actual code? [libs/thread/src/*/thread.cpp]
using pthread library:
unsigned thread::hardware_concurrency()
{
#if defined(PTW32_VERSION) || defined(__hpux)
return pthread_num_processors_np();
#elif defined(__APPLE__) || defined(__FreeBSD__)
int count;
size_t size=sizeof(count);
return sysctlbyname("hw.ncpu",&count,&size,NULL,0)?0:count;
#elif defined(BOOST_HAS_UNISTD_H) && defined(_SC_NPROCESSORS_ONLN)
int const count=sysconf(_SC_NPROCESSORS_ONLN);
return (count>0)?count:0;
#elif defined(_GNU_SOURCE)
return get_nprocs();
#else
return 0;
#endif
}
in windows:
unsigned thread::hardware_concurrency()
{
SYSTEM_INFO info={{0}};
GetSystemInfo(&info);
return info.dwNumberOfProcessors;
}

Related

Is there a way to check if virtualisation is enabled on the BIOS using C++?

I'm trying to check if virtualisation (AMD-V or Intel VT) is enabled programmatically. I know the bash commands that gives you this information but I'm trying to achieve this in C++.
On that note, I'm trying to avoid using std::system to execute shell code because of how hacky and unoptimal that solution is. Thanks in advance!
To check if VMX or SVM (Intel and AMD virtualization technologies) are enabled you need to use the cpuid instruction.
This instruction comes up so often in similar tests that the mainstream compilers all have an intrinsic for it, you don't need inline assembly.
For Intel's CPUs you have to check CPUID.1.ecx[5], while for AMD's ones you have to check CPUID.1.ecx[2].
Here's an example. I only have gcc on an Intel CPU, you are required to properly test and fix this code for other compilers and AMD CPUs.
The principles I followed are:
Unsupported compilers and non-x86 architecture should fail to compile the code.
If run on an x86 compatible CPU that is neither Intel nor AMD, the function return false.
The code assumes cpuid exists, for Intel's CPUs this is true since the 486. The code is designed so that cpuid_ex can return false to denote that cpuid is not present, but I didn't implement a test for it because that would require inline assembly. GCC and clang have a built-in for this but MSVCC doesn't.
Note that if you only build 64-bit binaries you always have cpuid, you can remove the 32-bit architecture check so that if someday somebody tries to use a 32-bit build, the code fails to compile.
#if defined(_MSC_VER)
#include <intrin.h>
#elif defined(__clang__) || defined(__GNUC__)
#include <cpuid.h>
#endif
#include <string>
#include <iostream>
#include <cstdint>
//Execute cpuid with leaf "leaf" and given the temporary array x holding the
//values of eax, ebx, ecx and edx (after cpuid), copy x[start:end) into regs
//converting each item in a uint32_t value (so this interface is the same even
//for esoteric ILP64 compilers).
//If CPUID is not supported, returns false (NOTE: this check is not done, CPUID
//exists since the 486, it's up to you to implement a test. GCC & CLANG have a
//nice macro for this. MSVCC doesn't. ICC I don't know.)
bool cpuid_ex(unsigned int leaf, uint32_t* regs, size_t start = 0, size_t end = 4)
{
#if ( ! defined(__x86_64__) && ! defined(__i386__) && ! defined(_M_IX86) && ! defined(_M_X64))
//Not an x86
return false;
#elif defined(_MSC_VER) || defined(__INTEL_COMPILER)
//MS & Intel
int x[4]
__cpuid((int*)x, leaf);
#elif defined(__clang__) || defined(__GNUC__)
//GCC & clang
unsigned int x[4];
__cpuid(leaf, x[0], x[1], x[2], x[3]);
#else
//Unknown compiler
static_assert(false, "cpuid_ex: compiler is not supported");
#endif
//Conversion from x[i] to uint32_t is safe since GP registers are 32-bit at least
//if we are using cpuid
for (; start < end; start++)
*regs++ = static_cast<uint32_t>(x[start]);
return true;
}
//Check for Intel and AMD virtualization.
bool support_virtualization()
{
//Get the signature
uint32_t signature_reg[3] = {0};
if ( ! cpuid_ex(0, signature_reg, 1))
//cpuid is not supported, this returns false but you may want to throw
return false;
uint32_t features;
//Get the features
cpuid_ex(1, &features, 2, 3);
//Is intel? Check bit5
if (signature_reg[0] == 0x756e6547 && signature_reg[1] == 0x6c65746e && signature_reg[2] == 0x49656e69)
return features & 0x20;
//Is AMD? check bit2
if (signature_reg[0] == 0x68747541 && signature_reg[1] == 0x69746e65 && signature_reg[2] == 0x444d4163)
return features & 0x04;
//Not intel or AMD, this returns false but you may want to throw
return false;
}
int main()
{
std::cout << "Virtualization is " << (support_virtualization() ? "": "NOT ") << "supported\n";
return 0;
}
No, there is no way to detect virtualization support using only standard C++ facilities. The C++ standard library does not have anything related to hardware virtualization. (Its exposure of low level details like that is extremely limited.)
You would need to use OS-specific facilities to detect this.

C++14 High Precision of time Measurements? [duplicate]

I'm looking to implement a simple timer mechanism in C++. The code should work in Windows and Linux. The resolution should be as precise as possible (at least millisecond accuracy). This will be used to simply track the passage of time, not to implement any kind of event-driven design. What is the best tool to accomplish this?
Updated answer for an old question:
In C++11 you can portably get to the highest resolution timer with:
#include <iostream>
#include <chrono>
#include "chrono_io"
int main()
{
typedef std::chrono::high_resolution_clock Clock;
auto t1 = Clock::now();
auto t2 = Clock::now();
std::cout << t2-t1 << '\n';
}
Example output:
74 nanoseconds
"chrono_io" is an extension to ease I/O issues with these new types and is freely available here.
There is also an implementation of <chrono> available in boost (might still be on tip-of-trunk, not sure it has been released).
Update
This is in response to Ben's comment below that subsequent calls to std::chrono::high_resolution_clock take several milliseconds in VS11. Below is a <chrono>-compatible workaround. However it only works on Intel hardware, you need to dip into inline assembly (syntax to do that varies with compiler), and you have to hardwire the machine's clock speed into the clock:
#include <chrono>
struct clock
{
typedef unsigned long long rep;
typedef std::ratio<1, 2800000000> period; // My machine is 2.8 GHz
typedef std::chrono::duration<rep, period> duration;
typedef std::chrono::time_point<clock> time_point;
static const bool is_steady = true;
static time_point now() noexcept
{
unsigned lo, hi;
asm volatile("rdtsc" : "=a" (lo), "=d" (hi));
return time_point(duration(static_cast<rep>(hi) << 32 | lo));
}
private:
static
unsigned
get_clock_speed()
{
int mib[] = {CTL_HW, HW_CPU_FREQ};
const std::size_t namelen = sizeof(mib)/sizeof(mib[0]);
unsigned freq;
size_t freq_len = sizeof(freq);
if (sysctl(mib, namelen, &freq, &freq_len, nullptr, 0) != 0)
return 0;
return freq;
}
static
bool
check_invariants()
{
static_assert(1 == period::num, "period must be 1/freq");
assert(get_clock_speed() == period::den);
static_assert(std::is_same<rep, duration::rep>::value,
"rep and duration::rep must be the same type");
static_assert(std::is_same<period, duration::period>::value,
"period and duration::period must be the same type");
static_assert(std::is_same<duration, time_point::duration>::value,
"duration and time_point::duration must be the same type");
return true;
}
static const bool invariants;
};
const bool clock::invariants = clock::check_invariants();
So it isn't portable. But if you want to experiment with a high resolution clock on your own intel hardware, it doesn't get finer than this. Though be forewarned, today's clock speeds can dynamically change (they aren't really a compile-time constant). And with a multiprocessor machine you can even get time stamps from different processors. But still, experiments on my hardware work fairly well. If you're stuck with millisecond resolution, this could be a workaround.
This clock has a duration in terms of your cpu's clock speed (as you reported it). I.e. for me this clock ticks once every 1/2,800,000,000 of a second. If you want to, you can convert this to nanoseconds (for example) with:
using std::chrono::nanoseconds;
using std::chrono::duration_cast;
auto t0 = clock::now();
auto t1 = clock::now();
nanoseconds ns = duration_cast<nanoseconds>(t1-t0);
The conversion will truncate fractions of a cpu cycle to form the nanosecond. Other rounding modes are possible, but that's a different topic.
For me this will return a duration as low as 18 clock ticks, which truncates to 6 nanoseconds.
I've added some "invariant checking" to the above clock, the most important of which is checking that the clock::period is correct for the machine. Again, this is not portable code, but if you're using this clock, you've already committed to that. The private get_clock_speed() function shown here gets the maximum cpu frequency on OS X, and that should be the same number as the constant denominator of clock::period.
Adding this will save you a little debugging time when you port this code to your new machine and forget to update the clock::period to the speed of your new machine. All of the checking is done either at compile-time or at program startup time. So it won't impact the performance of clock::now() in the least.
For C++03:
Boost.Timer might work, but it depends on the C function clock and so may not have good enough resolution for you.
Boost.Date_Time includes a ptime class that's been recommended on Stack Overflow before. See its docs on microsec_clock::local_time and microsec_clock::universal_time, but note its caveat that "Win32 systems often do not achieve microsecond resolution via this API."
STLsoft provides, among other things, thin cross-platform (Windows and Linux/Unix) C++ wrappers around OS-specific APIs. Its performance library has several classes that would do what you need. (To make it cross platform, pick a class like performance_counter that exists in both the winstl and unixstl namespaces, then use whichever namespace matches your platform.)
For C++11 and above:
The std::chrono library has this functionality built in. See this answer by #HowardHinnant for details.
Matthew Wilson's STLSoft libraries provide several timer types, with congruent interfaces so you can plug-and-play. Amongst the offerings are timers that are low-cost but low-resolution, and ones that are high-resolution but have high-cost. There are also ones for measuring pre-thread times and for measuring per-process times, as well as all that measure elapsed times.
There's an exhaustive article covering it in Dr. Dobb's from some years ago, although it only covers the Windows ones, those defined in the WinSTL sub-project. STLSoft also provides for UNIX timers in the UNIXSTL sub-project, and you can use the "PlatformSTL" one, which includes the UNIX or Windows one as appropriate, as in:
#include <platformstl/performance/performance_counter.hpp>
#include <iostream>
int main()
{
platformstl::performance_counter c;
c.start();
for(int i = 0; i < 1000000000; ++i);
c.stop();
std::cout << "time (s): " << c.get_seconds() << std::endl;
std::cout << "time (ms): " << c.get_milliseconds() << std::endl;
std::cout << "time (us): " << c.get_microseconds() << std::endl;
}
HTH
The StlSoft open source library provides a quite good timer on both windows and linux platforms. If you want it to implement on your own, just have a look at their sources.
The ACE library has portable high resolution timers also.
Doxygen for high res timer:
http://www.dre.vanderbilt.edu/Doxygen/5.7.2/html/ace/a00244.html
I have seen this implemented a few times as closed-source in-house solutions .... which all resorted to #ifdef solutions around native Windows hi-res timers on the one hand and Linux kernel timers using struct timeval (see man timeradd) on the other hand.
You can abstract this and a few Open Source projects have done it -- the last one I looked at was the CoinOR class CoinTimer but there are surely more of them.
I highly recommend boost::posix_time library for that. It supports timers in various resolutions down to microseconds I believe
SDL2 has an excellent cross-platform high-resolution timer. If however you need sub-millisecond accuracy, I wrote a very small cross-platform timer library here.
It is compatible with both C++03 and C++11/higher versions of C++.
I found this which looks promising, and is extremely straightforward, not sure if there are any drawbacks:
https://gist.github.com/ForeverZer0/0a4f80fc02b96e19380ebb7a3debbee5
/* ----------------------------------------------------------------------- */
/*
Easy embeddable cross-platform high resolution timer function. For each
platform we select the high resolution timer. You can call the 'ns()'
function in your file after embedding this.
*/
#include <stdint.h>
#if defined(__linux)
# define HAVE_POSIX_TIMER
# include <time.h>
# ifdef CLOCK_MONOTONIC
# define CLOCKID CLOCK_MONOTONIC
# else
# define CLOCKID CLOCK_REALTIME
# endif
#elif defined(__APPLE__)
# define HAVE_MACH_TIMER
# include <mach/mach_time.h>
#elif defined(_WIN32)
# define WIN32_LEAN_AND_MEAN
# include <windows.h>
#endif
static uint64_t ns() {
static uint64_t is_init = 0;
#if defined(__APPLE__)
static mach_timebase_info_data_t info;
if (0 == is_init) {
mach_timebase_info(&info);
is_init = 1;
}
uint64_t now;
now = mach_absolute_time();
now *= info.numer;
now /= info.denom;
return now;
#elif defined(__linux)
static struct timespec linux_rate;
if (0 == is_init) {
clock_getres(CLOCKID, &linux_rate);
is_init = 1;
}
uint64_t now;
struct timespec spec;
clock_gettime(CLOCKID, &spec);
now = spec.tv_sec * 1.0e9 + spec.tv_nsec;
return now;
#elif defined(_WIN32)
static LARGE_INTEGER win_frequency;
if (0 == is_init) {
QueryPerformanceFrequency(&win_frequency);
is_init = 1;
}
LARGE_INTEGER now;
QueryPerformanceCounter(&now);
return (uint64_t) ((1e9 * now.QuadPart) / win_frequency.QuadPart);
#endif
}
/* ----------------------------------------------------------------------- */-------------------------------- */
The first answer to C++ library questions is generally BOOST: http://www.boost.org/doc/libs/1_40_0/libs/timer/timer.htm. Does this do what you want? Probably not but it's a start.
The problem is you want portable and timer functions are not universal in OSes.
STLSoft have a Performance Library, which includes a set of timer classes, some that work for both UNIX and Windows.
I am not sure about your requirement, If you want to calculate time interval please see thread below
Calculating elapsed time in a C program in milliseconds
Late to the party here, but I'm working in a legacy codebase that can't be upgraded to c++11 yet. Nobody on our team is very skilled in c++, so adding a library like STL is proving difficult (on top of potential concerns others have raised about deployment issues). I really needed an extremely simple cross platform timer that could live by itself without anything beyond bare-bones standard system libraries. Here's what I found:
http://www.songho.ca/misc/timer/timer.html
Reposting the entire source here just so it doesn't get lost if the site ever dies:
//////////////////////////////////////////////////////////////////////////////
// Timer.cpp
// =========
// High Resolution Timer.
// This timer is able to measure the elapsed time with 1 micro-second accuracy
// in both Windows, Linux and Unix system
//
// AUTHOR: Song Ho Ahn (song.ahn#gmail.com) - http://www.songho.ca/misc/timer/timer.html
// CREATED: 2003-01-13
// UPDATED: 2017-03-30
//
// Copyright (c) 2003 Song Ho Ahn
//////////////////////////////////////////////////////////////////////////////
#include "Timer.h"
#include <stdlib.h>
///////////////////////////////////////////////////////////////////////////////
// constructor
///////////////////////////////////////////////////////////////////////////////
Timer::Timer()
{
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceFrequency(&frequency);
startCount.QuadPart = 0;
endCount.QuadPart = 0;
#else
startCount.tv_sec = startCount.tv_usec = 0;
endCount.tv_sec = endCount.tv_usec = 0;
#endif
stopped = 0;
startTimeInMicroSec = 0;
endTimeInMicroSec = 0;
}
///////////////////////////////////////////////////////////////////////////////
// distructor
///////////////////////////////////////////////////////////////////////////////
Timer::~Timer()
{
}
///////////////////////////////////////////////////////////////////////////////
// start timer.
// startCount will be set at this point.
///////////////////////////////////////////////////////////////////////////////
void Timer::start()
{
stopped = 0; // reset stop flag
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceCounter(&startCount);
#else
gettimeofday(&startCount, NULL);
#endif
}
///////////////////////////////////////////////////////////////////////////////
// stop the timer.
// endCount will be set at this point.
///////////////////////////////////////////////////////////////////////////////
void Timer::stop()
{
stopped = 1; // set timer stopped flag
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceCounter(&endCount);
#else
gettimeofday(&endCount, NULL);
#endif
}
///////////////////////////////////////////////////////////////////////////////
// compute elapsed time in micro-second resolution.
// other getElapsedTime will call this first, then convert to correspond resolution.
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInMicroSec()
{
#if defined(WIN32) || defined(_WIN32)
if(!stopped)
QueryPerformanceCounter(&endCount);
startTimeInMicroSec = startCount.QuadPart * (1000000.0 / frequency.QuadPart);
endTimeInMicroSec = endCount.QuadPart * (1000000.0 / frequency.QuadPart);
#else
if(!stopped)
gettimeofday(&endCount, NULL);
startTimeInMicroSec = (startCount.tv_sec * 1000000.0) + startCount.tv_usec;
endTimeInMicroSec = (endCount.tv_sec * 1000000.0) + endCount.tv_usec;
#endif
return endTimeInMicroSec - startTimeInMicroSec;
}
///////////////////////////////////////////////////////////////////////////////
// divide elapsedTimeInMicroSec by 1000
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInMilliSec()
{
return this->getElapsedTimeInMicroSec() * 0.001;
}
///////////////////////////////////////////////////////////////////////////////
// divide elapsedTimeInMicroSec by 1000000
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInSec()
{
return this->getElapsedTimeInMicroSec() * 0.000001;
}
///////////////////////////////////////////////////////////////////////////////
// same as getElapsedTimeInSec()
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTime()
{
return this->getElapsedTimeInSec();
}
and the header file:
//////////////////////////////////////////////////////////////////////////////
// Timer.h
// =======
// High Resolution Timer.
// This timer is able to measure the elapsed time with 1 micro-second accuracy
// in both Windows, Linux and Unix system
//
// AUTHOR: Song Ho Ahn (song.ahn#gmail.com) - http://www.songho.ca/misc/timer/timer.html
// CREATED: 2003-01-13
// UPDATED: 2017-03-30
//
// Copyright (c) 2003 Song Ho Ahn
//////////////////////////////////////////////////////////////////////////////
#ifndef TIMER_H_DEF
#define TIMER_H_DEF
#if defined(WIN32) || defined(_WIN32) // Windows system specific
#include <windows.h>
#else // Unix based system specific
#include <sys/time.h>
#endif
class Timer
{
public:
Timer(); // default constructor
~Timer(); // default destructor
void start(); // start timer
void stop(); // stop the timer
double getElapsedTime(); // get elapsed time in second
double getElapsedTimeInSec(); // get elapsed time in second (same as getElapsedTime)
double getElapsedTimeInMilliSec(); // get elapsed time in milli-second
double getElapsedTimeInMicroSec(); // get elapsed time in micro-second
protected:
private:
double startTimeInMicroSec; // starting time in micro-second
double endTimeInMicroSec; // ending time in micro-second
int stopped; // stop flag
#if defined(WIN32) || defined(_WIN32)
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER startCount; //
LARGE_INTEGER endCount; //
#else
timeval startCount; //
timeval endCount; //
#endif
};
#endif // TIMER_H_DEF
If one is using the Qt framework in the project, the best solution is probably to use QElapsedTimer.

What does __MACH__ mean for an #ifdef?

I am just going through some code and have come across an #ifdef that is:
#ifdef __MACH__
#define CLOCK_REALTIME 0
int clock_gettime (int /*clk_id*/, struct timespec *t) {
struct timeval now;
int rv = gettimeofday(&now, NULL);
if (rv) return rv;
t->tv_sec = now.tv_sec;
t->tv_nsec = now.tv_usec * 1000;
return 0;
}
#endif
What does __MACH__ refer to here? Is it an arbitrary name for "machine" which references the current OS I am compiling on? That's really the only thing I can think of it being.
__MACH__ is a built in compiler macro that indicates if you are on Macintosh operating system. It is defined when compiling on a mac machine.

Detect the availability of SSE/SSE2 instruction set in Visual Studio

How can I check in code whether SSE/SSE2 is enabled or not by the Visual Studio compiler?
I have tried #ifdef __SSE__ but it didn't work.
Some additional information on _M_IX86_FP.
_M_IX86_FP is only defined for 32-bit code. 64-bit x86 code has at least SSE2. You can use _M_AMD64 or _M_X64 to determine if the code is 64-bit.
#ifdef __AVX2__
//AVX2
#elif defined ( __AVX__ )
//AVX
#elif (defined(_M_AMD64) || defined(_M_X64))
//SSE2 x64
#elif _M_IX86_FP == 2
//SSE2 x32
#elif _M_IX86_FP == 1
//SSE x32
#else
//nothing
#endif
From the documentation:
_M_IX86_FP
Expands to a value indicating which /arch compiler option was used:
0 if /arch:IA32 was used.
1 if /arch:SSE was used.
2 if /arch:SSE2 was used. This value is the default if /arch was not specified.
I don't see any mention of _SSE_.
The relevant preprocessor macros have two underscores at each end:
#ifdef __SSE__
#ifdef __SSE2__
#ifdef __SSE3__
#ifdef __SSE4_1__
#ifdef __AVX__
...etc...
UPDATE: apparently the above macros are not automatically pre-defined for you when using Visual Studio (even though they are in every other x86 compiler that I have ever used), so you may need to define them yourself if you want portability with gcc, clang, ICC, et al...
This is a late answer, but on MSDN you can find an article about __cpuid and __cpuidex. I redid the class into a function and it checks the support of MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1.
https://learn.microsoft.com/en-us/cpp/intrinsics/cpuid-cpuidex?view=vs-2019
[[nodiscard]] bool CheckSimdSupport() noexcept
{
std::array<int, 4> cpui;
int nIds_{};
std::bitset<32> f_1_ECX_{};
std::bitset<32> f_1_EDX_{};
std::vector<std::array<int, 4>> data_;
__cpuid(cpui.data(), 0);
nIds_ = cpui[0];
for (int i = 0; i <= 1; ++i)
{
__cpuid(cpui.data(), i);
data_.push_back(cpui);
}
if (nIds_ >= 1)
{
f_1_ECX_ = data_[1][2];
f_1_EDX_ = data_[1][3];
}
// f_1_ECX_[0] - SSE3
// f_1_ECX_[9] - SSSE3
// f_1_ECX_[19] - SSE4.1
// f_1_EDX_[23] - MMX
// f_1_EDX_[25] - SSE
// f_1_EDX_[26] - SSE2
return f_1_ECX_[0] && f_1_ECX_[9] && f_1_ECX_[19] && f_1_EDX_[23] && f_1_EDX_[25] && f_1_EDX_[26];
}

Detecting CPU architecture compile-time

What is the most reliable way to find out CPU architecture when compiling C or C++ code? As far as I can tell, different compilers have their own set of non-standard preprocessor definitions (_M_X86 in MSVS, __i386__, __arm__ in GCC, etc).
Is there a standard way to detect the architecture I'm building for? If not, is there a source for a comprehensive list of such definitions for various compilers, such as a header with all the boilerplate #ifdefs?
Enjoy, I was the original author of this.
extern "C" {
const char *getBuild() { //Get current architecture, detectx nearly every architecture. Coded by Freak
#if defined(__x86_64__) || defined(_M_X64)
return "x86_64";
#elif defined(i386) || defined(__i386__) || defined(__i386) || defined(_M_IX86)
return "x86_32";
#elif defined(__ARM_ARCH_2__)
return "ARM2";
#elif defined(__ARM_ARCH_3__) || defined(__ARM_ARCH_3M__)
return "ARM3";
#elif defined(__ARM_ARCH_4T__) || defined(__TARGET_ARM_4T)
return "ARM4T";
#elif defined(__ARM_ARCH_5_) || defined(__ARM_ARCH_5E_)
return "ARM5"
#elif defined(__ARM_ARCH_6T2_) || defined(__ARM_ARCH_6T2_)
return "ARM6T2";
#elif defined(__ARM_ARCH_6__) || defined(__ARM_ARCH_6J__) || defined(__ARM_ARCH_6K__) || defined(__ARM_ARCH_6Z__) || defined(__ARM_ARCH_6ZK__)
return "ARM6";
#elif defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__)
return "ARM7";
#elif defined(__ARM_ARCH_7A__) || defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__)
return "ARM7A";
#elif defined(__ARM_ARCH_7R__) || defined(__ARM_ARCH_7M__) || defined(__ARM_ARCH_7S__)
return "ARM7R";
#elif defined(__ARM_ARCH_7M__)
return "ARM7M";
#elif defined(__ARM_ARCH_7S__)
return "ARM7S";
#elif defined(__aarch64__) || defined(_M_ARM64)
return "ARM64";
#elif defined(mips) || defined(__mips__) || defined(__mips)
return "MIPS";
#elif defined(__sh__)
return "SUPERH";
#elif defined(__powerpc) || defined(__powerpc__) || defined(__powerpc64__) || defined(__POWERPC__) || defined(__ppc__) || defined(__PPC__) || defined(_ARCH_PPC)
return "POWERPC";
#elif defined(__PPC64__) || defined(__ppc64__) || defined(_ARCH_PPC64)
return "POWERPC64";
#elif defined(__sparc__) || defined(__sparc)
return "SPARC";
#elif defined(__m68k__)
return "M68K";
#else
return "UNKNOWN";
#endif
}
}
There's no inter-compiler standard, but each compiler tends to be quite consistent. You can build a header for yourself that's something like this:
#if MSVC
#ifdef _M_X86
#define ARCH_X86
#endif
#endif
#if GCC
#ifdef __i386__
#define ARCH_X86
#endif
#endif
There's not much point to a comprehensive list, because there are thousands of compilers but only 3-4 in widespread use (Microsoft C++, GCC, Intel CC, maybe TenDRA?). Just decide which compilers your application will support, list their #defines, and update your header as needed.
If you would like to dump all available features on a particular platform, you could run GCC like:
gcc -march=native -dM -E - </dev/null
It would dump macros like #define __SSE3__ 1, #define __AES__ 1, etc.
If you want a cross-compiler solution then just use Boost.Predef which contains
BOOST_ARCH_ for system/CPU architecture one is compiling for.
BOOST_COMP_ for the compiler one is using.
BOOST_LANG_ for language standards one is compiling against.
BOOST_LIB_C_ and BOOST_LIB_STD_ for the C and C++ standard library in use.
BOOST_OS_ for the operating system we are compiling to.
BOOST_PLAT_ for platforms on top of operating system or compilers.
BOOST_ENDIAN_ for endianness of the os and architecture combination.
BOOST_HW_ for hardware specific features.
BOOST_HW_SIMD for SIMD (Single Instruction Multiple Data) detection.
Note that although Boost is usually thought of as a C++ library, Boost.Predef is pure header-only and works for C
For example
#include <boost/predef.h>
// or just include the necessary headers
// #include <boost/predef/architecture.h>
// #include <boost/predef/other.h>
#if BOOST_ARCH_X86
#if BOOST_ARCH_X86_64
std::cout << "x86-64\n";
#elif BOOST_ARCH_X86_32
std::cout << "x86-32\n";
#else
std::cout << "x86-" << BOOST_ARCH_WORD_BITS << '\n'; // Probably x86-16
#endif
#elif BOOST_ARCH_ARM
#if BOOST_ARCH_ARM >= BOOST_VERSION_NUMBER(8, 0, 0)
#if BOOST_ARCH_WORD_BITS == 64
std::cout << "ARMv8+ Aarch64\n";
#elif BOOST_ARCH_WORD_BITS == 32
std::cout << "ARMv8+ Aarch32\n";
#else
std::cout << "Unexpected ARMv8+ " << BOOST_ARCH_WORD_BITS << "bit\n";
#endif
#elif BOOST_ARCH_ARM >= BOOST_VERSION_NUMBER(7, 0, 0)
std::cout << "ARMv7 (ARM32)\n";
#elif BOOST_ARCH_ARM >= BOOST_VERSION_NUMBER(6, 0, 0)
std::cout << "ARMv6 (ARM32)\n";
#else
std::cout << "ARMv5 or older\n";
#endif
#elif BOOST_ARCH_MIPS
#if BOOST_ARCH_WORD_BITS == 64
std::cout << "MIPS64\n";
#else
std::cout << "MIPS32\n";
#endif
#elif BOOST_ARCH_PPC_64
std::cout << "PPC64\n";
#elif BOOST_ARCH_PPC
std::cout << "PPC32\n";
#else
std::cout << "Unknown " << BOOST_ARCH_WORD_BITS << "-bit arch\n";
#endif
You can find out more on how to use it here
Demo on Godbolt
There's nothing standard. Brian Hook documented a bunch of these in his "Portable Open Source Harness", and even tries to make them into something coherent and usable (ymmv regarding that). See the posh.h header on this site:
http://hookatooka.com/poshlib/
Note, the link above may require you to enter some bogus userid/password due to a DOS attack some time ago.
There's a list of the #defines here. There was a previous highly voted answer that included this link but it was deleted by a mod presumably due to SO's "answers must have code" rule. So here's a random sample. Follow the link for the full list.
AMD64
Type
Macro
Description
Identification
__amd64__ __amd64 __x86_64__ __x86_64
Defined by GNU C and Sun Studio
Identification
_M_X64 _M_AMD64
Defined by Visual Studio
If you need a fine-grained detection of CPU features, the best approach is to ship also a CPUID program which outputs to stdout or some "cpu_config.h" file the set of features supported by the CPU. Then you integrate that program with your build process.