Visual Studio equivalent of GCC's __attribute__((target("sse"))) - c++

This is the problem I encountered when I tried to migrate an existing GCC project to Visual Studio.
For a given function void foo(), I hand-optimize it by sse/avx intrinsics, resulting in two versions of this function void foo_sse() and void foo_avx(), and I use cpuid to invoke the correct version at runtime. To tell GCC that void foo_sse() and void foo_avx() should be compiled with -msse and -mavx options respectively, I add __attribute__((target("sse"))) and __attribute__((target("avx"))) to their declaration.
It works well for GCC, but I cannot find an equivalent for VS. For some security concerns, I have to put all codes in one single cpp file to mangle symbol names and I cannot simply put two functions in two different cpp files and give them different compiler options.
How can I specify compiler options on a per-function basis in VS? Thanks in advance.

Related

Why a basic unreferenced c++ function does not get optimized away?

Consider this simple code:
#include <stdio.h>
extern "C"
{
void p4nenc256v32();
void p4ndec256v32();
}
void bigFunctionTest()
{
p4nenc256v32();
p4ndec256v32();
}
int main()
{
printf("hello\n");
}
Code size of those p4nenc256v32/p4ndec256v32 functions is significant, roughly 1.5MB. This binary size when compiled with latest VS2022 with optimizations enabled is 1.5MB. If I comment out that unused bigFunctionTest function then resulting binary is smaller by 1.4MB. Any ideas why would this clearly unused function wouldn't be eliminated by compiler and/or linker in release builds? By default, VS2022 in release uses /Gy and /OPT:REF.
I also tried mingw64 (gcc 12.2) with -fdata-sections -ffunction-sections -Wl,--gc-sections and results were much worse: when compiled with that dummy function exe grew by 5.2MB. Seem like ms and gcc compilers agree that for some reason these functions cannot be removed.
I created a working sample project that shows the issue: https://github.com/pps83/TestLinker.git (make sure to pull submodules as well) and filled an issue with VS issue tracker: Linker doesn't eliminate correctly dead code, however, I think I might get better explanation from SO users explaining what might be the reason for the problem.

Wrong 32-bit calling convention for InterlockedExchange for Clang++, but MSVC is fine

I am using clang power tools to compile a project which is usually compiled using visual studio.
In boost's lwm_win32.hpp header (yes we are using an old version of boost and currently cannot update) I get an error reading.
function declared stdcall here was previously declared without calling convention
the line in question is:
extern "C" __declspec(dllimport) long __stdcall InterlockedExchange(long volatile *, long);
I don't get any errors or warnings for this line when compiling with visual studio. Interestingly I don't get any even if I manually change the calling convention from __stdcall to __cdecl.
Clang tells me which previous declaration it has seen. By manually inspecting this location I would say clang is right. After deciphering all preprocessor defines I would also say __cdecl is what should be seen by visual studio. However, neither the official documentation for InterlockedExchange, nor the official documentation for the intrinsic do mention a specific calling convention.
So basically I am unsure what the root of the problem is. Visual studio accepting any calling convention in the declaration? Clang not seeing the correct declaration due to some preprocessor macros set to the wrong value? Boost declaring the wrong calling convention? I must admit I am confused.
Visual Studio version is 2015 Update 3.
Clang++ version is 6.0.0 called with parameter -fms-compatibility-version=19.
EDIT
As suggested in the comments I had a look at the preprocessor output of MSVC and Clang. They looked rather identical to me. For both the line from boost expands to
extern "C" __declspec(dllimport) long __stdcall _InterlockedExchange(long volatile *, long);
Both have
#pragma intrinsic(_InterlockedExchange)
and the declarations
long __cdecl _InterlockedExchange(long volatile * _Target, long _Value);
LONG __cdecl _InterlockedExchange(LONG volatile *Target, LONG Value);
as well as several inline implementations for different overloads.
In both compilers I target 32-bit (-m32 for clang).
Do the clang power tools offer you things that you really don't want to live without?
If not (and I imagine that is a big if) then you might consider experimenting with VS 2017's support for clang. I have no experience of it personally and it's all still a bit new but what I do know is that MS are putting a lot of work in and it may well pay off in the long run.
As it is, I think you might be out on a bit of a limb. And whatever should and should not be in the header files, I would say that what MS say goes, wouldn't you?
Why are you stuck with that old version of boost? That might be a blocking issue here.

-mimplicit-it compiler flag not recognized

I am attempting to compile a C++ library for a Tegra TK1. The library links to TBB, which I pulled using the package manager. During compilation I got the following error
/tmp/cc4iLbKz.s: Assembler messages:
/tmp/cc4iLbKz.s:9541: Error: thumb conditional instruction should be in IT block -- `strexeq r2,r3,[r4]'
A bit of googling and this question led me to try adding -mimplicit-it=thumb to CMAKE_CXX_FLAGS, but the compiler doesn't recognize it.
I am compiling on the tegra with kernal 3.10.40-grinch-21.3.4, and using gcc 4.8.4 compiler (thats what comes back when I type c++ -v)
I'm not sure what the initial error message means, though I think it has something to do with the TBB linked library rather than the source I'm compiling. The problem with the fix is also mysterious. Can anyone shed some light on this?
-mimplicit-it is an option to the assembler, not to the compiler. Thus, in the absence of specific assembler flags in your makefile (which you probably don't have, given that you don't appear to be using a separate assembler step), you'll need to use the -Wa option to the compiler to pass it through, i.e. -Wa,-mimplicit-it=thumb.
The source of the issue is almost certainly some inline assembly - possibly from a static inline in a header file if you're really only linking pre-built libraries - which contains conditionally-executed instructions (I'm going to guess its something like a cmpxchg implementation). Since your toolchain could well be configured to compile to the Thumb instruction set - which requires a preceding it (If-Then) instruction to set up conditional instructions - by default, another alternative might be to just compile with -marm (and/or remove -mthumb if appropriate) and sidestep the issue by not using Thumb at all.
Adding compiler option:
-wa
should solve the problem.

ImageMagick pthread.h multiple definition

When trying to compile more recent versions of ImageMagick (v6.8.7-2 or later, v6.8.7-1 is fine), I get a bunch of:
CCLD magick/libMagickCore-6.Q16.la
magick/.libs/magick_libMagickCore_6_Q16_la-animate.o: In function `__pthread_cleanup_routine':
/usr/include/pthread.h:581: multiple definition of `__pthread_cleanup_routine'
magick/.libs/magick_libMagickCore_6_Q16_la-accelerate.o:/usr/include/pthread.h:581: first defined here
magick/.libs/magick_libMagickCore_6_Q16_la-annotate.o: In function `__pthread_cleanup_routine':
/usr/include/pthread.h:581: multiple definition of `__pthread_cleanup_routine'
magick/.libs/magick_libMagickCore_6_Q16_la-accelerate.o:/usr/include/pthread.h:581: first defined here
magick/.libs/magick_libMagickCore_6_Q16_la-artifact.o: In function `__pthread_cleanup_routine':
/usr/include/pthread.h:581: multiple definition of `__pthread_cleanup_routine'
magick/.libs/magick_libMagickCore_6_Q16_la-accelerate.o:/usr/include/pthread.h:581: first defined here
magick/.libs/magick_libMagickCore_6_Q16_la-attribute.o: In function `__pthread_cleanup_routine':
/usr/include/pthread.h:581: multiple definition of `__pthread_cleanup_routine'
magick/.libs/magick_libMagickCore_6_Q16_la-accelerate.o:/usr/include/pthread.h:581: first defined here
... goes on for quite a bit longer, all the same.
The pertinent area of /usr/include/pthread.h (from glibc-headers 2.5-118.el5_10.2) is:
/* Function called to call the cleanup handler. As an extern inline
function the compiler is free to decide inlining the change when
needed or fall back on the copy which must exist somewhere else. */
extern __inline void
__pthread_cleanup_routine (struct __pthread_cleanup_frame *__frame)
{
if (__frame->__do_it) // <======= this is :581
__frame->__cancel_routine (__frame->__cancel_arg);
}
I've been posting on ImageMagick's forum without response.
Even if you can't say exactly what's happening, how do I start figuring out whether the issue is with ImageMagick or pthread.h? Where do I go from there?
grep pthread_cleanup_routine -r * only shows matches against the binary object files -- none of ImageMagick's source code has pthread_cleanup_routine in it. A few of the sources include "pthread.h" of course.
That's leading me to believe that this is a glibc issue, not an ImageMagick issue... but, again, previous versions of ImageMagick compile just fine. (I have diff'ed the svn sources between versions where it broke. Lots of configuration/makefile changes, but nothing sticks out to me as to why it would cause this.)
I'm on CentOS 5, kernel 2.6.18-308.24.1.el5, gcc v4.9.0, ld v2.24, glibc-headers 2.5-118.el5_10.2
I've seen a lot of people posting similar issues with other packages than ImageMagick. Hopefully others will find this useful.
Changing pthread.h, just before __pthread_cleanup_routine :
extern __inline void
to
if __STDC__VERSION__ < 199901L
extern
#endif
__inline void
Fixes the issue. Older versions of glibc had an issue when -fexceptions was used, and inline non-C99 conformance (see http://gcc.gnu.org/ml/gcc-patches/2006-11/msg01030.html.) More recent glibc's would fix the issue too, but this should be a temp fix for those who don't want to / shouldn't upgrade it.
ImageMagick svn 13539 (which later became v6.8.7-2) began using -fexceptions.
I faced this error with a newer gcc compiler (4.9.3)
The ImageMagick(6.8.9_7) configure script was checking if compiler supports gnu99 standard. If yes, the configure script sets standard to gnu99 and also enables openmp.
Inline semantics change with C standard gnu99 causing multiple definition of the extern inline function
https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/Inline.html#Inline.
So, I added compiler flag -fgnu89-inline to use older semantics for inline and it fixed the issue.

dll Export/init problem (static vars init?) Visual Studio C++

I want to run an example plugin for CLANG/LLVM. Specifically llvm\tools\clang\examples\PrintFunctionNames. I managed to build it and i see an PrintFunctionNames.exports but i dont think visual studios supports it. The file is simply _ZN4llvm8Registry*. I have no idea what that is but i suspect its namespace llvm, class Registry which is defined as
template <typename T, typename U = RegistryTraits<T> >
class Registry {
I suspect the key line is at the end of the example file
static FrontendPluginRegistry::Add<PrintFunctionNamesAction> X("print-fns", "print function names");
print-fns is the name while the 2nd param is the desc. When i try loading/running the dll via
clang -cc1 -load printFunctionNames.dll -plugin print-fns a.c
I get an error about not finding print-fns. I suspect its because the static variable is never being initialize thus it never registers the plugin. A wrong dll name would get an error loading module msg.
I created a def file and added it to my project. It compiled but still no luck. Here is my def file
LIBRARY printFunctionNames
EXPORTS
X DATA
How do i register the plugin or get this example working?
Ok, becoming slightly more clear. To summarize: Visual Studio has nothing to do with it, really. This is a plugin for the clang executable. Therefore, there must be a method to communicate between them (the plugin interface). This appears to be an undocumented interface, so it's taking a bit off guesswork.
Troubleshooting DLL issues is done with "Dependency Walker" aka "Depends". It offers a profiling mode, in which all symbol lookups can be profiled. I.e. if you profile clang -cc1 -load printFunctionNames.dll -plugin print-fns a.c, you will see what symbols clang expects from your DLL, and in what order.
It looks like you're trying to mix C++ code built with two different, incompatible compilers. That's not supported, and the error you're seeing is a typical sign of that: C++ compilers usually use a "name mangling scheme", and if two compilers are incompatible then their name mangling schemes don't line up. One compiler may mangle llvm::Registry as _ZN4llvm8Registry* while another refers to it as llvm__Registry.