How to test different gcc target attributes - c++

I've started using gcc target attributes and Intel intrinsics to make hardware specific implementations of certain functions. I'm unsure how to go about testing the different implementations easily. Given the example below, how can I, at either compile time or runtime, target/test the default case when executed on a machine that supports sse3?
#include <iostream>
__attribute__((target("default")))
void hello() {
std::cout << "Hello default wolrld" << std::endl;
}
__attribute__((target("sse3")))
void hello() {
std::cout << "Hello SSE3 world" << std::endl;
}
int main() {
hello();
}

Related

Prevent usage of a third-party base class

I have implemented my own wrapper around std::chrono::steady_clock and would like to prevent any other developer from using the original:
#include <iostream>
#include <chrono>
namespace my_chrono {
class MyClock : public std::chrono::steady_clock {
// stuff here...
};
}
int main()
{
auto my_now = my_chrono::MyClock::now(); // this should compile
auto chrono_now = std::chrono::steady_clock::now(); // this should be prevented
std::cout << my_now.time_since_epoch().count() << ", " << chrono_now.time_since_epoch().count() << std::endl;
return 0;
}
Unfortunately, I cannot find a way to force usage of MyClock over steady_clock (for example via compiler errors or warnings treated as errors).
You cannot make it impossible for some other code to be unable to use something you have no control over. And the C++ standard library is something over which you have no control.

Possible clang compiler bug with thread_local in templates

In my project I need separate thread-local storage for each instance of a data member. Because I ran into problems while implementing this functionality, I extracted a simplified version of the code into the following C++14 program:
#include <iostream>
#include <unordered_map>
#include <vector>
template<class T> class ThreadLocalMember
{
public:
T& local() { return store.map[this]; }
private:
struct Store
{
Store() { std::cout << "construct" << std::endl; }
~Store() { std::cout << "destruct" << std::endl; }
std::unordered_map<ThreadLocalMember<T>*, T> map;
};
static thread_local Store store;
};
template <class T> thread_local typename ThreadLocalMember<T>::Store ThreadLocalMember<T>::store;
int main()
{
ThreadLocalMember<int> counter;
std::cout << "point 1" << std::endl;
int result = counter.local();
std::cout << "point 2; result: " << result << std::endl;
return result;
}
The expected output is
point 1
construct
point 2; result: 0
destruct
However, when compiled with clang Apple LLVM version 9.1.0 (clang-902.0.39.1) on MacOS High Sierra 10.13.4 using
clang++ -std=c++14 -O3 ThreadLocalMember.cpp -o test
(or with -O1 or -O2) the output is:
point 1
Illegal instruction: 4
It seems that the constructor of the thread_local variable is never executed and the program crashes when the variable is first accessed.
The problem goes away when
the program is compiled without optimisation (which is not acceptable in production mode)
the template class is replaced by a regular class (a possible workaround but very annoying)
the thread_local keyword is removed in both places (but then the program no longer does what I need it to do when there are multiple threads)
The program also compiles and runs fine when using gcc 5.4.0 on Ubuntu 16, with or without optimisation flag.
Is there something wrong with my code, or am I looking at a clang compiler bug?

Is equal() included by default in the global namespace?

This is a question regarding the default global namespace in C++. I have the following code that compiles and runs properly using g++ clang-500.2.79.
#include <string>
#include <iostream>
using std::string;
using std::endl;
using std::cout;
bool is_palindrome(const string& str){
return equal(str.begin(), str.end(), str.rbegin());
}
int main(){
cout << "Hello is a palindrome: " << is_palindrome("Hello") << endl;
cout << "madam is a palindrome: " << is_palindrome("madam") << endl;
return 0;
}
My questions is, why does this code compile properly? I forgot to put #include <algorithm> and using std::equal at the beginning of my file. So the expected behaviour is for the compiler to complain.
The example at http://en.cppreference.com/w/cpp/algorithm/equal confirms that I should be using std::equal.
To investigate this further, I tried to track down exactly which version of the equal() function was being called. Being a relative newbie to C++ I don't know exactly how to do this either. I tried,
cout << "The function is: " << equal << endl;
Which generated a compiler error with some interesting information:
/usr/include/c++/4.2.1/bits/stl_algobase.h:771:5:
note: 'std::equal' declared here
Try as I might, I can't find information about stl_algobase (or more probably, I most likely don't understand what I've found). Is stl_algobase a set of functions that are automatically included in the global namespace?
A further questions is: What is the proper way to track (code or otherwise) down which function is being called when you are dealing with potentially overloaded or template functions in C++?
equal is in the std namespace. What you are seeing is argument dependent lookup (ADL). Because the arguments are in the std, the name lookup for equal considers that namespace too.
Here's a simplified example:
namespace foo
{
struct Bar {};
}
namespace foo
{
void bar(const Bar&) {}
void bar(int) {}
}
int main()
{
foo::Bar b;
foo::bar(b); // OK
bar(b); // ADL, OK
foo::bar(42); // OK
bar(42); // No ADL: error: 'bar' was not declared in this scope
}

Ridiculously slow unique_ptr dtor call when debugger is attached (msvc)

struct test_struct
{
test_struct() {}
~test_struct() {}
};
#include <vector>
#include <memory>
#include <cstdio>
int main()
{
printf("ctor begin\n");
{
std::vector<std::unique_ptr<test_struct>> test_vec;
const int count = 100000;
for (auto i = 0; i < count; i++) {
test_vec.emplace_back(new test_struct);
}
printf("dtor begin\n");
}
printf("dtor end\n");
}
I'm using VS2010, and found some ridiculous performance issue. The code above works well both in debug and release build (ctrl+f5), but when debugger is attached(f5), dtor call for unique_ptr class is intolerably slow. The result machine code is fairly optimized, so I don't expect that it's compiler issue rather than debugger's, but I don't know how to deal with it. My question is
Is this problem able to be reproduced on your machine?
What's the reason of this behaviour?
Is there any workaround?
The slowdown is caused by memory checking that occurs whenever memory is freed. However, this is a special system-/debugger-level heap, and isn't anything you can control from within your program.
There's a great article on the issue. To summarize: you have to set an environment variable to disable it!
Luckily, you can set project-specific environment variables from the Debugging options in the Project Settings for your project, so that the environment variable is only applied to your program.
I used this simplified program to test:
#include <iostream>
#include <memory>
#include <vector>
int main()
{
std::cout << "ctor begin" << std::endl;
{
std::vector<std::unique_ptr<int>> test_vec;
for (unsigned i = 0; i < 100000; i++)
test_vec.emplace_back(new int);
std::cout << "dtor begin" << std::endl;
}
std::cout << "dtor end" << std::endl;
}
By setting _NO_DEBUG_HEAP=1 as an environment variable (either system-wide, which I won't recommend, or through the Debugging options), the code runs in roughly the same amount of time irrespective of whether or not the debugger is attached.

typeinfo / typeid output

I'm currently trying to debug a piece of simple code and wish to see how a specific variable type changes during the program.
I'm using the typeinfo header file so I can utilise typeid.name(). I'm aware that typeid.name() is compiler specific thus the output might not be particularly helpful or standard.
I'm using GCC but I cannot find a list of the potential output despite searching, assuming a list of typeid output symbols exist. I don't want to do any sort of casting based on the output or manipulate any kind of data, just follow its type.
#include <iostream>
#include <typeinfo>
int main()
{
int a = 10;
cout << typeid(int).name() << endl;
}
Is there a symbol list anywhere?
I don't know if such a list exists, but you can make a small program to print them out:
#include <iostream>
#include <typeinfo>
#define PRINT_NAME(x) std::cout << #x << " - " << typeid(x).name() << '\n'
int main()
{
PRINT_NAME(char);
PRINT_NAME(signed char);
PRINT_NAME(unsigned char);
PRINT_NAME(short);
PRINT_NAME(unsigned short);
PRINT_NAME(int);
PRINT_NAME(unsigned int);
PRINT_NAME(long);
PRINT_NAME(unsigned long);
PRINT_NAME(float);
PRINT_NAME(double);
PRINT_NAME(long double);
PRINT_NAME(char*);
PRINT_NAME(const char*);
//...
}