Can't get warnings to work for header-only library - c++

I'm creating an header-only library, and I would like to get warnings for it displayed during compilation. However, it seems that only warnings for the "main" project including the library get displayed, but not for the library itself.
Is there a way I can force the compiler to check for warnings in the included library?
// main.cpp
#include "MyHeaderOnlyLib.hpp"
int main() { ... }
// Compile
g++ ./main.cpp -Wall -Wextra -pedantic ...
// Warnings get displayed for main.cpp, but not for MyHeaderOnlyLib.hpp
I'm finding MyHeaderOnlyLib.hpp via a CMake script, using find_package. I've checked the command executed by CMake, and it's using -I, not -isystem.
I've tried both including the library with <...> (when it's in the /usr/include/ directory), or locally with "...".

I suppose that you have a template library and you are complaining about the lack of warnings from its compilation. Don't look for bad #include path, that would end up as an error. Unfortunately, without specialization (unless the templates are used by the .cpp), the compiler has no way to interpret the templates reliably, let alone produce sensible warnings. Consider this:
#include <vector>
template <class C>
struct T {
bool pub_x(const std::vector<int> &v, int i)
{
return v.size() < i;
}
bool pub_y(const std::vector<int> &v, int i)
{
return v.size() < i;
}
};
typedef T<int> Tint; // will not help
bool pub_z(const std::vector<int> &v, unsigned int i) // if signed, produces warning
{
return v.size() < i;
}
class WarningMachine {
WarningMachine() // note that this is private
{
//T<int>().pub_y(std::vector<int>(), 10); // to produce warning for the template
}
};
int main()
{
//Tint().pub_y(std::vector<int>(), 10); // to produce warning for the template
return 0;
}
You can try it out in codepad. Note that the pub_z will immediately produce signed / unsigned comparison warning when compiled, despite never being called. It is a whole different story for the templates, though. Even if T::pub_y is called, T::pub_x still passes unnoticed without a warning. This depends on a compiler implementation, some compilers perform more aggressive checking once all the information is available, other tend to be lazy. Note that neither T::pub_x or T::pub_y depend on the template argument.
The only way to do it reliably is to specialize the templates and call the functions. Note that the code which does that does not need to be accessible for that (such as in WarningMachine), making it a candidate to be optimized away (but that depends), and also meaning that the values passed to the functions may not need to be valid values as the code will never run (that will save you allocating arrays or preparing whatever data the functions may need).
On the other hand, since you will have to write a lot of code to really check all the functions, you may as well pass valid data and check for result correctness and make it useful, instead of likely confusing the hell of anyone who reads the code after you (as is likely in the above case).

Related

For a function that takes a const struct, does the compiler not optimize the function body?

I have the following piece of code:
#include <stdio.h>
typedef struct {
bool some_var;
} model_t;
const model_t model = {
true
};
void bla(const model_t *m) {
if (m->some_var) {
printf("Some var is true!\n");
}
else {
printf("Some var is false!\n");
}
}
int main() {
bla(&model);
}
I'd imagine that the compiler has all the information required to eliminate the else clause in the bla() function. The only code path that calls the function comes from main, and it takes in const model_t, so it should be able to figure out that that code path is not being used. However:
With GCC 12.2 we see that the second part is linked in.
If I inline the function this goes away though:
What am I missing here? And is there some way I can make the compiler do some smarter work? This happens in both C and C++ with -O3 and -Os.
The compiler does eliminate the else path in the inlined function in main. You're confusing the global function that is not called anyway and will be discarded by the linker eventually.
If you use the -fwhole-program flag to let the compiler know that no other file is going to be linked, that unused segment is discarded:
[See online]
Additionally, you use static or inline keywords to achieve something similar.
The compiler cannot optimize the else path away as the object file might be linked against any other code. This would be different if the function would be static or you use whole program optimization.
The only code path that calls the function comes from main
GCC can't know that unless you tell it so with -fwhole-program or maybe -flto (link-time optimization). Otherwise it has to assume that some static constructor in another compilation unit could call it. (Including possibly in a shared library, but another .cpp that you link with could do it.) e.g.
// another .cpp
typedef struct { bool some_var; } model_t;
void bla(const model_t *m); // declare the things from the other .cpp
int foo() {
model_t model = {false};
bla(&model);
return 1;
}
int some_global = foo(); // C++ only: non-constant static initializer.
Example on Godbolt with these lines in the same compilation unit as main, showing that it outputs both Some var is false! and then Some var is true!, without having changed the code for main.
ISO C doesn't have easy ways to get init code executed, but GNU C (and GCC specifically) have ways to get code run at startup, not called by main. This works even for shared libraries.
With -fwhole-program, the appropriate optimization would be simply not emitting a definition for it at all, as it's already inlined into the call-site in main. Like with inline (In C++, a promise that any other caller in another compilation unit can see its own definition of the function) or static (private to this compilation unit).
Inside main, it has optimized away the branch after constant propagation. If you ran the program, no branch would actually execute; nothing calls the stand-alone definition of the function.
The stand-alone definition of the function doesn't know that the only possible value for m is &model. If you put that inside the function, then it could optimize like you're expecting.
Only -fPIC would force the compiler to consider the possibility of symbol-interposition so the definition of const model_t model isn't the one that is in effect after (dynamic) linking. But you're compiling code for an executable not a library. (You can disable symbol-interposition for a global variable by giving it "hidden" visibility, __attribute__((visibility("hidden"))), or use -fvisibility=hidden to make that the default).

Make clang's Memory Sanitizer report unitialised variable use without it deciding branching

I was experimenting with Clang 6.0's Memory Sanitizer(MSan).
Code is compiled with
clang++ memsans.cpp -std=c++14 -o memsans -g -fsanitize=memory -fno-omit-frame-pointer -Weverything
on Ubuntu 18.04. As per the MSan documentation
It will tolerate copying of uninitialized memory, and also simple
logic and arithmetic operations with it. In general, MemorySanitizer
silently tracks the spread of uninitialized data in memory, and
reports a warning when a code branch is taken (or not taken) depending
on an uninitialized value.
So the following code does not generate any error
#include <iostream>
class Test {
public:
int x;
};
int main() {
Test t;
std::cout << t.x;
std::cout << std::endl;
return 0;
}
But this will
#include <iostream>
class Test {
public:
int x;
};
int main() {
Test t;
if(t.x) {
std::cout << t.x;
}
std::cout << std::endl;
return 0;
}
Ideally one would like both of these code samples to generate some sort of error since both are "using" an uninitialised variable in the sense that the first one is printing it. This code is a small test code and hence the error in the first code is obvious, however if it were a large codebase with a similar error, MSan would totally miss this. Is there any hack to force MSan to report this type of error as well ?
It sounds like your C++ library wasn't built with MSan. Unlike ASan and UBSan, MSan requires that the whole program was built with msan enabled. Think of it like having a different ABI, you shouldn't link two programs built with different msan settings. The one exception is libc for which msan adds "interceptors" to make it work.
If you write your own code which you want to integrate with msan by reporting an error where msan normally wouldn't (say, in a function which makes a copy but you know the data needs to be initialized) then you can use __msan_check_mem_is_initialized from the msan_interface.h file: https://github.com/llvm-mirror/compiler-rt/blob/master/include/sanitizer/msan_interface.h

Is there a way to detect inline function ODR violations?

So I have this code in 2 separate translation units:
// a.cpp
#include <stdio.h>
inline int func() { return 5; }
int proxy();
int main() { printf("%d", func() + proxy()); }
// b.cpp
inline int func() { return 6; }
int proxy() { return func(); }
When compiled normally the result is 10. When compiled with -O3 (inlining on) I get 11.
I have clearly done an ODR violation for func().
It showed up when I started merging sources of different dll's into fewer dll's.
I have tried:
GCC 5.1 -Wodr (which requires -flto)
gold linker with -detect-odr-violations
setting ASAN_OPTIONS=detect_odr_violation=1 before running an instrumented binary with the address sanitizer.
Asan can supposedly catch other ODR violations (global vars with different types or something like that...)
This is a really nasty C++ issue and I am amazed there isn't reliable tooling for detecting it.
Pherhaps I have misused one of the tools I tried? Or is there a different tool for this?
EDIT:
The problem remains unnoticed even when I make the 2 implementations of func() drastically different so they don't get compiled to the same amount of instructions.
This also affects class methods defined inside the class body - they are implicitly inline.
// a.cpp
struct A { int data; A() : data(5){} };
// b.cpp
struct A { int data; A() : data(6){} };
Legacy code with lots of copy/paste + minor modifications after that is a joy.
The tools are imperfect.
I think Gold's check will only notice when the symbols have different types or different sizes, which isn't true here (both functions will compile to the same number of instructions, just using a different immediate value).
I'm not sure why -Wodr doesn't work here, but I think it only works for types, not functions, i.e. it will detect two conflicting definitions of a class type T but not your func().
I don't know anything about ASan's ODR checking.
The simplest way to detect such concerns is to copy all the functions into a single compilation unit (create one temporarily if needed). Any C++ compiler will then be able to detect and report duplicate definitions when compiling that file.

C++ template function compiles in header but not implementation

I'm trying to learn templates and I've run into this confounding error. I'm declaring some functions in a header file and I want to make a separate implementation file where the functions will be defined.
Here's the code that calls the header (dum.cpp):
#include <iostream>
#include <vector>
#include <string>
#include "dumper2.h"
int main() {
std::vector<int> v;
for (int i=0; i<10; i++) {
v.push_back(i);
}
test();
std::string s = ", ";
dumpVector(v,s);
}
Now, here's a working header file (dumper2.h):
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector(const std::vector<T>& v,std::string sep);
template <class T> void dumpVector(const std::vector<T>& v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.cbegin();
std::cout << *vi;
vi++;
for (;vi<v.cend();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
With implementation (dumper2.cpp):
#include <iostream>
#include "dumper2.h"
void test() {
std::cout << "!olleh dlrow\n";
}
The weird thing is that if I move the code that defines dumpVector from the .h to the .cpp file, I get the following error:
g++ -c dumper2.cpp -Wall -Wno-deprecated
g++ dum.cpp -o dum dumper2.o -Wall -Wno-deprecated
/tmp/ccKD2e3G.o: In function `main':
dum.cpp:(.text+0xce): undefined reference to `void dumpVector<int>(std::vector<int, std::allocator<int> >, std::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
collect2: ld returned 1 exit status
make: *** [dum] Error 1
So why does it work one way and not the other? Clearly the compiler can find test(), so why can't it find dumpVector?
The problem you're having is that the compiler doesn't know which versions of your template to instantiate. When you move the implementation of your function to x.cpp it is in a different translation unit from main.cpp, and main.cpp can't link to a particular instantiation because it doesn't exist in that context. This is a well-known issue with C++ templates. There are a few solutions:
1) Just put the definitions directly in the .h file, as you were doing before. This has pros & cons, including solving the problem (pro), possibly making the code less readable & on some compilers harder to debug (con) and maybe increasing code bloat (con).
2) Put the implementation in x.cpp, and #include "x.cpp" from within x.h. If this seems funky and wrong, just keep in mind that #include does nothing more than read the specified file and compile it as if that file were part of x.cpp In other words, this does exactly what solution #1 does above, but it keeps them in seperate physical files. When doing this kind of thing, it is critical that you not try to compile the #included file on it's own. For this reason, I usually give these kinds of files an hpp extension to distinguish them from h files and from cpp files.
File: dumper2.h
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector( std::vector<T> v,std::string sep);
#include "dumper2.hpp"
File: dumper2.hpp
template <class T> void dumpVector(std::vector<T> v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.begin();
std::cout << *vi;
vi++;
for (;vi<v.end();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
3) Since the problem is that a particular instantiation of dumpVector is not known to the translation unit that is trying to use it, you can force a specific instantiation of it in the same translation unit as where the template is defined. Simply by adding this: template void dumpVector<int>(std::vector<int> v, std::string sep); ... to the file where the template is defined. Doing this, you no longer have to #include the hpp file from within the h file:
File: dumper2.h
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector( std::vector<T> v,std::string sep);
File: dumper2.cpp
template <class T> void dumpVector(std::vector<T> v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.begin();
std::cout << *vi;
vi++;
for (;vi<v.end();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
template void dumpVector<int>(std::vector<int> v, std::string sep);
By the way, and as a total aside, your template function is taking a vector by-value. You may not want to do this, and pass it by reference or pointer or, better yet, pass iterators instead to avoid making a temporary & copying the whole vector.
This was what the export keyword was supposed to accomplish (i.e., by exporting the template, you'd be able to put it in a source file instead of a header. Unfortunately, only one compiler (Comeau) ever really implemented export completely.
As to why the other compilers (including gcc) didn't implement it, the reason is pretty simple: because export is extremely difficult to implement correctly. Code inside the template can change meaning (almost) completely, based on the type over which the template is instantiated, so you can't generate a conventional object file of the result of compiling the template. Just for example, x+y might compile to native code like mov eax, x/add eax, y when instantiated over an int, but compile to a function call if instantiated over something like std::string that overloads operator+.
To support separate compilation of templates, you have to do what's called two-phase name lookup (i.e., lookup the name both in the context of the template and in the context where the template is being instantiated). You typically also have the compiler compile the template to some sort of database format that can hold instantiations of the template over an arbitrary collection of types. You then add in a stage between compiling and linking (though it can be built into the linker, if desired) that checks the database and if it doesn't contain code for the template instantiated over all the necessary types, re-invokes the compiler to instantiate it over the necessary types.
Due to the extreme effort, lack of implementation, etc., the committee has voted to remove export from the next version of the C++ standard. Two other, rather different, proposals (modules and concepts) have been made that would each provide at least part of what export was intended to do, but in ways that are (at least hoped to be) more useful and reasonable to implement.
Template parameters are resolved as compile time.
The compiler finds the .h, finds a matching definition for dumpVector, and stores it. The compiling is finished for this .h. Then, it continues parsing files and compiling files. When it reads the dumpVector implementation in the .cpp, it's compiling a totally different unit. Nothing is trying to instantiate the template in dumper2.cpp, so the template code is simply skipped. The compiler won't try every possible type for the template, hoping there will be something useful later for the linker.
Then, at link time, no implementation of dumpVector for the type int has been compiled, so the linker won't find any. Hence why you're seeing this error.
The export keyword is designed to solve this problem, unfortunately few compilers support it. So keep your implementation with the same file as your definition.
A template function is not real function. The compiler turns a template function into a real function when it encounters a use of that function. So the entire template declaration has to be in scope it finds the call to DumpVector, otherwise it can't generate the real function.
Amazingly, a lot of C++ intro books get this wrong.
This is exactly how templates work in C++, you must put the implementation in the header.
When you declare/define a template function, the compiler can't magically know which specific types you may wish to use the template with, so it can't generate code to put into a .o file like it could with a normal function. Instead, it relies on generating a specific instantiation for a type when it sees the use of that instantiation.
So when the implementation is in the .C file, the compiler basically says "hey, there are no users of this template, don't generate any code". When the template is in the header, the compiler is able to see the use in main and actually generate the appropriate template code.
Most compilers don't allow you to put template function definitions in a separate source file, even though this is technically allowed by the standard.
See also:
http://www.parashift.com/c++-faq-lite/templates.html#faq-35.12
http://www.parashift.com/c++-faq-lite/templates.html#faq-35.14

Checking if a function has C-linkage at compile-time [unsolvable]

Is there any way to check if a given function is declared with C-linkage (that is, with extern "C") at compile-time?
I am developing a plugin system. Each plugin can supply factory functions to the plugin-loading code. However, this has to be done via name (and subsequent use of GetProcAddress or dlsym). This requires that the functions be declared with C-linkage so as to prevent name-mangling. It would be nice to be able to throw a compiler error if the referred-to function is declared with C++-linkage (as opposed to finding out at runtime when a function with that name does not exist).
Here's a simplified example of what I mean:
extern "C" void my_func()
{
}
void my_other_func()
{
}
// Replace this struct with one that actually works
template<typename T>
struct is_c_linkage
{
static const bool value = true;
};
template<typename T>
void assertCLinkage(T *func)
{
static_assert(is_c_linkage<T>::value, "Supplied function does not have C-linkage");
}
int main()
{
assertCLinkage(my_func); // Should compile
assertCLinkage(my_other_func); // Should NOT compile
}
Is there a possible implementation of is_c_linkage that would throw a compiler error for the second function, but not the first? I'm not sure that it's possible (though it may exist as a compiler extension, which I'd still like to know of). Thanks.
I agree with Jonathan Leffler that this probably is not possible in a standard way. Maybe it would be possible somewhat, depending on the compiler and even version of the compiler, but you would have to experiment to determine possible approaches and accept the fact that the compiler's behavior was likely unintentional and might be "fixed" in later versions.
With g++ version 4.4.4 on Debian Squeeze, for example, you might be able to raise a compiler error for functions that are not stdcall with this approach:
void my_func() __attribute__((stdcall));
void my_func() { }
void my_other_func() { }
template <typename ret_, typename... args_>
struct stdcall_fun_t
{
typedef ret_ (*type)(args_...) __attribute__((stdcall));
};
int main()
{
stdcall_fun_t<void>::type pFn(&my_func),
pFn2(&my_other_func);
}
g++ -std=c++0x fails to compile this code because:
SO2936360.cpp:17: error: invalid conversion from ‘void ()()’ to ‘void ()()’
Line 17 is the declaration of pFn2. If I get rid of this declaration, then compilation succeeds.
Unfortunately, this technique does not work with cdecl.
For Unix/Linux, how about analyzing the resulting binary with 'nm' and looking for symbol names? I suppose it's not what you meant, but still it's sort of compile time.