How to generate documentation when there is a `static if` - d

/** This is struct S. */
struct S(T) {
static if(isFloatingPoint!T)
{
/// This version works well with floating-point numbers.
void fun() { }
}
else
{
/// This version works well with everything else.
void fun() { }
/// We also provide extra functionality.
void du() { }
}
}
Compiling with dmd -D, documentation is generated for the first block only. How do I get it to generate for the else block as well ?

For version blocks, it's only the version which is used which ends up in the documentation (be it the first one or the last one or whichever in between). So, for instance, if you have a version block for Linux and one for Windows, only the one which matches the system that you compile on will end up in the docs.
static if blocks outside of templates seem to act the same way. If they're compiled in, then their ddoc comments end up in the docs, whereas if they're not compiled in, they don't.
However, static if blocks inside templates appear to always grab the documentation from the first static if block, even if it's always false. But considering that those static ifs can end up being both true and false (from different instantiations of the template) and that the compiler doesn't actually require that the template be instantiated for its ddoc comments to end up in the generated docs, that makes sense. It doesn't have one right answer like static if blocks outside of templates do.
Regardless, it's generally a bad idea to put documentation inside of a version block or static if, precisely because they're using conditional compilation and may or may not be compiled in. The solution is to use a version(D_Ddoc) block. So, you'd end up with something like this:
/// This is struct S
struct S(T)
{
version(D_Ddoc)
{
/// Function foo.
void fun();
/// Extra functionality. Exists only when T is not a floating point type.
void du();
}
else
{
static if(isFloatingPoint!T)
void fun() { }
else
{
void fun() { }
void du() { }
}
}
}
I would also note that even if what you were trying to do had worked, it would look very bizarre in the documentation, because you would have ended up with foo in there twice with the exact same signature but different comments. static if doesn't end up in the docs at all, so there'd be no way to know under what circumstances foo existed. It would just look like you somehow declared foo twice.
The situation is similar with template constraints. The constraints don't end up in the docs, so it doesn't make sense to document each function overload when you're dealing with templated functions which are overloaded only by the their constraints.
One place where you don't need version(D_Ddoc), however, is when you have the same function in a series of version blocks. e.g.
/// foo!
version(linux)
void foo() {}
else version(Windows)
void foo() {}
else
static assert(0, "Unsupported OS.");
The ddoc comment will end up in the generated documentation regardless of which version block is compiled in.
It should be noted that the use of version(D_Ddoc) blocks tends to make it so when using -D, it makes no sense to compile your code for anything other than generating the documentation and that the actual executable that you run should be generated by a separate build which doesn't use -D. You can put the full code in the version(D_Ddoc) blocks to avoid that, but that would mean duplicating code, and it wouldn't really work with static if. Phobos uses version(StdDdoc) (which it defines for itself) instead of version(D_Ddoc) so that if you don't use version(D_Ddoc) blocks, you can still compile with -D and have Phobos work, but once you start using version(D_Ddoc), you're going to have to generate your documentation separately from your normal build.

Related

How do you perform cppcheck cross-translation unit (CTU) static analysis?

Cppcheck documentation seems to imply analysis can be done across multiple translation units as evidenced by the --max-ctu-depths flag. This clearly isn't working on this toy example here:
main.cpp:
int foo();
int main (void)
{
return 3 / foo();
}
foo.cpp:
int foo(void)
{
return 0;
}
Even with --enable=all and --inconclusive set, this problem does not appear in the report. It seems like cppcheck might not be designed to do cross-file analysis, but the max-ctu-depths flag begs to differ. Am I missing something here? Any help is appreciated!
I am a cppcheck developer.
The whole program analysis in Cppcheck is quite limited. We have some such analysis but it is not very "deep" nor sophisticated. It only currently tracks values that you pass into functions.
Some example test cases (feel free to copy/paste these code examples into different files):
https://github.com/danmar/cppcheck/blob/main/test/testbufferoverrun.cpp#L4272
https://github.com/danmar/cppcheck/blob/main/test/testbufferoverrun.cpp#L4383
https://github.com/danmar/cppcheck/blob/main/test/testbufferoverrun.cpp#L4394
https://github.com/danmar/cppcheck/blob/main/test/testnullpointer.cpp#L3281
https://github.com/danmar/cppcheck/blob/main/test/testuninitvar.cpp#L4723
.. and then there is the whole unused functions checker.
If you are using threads then you will have to use --cppcheck-build-dir to make CTU possible.
Based on the docs and the source code (as well as the associated header) of the CTU checker, it does not contain a cross-translation unit divide by zero check.
One of the few entry points to the CTU class (and checker) is CTU::getUnsafeUsage, which is described (in-code) as follows:
std::list<CTU::FileInfo::UnsafeUsage> CTU::getUnsafeUsage(...) {
std::list<CTU::FileInfo::UnsafeUsage> unsafeUsage;
// Parse all functions in TU
const SymbolDatabase *const symbolDatabase = tokenizer->getSymbolDatabase();
for (const Scope &scope : symbolDatabase->scopeList) {
// ...
// "Unsafe" functions unconditionally reads data before it is written..
for (int argnr = 0; argnr < function->argCount(); ++argnr) {
// ...
}
}
return unsafeUsage;
}
with emphasis on ""Unsafe" functions unconditionally reads data before it is written..".
There is no single mention on divide by zero analysis in the context of the CTU checker.
It seems like cppcheck might not be designed to do cross-file analysis
Based on the brevity of the public API of the CTU class, it does seem cppchecks cross-file analysis is indeed currently somewhat limited.

c++, how do I create thread-restricted/protected variables and functions?

I have three threads in an application I'm building, all of which remain open for the lifetime of the application. Several variables and functions should only be accessed from specific threads. In my debug compile, I'd like a check to be run and an error to be thrown if one of these functions or variables is accessed from an illegal thread, but I don't want this as overhead in my final compilation. I really just want this so I the programmer don't make stupid mistakes, not to protect my executing program from making mistakes.
Originally, I had a 'thread protected' class template that would wrap around return types for functions, and run a check on construction before implicitly converting to the intended return type, but this didn't seem to work for void return types without disabling important warnings, and it didn't resolve my issue for protected variables.
Is there a method of doing this, or is it outside the scope of the language? 'If you need this solution, you're doing it wrong' comments not appreciated, I managed to near halve my program's execution time with this methodology, but it's just too likely I'm going to make a mistake that results in a silent race condition and ultimately undefined behavior.
What you described is exactly what assert macro is for.
assert(condition)
In a debug build condition is checked. If it is false, the program will throw an exception at that line. In a release build, the assert and whatever is inside the parentheses aren't compiled.
Without being harsh, it would have been more helpful if you had explained the variables you are trying to protect. What type are they? Where do they come from? What's their lifetime? Are they global? Why do you need to protect a returned type if it's void? How did you end up in a situation where one thread might accidentally access something. I kinda have to guess but I'll throw out some ideas here:
#include <thread>
#include <cassert>
void protectedFunction()
{
assert(std::this_thread::get_id() == g_thread1.get_id());
}
// protect a global singleton (full program lifetime)
std::string& protectedGlobalString()
{
static std::string inst;
assert(std::this_thread::get_id() == g_thread1.get_id());
return inst;
}
// protect a class member
int SomeClass::protectedInt()
{
assert(std::this_thread::get_id() == g_thread1.get_id());
return this->m_theVar;
}
// thread protected wrapper
template <typename T>
class ThreadProtected
{
std::thread::id m_expected;
T m_val;
public:
ThreadProtected(T init, std::thread::id expected)
: m_val(init), m_expected(expected)
{ }
T Get()
{
assert(std::this_thread::get_id() == m_expected);
return m_val;
}
};
// specialization for void
template <>
class ThreadProtected<void>
{
public:
ThreadProtected(std::thread::id expected)
{
assert(std::this_thread::get_id() == expected);
}
};
assert is oldschool. We were actually told to stop using it at work because it was causing resource leaks (the exception was being caught high up in the stack). It has the potential to cause debugging headaches because the debug behavior is different from the release behavior. A lot of the time if the asserted condition is false, there isn't really a good choice of what to do; you usually don't want to continue running the function but you also don't know what value to return. assert is still very useful when developing code. I personally use assert all the time.
static_assert will not help here because the condition you are checking for (e.g. "Which thread is running this code?") is a runtime condition.
Another note:
Don't put things that you want to be compiled inside an assert. It seems obvious now, but it's easy to do something dumb like
int* p;
assert(p = new(nothrow) int); // check that `new` returns a value -- BAD!!
It's good to check the allocation of new, but the allocation won't happen in a release build, and you won't even notice until you start release testing!
int* p;
p = new(nothrow) int;
assert(p); // check that `new` returns a value -- BETTER...
Lastly, if you write the protected accessor functions in a class body or in a .h, you can goad the compiler into inlining them.
Update to address the question:
The real question though is where do I PUT an assert macro? Is a
requirement that I write setters and getters for all my thread
protected variables then declare them as inline and hope they get
optimised out in the final release?
You said there are variables that should be checked (in the debug build only) when accessed to make sure the correct thread is accessing them. So, theoretically, you would want an assert macro before every such access. This is easy if there are only a few places (if this is the the case, you can ignore everything I'm about to say). However, if there are so many places that it starts to violate the DRY Principal, I suggest writing getters/setters and putting the assert inside (This is what I've casually given examples of above). But while the assert won't add overhead in release mode (since it's conditionally compiled), using extra functions (probably) adds function call overhead. However, if you write them in the .h, there's a good chance they'll be inlined.
Your requirement for me was to come up with a way to do this without release overhead. Now that I've mentioned inlining I'm obligated to say that the compiler knows best. There usually are compiler-specific ways to force inlining (since the compiler is allowed to ignore the inline keyword). You should be profiling the code before trying to inline things. See the answer to this question. Is it good practice to make getters and setters inline?. You can easily see if the compiler is inlining the function by looking at the assembly. Don't worry, you don't have to be good at assembly. Just find the calling function and look for a call to the getter/setter. If the function was inlined, you won't see a call and you'll see probably a mov instead.

Defining a code section within which a different code path is executed

Is it possible to define a section or scope in my code within which a different code path is executed, without using a global or passed-down state variable?
For debugging purposes, I want to be able to surround a section of faulty code with a scope or #define to temporarily switch on pre-defined debugging behavior within this section, e.g. use debug data, a more precise data type, an already validated algorithm, … This needs to work in a multi-threaded application in which multiple threads will likely execute the same shared code concurrently, but only some of them have called this code from within the defined section.
For example, here is some pseudo-code that is not working, but might illustrate what I'd like to do. A static expensive function that is called from several places concurrently:
Result Algorithms::foo()
{
#ifdef DEBUG_SECTION
return Algorithms::algorithmPrecise(dataPrecise);
#else
return Algorithms::algorithmOptimized(dataOptimized);
#endif
}
Three classes of which instances need to be updated frequently:
Result A::update()
{
return Algorithms::foo();
}
Result B::update()
{
Result result;
#define DEBUG_SECTION
...
result = a.update() + 1337;
...
#undef DEBUG_SECTION
return result;
}
Result C::update()
{
return a.update();
}
As you can see, class A directly calls foo(), whereas in class B, foo() is called indirectly by calling a.update() and some other stuff. Let us assume B::update() returns a wrong result, so I want to be able to use the debug implementation of foo() only from this location. In C::update(), the optimized version should still be used.
My conceptual idea is to define a DEBUG_SECTION around the faulty code which would use the debug implementation at this location. This, however, does not work in practice, as Algorithms::foo() is compiled once with DEBUG_SECTION not being defined. In my application, Algorithms, A, B, and C are located in separate libraries.
I want that within a section defined in the code, a different code section within shared code is executed. However, outside of this section I still want execution of the original code, which at runtime will happen concurrently, so I cannot simply use a state variable. I could add a debugFlag parameter to each call within the DEBUG_SECTION that is passed down in each recursive call that is then provided to Algorithms::foo(), but this is extremely prone to errors (you must not miss any calls, but the section could be quite huge, spread over different files, …) and quite messy in a larger system. Is there any better way to do this?
I need a solution for C++11 and MSVC.
This might work by using a template:
template<bool pDebug>
Result Algorithms::foo()
{
if(pDebug)
return Algorithms::algorithmPrecise(dataPrecise);
else
return Algorithms::algorithmOptimized(dataOptimized);
}
On the other hand this means moving your function definition into a header (or forcing template instantiation, see these answers).
The downside is that changing the call to Algorithms::foo() from instance.foo<false> to instance.foo<true> every time you want to switch between debugging and release might require effort. If you have multiple affected calls you could use a compile time const variable to reduce the typing effort, but not knowing your code exactly I can't estimate if this is a feasible solution.
If the majority of your code uses the optimized version of the function you can also set the template parameter to default to false (template<bool pDebug = false>) to avoid changing existing code that will not call the debug-version.

Define a new type of optimization

Is there a way to tell g++ more about a type, function, or specific variable (other than attributes) that I might know is safe to preform.
Example:
TurnLedOn();
TurnLedOn();
Only the first function actually turns the LED on the second function does not actually do anything....so would it be possible to tell g++ more about the function so that it gets rid of a second call if it knows that the LED is on (because it knows that a corresponding TurnLedOff() function has not been called)....
The reason I do not want to use g++ attributes is because I want to arbitrarily define optimizations, which is really not possible with attributes (and I believe the optimization I am trying here is not actually possible to begin with using attributes)
These are optimisations you need to code. Such as:
class LedSwitch {
bool isOn{false};
public:
inline void turnLedOn(){
if (!isOn) {
isOn = true;
// ...
}
}
// ...
}
// ...
If the code inlines then the compiler might then notice the bool negated in the second hardcoded sequential call, but why do that in the first place?
Maybe you should revisit design if things like this are slowing down your code.
One possibility is to make it so that the second TurnLedOn call does nothing, and make it inline and declare it in a header file so the compiler can see the definition in any source file:
extern bool isLedOn; // defined somewhere else
inline void TurnLedOn()
{
if(!isLedOn)
{
ActuallyTurnLedOn();
isLedOn = true;
}
}
Then the compiler might be able to figure out by itself that calling TurnLedOn twice does nothing useful. Of course, as with any optimization, you have no guarantees.
Contrary to your thinking, the answer by #immibis is what you were expecting.
This way to describe the complex behavior of the function TurnLedOn (i.e. needn't be called twice in a row unless unlocked by some other action) is indeed how you tell the compiler to perform this "optimization".
Could you imagine other annotations such as
#pragma call_once_toggle_pair(TurnLEDOn, TurnLEDOff)
with innumerable variants describing all your vagaries ?
The C++ language has enough provisions to let you express arbitrarily complex situations, please don't add yet a layer of complexity on top of that.

Do repetitive calls to member functions hurt?

I have programmed in both Java and C, and now I am trying to get my hands dirty with C++.
Given this code:
class Booth {
private :
int tickets_sold;
public :
int get_tickets_sold();
void set_tickets_sold();
};
In Java, wherever I needed the value of tickets_sold, I would call the getter repeatedly.
For example:
if (obj.get_tickets_sold() > 50 && obj.get_tickets_sold() < 75){
//do something
}
In C I would just get the value of the particular variable in the structure:
if( obj_t->tickets_sold > 50 && obj_t->tickets_sold < 75){
//do something
}
So while using structures in C, I save on the two calls that I would otherwise make in Java, the two getters that is, I am not even sure if those are actual calls or Java somehow inlines those calls.
My point is if I use the same technique that I used in Java in C++ as well, will those two calls to getter member functions cost me, or will the compiler somehow know to inline the code? (thus reducing the overhead of function call altogether?)
Alternatively, am I better off using:
int num_tickets = 0;
if ( (num_tickets = obj.get_ticket_sold()) > 50 && num_tickets < 75){
//do something
}
I want to write tight code and avoid unnecessary function calls, I would care about this in Java, because, well, we all know why. But, I want my code to be readable and to use the private and public keywords to correctly reflect what is to be done.
Unless your program is too slow, it doesn't really matter. In 99.9999% of code, the overhead of a function call is insignificant. Write the clearest, easiest to maintain, easiest to understand code that you can and only start tweaking for performance after you know where your performance hot spots are, if you have any at all.
That said, modern C++ compilers (and some linkers) can and will inline functions, especially simple functions like this one.
If you're just learning the language, you really shouldn't worry about this. Consider it fast enough until proven otherwise. That said, there are a lot of misleading or incomplete answers here, so for the record I'll flesh out a few of the subtler implications. Consider your class:
class Booth
{
public:
int get_tickets_sold();
void set_tickets_sold();
private:
int tickets_sold;
};
The implementation (known as a definition) of the get and set functions is not yet specified. If you'd specified function bodies inside the class declaration then the compiler would consider you to have implicitly requested they be inlined (but may ignore that if they're excessively large). If you specify them later using the inline keyword, that has exactly the safe effect. Summarily...
class Booth
{
public:
int get_tickets_sold() { return tickets_sold; }
...
...and...
class Booth
{
public:
int get_tickets_sold();
...
};
inline int Booth::get_tickets_sold() { return tickets_sold; }
...are equivalent (at least in terms of what the Standard encourages us to expect, but individual compiler heuristics may vary - inlining is a request that the compiler's free to ignore).
If the function bodies are specified later without the inline keyword, then the compiler is under no obligation to inline them, but may still choose to do so. It's much more likely to do so if they appear in the same translation unit (i.e. in the .cc/.cpp/.c++/etc. "implementation" file you're compiling or some header directly or indirectly included by it). If the implementation is only available at link time then the functions may not be inlined at all, but it depends on the way your particular compiler and linker interact and cooperate. It is not simply a matter of enabling optimisation and expecting magic. To prove this, consider the following code:
// inline.h:
void f();
// inline.cc:
#include <cstdio>
void f() { printf("f()\n"); }
// inline_app.cc:
#include "inline.h"
int main() { f(); }
Building this:
g++ -O4 -c inline.cc
g++ -O4 -o inline_app inline_app.cc inline.o
Investigating the inlining:
$ gdb inline_app
...
(gdb) break main
Breakpoint 1 at 0x80483f3
(gdb) break f
Breakpoint 2 at 0x8048416
(gdb) run
Starting program: /home/delroton/dev/inline_app
Breakpoint 1, 0x080483f3 in main ()
(gdb) next
Single stepping until exit from function main,
which has no line number information.
Breakpoint 2, 0x08048416 in f ()
(gdb) step
Single stepping until exit from function _Z1fv,
which has no line number information.
f()
0x080483fb in main ()
(gdb)
Notice the execution went from 0x080483f3 in main() to 0x08048416 in f() then back to 0x080483fb in main()... clearly not inlined. This illustrates that inlining can't be expected just because a function's implementation is trivial.
Notice that this example is with static linking of object files. Clearly, if you use library files you may actually want to avoid inlining of the functions specifically so that you can update the library without having to recompile the client code. It's even more useful for shared libraries where the linking is done implicitly at load time anyway.
Very often, classes providing trivial functions use the two forms of expected-inlined function definitions (i.e. inside class or with inline keyword) if those functions can be expected to be called inside any performance-critical loops, but the countering consideration is that by inlining a function you force client code to be recompiled (relatively slow, possibly no automated trigger) and relinked (fast, for shared libraries happens on next execution), rather than just relinked, in order to pick up changes to the function implementation.
These kind of considerations are annoying, but deliberate management of these tradeoffs is what allows enterprise use of C and C++ to scale to tens and hundreds of millions of lines and thousands of individual projects, all sharing various libraries over decades.
One other small detail: as a ballpark figure, an out-of-line get/set function is typically about an order of magnitude (10x) slower than the equivalent inlined code. That will obviously vary with CPU, compiler, optimisation level, variable type, cache hits/misses etc..
No, repetitive calls to member functions will not hurt.
If it's just a getter function, it will almost certainly be inlined by the C++ compiler (at least with release/optimized builds) and the Java Virtual Machine may "figure out" that a certain function is being called frequently and optimize for that. So there's pretty much no performance penalty for using functions in general.
You should always code for readability first. Of course, that's not to say that you should completely ignore performance outright, but if performance is unacceptable then you can always profile your code and see where the slowest parts are.
Also, by restricting access to the tickets_sold variable behind getter functions, you can pretty much guarantee that the only code that can modify the tickets_sold variable to member functions of Booth. This allows you to enforce invariants in program behavior.
For example, tickets_sold is obviously not going to be a negative value. That is an invariant of the structure. You can enforce that invariant by making tickets_sold private and making sure your member functions do not violate that invariant. The Booth class makes tickets_sold available as a "read-only data member" via a getter function to everyone else and still preserves the invariant.
Making it a public variable means that anybody can go and trample over the data in tickets_sold, which basically completely destroys your ability to enforce any invariants on tickets_sold. Which makes it possible for someone to write a negative number into tickets_sold, which is of course nonsensical.
The compiler is very likely to inline function calls like this.
class Booth {
public:
int get_tickets_sold() const { return tickets_sold; }
private:
int tickets_sold;
};
Your compiler should inline get_tickets_sold, I would be very surprised if it didn't. If not, you either need to use a new compiler or turn on optimizations.
Any compiler worth its salt will easily optimize the getters into direct member access. The only times that won't happen are when you have optimization explicitly disabled (e.g. for a debug build) or if you're using a brain-dead compiler (in which case, you should seriously consider ditching it for a real compiler).
The compiler will very likely do the work for you, but in general, for things like this I would approach it more from the C perspective rather than the Java perspective unless you want to make the member access a const reference. However, when dealing with integers, there's usually little value in using a const reference over a copy (at least in 32 bit environments since both are 4 bytes), so your example isn't really a good one here... Perhaps this may illustrate why you would use a getter/setter in C++:
class StringHolder
{
public:
const std::string& get_string() { return my_string; }
void set_string(const std::string& val) { if(!val.empty()) { my_string = val; } }
private
std::string my_string;
}
That prevents modification except through the setter which would then allow you to perform extra logic. However, in a simple class such as this, the value of this model is nil, you've just made the coder who is calling it type more and haven't really added any value. For such a class, I wouldn't have a getter/setter model.