Stop goto optimization g - c++

When writing asm code, there is a trick to slow down the code by a cycle or two by telling the cpu to explicitly jump to the next instruction. I was thinking to do something similar using C++ templates. Here's my code:
template <unsigned int c>
inline void adelay()
{
goto x;
x:
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
Although the idea seems sound, the optimizer appears to be getting in the way by removing the jmp code. Any ideas how this could be implemented?
Background
The reason for wanting to do this is to slow down the code of a micro-controller such that it outputs a light beam pulse at a very specific frequency. This is a very specialized use, and is not a common except in low level hardware access such as writing drivers or programming micro-controllers. Even then I try and avoid such things when at all possible. Unfortunately, this cannot always be avoided.

That's what optimizer should do - optimize, including removal of non-functional code.
Either disable the optimizations completely in your compiler options or use other methods to slow your program, there are plenty of APIs that allow you to sleep for a defined time.

You can add this attribute:
template <>
inline void __attribute__((optimize("O0"))) adelay<0>()
{
}
Which should prevent the optimization. Although as others have mentioned there are probably better ways but if this is purely for learning purposes than all good. I usually use this to verify assembler output really quick or when I am not at a command line.

Thanks for the help all. Instead of using jmp instructions, I went with nop instructions:
template <unsigned int c>
inline void adelay()
{
asm("nop");
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
At one point I used referencing a volatile variable which worked at a slightly courser granularity:
static volatile int _adelay = 0;
template <unsigned int c>
inline void adelay()
{
_adelay;
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
That may be useful when I'm running low on memory.
Thanks again! :)
Adrian

Related

Auto cast to void unused variable C++

I am trying to solve huge number of warnings in a C++ project that is generated by a lot of unused variables. For example, consider this function:
void functionOne(int a, int b)
{
// other stuff and implementations :D
doSomethingElse();
runProcedureA();
}
In my project, in order to surpress the warnings I am simply casting the unused variables to void because I cannot change the methods signatures.
void functionOne(int a, int b)
{
(void)a;
(void)b;
// other stuff and implementations :D
doSomethingElse();
runProcedureA();
}
This technique works all right but I have a really huge quantity of functions that I need to do this in order to solve the warnings issue. Is there any way to auto refactor all these functions by casting all unused parameters to void?
Currently, I am working with CLion IDE and VSCODE.
A simple alternative is to not give names for the parameters instead of the cast. This way the unusage would be considered intentional:
void functionOne(int, int)
Another way to achieve the same is:
void functionOne([[maybe_unused]] int a, [[maybe_unused]] int b)
Is there any way to auto refactor all these functions
Potential XY-problem: If you don't want to be warned about unused parameters, how about disabling the warning?

Does C++ have a way to do Cuda style kernel templates, where parameters produce separate compilations?

In Cuda you can specify template parameters that are used to automatically create completely different versions of kernels. The catch is that you can only pass const values to the functions so that the compiler knows ahead of time exactly which versions of the kernel need to be created. For instance, you can have a template parameter int X, then use an if(X==4){this}else{that} and you'll get two separate functions created, neither of which have the overhead of the 'if' statement.
I've found this to be invaluable in allowing great flexibility and code re-usability without sacrificing performance.
Bonus points if you can point out that branches don't have that much overhead, I never knew that! ;)
Something like this?
#include <iostream>
template <int x>
void function() {
if constexpr (x == 1) {
std::cout << "hello\n";
} else {
std::cout << "world\n";
}
}
int main() {
function<3>();
}

Does the (gcc) compiler optimize away empty-body functions?

Using policy based design, an EncapsulatedAlgorithm:
template< typename Policy>
class EncapsulatedAlgorithm : public Policy
{
double x = 0;
public:
using Policy::subCalculate;
void calculate()
{
Policy::subCalculate(x);
}
protected:
~EncapsulatedAlgorithm() = default;
};
may have a policy Policy that performs a sub-calculation. The sub-calculation is not necessary for the algorithm: it can be used in some cases to speed up algorithm convergence. So, to model that, let's say there are three policies.
One that just "logs" something:
struct log
{
static void subCalculate(double& x)
{
std::cout << "Doing the calculation" << endl;
}
};
one that calculates:
struct calculate
{
static void subCalculate(double& x)
{
x = x * x;
}
};
and one to bring them all and in the darkness bind them :D - that does absolutely nothing:
struct doNothing
{
static void subCalculate(double& x)
{
// Do nothing.
}
};
Here is the example program:
typedef EncapsulatedAlgorithm<doNothing> nothingDone;
typedef EncapsulatedAlgorithm<calculate> calculationDone;
typedef EncapsulatedAlgorithm<loggedCalculation> calculationLogged;
int main(int argc, const char *argv[])
{
nothingDone n;
n.calculate();
calculationDone c;
c.calculate();
calculationLogged l;
l.calculate();
return 0;
}
And here is the live example. I tried examining the assembly code produced by gcc with the optimization turned on:
g++ -S -O3 -std=c++11 main.cpp
but I do not know enough about Assembly to interpret the result with certainty - the resulting file was tiny and I was unable to recognize the function calls, because the code of the static functions of all policies was inlined.
What I could see is that when no optimization is set for the, within the main function, there is a call and a subsequent leave related to the 'doNothing::subCalculate'
call _ZN9doNothing12subCalculateERd
leave
Here are my questions:
Where do I start to learn in order to be able to read what g++ -S spews out?
Is the empty function optimized away or not and where in main.s are those lines?
Is this design O.K.? Usually, implementing a function that does nothing is a bad thing, as the interface is saying something completely different (subCalculate instead of doNothing), but in the case of policies, the policy name clearly states that the function will not do anything. Otherwise I need to do type traits stuff like enable_if, etc, just to exclude a single function call.
I went to http://assembly.ynh.io/, which shows assembly output. I
template< typename Policy>
struct EncapsulatedAlgorithm : public Policy
{
void calculate(double& x)
{
Policy::subCalculate(x);
}
};
struct doNothing
{
static void subCalculate(double& x)
{
}
};
void func(double& x) {
EncapsulatedAlgorithm<doNothing> a;
a.calculate(x);
}
and got these results:
.Ltext0:
.globl _Z4funcRd
_Z4funcRd:
.LFB2:
.cfi_startproc #void func(double& x) {
.LVL0:
0000 F3 rep #not sure what this is
0001 C3 ret #}
.cfi_endproc
.LFE2:
.Letext0:
Well, I only see two opcodes in the assembly there. rep (no idea what that is) and end function. It appears that the G++ compiler can easily optimize out the function bodies.
Where do I start to learn in order to be able to read what g++ -S spews out?
This site's not for recommending reading material. Google "x86 assembly language".
Is the empty function optimized away or not and where in main.s are those lines?
It will have been when the optimiser was enabled, so there won't be any lines in the generated .S. You've already found the call in the unoptimised output....
In fact, even the policy that's meant to do a multiplication may be removed as the compiler should be able to work out you're not using the resultant value. Add code to print the value of x, and seed x from some value that can't be known at compile time (it's often convenient to use argc in a little experimental program like this, then you'll be forcing the compiler to at least leave in the functionally significant code.
Is this design O.K.?
That depends on a lot of things (like whether you want to use templates given the implementation needs to be exposed in the header file, whether you want to deal with having distinct types for every instantiation...), but you're implementing the design correctly.
Usually, implementing a function that does nothing is a bad thing, as the interface is saying something completely different (subCalculate instead of doNothing), but in the case of policies, the policy name clearly states that the function will not do anything. Otherwise I need to do type traits stuff like enable_if, etc, just to exclude a single function call.
You may want to carefully consider your function names... do_any_necessary_calculations(), ensure_exclusivity() instead of lock_mutex(), after_each_value() instead of print_breaks etc..

goto Optimization Refactor

I have a "MyFunction" I keep obsessing over if I should or shouldn't use goto on it and in similar (hopefully rare) circumstances. So I'm trying to establish a hard-and-fast habit for this situation. To-do or not-to-do.
int MyFunction()
{ if (likely_condition)
{
condition_met:
// ...
return result;
}
else /*unlikely failure*/
{ // meet condition
goto condition_met;
}
}
I was intending to net the benefits of the failed conditional jump instruction for the likely case. However I don't see how the compiler could know which to streamline for case probability without something like this.
it works right?
are the benefits worth the confusion?
are there better (less verbose, more structured, more expressive) ways to enable this optimization?
It appears to me that the optimization you're trying to do is mostly obsolete. Most modern processors have branch prediction built in, so (assuming it's used enough to notice) they track how often a branch is taken or not and predict whether the branch is likely to be taken or not based on its past pattern of being taken or not. In this case, speed depends primarily on how accurate that prediction is, not whether the prediction is for taken vs. not taken.
As such, you're probably best off with rather simpler code:
int MyFunction() {
if (!likely_condition) {
meet_condition();
}
// ...
return result;
}
A modern CPU will take that branch either way with equal performance if it makes the correct branch prediction. So if that is in an inner loop, the performance of if (unlikely) { meet condition } common code; will match what you have written.
Also, if you spell out the common code in both branches the compiler will generate code that is identical to what you have written: The common case will be emitted for the if clause and the else clause will jmp to the common code. You see this all the time with simpler terminal cases like *out = whatever; return result;. When debugging it can be hard to tell which return you're looking at because they've all been merged.
It looks like the code should work as you expect as long as condition_met: doesn't skip variable initializations.
No, and you don't even know that the obfuscated version compiles into more optimal code. Compiler optimizations (and processor branch prediction) are getting very smart in recent times.
3.
int MyFunction()
{
if (!likely_condition)
{
// meet condition
}
condition_met:
// ...
return result;
}
or, if it helps your compiler (check the assembly)
int MyFunction()
{
if (likely_condition); else
{
// meet condition
}
condition_met:
// ...
return result;
}
I would highly recommend using the __builtin_expect() macro (GCC) or alike for your particular C++ compiler (see Portable branch prediction hints) instead of using goto:
int MyFunction()
{ if (__builtin_expect(likely_condition))
{
// ...
return result;
}
else /*unlikely failure*/
{ // meet condition
}
}
As others also mentioned goto is error prone and evil from the bones.

BOOST_STATIC_WARNING

I've recently had some trouble with C++'s implicit casting, so I'm looking for a way to warn people if somebody attempts to assign an int32_t to a uint64_t or whatever. BOOST_STATIC_ASSERT would work wonders for this, except that the code base I'm working with is quite large and relies on a lot of implicit casting, so immediately breaking everything with assertions is unrealistic.
It looks like BOOST_STATIC_WARNING would be ideal for me, however, I cannot get it to actually emit a warning. Something like this won't do anything:
typedef boost::is_same<int64_t, int32_t> same_type;
BOOST_STATIC_WARNING(same_type::value);
My compiler is g++ 4.4.3 with --std=c++0x -Wall -Wextra. My Boost is 1.46.1.
The problem I'm trying to solve here is that we have a buffer type which has methods like uint8_t GetUInt8(size_type index), void SetUInt32(size_type index, uint32_t value), etc. So, you see usage like this:
x = buffer.GetUInt16(96);
The problem is that there is no guarantee that, while you are reading a 16-bit unsigned integer, that x is actually 16-bits. While the person who originally wrote that line did it properly (hopefully), if the type of x changes, this line will break silently.
My solution is to create a safe_convertable<T> type like so:
template <typename T>
struct safe_convertable
{
public:
template <typename TSource>
safe_convertable(const TSource& val)
{
typedef boost::is_same<T, TSource> same_type;
BOOST_STATIC_WARNING(same_type::value);
_val = val;
}
template <typename TDestination>
operator TDestination ()
{
typedef boost::is_same<T, TDestination> same_type;
BOOST_STATIC_WARNING(same_type::value);
return _val;
}
private:
T _val;
};
and change the methods to return and accept these safe references: safe_reference<uint8_t> GetUInt8(size_type index), void SetUInt32(size_type index, safe_reference<uint32_t> value) (that's the short version, there are other operators and whatnot you can do to references).
Anyway, this works great with BOOST_STATIC_ASSERT, save for the fact that I want warnings and not errors.
For the curious, I've implemented the warning thing myself, which works fine, but I'd prefer the Boost variety so that I get all the other Boost features (this only works inside a function).
namespace detail
{
template <typename TIntegralContant>
inline void test_warning(const TIntegralContant&)
{
static_cast<void>(1 / TIntegralContant::value);
}
}
#define MY_STATIC_WARNING(value_) \
::detail::test_warning(::boost::integral_constant<bool, value_ >())
What version of Boost are you using? This comment may be the reason why your own warning works, but the boost version does not:
// 6. replaced implementation with one which depends solely on
// mpl::print<>. The previous one was found to fail for functions
// under recent versions of gcc and intel compilers - Robert Ramey
I'm guessing if you upgraded to a recent version of Boost (e.g. 1.46.1), you'd be good to go. crosses fingers