c++11: short notation for pow? - c++

I need paste huge formula into my c++11 code.
This formula contains numerous things like double type to power of integer:
It is boring to write:
pow(sin(B), 6) + ... pow(sin(C), 4) + ....
20 times
The best thing will be overload operator^ for double and int,
but as I understand it is not possible for C++11. And in formula like this:
z0^2 * (something)
according to precedence of operators it would be like this:
z0 ^ (2 * (something))
and this is not what I want.
So is it possible using some trick to approximate math notation for x to power of y with C++ code?
Possible toools are: c++11 and boost.
Update:
About support code with such math notation instead of C++.
Ideal solution as I see would be like:
const double result = (latex_mode
(sin(B_0))^6(a + b)
latex_mode_end)
(B_0, a, b);
where latex_mode support small subset of LaTex language without ambiguity.
1)All programmers that touch this code have mathematical background, so they read latex without problem
2)Formula can be copy/paste from article without any modification,
so it will reduce typo errors.

No, you can't do it. At least, you shouldn't. But it can be simplified otherwise. If your formula can be described as
then you can make table of pairs (a,b) and write code which itself will do it. For example:
vector<pair<unsigned, unsigned>> table = {{1, 2}, {2, 3}, {3, 4}};
unsigned sum = 0;
for(const auto& x : table)
sum += pow(get<0>(x), get<1>(x));
Inspired by a #5gon12eder's comment I wrote a function:
template <typename Input, typename Output = unsigned>
Output powers(std::initializer_list<Input> args) {
Output result = 0;
for(const auto x : args)
result += pow(std::get<0>(x), std::get<1>(x));
return result;
}
You need additional (standard) library:
#tuple
#initializer_list
Example of use:
std::pair<unsigned, unsigned> a{1,2}, b{2,3};
std::cout << powers({a, b, {3, 4}, {4,5}}) << '\n';
That prints 1114 and it's correct.
Referring to edit part, I can suggest write a function receiving string and parsing. But it will be much slower than the above method.
Finally, you can write to the authors of your compiler.
Edit:
With C++14 new possibilities appeared. You can make constexpr functions with for, variables etc. So it is easier to create compile-time parser. I still recommend solution from original part of post, cause it will be little bit messy, but it will do what you want in compile-time.
String to int example:
#include <iostream>
template<size_t N>
constexpr uint32_t to_int(const char (&input)[N]) { // Pass by reference
uint32_t result = 0;
for(uint32_t i = 0; i < N && input[i] != '\0'; ++i) {
result *= 10;
result += input[i] - '0';
}
return result;
}
constexpr uint32_t value = to_int("123427");
enum { __force_compile_time_computing = value };
int main() {
std::cout << value << std::endl;
}
Prints:
~ $ g++ -std=c++14 -Wall -Wextra -pedantic-errors example.cpp
~ $ ./a.out
123427
~ $
Obviously it will be harder to make parser. Probably the best way to do it is to create constexpr class Operation with two constructors Operation(operation, operation) and Operation(value) and to create calculation tree in compile time (if you have variables in your string).
If you don't want to do whole this job, and you can accept other program/language semantic, then you can realize easy run-time solution. Create new thread which calls R/mathematica/{something else} and send input string to it. After calculation resend value to main program.
If you want some hint, probably using std::future will be convenient.

Just for the record, this is what Bjarne Stroustrup says about it:
Can I define my own operators?
Sorry, no. The possibility has been considered several times, but each time I/we decided that the likely problems outweighed the likely benefits.
It's not a language-technical problem. Even when I first considerd it in 1983, I knew how it could be implemented. However, my experience has been that when we go beyond the most trivial examples people seem to have subtlely different opinions of “the obvious” meaning of uses of an operator. A classical example is a**b**c. Assume that ** has been made to mean exponentiation. Now should a**b**c mean (a**b)**c or a**(b**c)? I thought the answer was obvious and my friends agreed – and then we found that we didn't agree on which resolution was the obvious one. My conjecture is that such problems would lead to subtle bugs.
Interestingly, it is exactly the operator you are missing. Well, this is not that much a coincidence since many people are missing a built-in exponentiation operator. Especially those who know Fortran or Python (two languages otherwise rarely mentioned together).

Related

Logical negation and assignment operator in C/C++?

Is there a more concise way to write the second line of this function without repeating a reduntantly?
void myfunc()
{
bool a = false; // Assume that this won't alway be hardcoded to false.
a = !a;
}
Is there a more concise way to write the second line of this function without repeating a reduntantly?
! is an unary operator, you cannot associate it with = if this is what you expected.
But you can do something like a ^= true;, this is not a more concise way but this is without repeating a redundantly. Anyway this is not a good way to do contrarily to a = !a; which is much more readable
Well, I don't really see the value of doing so, but you could use xor or a simple decrement. Both of these work.
a ^= 1;
a--;
Then you won't have to repeat a.
But if you want it to be very clear, consider using a function:
void flip(bool *b)
{
(*b)--;
}
bool b = true;
flip(&b);
In C++, you can use references
void flip(bool &b)
{
b--;
}
bool b = true;
flip(b);
Or write a macro. Actually, macros are pretty handy for solving duplication problems. Although, I almost never use them for that, since it's simply rarely worth the effort. I wrote one macro to avoid duplication for malloc calls. Such a call typically look like this:
int *x = malloc(12 * sizeof *x);
You can avoid the duplication with this:
#define ALLOC(p, n) \
((p) = malloc((n) * sizeof *(p)))
But even this is something I hesitate to use.
To be honest, it's not really a problem worth solving. :)
The only possible answer is: unfortunately not.
The unary operator ! is actually concise enough, and any other trick would lead to unreadable code.
Any other shortcut form, for example for binary operators such as + or *:
a += 5;
b *= 7;
mantain the reference to the original meaning of theoperator: multiplication and addition respectively.
In your sentence:
without repeating a reduntantly
there's the wrong assumption that a is redundant in case the compiler could be instructed to negate a without repeating the variable name. No: it tells to people reading the code (even to yourself, for example an year later!) that the variable to be negated is a. And since C grammar doesn't define any syntactic sugar for logical negation operator, a is not redundant at all.

Is a boolean expression as onerous as branching with if or switch?

Often I convert some if statements into boolean expressions for code compactness. For instance, if I have something like
foo(int x)
{
if (x > 5) return 100 + 5;
return 100;
}
I'll do it like
foo(int x)
{
return 100 + (x > 5) * 5;
}
This is very simple so no problem, the thing is when I have multiple tests, I can greatly simplify them (at the expense of readability but that's a different issue).
So the question is if that (x > 5) evaluation is as onerous as explicitly branching with it.
In both cases the expression (x > 5) has to be checked if it evaluates to true . And as demonstrated already, both versions compile to the same assembly even without any optimization enabled.
However, the Philosophy section of C++ Core Guidelines has these two rules you would do well to pay heed to:
P.1: Express ideas directly in code
P.3: Express intent
Though these rules cannot be enforced in anyway, adhering to them will make you adopt the version with the if statement.
Doing so will make it less onerous for those who have to maintain the code after you and even yourself a few months later.
You seem to be conflating C++ language constructs with patterns in the assembly. It may have been viable to reason about code on this level given the compilers of the late eighties or early nineties. At this point, however, compilers apply a lot of optimizations and transformations whose correctness or utility is not even obvious to the average programmer. A very simple example is the common beginner's mistake of assuming the following equivalences:
std::uint16_t a = ...;
a *= 2; // a multiplication in assembly
a *= 17; // ditto
a /= 3; // a division in assembly
They may then be surprised to find out that their compiler of choice translates these into the assembly equivalent of e.g.:
a <<= 1u;
a = (a << 4u) + a; // or even (a << 4u) | a if a < 16
a *= 43691u;
Note that the last transformation is only allowed if a is known to be a multiple of the divisor, so you may not see this kind of optimization all too often. How does it even work? In mathematical terms, uint16_t can be thought of as the residue class ring Z/(2^16)Z, and in this ring, there exists a multiplicative inverse for any element that is coprime to 2^16 (i.e. not divisible by 2). If d (e.g. 3) is coprime to 2, it has such an inverse, and then dividing by d is simply equivalent to multiplying by the inverse of d if the remainder is known to be zero. (I won't go into how this inverse can be calculated here.)
Here is another surprising optimization:
long arithsum(long n)
{
long result = 0;
for (long i=0; i<=n; ++i)
result += i;
return result;
}
GCC with -O3 rather mundanely translates this into an unrolled loop of additions. My version (9.0.0svn-something) of Clang, however, will pull a Gauss on you if you do this, and translate this into something like:
long arithsum(long n)
{
return (n * (n+1)) >> 1;
}
Anyway, the same caveats apply to if/switch etc. – while these are control flow structures, and so you'd think they correspond to branching, this may not be so. Likewise, what appears to be a non-branching operation might be translated to a branching operation if the compiler has an optimization rule under which this seems beneficial, or even if it is just unable to translate its own AST or intermediate representation into machine code without use of branching (on the given architecture).
TL;DR: Before you try to outsmart your compiler, figure out which assembly the compiler produces for the straightforward / readable code in this first place. If this assembly is good, there is no point in making the code more subtle / less readable.
Assuming by onerous you mean 1/0. Sure it might work in C/C++ due to implicit typecasting but might not for other languages. If that's what you want to achieve why not use ternary operator (? :) which also makes the code more readable
foo(int x) {
return (x > 5) ? (100 + 5) : 100;
}
Also read this stackoverflow article -- bool to int conversion

How to improve compilation time for a gigantic boolean expression?

I need to perform a rather complex check over a vector and I have to repeat it thousands and millions of times. To make it more efficient, I translate given formula into C++ source code and compile it in heavily-optimized binary, which I call in my code. The formula is always purely Boolean: only &&, || and ! used. Typical source code looks like this:
#include <assert.h>
#include <vector>
using DataType = std::vector<bool>;
static const char T = 1;
static const char F = 0;
const std::size_t maxidx = 300;
extern "C" bool check (const DataType& l);
bool check (const DataType& l) {
assert (l.size() == maxidx);
return (l[0] && l[1] && l[2]) || (l[3] && l[4] && l[5]); //etc, very large line with && and || everywhere
}
I compile it as follows:
g++ -std=c++11 -Ofast -march=native -fpic -c check.cpp
Performance of the resulting binary is crucial.
It worked perfectly util recent test case with the large number of variables (300, as you can see above). With this test case, g++ consumes more than 100 GB of memory and freezes forever.
My question is pretty straightforward: how can I simplify that code for the compiler? Should I use some additional variables, get rid of vector or something else?
EDIT1: Ok, here is the screenshot from top utility.
cc1plus is busy with my code. The check function depends on 584 variables (sorry for a imprecise number in the example above) and it contains 450'000 expressions.
I would agree with #akakatak's comment below. It seems that g++ performs something O(N^2).
The obvious optimization here is to toss out the vector and use a bit-field, based on the fastest possible integer type:
uint_fast8_t item [n];
You could write this as
#define ITEM_BYTES(items) ((items) / sizeof(uint_fast8_t))
#define ITEM_SIZE(items) ( ITEM_BYTES(items) / CHAR_BIT + (ITEM_BYTES(items)%CHAR_BIT!=0) )
...
uint_fast8_t item [ITEM_SIZE(n)];
Now you have a chunk of memory with n segments, where each segment is the ideal size for your CPU. In each such segment, set bits to 1=true or 0=false, using bitwise operators.
Depending on how you want to optimize, you would group the bits in different ways. I would suggest storing 3 bits of data in every segment, since you always wish to check 3 adjacent boolean numbers. This mean that "n" in the above example will be the total number of booleans divided by 3.
You can then simply iterate through the array like:
bool items_ok ()
{
for(size_t i=0; i<n; i++)
{
if( (item[i] & 0x7u) == 0x7u )
{
return true;
}
}
return false;
}
With the above method you optimize:
The data size in which comparisons are made, and with it possible alignment issues.
The overall memory use.
The number of branches needed for the comparisons.
This also rules out any risks of ineffectiveness caused by the usual C++ meta programming. I would never trust std::vector, std::array or std::bitfield to produce optimal code.
Once you have the above working you can always test if std::bitfield etc containers yields the very same, effective machine code. If you find that they spawned any form of unrelated madness in your machine code, then don't use them.
It's a necro-posting a little bit, but I still should share my results.
The solution proposed by Thilo in comments above is the best. It's very simple and it provides measurable compile time improvement. Just split your expression into chunks of the same size. But, in my experience, you have to choose an appropriate sub expression length carefully - you can encounter significant execution performance drop in case of large number of sub expressions; a compiler will not be able to optimize the whole expression perfectly.

Is it good practice to use the comma operator?

I've recently (only on SO actually) run into uses of the C/C++ comma operator. From what I can tell, it creates a sequence point on the line between the left and right hand side operators so that you have a predictable (defined) order of evaluation.
I'm a little confused about why this would be provided in the language as it seems like a patch that can be applied to code that shouldn't work in the first place. I find it hard to imagine a place it could be used that wasn't overly complex (and in need of refactoring).
Can someone explain the purpose of this language feature and where it may be used in real code (within reason), if ever?
It can be useful in the condition of while() loops:
while (update_thing(&foo), foo != 0) {
/* ... */
}
This avoids having to duplicate the update_thing() line while still maintaining the exit condition within the while() controlling expression, where you expect to find it. It also plays nicely with continue;.
It's also useful in writing complex macros that evaluate to a value.
The comma operator just separates expressions, so you can do multiple things instead of just one where only a single expression is required. It lets you do things like
(x) (y)
for (int i = 0, j = 0; ...; ++i, ++j)
Note that x is not the comma operator but y is.
You really don't have to think about it. It has some more arcane uses, but I don't believe they're ever absolutely necessary, so they're just curiosities.
Within for loop constructs it can make sense. Though I generally find them harder to read in this instance.
It's also really handy for angering your coworkers and people on SO.
bool guess() {
return true, false;
}
Playing Devil's Advocate, it might be reasonable to reverse the question:
Is it good practice to always use the semi-colon terminator?
Some points:
Replacing most semi-colons with commas would immediately make the structure of most C and C++ code clearer, and would eliminate some common errors.
This is more in the flavor of functional programming as opposed to imperative.
Javascript's 'automatic semicolon insertion' is one of its controversial syntactic features.
Whether this practice would increase 'common errors' is unknown, because nobody does this.
But of course if you did do this, you would likely annoy your fellow programmers, and become a pariah on SO.
Edit: See AndreyT's excellent 2009 answer to Uses of C comma operator. And Joel 2008 also talks a bit about the two parallel syntactic categories in C#/C/C++.
As a simple example, the structure of while (foo) a, b, c; is clear, but while (foo) a; b; c; is misleading in the absence of indentation or braces, or both.
Edit #2: As AndreyT states:
[The] C language (as well as C++) is historically a mix of two completely different programming styles, which one can refer to as "statement programming" and "expression programming".
But his assertion that "in practice statement programming produces much more readable code" [emphasis added] is patently false. Using his example, in your opinion, which of the following two lines is more readable?
a = rand(), ++a, b = rand(), c = a + b / 2, d = a < c - 5 ? a : b;
a = rand(); ++a; b = rand(); c = a + b / 2; if (a < c - 5) d = a; else d = b;
Answer: They are both unreadable. It is the white space which gives the readability--hurray for Python!. The first is shorter. But the semi-colon version does have more pixels of black space, or green space if you have a Hazeltine terminal--which may be the real issue here?
Everyone is saying that it is often used in a for loop, and that's true. However, I find it's more useful in the condition statement of the for loop. For example:
for (int x; x=get_x(), x!=sentinel; )
{
// use x
}
Rewriting this without the comma operator would require doing at least one of a few things that I'm not entirely comfortable with, such as declaring x outside the scope where it's used, or special casing the first call to get_x().
I'm also plotting ways I can utilize it with C++11 constexpr functions, since I guess they can only consist of single statements.
I think the only common example is the for loop:
for (int i = 0, j = 3; i < 10 ; ++i, ++j)
As mentioned in the c-faq:
Once in a while, you find yourself in a situation in which C expects a
single expression, but you have two things you want to say. The most
common (and in fact the only common) example is in a for loop,
specifically the first and third controlling expressions.
The only reasonable use I can think of is in the for construct
for (int count=0, bit=1; count<10; count=count+1, bit=bit<<1)
{
...
}
as it allows increment of multiple variables at the same time, still keeping the for construct structure (easy to read and understand for a trained eye).
In other cases I agree it's sort of a bad hack...
I also use the comma operator to glue together related operations:
void superclass::insert(item i) {
add(i), numInQ++, numLeft--;
}
The comma operator is useful for putting sequence in places where you can't insert a block of code. As pointed out this is handy in writing compact and readable loops. Additionally, it is useful in macro definitions. The following macro increments the number of warnings and if a boolean variable is set will also show the warning.
#define WARN if (++nwarnings, show_warnings) std::cerr
So that you may write (example 1):
if (warning_condition)
WARN << "some warning message.\n";
The comma operator is effectively a poor mans lambda function.
Though posted a few months after C++11 was ratified, I don't see any answers here pertaining to constexpr functions. This answer to a not-entirely-related question references a discussion on the comma operator and its usefulness in constant expressions, where the new constexpr keyword was mentioned specifically.
While C++14 did relax some of the restrictions on constexpr functions, it's still useful to note that the comma operator can grant you predictably ordered operations within a constexpr function, such as (from the aforementioned discussion):
template<typename T>
constexpr T my_array<T>::at(size_type n)
{
return (n < size() || throw "n too large"), (*this)[n];
}
Or even something like:
constexpr MyConstexprObject& operator+=(int value)
{
return (m_value += value), *this;
}
Whether this is useful is entirely up to the implementation, but these are just two quick examples of how the comma operator might be applied in a constexpr function.

Which one of these two ways of writing this code would be more suited?

I've got a code that as you can see I can write in either of two following ways, the matter is the only difference is since in second function the parameter is declared as non-constant I can use that instead of declaring a new variable(num1 in first function) but I' curious which one would be more suited if there would be any difference between output assembly codes generated by compiler for each one:
void Test(const long double input){
long double num=(6.0*input);
long double num1=9.0;
for (int i=0;i<5;i++)
num1*=num;
cout <<num1<<'\n';
}
void Test(long double input){
long double num=(6.0*input);
input=9.0;
for (int i=0;i<5;i++)
input*=num;
cout <<input<<'\n';
}
A good optimizing compiler could theoretically make them equivalent (i.e., generate equivalent code for both) via enregistration into floating point registers of the numbers, although this may not result in the fastest code. Whether such a compiler exists is a good question.
For stylistic (i.e., readability) reasons, though, I prefer the first, so that the same variable does not get used for two different things:
void Test(const long double input){
long double num=(6.0*input);
long double num1=9.0;
for (int i=0;i<5;i++)
num1*=num;
cout <<num1<<'\n';
}
Like this:
void Test(long double input)
{
long double factor = 6.0 * input;
long double result = 9.0;
for (int i = 0; i < 5; ++i)
result *= factor;
cout << result << '\n';
}
Note we put spaces between things for the same reason weputspacesbetweenwords and give things meaningful names, so it's actually readable...
Like this:
void Test(long double input)
{
long double const factor = 6.0 * input;
long double result = 9.0 * pow(factor, 5);
cout << result << '\n';
}
If you must use the loop then I would follow GMan's example.
One variable for one use. Trying to re-use variables has no meaning. The compiler does not even have the concept of variable names. It re-uses slots when appropriate (notice I use the term slot: it multiple variables can use the same slot).
The compiler is just so much better at optimization than a human that it is counter productive to try and beat it (use better algorithms this is were the human factor comes in because the compiler does not understand algorithms).
The biggest thing about code is not writing it but maintaining it. So your code MUST be written to be easy to maintain for the next person (a company will spend much more on maintenance then developing new code). The old adage is write your code knowing that the maintainer is an axe murder that knows where you live.
What the compiler generates is entirely dependent on your compiler flags and platform. It would be an interesting exercise to generate the assembler outputs for each of the above using full optimization (just give them diff function names) and post it here for definitive comment, or as a separate question).
My guess is that you are most likely concerned about performance - if so, I would just write a small wrapper app to call each function N times and then output the relative timings, possibly excluding the cout part to avoid console I/O skewing the results.
Well in the second function you're reusing stack space from the argument, while in the first function the compiler has to reserve space for num1. The assembly instructions should be the same, save for the addresses/offsets used.