round to inferior odd or even integer in C [closed] - c++

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Working in C language, I would like to round a float number to its inferior odd integer and its inferior even integer.
The speed of the solution is very important (because it is computed 2M*20 times per seconds).
I propose this solution :
x_even = (int)floor(x_f) & ~1;
x_odd = ((int)ceil(x_f) & ~1) -1;
I presume that the weak point is the floor and ceil operations, but I'm not even sure of that.
Does someone have a comment on this solution ; I'm interested about it's speed of execution, but if you have another solution to share, I'll be very happy to test it :-).

You don't explain what you mean by 'inferior', but assuming you mean 'greatest even/odd integer less than the given number', and assuming you have a 2s-complement machine, you want:
x_i = (int)floor(x_f);
x_even = x_i & ~1;
x_odd = x_i - (~x_i & 1);
If you want to avoid the implementation dependency of doing bitwise ops on possibly negative signed numbers, you could instead do it entirely in float:
x_even = 2.0 * floor(x_f * 0.5);
x_odd = x_even + 1.0 > x_f ? x_even - 1.0 : x_even + 1.0;
This also has the advantage of not overflowing for large numbers, though it does give you x_odd == x_even for large numbers (those too big for the floating point representation to represent an odd number).

Perhaps the ceil and floor function won't be necessary as transtypage from a double to an int is equivalent to the floor function for positive integer.
Try something like this for POSITIVE INTEGERs :
double k = 68.8 ; // Because we need something to seed with.
int even = ((int) k & ~1) ; // What you did
int test = ((int) (k+1) & ~1) ; // Little trick
int odd = (test>k) ? odd+1 : odd - 1 ;
I tested it on codepad, and it works well for on http://codepad.org/y3t0KgwW for C++, I think it will in C. If you test this solution, I'd be glad to know how fast it can be...
Notice that :
This is not a good answer as it shadows the existence of negative integers.
The range is limited to integers'.
I swapped odd and even numbers, I corrected it thank's to Chris' comment.
I'm just adding my humble stone :)

Related

Overflow When Calculating Average?

Given 2 integer numbers we can calculate their average like this:
return (a+b)/2;
which isn't safe since (a+b) can cause overflow (Side Note: can someone tell me the correct term for this case maybe memory overflow?)
So we write:
return a+(b-a)/2;
can the same trick be implemented over n numbers and how?
Note that there are several different averages. I assume that you're asking about the arithmetic mean.
overflow (Side Note: can someone tell me the correct term for this case maybe memory overflow?)
The correct term is arithmetic overflow, or just overflow. Not memory overflow.
a+(b-a)/2;
b-a can also overflow. This isn't quite as easy to solve as it may seem.
Standard library has a function template to do this correctly without overflow: std::midpoint.
I checked an implementation of std::midpoint, and they do what you suggested for integers, except the operands are first converted to the corresponding unsigned type. Then the result is converted back. A mathematician may explain how that works, but I guess that it has something to do with the magic of modular arithmetic.
For floats, they do a / 2 + b / 2 (if the inputs are normal).
can the same trick be implemented over n numbers and how?
Simplest solution that works with all inputs without overflow and without imprecision is probably to use arbitrary precision arithmetic.
One way of getting average number for multiple numbers is to find the Cumulative Moving Average, or CMA:
Your code a + (b - a) / 2 can also be derived from this equation for n + 1 == 2.
Translating above equation to code, you would get something similar to:
std::vector<int> vec{10, 5, 8, 3, 2, 8}; // average is 6
double average = 0.0;
for(auto n = 0; n < vec.size(); ++n)
{
average += (vec[n] - average) / (n + 1);
}
std::cout << average; // prints 6
Alternatively, you can also use the std::accumulate:
std::cout << std::accumulate(vec.begin(), vec.end(), 0.0,
[n = 0](auto cma, auto i) mutable {
return cma + (i - cma) / ++n;
});
Do note any time you are using floating division can result into imprecise result, especially when you attempt to do that for numerous times. For more regarding impreciseness, you can look at: Is floating point math broken?

How do I safely convert a double into an integer in C++? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Is there a way to convert a double into an integer without risking any undesired errors in the process? I read in Programming - Principles and Practice Using C++ (a book written by the creator of c++) that doubles cannot be turned into integers, but I've put it to the test, and it converts properly about 80% of the time. What's the best way to do this with no risk at all, if it's even possible?
So for example, this converts properly.
double bruh = 10.0;
int a = bruh;
cout << bruh << "\n";
But this doesn't.
double bruh = 10.9;
int a = bruh;
cout << bruh << "\n";
In short, it doesn't round automatically so I think that's what constitutes it as "unsafe".
It it not possible to convert all doubles to integers with no risk of losing data.
First, if the double contains a fractional part (42.9), that fractional part will be lost.
Second, doubles can hold a much larger range of values than most integers, something around 1.7e308, so when you get into the larger values you simply won't be able to store them into an integer.
way to convert a double into an integer without risking any undesired errors
in short, it doesn't round automatically so I think that's what constitutes it as "unsafe"
To convert to an integer value:
x = round(x);
To convert to an integer type:
Start with a round function like long lround(double x);. It "Returns the integer value that is nearest in value to x, with halfway cases rounded away from zero."
If the round result is outside the long range, problems occur and code may want to test for that first.
// Carefully form a double the is 1 more than LONG_MAX
#define LONG_MAXP1 ((LONG_MAX/2 + 1)*2.0)
long val = 0;
if (x - LONG_MAXP1 < -0.5 && x - LONG_MIN > -0.5) {
val = lround(x);
} else {
Handle_error();
}
Detail: in order to test if a double is in range to round to a long, it is important to test the endpoints carefully. The mathematical valid range is (LONG_MIN-0.5 ... LONG_MAX + 0.5), yet those endpoints may not be exactly representable as a double. Instead code uses nearby LONG_MIN and LONG_MAXP1 whose magnitudes are powers of 2 and easy to represent exactly as a double.

Compute arithmetic-geometric mean without using an epsilon in C++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
Is it possible to compute arithmetic-geometric mean without using an epsilon in C++?
Here is my code:
double agm(double a, double b)
{
if(a > b)
{
a = a + b;
b = a - b;
a = a - b;
}
double aCurrent(a), bCurrent(b),
aNext(a), bNext(b);
while(aCurrent - bCurrent != 0)
{
aNext = sqrt(aCurrent*bCurrent);
bNext = (aCurrent+bCurrent)*0.5;
aCurrent = aNext;
bCurrent = bNext;
}
return aCurrent;
}
double sqrt(double x)
{
double res(x * 0.5);
do
{
res = (res + x/res) * 0.5;
} while(abs(res*res - x) > 1.0e-9);
return res;
}
And it runs forever.
Actually it is very clear what I was asking. It is just that you never met the problem and maybe lazy to think about it and are saying at once that there is nothing to talk about.
So, here is the solution I was looking for:
Instead of eps we can just add the following condition
if(aCurrent <= aPrev || bPrev <= bCurrent || bCurrent <= aCurrent )
And if the condition is true, then it means that we have computed the arithmetic-geometric mean with the most precision possible on our machine. As you can see there is no eps.
Using an eps in the question and answer means comparing that we say that two double numbers are equal when the difference between them is less than eps.
Please, reconsider opening the question.
Of course you can. It suffices to limit the number of iterations to the maximum required for convergence in any case, which should be close to the logarithm of the number of significant bits in the floating-point representation.
The same reasoning holds for the square root. (With a good starting approximation based on the floating-point exponent, i.e. at most a factor 2 away from the exact root, 5 iterations always suffice for doubles).
As a side note, avoid using absolute tolerances. Floating-point values can vary in a very wide range. They can be so large that the tolerance is 0 in comparison, or so tiny that they are below the tolerance itself. Prefer relative tolerances, with the extra difficulty that there is no relative tolerance to 0.
No, it's not possible without using an epsilon. Floating point arithmetic is an approximation of real arithmetic, and usually generates roundoff errors. As a result, it's unlikely the two calculation sequences used to compute the AGM will ever converge to exactly the same floating point numbers. So rather than test whether two floating point numbers are equal, you need to test whether they're close enough to each other to consider them effectively equal. And that's done by calculating the difference and testing whether it's really small.
You can either use a hard-coded epsilon value, or calculate it relative to the size of the numbers. The latter tends to be better, because it allows you to work with different number scales. E.g. you shouldn't use the same epsilon to try to calculate the square root of 12345 and 0.000012345; 0.01 might be adequate for the large number, but you'd need something like 0.000001 for the small number.
See What every programmer should know about floating point

Please provide me the logic to print the result using PRINTF [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
need an programming logic to print an 4 decimal points
EX: scalar should be 0 to -5
value = 10006 , scalar = -3 then print result = 10.0060 (4 decimals)
value = 123 ,scalar = -5 then print result = 0.0012 (4 decimals)**
required value/divisor = 10 , value%divisor = 0060 (required logic after decimals )
I tried like this:
divisor = std::pow(10,std::abs(scalar));
**Result = snprintf(X,Y,"%d.%0*d",value/scalar,4,value%scalar);**
I'm not allowed to use float , setprecision() .
It does not necessarily represent the actual value , but we can format that value to print with logic like the original one (by using the logic , add ...subtract...pow etc)
std::int32_t divisor = static_cast(std::pow( 10.0F, std::abs( Scalar)) );
but int the above result modulus scalar value with 0 are not considering.
**Please provide me the logic to print the above result with scalar condition
In order to print decimals (easily), you need floating point:
printf("%10.6f", static_cast<double>(1) / 3);
If one of the arguments to division is floating point, the compiler will promote the expression to floating point.
Integral or scalar division will lose decimals.
You are always welcome to write your own division function.
Edit 1: Shifting
You don't need to use the pow function (especially since it's floating point).
Use a loop, in the loop multiply your divisor by 10 each time.
double result = (double) scalar / value;
int divisor = 10;
int i;
for (i = 0; i < NUMBER_OF_DECIMALS; ++)
{
// Isolate a digit using math.
// print the digit
divisor += 10;
}
The math part is left as an exercise for the OP.
In this homework exercise you are expected to perform your own number formatting, rather than using that of a library.
For instance, to format and output 100 as "100"
int accum = 100;
int position = 0;
while (accum > 0)
{
printf("%d", accum % 10);
accum /= 10;
position += 1;
}
For your homework assignment, you need to modify the above loop so that it puts a printf(".") in the correct place in the output number. Your answer is likely to involve multiplying accum before the loop and testing position relative to scalar

Getting "carry" in x + y [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Best way to detect integer overflow in C/C++
If I have an expression x + y (in C or C++) where x and y are both of type uint64_t which causes an integer overflow, how do I detect how much it overflowed by (the carry), place than in another variable, then compute the remainder?
The remainder will already be stored in the sum of x + y, assuming you are using unsigned integers. Unsigned integer overflow causes a wrap around ( signed integer overflow is undefined ). See standards reference from Pascal in the comments.
The overflow can only be 1 bit. If you add 2 64 bit numbers, there cannot be more than 1 carry bit, so you just have to detect the overflow condition.
For how to detect overflow, there was a previous question on that topic: best way to detect integer overflow.
For z = x + y, z stores the remainder. The overflow can only be 1 bit and it's easy to detect. If you were dealing with signed integers then there's an overflow if x and y have the same sign but z has the opposite. You cannot overflow if x and y have different signs. For unsigned integers you just check the most significant bit in the same manner.
The approach in C and C++ can be quite different, because in C++ you can have operator overloading work for you, and wrap the integer you want to protect in some kind of class (for which you would overload the necessary operators. In C, you would have to wrap the integer you want to protect in a structure (to carry the remainder as well as the result) and call some function to do the heavy lifting.
Other than that, the approach in the two languages is the same: depending on the operation you want to perform (adding, in your example) you have to figure out the worst that could happen and handle it.
In the case of adding, it's quite simple: if the sum of the two is going to be greater than some maximum value (which will be the case if the difference of that maximum value M and one of the operands is greater than the other operand) you can calculate the remainder - the part that's too big: if ((M - O1) > O2) R = O2 - (M - O1) (e.g. if M is 100, O1 is 80 and O2 is 30, 30 - (100 - 80) = 10, which is the remainder).
The case of subtraction is equally simple: if your first operand is smaller than the second, the remainder is the second minus the first (if (O1 < O2) { Rem = O2 - O1; Result = 0; } else { Rem = 0; Result = O1 - O2; }).
It's multiplication that's a bit more difficult: your safest bet is to do a binary multiplication of the values and check that your resulting value doesn't exceed the number of bits you have. Binary multiplication is a long multiplication, just like you would do if you were doing a decimal multiplication by hand on paper, so, for example, 12 * 5 is:
0110
0100
====
0110
0
0110
0
++++++
011110 = 40
if you'd have a four-bit integer, you'd have an overflow of one bit here (i.e. bit 4 is 1, bit 5 is 0. so only bit 4 counts as an overflow).
For division you only really need to care about division by 0, most of the time - the rest will be handled be your CPU.
HTH
rlc