calculating square root for implementating a fixed point function

calculating square root for implementating a fixed point function - c++

i am trying to find the square root of a fixed point and i used the following Calculation to find an approximation of the square root using an integer algorithm. The algorithm is described in Wikipedia:
http://en.wikipedia.org/wiki/Methods_of_computing_square_roots
uint32_t SquareRoot(uint32_t a_nInput)
{
uint32_t op = a_nInput;
uint32_t res = 0;
uint32_t one = 1uL << 30; // The second-to-top bit is set: use 1u << 14 for uint16_t type; use 1uL<<30 for uint32_t type
// "one" starts at the highest power of four <= than the argument.
while (one > op)
{
one >>= 2;
}
while (one != 0)
{
if (op >= res + one)
{
op = op - (res + one);
res = res + 2 * one;
}
res >>= 1;
one >>= 2;
}
return res;
}
but i am unable to follow whats happening in the code what does the comment // "one" starts at the highest power of four <= than the argument. exactly means. Can someone please hint me whats happening in the code to calculate the square root of the argument a_nInput
Thanks much

Note how one is initialized.
uint32_t one = 1uL << 30;
That's 230, or 1073741824. Which is also 415.
This line:
one >>= 2;
Is equivalent to
one = one / 4;
So the pseudocode for what's happening is:
one = 415
if one is more than a_nInput
one = 414
if one is still more than a_nInput
one = 413
(and so on...)
Eventually, one will not be more than a_nInput.
// "one" starts at the highest power of four less than or equal to a_nInput

Related

Why is there a loop in this division as multiplication code?

I got the js code below from an archive of hackers delight (view the source)
The code takes in a value (such as 7) and spits out a magic number to multiply with. Then you bitshift to get the results. I don't remember assembly or any math so I'm sure I'm wrong but I can't find the reason why I'm wrong
From my understanding you could get a magic number by writing ceil(1/divide * 1<<32) (or <<64 for 64bit values, but you'd need bigger ints). If you multiple an integer with imul you'd get the result in one register and the remainder in another. The result register is magically the correct result of a division with this magic number from my formula
I wrote some C++ code to show what I mean. However I only tested with the values below. It seems correct. The JS code has a loop and more and I was wondering, why? Am I missing something? What values can I use to get an incorrect result that the JS code would get correctly? I'm not very good at math so I didn't understand any of the comments
#include <cstdio>
#include <cassert>
int main(int argc, char *argv[])
{
auto test_divisor = 7;
auto test_value = 43;
auto a = test_value*test_divisor;
auto b = a-1; //One less test
auto magic = (1ULL<<32)/test_divisor;
if (((1ULL<<32)%test_divisor) != 0) {
magic++; //Round up
}
auto answer1 = (a*magic) >> 32;
auto answer2 = (b*magic) >> 32;
assert(answer1 == test_value);
assert(answer2 == test_value-1);
printf("%lld %lld\n", answer1, answer2);
}
JS code from hackers delight
var two31 = 0x80000000
var two32 = 0x100000000
function magic_signed(d) { with(Math) {
if (d >= two31) d = d - two32// Treat large positive as short for negative.
var ad = abs(d)
var t = two31 + (d >>> 31)
var anc = t - 1 - t%ad // Absolute value of nc.
var p = 31 // Init p.
var q1 = floor(two31/anc) // Init q1 = 2**p/|nc|.
var r1 = two31 - q1*anc // Init r1 = rem(2**p, |nc|).
var q2 = floor(two31/ad) // Init q2 = 2**p/|d|.
var r2 = two31 - q2*ad // Init r2 = rem(2**p, |d|).
do {
p = p + 1;
q1 = 2*q1; // Update q1 = 2**p/|nc|.
r1 = 2*r1; // Update r1 = rem(2**p, |nc|.
if (r1 >= anc) { // (Must be an unsigned
q1 = q1 + 1; // comparison here).
r1 = r1 - anc;}
q2 = 2*q2; // Update q2 = 2**p/|d|.
r2 = 2*r2; // Update r2 = rem(2**p, |d|.
if (r2 >= ad) { // (Must be an unsigned
q2 = q2 + 1; // comparison here).
r2 = r2 - ad;}
var delta = ad - r2;
} while (q1 < delta || (q1 == delta && r1 == 0))
var mag = q2 + 1
if (d < 0) mag = two32 - mag // Magic number and
shift = p - 32 // shift amount to return.
return mag
}}

In the C CODE:
auto magic = (1ULL<<32)/test_divisor;
We get Integer Value in magic because both (1ULL<<32) & test_divisor are Integers.
The Algorithms requires incrementing magic on certain conditions, which is the next conditional statement.
Now, multiplication also gives Integers:
auto answer1 = (a*magic) >> 32;
auto answer2 = (b*magic) >> 32;
C CODE is DONE !
In the JS CODE:
All Variables are var ; no Data types !
No Integer Division ; No Integer Multiplication !
Bitwise Operations are not easy and not suitable to use in this Algorithm.
Numeric Data is via Number & BigInt which are not like "C Int" or "C Unsigned Long Long".
Hence the Algorithm is using loops to Iteratively add and compare whether "Division & Multiplication" has occurred to within the nearest Integer.
Both versions try to Implement the same Algorithm ; Both "should" give same answer, but JS Version is "buggy" & non-standard.
While there are many Issues with the JS version, I will highlight only 3:
(1) In the loop, while trying to get the best Power of 2, we have these two statements :
p = p + 1;
q1 = 2*q1; // Update q1 = 2**p/|nc|.
It is basically incrementing a counter & multiplying a number by 2, which is a left shift in C++.
The C++ version will not require this rigmarole.
(2) The while Condition has 2 Equality comparisons on RHS of || :
while (q1 < delta || (q1 == delta && r1 == 0))
But both these will be false in floating Point Calculations [[ eg check "Math.sqrt(2)*Math.sqrt(0.5) == 1" : even though this must be true, it will almost always be false ]] hence the while Condition is basically the LHS of || , because RHS will always be false.
(3) The JS version returns only one variable mag but user is supposed to get (& use) even variable shift which is given by global variable access. Inconsistent & BAD !
Comparing , we see that the C Version is more Standard, but Point is to not use auto but use int64_t with known number of bits.

First I think ceil(1/divide * 1<<32) can, depending on the divide, have cases where the result is off by one. So you don't need a loop but sometimes you need a corrective factor.
Secondly the JS code seems to allow for other shifts than 32: shift = p - 32 // shift amount to return. But then it never returns that. So not sure what is going on there.
Why not implement the JS code in C++ as well and then run a loop over all int32_t and see if they give the same result? That shouldn't take too long.
And when you find a d where they differ you can then test a / d for all int32_t a using both magic numbers and compare a / d, a * m_ceil and a * m_js.

Add a bit in the middle of any two bits

In a binary representation of a number is there a simpler way than this
long half(long patten, bool exclusive) {
if (patten == 0) {
return 1;
}
long newPatten = 0;
for (int p = 0; p < MAX_STEPS; p++) {
long check = 1 << p;
if ((check & patten) > 0) {
int end = (p + MAX_STEPS) - 1;
for (int to = p+1; to <= end; to++) {
long checkTo = 1 << (to % MAX_STEPS);
if ( (checkTo & patten) > 0 || to == end ) {
int distance = to - p;
long fullShift = (int)round( ((float)p) + ((float)distance)/2 );
long shift = fullShift % MAX_STEPS;
long toAdd = 1 << shift;
newPatten = newPatten | toAdd;
break;
}
}
}
}
return exclusive ? patten ^ newPatten : patten | newPatten;
}
To add a bit in the middle of any two other bits with wrapping around and rounding to one side when there is an even number of positions between?
E.g.
010001 to
110101
Or
1001 to
1101
Update: These bits aren't coming from anywhere else, such as another number, just want a new True bit to be in the middle of any other two True bits, so if there was a 101, then the middle would be the 0 and the result would be 111. Or if we started with 10001, then the middle would be the center 0 and the result would be 10101.
When i say wrapping i also mean adding bits as if the bit array was a circle so if we had 00100010 representing the positions with letters:
00100010
hgfedcba
So we have True bits at b and f we would put a middle bit between b and f going left to right at d but also going right to left, wrapping around and putting it at h resulting in:
10101010
hgfedcba
I understand this isn't a usual problem.
Are there known tricks to do things like this without loops?

Problem with Reversing Large Integers On Leetcode?

I was working on this problem from Leetcode where it has this requirement of reversing numbers whilst staying within the +/-2^31 range. I checked out other solutions made for this problem, and from there created my own solution to it. It worked successfully for numbers ranging from 10 to less than 99,999,999. Going more than that(when trying to submit the code to move to the next question) would throw an error saying:
"Line 17: Char 23: runtime error: signed integer overflow: 445600005 * 10 cannot be represented in type 'int' (solution.cpp)"
This was the input given when trying to submit the code: 1534236469
My code
class Solution {
public:
int reverse(int x) {
int flag = 0;
int rev = 0;
if (x >= pow(2, 31)) {
return 0;
} else {
if (x < 0) {
flag = 1;
x = abs(x);
}
while(x > 0) {
rev = rev * 10 + x % 10;
x /= 10;
}
if (flag == 1) {
rev = rev*(-1);
}
return rev;
}
}
};
As you can see from my code, I added an if statement that would basically return 0 if the number was greater than 2^31. Unfortunately, this was wrong.
Can anyone explain how this can be fixed? Thank you in advance.

Problem statement asks to return 0 if reversed number does not belong to integer range :
If reversing x causes the value to go outside the signed 32-bit integer range [-2^31, 2^31 - 1], then return 0.
In your code you checked if input fits in integer range but their arises a corner case when the integer has 10 digits and last digit is >2 (and for some cases 2).
Lets consider the input 1534236469: 1534236469 < 2^31 - 1
so program executes as expected now lets trace last few steps of program execution : rev = 964632435 and x = 1 problem arises when following statement is executed :
rev = rev * 10 + x % 10;
Now, even though input can be represented as integer rev * 10 i.e. 9646324350 is greater than integer range and correct value that should be returned is zero
Fix ?
1. Lets consider 10 digit case independently
Even though this can be done, it gives rise to unnecessary complications when last digit is 2
2. Make rev a long integer
This works perfectly and is also accepted, but sadly this is not expected when solving this problem as statement explicitly asks to not use 64-bit integers
Assume the environment does not allow you to store 64-bit integers (signed or unsigned).
3. Checking before multyplying by 10 ?
This works as expected. Before multyplying rev by 10 check if it is >= (pow(2,31)/10)
while(x > 0) {
if (rev >= pow(2, 31)/10 )
return 0;
rev = rev * 10 + x % 10;
x /= 10;
}
I hope this solves your doubt !! Comment if you find something wrong as this is my first answer.
Note : The following if statement is unnecessary as input is always a 32-bit integer
Given a signed 32-bit integer x
if (x >= pow(2, 31)) {
return 0;
}
Edit : As most of the comments pointed it out, instead of pow(2,31), use INT_MAX macro as it suffices here.

public static int reverse(int x) {
boolean isNegative = false;
if (x < 0) {
isNegative = true;
x = -x;
}
long reverse = 0;
while (x > 0) {
reverse = reverse * 10 + x % 10;
x=x/10;
}
if (reverse > Integer.MAX_VALUE) {
return 0;
}
return (int) (isNegative ? -reverse : reverse);
}

Bit-wise shift for Matrix iteration?

Ok some background
I have been working on this project, which I had started back in college, (no longer in school but want to expand on it to help me improve my understanding of C++). I digress... The problem is to find the Best path through a matrix. I generate a matrix filled with a set integer value lets say 9. I then create a path along the outer edge (Row 0, Col length-1) so that all values along it are 1.
The goal is that my program will run through all the possible paths and determine the best path. To simplify the problem I decide to just calculate the path SUM and then compare that to what the SUM computed by the application.
(The title is miss leading S=single-thread P=multi-threads)
OK so to my question.
In one section the algorithm does some simple bit-wise shifts to come up with the bounds for iteration. My question is how exactly do these shifts work so that the entire matrix (or MxN array) is completely traversed?
void AltitudeMapPath::bestPath(unsigned int threadCount, unsigned int threadIndex) {
unsigned int tempPathCode;
unsigned int toPathSum, toRow, toCol;
unsigned int fromPathSum, fromRow, fromCol;
Coordinates startCoord, endCoord, toCoord, fromCoord;
// To and From split matrix in half along the diagonal
unsigned int currentPathCode = threadIndex;
unsigned int maxPathCode = ((unsigned int)1 << (numRows - 1));
while (currentPathCode < maxPathCode) {
tempPathCode = currentPathCode;
// Setup to path iteration
startCoord = pathedMap(0, 0);
toPathSum = startCoord.z;
toRow = 0;
toCol = 0;
// Setup from path iteration
endCoord = pathedMap(numRows - 1, numCols - 1);
fromPathSum = endCoord.z;
fromRow = numRows - 1;
fromCol = numCols - 1;
for (unsigned int index = 0; index < numRows - 1; index++) {
if (tempPathCode % 2 == 0) {
toCol++;
fromCol--;
}
else {
toRow++;
fromRow--;
}
toCoord = pathedMap(toRow, toCol);
toPathSum += toCoord.z;
fromCoord = pathedMap(fromRow, fromCol);
fromPathSum += fromCoord.z;
tempPathCode = tempPathCode >> 1;
}
if (toPathSum < bestToPathSum[threadIndex][toRow]) {
bestToPathSum[threadIndex][toRow] = toPathSum;
bestToPathCode[threadIndex][toRow] = currentPathCode;
}
if (fromPathSum < bestFromPathSum[threadIndex][fromRow]) {
bestFromPathSum[threadIndex][fromRow] = fromPathSum;
bestFromPathCode[threadIndex][fromRow] = currentPathCode;
}
currentPathCode += threadCount;
}
}
I simplified the code since all the extra stuff just detracts from the question. Also if people are wondering I wrote most of the application but this idea of using the bit-wise operators was given to me by my past instructor.
Edit:
I added the entire algorithm for which each thread executes on. The entire project is still a work a progress but here is the source code for the whole thing if any one is interested [GITHUB]

A right bit shift is equivalent to dividing by 2 to the power of the number of bits shifted. IE 1 >> 2 = 1 / (2 ^ 2) = 1 / 4
A left bit shift is equivalent to multiplying by 2 to the power of the number of bits shifted. IE 1 << 2 = 1 * 2 ^ 2 = 1 * 4
I'm not entirely sure what that algorithm does and why it needs to multiply by 2^ (num rows - 1) and then progressively divide by 2.

Fastest way of computing the power that a "power of 2" number used?

What would be the quickest way to find the power of 2, that a certain number (that is a power of two) used?
I'm not very skilled at mathematics, so I'm not sure how best to describe it. But the function would look similar to x = 2^y where y is the output, and x is the input. Here's a truth table of what it'd look like if that helps explain it.
0 = f(1)
1 = f(2)
2 = f(4)
3 = f(8)
...
8 = f(256)
9 = f(512)
I've made a function that does this, but I fear it's not very efficient (or elegant for that matter). Would there be a simpler and more efficient way of doing this? I'm using this to compute what area of a texture is used to buffer how drawing is done, so it's called at least once for every drawn object. Here's the function I've made so far:
uint32 getThePowerOfTwo(uint32 value){
for(uint32 n = 0; n < 32; ++n){
if(value <= (1 << n)){
return n;
}
}
return 32; // should never be called
}

Building on woolstar's answer - I wonder if a binary search of a lookup table would be slightly faster? (and much nicer looking)...
int getThePowerOfTwo(int value) {
static constexpr int twos[] = {
1<<0, 1<<1, 1<<2, 1<<3, 1<<4, 1<<5, 1<<6, 1<<7,
1<<8, 1<<9, 1<<10, 1<<11, 1<<12, 1<<13, 1<<14, 1<<15,
1<<16, 1<<17, 1<<18, 1<<19, 1<<20, 1<<21, 1<<22, 1<<23,
1<<24, 1<<25, 1<<26, 1<<27, 1<<28, 1<<29, 1<<30, 1<<31
};
return std::lower_bound(std::begin(twos), std::end(twos), value) - std::begin(twos);
}

This operation is sufficiently popular for processor vendors to come up with hardware support for it. Check out find first set. Compiler vendors offer specific functions for this, unfortunately there appears to be no standard how to name it. So if you need maximum performance you have to create compiler-dependent code:
# ifdef __GNUC__
return __builtin_ffs( x ) - 1; // GCC
#endif
#ifdef _MSC_VER
return CHAR_BIT * sizeof(x)-__lzcnt( x ); // Visual studio
#endif

If input value is only 2^n where n - integer, optimal way to find n is to use hash table with perfect hash function. In that case hash function for 32 unsigned integer could be defined as value % 37
template < size_t _Div >
std::array < uint8_t, _Div > build_hash()
{
std::array < uint8_t, _Div > hash_;
std::fill(hash_.begin(), hash_.end(), std::numeric_limits<uint8_t>::max());
for (size_t index_ = 0; index_ < 32; ++index_)
hash_[(1 << index_) % _Div] = index_;
return hash_;
}
uint8_t hash_log2(uint32_t value_)
{
static const std::array < uint8_t, 37 > hash_ = build_hash<37> ();
return hash_[value_%37];
}
Check
int main()
{
for (size_t index_ = 0; index_ < 32; ++index_)
assert(hash_log2(1 << index_) == index_);
}

Your version is just fine, but as you surmised, its O(n) which means it takes one step through the loop for every bit. You can do better. To take it to the next step, try doing the equivalent of a divide and conquer:
unsigned int log2(unsigned int value)
{
unsigned int val = 0 ;
unsigned int mask= 0xffff0000 ;
unsigned int step= 16 ;
while ( value )
{
if ( value & mask ) { val += step ; value &= ~ mask ; }
step /= 2 ;
if ( step ) { mask >>= step ; } else { mask >>= 1 ; }
}
return val ;
}
Since we're just hunting for the highest bit, we start out asking if any bits are on in the upper half of the word. If there are, we can throw away all the lower bits, else we just narrow the search down.
Since the question was marked C++, here's a version using templates that tries to figure out the initial mask & step:
template <typename T>
T log2(T val)
{
T result = 0 ;
T step= ( 4 * sizeof( T ) ) ; // half the number of bits
T mask= ~ 0L - ( ( 1L << ( 4 * sizeof( T )) ) -1 ) ;
while ( val && step )
{
if ( val & mask ) { result += step ; val >>= step ; }
mask >>= ( step + 1) / 2 ;
step /= 2 ;
}
return result ;
}
While performance of either version is going to be a blip on a modern x86 architecture, this has come up for me in embedded solutions, and in the last case where I was solving a bit search very similar to this, even the O(log N) was too slow for the interrupt and we had to use a combo of divide and conquer plus table lookup to squeeze the last few cycles out.

If you KNOW that it is indeed a power of two (which is easy enough to verify),
Try the variant below.
Full description here: http://sree.kotay.com/2007/04/shift-registers-and-de-bruijn-sequences_10.html
//table
static const int8 xs_KotayBits[32] = {
0, 1, 2, 16, 3, 6, 17, 21,
14, 4, 7, 9, 18, 11, 22, 26,
31, 15, 5, 20, 13, 8, 10, 25,
30, 19, 12, 24, 29, 23, 28, 27
};
//only works for powers of 2 inputs
static inline int32 xs_ILogPow2 (int32 v){
assert (v && (v&(v-1)==0));
//constant is binary 10 01010 11010 00110 01110 11111
return xs_KotayBits[(uint32(v)*uint32( 0x04ad19df ))>>27];
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

calculating square root for implementating a fixed point function - c++

Related

Why is there a loop in this division as multiplication code?

Add a bit in the middle of any two bits

Problem with Reversing Large Integers On Leetcode?

Bit-wise shift for Matrix iteration?

Fastest way of computing the power that a "power of 2" number used?

Categories

Resources