C++ Radix sort algorithm

C++ Radix sort algorithm - c++

Trying to understand radix sort for my data structures class. My teacher showed us a sample of radix sort in C++. I don't understand what the for loop for the digits does, she said something about maximum digits. Also when I try this in VS it says log10 is an ambiguous call to an overloaded function.
void RadixSort(int A[], int size)
{
int d = 1;
for(int i = 0; i < size; ++i)
{
int digits_temp;
digits_temp=(int)log10(abs(A[i]!=0 ? abs(A[i]) : 1)) +1;
if(digits_temp > d)
d = digits_temp;
}
d += 1;
*rest of the implementation*
}
Can anyone explain what this for loop does and why i get that ambiguous call error? Thanks

That piece of code is just a search for the number of digits needed for the "longest" integer; that's probably needed to allocate some buffer later.
log10 gives you the power of ten that corresponds to its argument, which, rounded to the next integer (hence the +1 followed by the (int) cast, which results in truncation), gives you the number of digits required for the number.
The argument of log10 is a bit of a mess, since abs is called twice when just once would suffice. Still, the idea is to pass to log10 the absolute value of the number being examined if it's not zero, or 1 if it is zero - this because, if the argument were zero, the logarithm would diverge to minus infinity (which is not desirable in this case, I think that the conversion to int would lead to strange results).
The rest of the loop is just the search for the maximum: at each iteration it calculates the digits needed for the current int being examined, checks if it's bigger than the "current maximum" (d) and, if it is, it replaces the "current maximum".
The d+=1 may be for cautionary purposes (?) or for the null-terminator of the string being allocated, it depends on how d is used afterward.
As for the "ambiguous call" error: you get it because you are calling log10 with an int argument, which can be converted equally to float, double and long double (all types for which log10 is overloaded), so the overload to choose is not clear to the compiler. Just stick a (double) cast before the whole log10 argument.
By the way, that code could have been simplified/optimized by just looking for the maximum int (in absolute value) and then taking the base-10 logarithm to discover the number of digits needed.

Log base 10 + 1 gives you the total number of digits present in a number.
Essentially here, you are checking every element in the array A[] and if the element is == 0 you store 1 in the digits_temp variable.
You initialize d = 1 as a number should have atleast 1 digit, and if it has more than 1 you replace it with the number of digits calculated.
Hope that helps.

There are 3 types of definition for log10 function which are float,double,long double input.
log10( static_cast<double> (abs(A[i]!=0 ? abs(A[i]) : 1)) );
So you need to static cast it as double to avoid the error.
(int)log10(x)+1 gives the number of digit present in that number.
Rest is simple implementation of Radix Sort

You see the warning because log10 is defined for float, double and long double but not integer and it's being called with a integer. The compiler can convert the int into any of those types so the call is ambiguous.
The for loop is doing a linear search for the maximum of digits in any of the numbers in the array. It is unnecessarily complicated and slow because you can simply searched for the largest absolute value in A then taken the log10 of that.
void RadixSort(int A[], int size)
{
int max_abs = 1;
for(int i = 0; i < size; ++i)
{
if(abs(A[i] > max_abs)
max_abs = abs(A[i]);
}
int d += log10(float(max_abs));
/* rest of the implementation */
}

Rest of code is missing so cant exactly determined usage.
But basically Radix sort goes over all INTEGERS and sort them comparing Digit Digit starting from least significant upwards.
the first part of code only determines the max digit count+1 from integers in array, this could be used to normalize all numbers to same length for easy handling.
i.e (1,239,2134) to (0001,0239,2134)

Related

How to convert an arbitrary length unsigned int array to a base 10 string representation?

I am currently working on an arbitrary size integer library for learning purposes.
Each number is represented as uint32_t *number_segments.
I have functional arithmetic operations, and the ability to print the raw bits of my number.
However, I have struggled to find any information on how I could convert my arbitrarily long array of uint32 into the correct, and also arbitrarily long base 10 representation as a string.
Essentially I need a function along the lines of:
std::string uint32_array_to_string(uint32_t *n, size_t n_length);
Any pointers in the right direction would be greatly appreciated, thank you.

You do it the same way as you do with a single uint64_t except on a larger scale (bringing this into modern c++ is left for the reader):
char * to_str(uint64_t x) {
static char buf[23] = {0}; // leave space for a minus sign added by the caller
char *p = &buf[22];
do {
*--p = '0' + (x % 10);
x /= 10;
} while(x > 0);
return p;
}
The function fills a buffer from the end with the lowest digits and divides the number by 10 in each step and then returns a pointer to the first digit.
Now with big nums you can't use a static buffer but have to adjust the buffer size to the size of your number. You probably want to return a std::string and creating the number in reverse and then copying it into a result string is the way to go. You also have to deal with negative numbers.
Since a long division of a big number is expensive you probably don't want to divide by 10 in the loop. Rather divide by 1'000'000'000 and convert the remainder into 9 digits. This should be the largest power of 10 you can do long division by a single integer, not bigum / bignum. Might be you can only do 10'000 if you don't use uint64_t in the division.

finding number of trailing zeroes in a number

I wanted to find the number of trailing zeroes in a number, so i made the following code. It worked fine for certain numbers but for bigger numbers it started showing anomaly. Like when i input the number"12345678" it show zero 0`s which is correct but when i input "123456789" it shows one zero, what can be the possible mistake in my code???
#include<iostream>
#include<math.h>
using namespace std;
int main(){
int n = 0;
float s;
cin>>s; //the number to be given as input
for(int j = 0;j <100;j++){
s = s/10;
if(s == floor(s)){
n++;
}else{
break;
}
}
cout<<n<<endl;
return 0;
}

Floating point numbers have a limited precision. Usually, float is a 32-bit number, double is a 64-bit one. Float can store integer numbers precisely if the number is less than or equal to 16777216 (it is 2^24).
So, when 123456789 is read into a float variable, will have a different value, it becomes 123456792. At this point, there is no rationale to count trailing zeros.
Double can store integer numbers precisely if it is less than or equal to 9007199254740992 (2^53).
An unsigned long long int can store integer numbers less than 2^64. If you choose this way, use this condition for checking trailing zero: if (number%10==0)
If you only want to count trailing zeros, and that's all, then use std::string instead. This way you can handle as big numbers as you like.

C++ Modulus returning wrong answer

Here is my code :
#include <iostream>
#include <cmath>
using namespace std;
int main()
{
int n, i, num, m, k = 0;
cout << "Enter a number :\n";
cin >> num;
n = log10(num);
while (n > 0) {
i = pow(10, n);
m = num / i;
k = k + pow(m, 3);
num = num % i;
--n;
cout << m << endl;
cout << num << endl;
}
k = k + pow(num, 3);
return 0;
}
When I input 111 it gives me this
1
12
1
2
I am using codeblocks. I don't know what is wrong.

Whenever I use pow expecting an integer result, I add .5 so I use (int)(pow(10,m)+.5) instead of letting the compiler automatically convert pow(10,m) to an int.
I have read many places telling me others have done exhaustive tests of some of the situations in which I add that .5 and found zero cases where it makes a difference. But accurately identifying the conditions in which it isn't needed can be quite hard. Using it when it isn't needed does no real harm.
If it makes a difference, it is a difference you want. If it doesn't make a difference, it had a tiny cost.
In the posted code, I would adjust every call to pow that way, not just the one I used as an example.
There is no equally easy fix for your use of log10, but it may be subject to the same problem. Since you expect a non integer answer and want that non integer answer truncated down to an integer, adding .5 would be very wrong. So you may need to find some more complicated work around for the fundamental problem of working with floating point. I'm not certain, but assuming 32-bit integers, I think adding 1e-10 to the result of log10 before converting to int is both never enough to change log10(10^n-1) into log10(10^n) but always enough to correct the error that might have done the reverse.

pow does floating-point exponentiation.
Floating point functions and operations are inexact, you cannot ever rely on them to give you the exact value that they would appear to compute, unless you are an expert on the fine details of IEEE floating point representations and the guarantees given by your library functions.
(and furthermore, floating-point numbers might even be incapable of representing the integers you want exactly)
This is particularly problematic when you convert the result to an integer, because the result is truncated to zero: int x = 0.999999; sets x == 0, not x == 1. Even the tiniest error in the wrong direction completely spoils the result.
You could round to the nearest integer, but that has problems too; e.g. with sufficiently large numbers, your floating point numbers might not have enough precision to be near the result you want. Or if you do enough operations (or unstable operations) with the floating point numbers, the errors can accumulate to the point you get the wrong nearest integer.
If you want to do exact, integer arithmetic, then you should use functions that do so. e.g. write your own ipow function that computes integer exponentiation without any floating-point operations at all.

Calculating Probability C++ Bernoulli Trials

The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.

You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)

You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.

Is there a way to find sum of digits of 100!?

I know there is a way of finding the sum of digits of 100!(or any other big number's factorial) using Python. But I find it really tough when it comes to C++ as the the size of even LONG LONG is not enough.
I just want to know if there is some other way.
I get it that it is not possible as our processor is generally 32 bits. What I am referring is some other kind of tricky technique or algorithm which can accomplish the same using the same resources.

Use a digit array with the standard, on-paper method of multiplication. For example, in C :
#include <stdio.h>
#define DIGIT_COUNT 256
void multiply(int* digits, int factor) {
int carry = 0;
for (int i = 0; i < DIGIT_COUNT; i++) {
int digit = digits[i];
digit *= factor;
digit += carry;
digits[i] = digit % 10;
carry = digit / 10;
}
}
int main(int argc, char** argv) {
int n = 100;
int digits[DIGIT_COUNT];
digits[0] = 1;
for (int i = 1; i < DIGIT_COUNT; i++) { digits[i] = 0; }
for (int i = 2; i < n; i++) { multiply(digits, i); }
int digitSum = 0;
for (int i = 0; i < DIGIT_COUNT; i++) { digitSum += digits[i]; }
printf("Sum of digits in %d! is %d.\n", n, digitSum);
return 0;
}

How are you going to find the sum of digits of 100!. If you calculate 100! first, and then find the sum, then what is the point. You will have to use some intelligent logic to find it without actually calculating 100!. Remove all the factors of five because they are only going to add zeros. Think in this direction rather than thinking about the big number. Also I am sure the final answer i.e. the sum of the digits will be within LONG LONG.
There are C++ big int libraries, but I think the emphasis here is on algorithm rather than library.

long long is not a part of C++. g++ provides it as an extension.
Arbitrary Precision Arithmetic is something that you are looking for. Check out the pseudocode given in the wiki page.
Furthermore long long cannot store such large values. So you can either create your BigInteger Class or you can use some 3rd party libraries like GMP or C++ BigInteger.

If you're referring to the Project Euler problem, my reading of that is that it wants you to write your own arbitrary-precision integer library or class that can multiply numbers.
My suggestion is to store the base-10 digits of a number, in reverse order to the way you'd normally write them, because you'll need to convert the number to base 10 in the end, anyway. Storing the digits in reverse order makes writing the addition and multiplication routines slightly easier, in my opinion. Then write addition and multiplication routines that emulate how you would add or multiply numbers manually.

Observe that multiplying any number by 10 or 100 does not change the sum of the digits.
Once you recognize that, see that multiplying by 2 and 5, or by 20 and 50, also does not change the sum, since 2x5 = 10 and 20x50 = 1000.
Then notice that anytime your current computation ends in a 0, you can simply divide by 10, and keep calculating your factorial.
Make a few more observations about shortcuts to eliminate numbers from 1 to 100, and I think you might be able to fit the answer into standard ints.

There are a number of BigInteger libraries available in C++. Just Google "C++ BigInteger". But if this is a programming contest problem then you should better try to implement your own BigInteger library.

Nothing in project Euler requires more than __int64.
I would suggest trying to do it using base 10000.

You could take the easy road and use perl/python/lisp/scheme/erlang/etc to calculate 100! using one of their built-in bignum libraries or the fact some languages use exact integer arithmetic. Then take that number, store it into a string, and find the sum of the characters (accounting for '0' = 48 etc).
Or, you could consider that in 100!, you will get a really large number with many many zeros. If you calculate 100! iteratively, consider dividing by 10 every time the current factorial is divisible by 10. I believe this will yield a result within the range of long long or something.
Or, probably a better exercise is to write your own big int library. You will need it for some later problems if you do not determine the clever tricks.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js