Downscale array for decimal factor - c++

Is there efficient way to downscale number of elements in array by decimal factor?
I want to downsize elements from one array by certain factor.
Example:
If I have 10 elements and need to scale down by factor 2.
1 2 3 4 5 6 7 8 9 10
scaled to
1.5 3.5 5.5 7.5 9.5
Grouping 2 by 2 and use arithmetic mean.
My problem is what if I need to downsize array with 10 elements to 6 elements? In theory I should group 1.6 elements and find their arithmetic mean, but how to do that?

Before suggesting a solution, let's define "downsize" in a more formal way. I would suggest this definition:
Downsizing starts with an array a[N] and produces an array b[M] such that the following is true:
M <= N - otherwise it would be upsizing, not downsizing
SUM(b) = (M/N) * SUM(a) - The sum is reduced proportionally to the number of elements
Elements of a participate in computation of b in the order of their occurrence in a
Let's consider your example of downsizing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 to six elements. The total for your array is 55, so the total for the new array would be (6/10)*55 = 33. We can achieve this total in two steps:
Walk the array a totaling its elements until we've reached the integer part of N/M fraction (it must be an improper fraction by rule 1 above)
Let's say that a[i] was the last element of a that we could take as a whole in the current iteration. Take the fraction of a[i+1] equal to the fractional part of N/M
Continue to the next number starting with the remaining fraction of a[i+1]
Once you are done, your array b would contain M numbers totaling to SUM(a). Walk the array once more, and scale the result by N/M.
Here is how it works with your example:
b[0] = a[0] + (2/3)*a[1] = 2.33333
b[1] = (1/3)*a[1] + a[2] + (1/3)*a[3] = 5
b[2] = (2/3)*a[3] + a[4] = 7.66666
b[3] = a[5] + (2/3)*a[6] = 10.6666
b[4] = (1/3)*a[6] + a[7] + (1/3)*a[8] = 13.3333
b[5] = (2/3)*a[8] + a[9] = 16
--------
Total = 55
Scaling down by 6/10 produces the final result:
1.4 3 4.6 6.4 8 9.6 (Total = 33)
Here is a simple implementation in C++:
double need = ((double)a.size()) / b.size();
double have = 0;
size_t pos = 0;
for (size_t i = 0 ; i != a.size() ; i++) {
if (need >= have+1) {
b[pos] += a[i];
have++;
} else {
double frac = (need-have); // frac is less than 1 because of the "if" condition
b[pos++] += frac * a[i]; // frac of a[i] goes to current element of b
have = 1 - frac;
b[pos] += have * a[i]; // (1-frac) of a[i] goes to the next position of b
}
}
for (size_t i = 0 ; i != b.size() ; i++) {
b[i] /= need;
}
Demo.

You will need to resort to some form of interpolation, as the number of elements to average isn't integer.
You can consider computing the prefix sum of the array, i.e.
0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10
yields by summation
0 1 2 3 4 5 6 7 8 9
1 3 6 10 15 21 28 36 45 55
Then perform linear interpolation to get the intermediate values that you are lacking, like at 0*, 10/6, 20/6, 30/5*, 40/6, 50/6, 60/6*. (Those with an asterisk are readily available).
0 1 10/6 2 3 20/6 4 5 6 40/6 7 8 50/6 9
1 3 15/3 6 10 35/3 15 21 28 100/3 36 45 145/3 55
Now you get fractional sums by subtracting values in pairs. The first average is
(15/3-1)/(10/6) = 12/5

I can't think of anything in the C++ library that will crank out something like this, all fully cooked and ready to go.
So you'll have to, pretty much, roll up your sleeves and go to work. At this point, the question of what's the "efficient" way of doing it boils down to its very basics. Which means:
1) Calculate how big the output array should be. Based on the description of the issue, you should be able to make that calculation even before looking at the values in the input array. You know the input array's size(), you can calculate the size() of the destination array.
2) So, you resize() the destination array up front. Now, you no longer need to worry about the time wasted in growing the size of the dynamic output array, incrementally, as you go through the input array, making your calculations.
3) So what's left is the actual work: iterating over the input array, and calculating the downsized values.
auto b=input_array.begin();
auto e=input_array.end();
auto p=output_array.begin();
Don't see many other options here, besides brute force iteration and calculations. Iterate from b to e, getting your samples, calculating each downsized value, and saving the resulting value into *p++.

Related

What is the maximum number of comparisons to heapify an array?

Is there a general formula to calculate the maximum number of comparisons to heapify n elements?
If not, is 13 the max number of comparisons to heapify an array of 8 elements?
My reasoning is as such:
at h = 0, 1 node, 0 comparisons, 1* 0 = 0 comparisons
at h = 1, 2 nodes, 1 comparison each, 2*1 = 2 comparisons
at h = 2, 4 nodes, 2 comparisons each, 4*2 = 8 comparisons
at h = 3, 1 node, 3 comparisons each, 1*3 = 3 comparisons
Total = 0 + 2 + 8 + 3 =13
Accepted theory is that build-heap requires at most (2N - 2) comparisons. So the maximum number of comparisons required should be 14. We can confirm that easily enough by examining a heap of 8 elements:
7
/ \
3 1
/ \ / \
5 4 8 2
/
6
Here, the 4 leaf nodes will never move down. The nodes 5 and 1 can move down 1 level. 3 could move down two levels. And 7 could move down 3 levels. So the maximum number of level moves is:
(0*4)+(1*2)+(2*1)+(3*1) = 7
Every level move requires 2 comparisons, so the maximum number of comparisons would be 14.

Using bit wise operators

Am working on a C++ app in Windows platform. There's a unsigned char pointer that get's bytes in decimal format.
unsigned char array[160];
This will have values like this,
array[0] = 0
array[1] = 0
array[2] = 176
array[3] = 52
array[4] = 0
array[5] = 0
array[6] = 223
array[7] = 78
array[8] = 0
array[9] = 0
array[10] = 123
array[11] = 39
array[12] = 0
array[13] = 0
array[14] = 172
array[15] = 51
.......
........
.........
and so forth...
I need to take each block of 4 bytes and then calculate its decimal value.
So for eg., for the 1st 4 bytes the combined hex value is B034. Now i need to convert this to decimal and divide by 1000.
As you see, for each 4 byte block the 1st 2 bytes are always 0. So i can ignore those and then take the last 2 bytes of that block. So from above example, it's 176 & 52.
There're many ways of doing this, but i want to do it via using bit wise operators.
Below is what i tried, but it's not working. Basically am ignoring the 1st 2 bytes of every 4 byte block.
int index = 0
for (int i = 0 ; i <= 160; i++) {
index++;
index++;
float Val = ((Array[index]<<8)+Array[index+1])/1000.0f;
index++;
}
Since you're processing the array four-by-four, I recommend that you increment i by 4 in the for loop. You can also avoid confusion after dropping the unnecessary index variable - you have i in the loop and can use it directly, no?
Another thing: Prefer bitwise OR over arithmetic addition when you're trying to "concatenate" numbers, although their outcome is identical.
for (int i = 0 ; i <= 160; i += 4) {
float val = ((array[i + 2] << 8) | array[i + 3]) / 1000.0f;
}
First of all, i <= 160 is one iteration too many.
Second, your incrementation is wrong; for index, you have
Iteration 1:
1, 2, 3
And you're combining 2 and 3 - this is correct.
Iteration 2:
4, 5, 6
And you're combining 5 and 6 - should be 6 and 7.
Iteration 3:
7, 8, 9
And you're combining 8 and 9 - should be 10 and 11.
You need to increment four times per iteration, not three.
But I think it's simpler to start looping at the first index you're interested in - 2 - and increment by 4 (the "stride") directly:
for (int i = 2; i < 160; i += 4) {
float Val = ((Array[i]<<8)+Array[i+1])/1000.0f;
}

how can we find the nth 3 word combination from a word corpus of 3000 words

I have a word corpus of say 3000 words such as [hello, who, this ..].
I want to find the nth 3 word combination from this corpus.I am fine with any order as long as the algorithm gives consistent output.
What would be the time complexity of the algorithm.
I have seen this answer but was looking for something simple.
(Note that I will be using 1-based indexes and ranks throughout this answer.)
To generate all combinations of 3 elements from a list of n elements, we'd take all elements from 1 to n-2 as the first element, then for each of these we'd take all elements after the first element up to n-1 as the second element, then for each of these we'd take all elements after the second element up to n as the third element. This gives us a fixed order, and a direct relation between the rank and a specific combination.
If we take element i as the first element, there are (n-i choose 2) possibilities for the second and third element, and thus (n-i choose 2) combinations with i as the first element. If we then take element j as the second element, there are (n-j choose 1) = n-j possibilities for the third element, and thus n-j combinations with i and j as the first two elements.
Linear search in tables of binomial coefficients
With tables of these binomial coefficients, we can quickly find a specific combination, given its rank. Let's look at a simplified example with a list of 10 elements; these are the number of combinations with element i as the first element:
i
1 C(9,2) = 36
2 C(8,2) = 28
3 C(7,2) = 21
4 C(6,2) = 15
5 C(5,2) = 10
6 C(4,2) = 6
7 C(3,2) = 3
8 C(2,2) = 1
---
120 = C(10,3)
And these are the number of combinations with element j as the second element:
j
2 C(8,1) = 8
3 C(7,1) = 7
4 C(6,1) = 6
5 C(5,1) = 5
6 C(4,1) = 4
7 C(3,1) = 3
8 C(2,1) = 2
9 C(1,1) = 1
So if we're looking for the combination with e.g. rank 96, we look at the number of combinations for each choice of first element i, until we find which group of combinations the combination ranked 96 is in:
i
1 36 96 > 36 96 - 36 = 60
2 28 60 > 28 60 - 28 = 32
3 21 32 > 21 32 - 21 = 11
4 15 11 <= 15
So we know that the first element i is 4, and that within the 15 combinations with i=4, we're looking for the eleventh combination. Now we look at the number of combinations for each choice of second element j, starting after 4:
j
5 5 11 > 5 11 - 5 = 6
6 4 6 > 4 6 - 4 = 2
7 3 2 <= 3
So we know that the second element j is 7, and that the third element is the second combination with j=7, which is k=9. So the combination with rank 96 contains the elements 4, 7 and 9.
Binary search in tables of running total of binomial coefficients
Instead of creating a table of the binomial coefficients and then performing a linear search, it is of course more efficient to create a table of the running total of the binomial coefficient, and then perform a binary search on it. This will improve the time complexity from O(N) to O(logN); in the case of N=3000, the two look-ups can be done in log2(3000) = 12 steps.
So we'd store:
i
1 36
2 64
3 85
4 100
5 110
6 116
7 119
8 120
and:
j
2 8
3 15
4 21
5 26
6 30
7 33
8 35
9 36
Note that when finding j in the second table, you have to subtract the sum corresponding with i from the sums. Let's walk through the example of rank 96 and combination [4,7,9] again; we find the first value that is greater than or equal to the rank:
3 85 96 > 85
4 100 96 <= 100
So we know that i=4; we then subtract the previous sum next to i-1, to get:
96 - 85 = 11
Now we look at the table for j, but we start after j=4, and subtract the sum corresponding to 4, which is 21, from the sums. then again, we find the first value that is greater than or equal to the rank we're looking for (which is now 11):
6 30 - 21 = 9 11 > 9
7 33 - 21 = 12 11 <= 12
So we know that j=7; we subtract the previous sum corresponding to j-1, to get:
11 - 9 = 2
So we know that the second element j is 7, and that the third element is the second combination with j=7, which is k=9. So the combination with rank 96 contains the elements 4, 7 and 9.
Hard-coding the look-up tables
It is of course unnecessary to generate these look-up tables again every time we want to perform a look-up. We only need to generate them once, and then hard-code them into the rank-to-combination algorithm; this should take only 2998 * 64-bit + 2998 * 32-bit = 35kB of space, and make the algorithm incredibly fast.
Inverse algorithm
The inverse algorithm, to find the rank given a combination of elements [i,j,k] then means:
Finding the index of the elements in the list; if the list is sorted (e.g. words sorted alphabetically) this can be done with a binary search in O(logN).
Find the sum in the table for i that corresponds with i-1.
Add to that the sum in the table for j that corresponds with j-1, minus the sum that corresponds with i.
Add to that k-j.
Let's look again at the same example with the combination of elements [4,7,9]:
i=4 -> table_i[3] = 85
j=7 -> table_j[6] - table_j[4] = 30 - 21 = 9
k=9 -> k-j = 2
rank = 85 + 9 + 2 = 96
Look-up tables for N=3000
This snippet generates the look-up table with the running total of the binomial coefficients for i = 1 to 2998:
function C(n, k) { // binomial coefficient (Pascal's triangle)
if (k < 0 || k > n) return 0;
if (k > n - k) k = n - k;
if (! C.t) C.t = [[1]];
while (C.t.length <= n) {
C.t.push([1]);
var l = C.t.length - 1;
for (var i = 1; i < l / 2; i++)
C.t[l].push(C.t[l - 1][i - 1] + C.t[l - 1][i]);
if (l % 2 == 0)
C.t[l].push(2 * C.t[l - 1][(l - 2) / 2]);
}
return C.t[n][k];
}
for (var total = 0, x = 2999; x > 1; x--) {
total += C(x, 2);
document.write(total + ", ");
}
This snippet generates the look-up table with the running total of the binomial coefficients for j = 2 to 2999:
for (var total = 0, x = 2998; x > 0; x--) {
total += x;
document.write(total + ", ");
}
Code example
Here's a quick code example, unfortunately without the full hardcoded look-up tables, because of the size restriction on answers on SO. Run the snippets above and paste the results into the arrays iTable and jTable (after the leading zeros) to get the faster version with hard-coded look-up tables.
function combinationToRank(i, j, k) {
return iTable[i - 1] + jTable[j - 1] - jTable[i] + k - j;
}
function rankToCombination(rank) {
var i = binarySearch(iTable, rank, 1);
rank -= iTable[i - 1];
rank += jTable[i];
var j = binarySearch(jTable, rank, i + 1);
rank -= jTable[j - 1];
var k = j + rank;
return [i, j, k];
function binarySearch(array, value, first) {
var last = array.length - 1;
while (first < last - 1) {
var middle = Math.floor((last + first) / 2);
if (value > array[middle]) first = middle;
else last = middle;
}
return (value <= array[first]) ? first : last;
}
}
var iTable = [0]; // append look-up table values here
var jTable = [0, 0]; // and here
// remove this part when using hard-coded look-up tables
function C(n,k){if(k<0||k>n)return 0;if(k>n-k)k=n-k;if(!C.t)C.t=[[1]];while(C.t.length<=n){C.t.push([1]);var l=C.t.length-1;for(var i=1;i<l/2;i++)C.t[l].push(C.t[l-1][i-1]+C.t[l-1][i]);if(l%2==0)C.t[l].push(2*C.t[l-1][(l-2)/2])}return C.t[n][k]}
for (var iTotal = 0, jTotal = 0, x = 2999; x > 1; x--) {
iTable.push(iTotal += C(x, 2));
jTable.push(jTotal += x - 1);
}
document.write(combinationToRank(500, 1500, 2500) + "<br>");
document.write(rankToCombination(1893333750) + "<br>");

Counting ways of breaking up a string of digits into numbers under 26

Given a string of digits, I wish to find the number of ways of breaking up the string into individual numbers so that each number is under 26.
For example, "8888888" can only be broken up as "8 8 8 8 8 8 8". Whereas "1234567" can be broken up as "1 2 3 4 5 6 7", "12 3 4 5 6 7" and "1 23 4 5 6 7".
I'd like both a recurrence relation for the solution, and some code that uses dynamic programming.
This is what I've got so far. It only covers the base cases which are a empty string should return 1 a string of one digit should return 1 and a string of all numbers larger than 2 should return 1.
int countPerms(vector<int> number, int currentPermCount)
{
vector< vector<int> > permsOfNumber;
vector<int> working;
int totalPerms=0, size=number.size();
bool areAllOverTwo=true, forLoop = true;
if (number.size() <=1)
{
//TODO: print out permetations
return 1;
}
for (int i = 0; i < number.size()-1; i++) //minus one here because we dont care what the last digit is if all of them before it are over 2 then there is only one way to decode them
{
if (number.at(i) <= 2)
{
areAllOverTwo = false;
}
}
if (areAllOverTwo) //if all the nubmers are over 2 then there is only one possable combination 3456676546 has only one combination.
{
permsOfNumber.push_back(number);
//TODO: write function to print out the permetions
return 1;
}
do
{
//TODO find all the peremtions here
} while (forLoop);
return totalPerms;
}
Assuming you either don't have zeros, or you disallow numbers with leading zeros), the recurrence relations are:
N(1aS) = N(S) + N(aS)
N(2aS) = N(S) + N(aS) if a < 6.
N(a) = 1
N(aS) = N(S) otherwise
Here, a refers to a single digit, and S to a number. The first line of the recurrence relation says that if your string starts with a 1, then you can either have it on its own, or join it with the next digit. The second line says that if you start with a 2 you can either have it on its own, or join it with the next digit assuming that gives a number less than 26. The third line is the termination condition: when you're down to 1 digit, the result is 1. The final line says if you haven't been able to match one of the previous rules, then the first digit can't be joined to the second, so it must stand on its own.
The recurrence relations can be implemented fairly directly as an iterative dynamic programming solution. Here's code in Python, but it's easy to translate into other languages.
def N(S):
a1, a2 = 1, 1
for i in xrange(len(S) - 2, -1, -1):
if S[i] == '1' or S[i] == '2' and S[i+1] < '6':
a1, a2 = a1 + a2, a1
else:
a1, a2 = a1, a1
return a1
print N('88888888')
print N('12345678')
Output:
1
3
An interesting observation is that N('1' * n) is the n+1'st fibonacci number:
for i in xrange(1, 20):
print i, N('1' * i)
Output:
1 1
2 2
3 3
4 5
5 8
6 13
7 21
8 34
9 55
If I understand correctly, there are only 25 possibilities. My first crack at this would be to initialize an array of 25 ints all to zero and when I find a number less than 25, set that index to 1. Then I would count up all the 1's in the array when I was finished looking at the string.
What do you mean by recurrence? If you're looking for a recursive function, you would need to find a good way to break the string of numbers down recursively. I'm not sure that's the best approach here. I would just go through digit by digit and as you said if the digit is 2 or less, then store it and test appending the next digit... i.e. 10*digit + next. I hope that helped! Good luck.
Another way to think about it is that, after the initial single digit possibility, for every sequence of contiguous possible pairs of digits (e.g., 111 or 12223) of length n we multiply the result by:
1 + sum, i=1 to floor (n/2), of (n-i) choose i
For example, with a sequence of 11111, we can have
i=1, 1 1 1 11 => 5 - 1 = 4 choose 1 (possibilities with one pair)
i=2, 1 11 11 => 5 - 2 = 3 choose 2 (possibilities with two pairs)
This seems directly related to Wikipedia's description of Fibonacci numbers' "Use in Mathematics," for example, in counting "the number of compositions of 1s and 2s that sum to a given total n" (http://en.wikipedia.org/wiki/Fibonacci_number).
Using the combinatorial method (or other fast Fibonacci's) could be suitable for strings with very long sequences.

Calculating Hamming Sequence in C++ (a sequence of numbers that has only 2, 3, and 5 as dividers) [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Generating a sequence using prime numbers 2, 3, and 5 only, and then displaying an nth term (C++)
I've been brainstorming over this forever, and I just can't figure this out. I need to solve the following problem:
Generate the following sequence and display the nth term in the
sequence
2,3,4,5,6,8,9,10,12,15, etc..... Sequence only has Prime numbers
2,3,5
I need to use basic C++, such as while, for, if, etc. Nothing fancy. I can't use arrays simply because I don't know much about them yet, and I want to understand the code for the solution.
I'm not asking for a complete solution, but I am asking for guidance to get through this... please.
My problem is that I can't figure out how to check if the number if the number in the sequence is divisible by any other prime numbers other than 2, 3, and 5.
Also let's say I'm checking the number like this:
for(int i=2; i<n; i++){
if(i%2==0){
cout<<i<<", ";
}else if(i%3==0){
cout<<i<<", ";
}else if(i%5==0){
cout<<i<<", ";
}
It doesn't work simply due to the fact that it'll produce numbers such as 14, which can be divided by prime number 7. So I need to figure out how to ensure that that sequence is only divisible by 2, 3, and 5..... I've found lots of material online with solutions for the problem, but the solutions they have are far too advance, and I can't use them (also most of them are in other languages... not C++). I'm sure there's a simpler way.
The problem with your code is that you just check one of the prime factors, not all of them.
Take your example of 14. Your code only checks if 2,3 or 5 is a factor of 14, which is not exactly what you need. Indeed, you find that 2 is a factor of 14, but the other factor is 7, as you said. What you are missing is to further check if 7 has as only factors 2,3 and 5 (which is not the case). What you need to do is to eliminate all the factors 2,3 and 5 and see what is remaining.
Let's take two examples: 60 and 42
For 60
Start with factors 2
60 % 2 = 0, so now check 60 / 2 = 30.
30 % 2 = 0, so now check 30 / 2 = 15.
15 % 2 = 1, so no more factors of 2.
Go on with factors 3
15 % 3 = 0, so now check 15 / 3 = 5.
5 % 3 = 2, so no more factors of 3.
Finish with factors 5
5 % 5 = 0, so now check 5 / 5 = 1
1 % 5 = 1, so no more factors of 5.
We end up with 1, so this number is part of the sequence.
For 42
Again, start with factors 2
42 % 2 = 0, so now check 42 / 2 = 21.
21 % 2 = 1, so no more factors of 2.
Go on with factors 3
21 % 3 = 0, so now check 21 / 3 = 7.
7 % 3 = 1, so no more factors of 3.
Finish with factors 5
7 % 5 = 2, so no more factors of 5.
We end up with 7 (something different from 1), so this number is NOT part of the sequence.
So in your implementation, you should probably nest 3 while loops in your for loop to reflect this reasoning.
Store the next i value in temporary variable and then divide it by 2 as long as you can (eg. as long as i%2 == 0). Then divide by 3 as long as you can. Then by 5. And then check, what is left.
What about this?
bool try_hamming(int n)
{
while(n%2 == 0)
{
n = n/2;
}
while(n%3 == 0)
{
n = n/3;
}
while(n%5 == 0)
{
n = n/5;
}
return n==1;
}
This should return true when n is a hamming nummer and false other wise. So the main function could look something like this
#include<iostream>
using namespace std;
main()
{
for(int i=2;i<100;++i)
{
if(try_hamming(i) )
cout<< i <<",";
}
cout<<end;
}
this schould print out all Hamming numbers less then 100