Clojure reducers fold not calling combine function as expected - clojure

I'm experimenting with the clojure reducers library, and I'm a little confused as to when the combining function is called as part of the reducers/fold function. To see what was being called when, I created the below example:
(def input (range 1 100))
(defn combine-f
([]
(println "identity-combine")
0)
([left right]
(println (str "combine " left " " right))
(max (reduce max 0 left)
(reduce max 0 right))))
(defn reduce-f
([]
(println "identity-reduce")
0)
([result input]
(println (str "reduce " result " " input))
(max result input)))
(clojure.core.reducers/fold 10 combine-f reduce-f input)
;prints
identity-combine
reduce 0 1
reduce 1 2
reduce 2 3
reduce 3 4
.
.
.
reduce 98 99
I was expecting that when fold executes, the input would be partitioned into groups of approximately size 10, with each group reduced using reduce-f, and then combined using combine-f. However running the above code, it seems that the combine function is only called once as an identity, and the entire input reduced using reduce-f. Can anyone explain why I'm seeing this behaviour?
Thanks,
Matt.

Unfortunately, range cannot at the moment be realized in parallel. It seems there are foldable implementations around as an enhancement ticket, but I can't seem to find right now why they haven't been accepted. As is, folds over a range will always proceed like a straight reduce, except for the identity call to the combine operator. For comparison, a vector provides random access and so is foldable:
(def input (vec (range 1 50)))
(defn combine-f
([]
(println "identity-combine")
Long/MIN_VALUE)
([left right]
(println (str "combine " left " " right))
(max left right)))
(defn reduce-f
([]
(println "identity-reduce")
Long/MIN_VALUE)
([result input]
(println (str "reduce " result " " input))
(max result input)))
(clojure.core.reducers/fold 10 combine-f reduce-f input)
with output:
identity-combineidentity-combineidentity-combine
reduce -9223372036854775808 1
reduce -9223372036854775808 25reduce -9223372036854775808 19
reduce -9223372036854775808 13reduce 25 26
reduce 26 27
reduce 1 2
reduce 27 28
reduce 28 29
reduce 29 30
reduce 2 3
reduce 19 20
reduce 3 4
identity-combinereduce 4 5
reduce 5 6reduce 13 14
reduce 14 15
reduce 20 21identity-combine
reduce 21 22
reduce 15 16
reduce -9223372036854775808 31
reduce 22 23reduce 16 17reduce -9223372036854775808 7
reduce 7 8
reduce 8 9
reduce 23 24
reduce 31 32
reduce 17 18
reduce 9 10
reduce 10 11
reduce 11 12
identity-combine
reduce 32 33
combine 18 24
combine 6 12identity-combine
reduce -9223372036854775808 37
reduce 33 34
reduce 37 38reduce -9223372036854775808 43
combine 12 24
reduce 43 44reduce 34 35reduce 38 39
reduce 44 45
reduce 35 36
reduce 45 46
reduce 39 40
reduce 46 47
combine 30 36
reduce 47 48
reduce 48 49
reduce 40 41
reduce 41 42
combine 42 49
combine 36 49
combine 24 49
which you may notice is a lot more jumbled because of non-serialized accessing of *out*.
(I needed to alter combine-f a little because it was trying and failing to reduce over a single long. Switching to Long/MIN_VALUE doesn't affect this example much but is the identity element of max over longs, so I figured why not?).

Related

Evaluating polynomials to 5 significant figures but only 1 sig fig returns - Maple Programming

For example, A polynomial is defined as follows:
f := (x, y) -> 333.75y^6 + x^2(11x^2y^2 - y^6 - 12y^4 - 2) + 5.5y^8 + 1/2*x/y
In maple, I look to evaluate this to 5 significant figures like so:
evalf[5](f(77617,33096))
And obtain a value that is: 1*10^32.
Why is this not to 5 sig fig? Why is this not close to a value of 7.878 * 10^29 as you increase the number of sig fig required?
Thanks!
Don't reduce the working precision that low, especially if you are trying to compute an accurate answer (and then round it for convenience).
More importantly, for compound expressions the floating-point working precision (Digits, or the index of an evalf call) is just that: a specification of working precision and not an accuracy request.
By lowering the working precision so much you are seeing greater roundoff error in the floating-point computation.
restart;
f := (x, y) -> 333.75*y^6
+ x^2*(11*x^2*y^2 - y^6 - 12*y^4 - 2)
+ 5.5*y^8 + 1/2*x/y:
for d from 5 to 15 do
evalf[5](evalf[d](f(77617,33096)));
end do;
32
1 10
31
-3 10
30
1 10
29
8 10
29
7.9 10
29
7.88 10
29
7.878 10
29
7.8784 10
29
7.8785 10
29
7.8785 10
29
7.8785 10

Density of fractions between 2 given numbers

I'm trying to do some analysis over a simple Fraction class and I want some data to compare that type with doubles.
The problem
Right know I'm looking for some good way to get the density of Fractions between 2 numbers. Fractions is basically 2 integers (e.g. pair< long, long>), and the density between s and t is the amount of representable numbers in that range. And it needs to be an exact, or very good approximation done in O(1) or very fast.
To make it a bit simpler, let's say I want all the numbers (not fractions) a/b between s and t, where 0 <= s <= a/b < t <= M, and 0 <= a,b <= M (b > 0, a and b are integers)
Example
If my fractions were of a data type which only count to 6 (M = 6), and I want the density between 0 and 1, the answer would be 12. Those numbers are:
0, 1/6, 1/5, 1/4, 1/3, 2/5, 1/2, 3/5, 2/3, 3/4, 4/5, 5/6.
What I thought already
A very naive approach would be to cycle trough all the possible fractions, and count those which can't be simplified. Something like:
long fractionsIn(double s, double t){
long density = 0;
long M = LONG_MAX;
for(int d = 1; d < floor(M/t); d++){
for(int n = ceil(d*s); n < M; n++){
if( gcd(n,d) == 1 )
density++;
}
}
return density;
}
But gcd() is very slow so it doesn't works. I also try doing some math but i couldn't get to anything good.
Solution
Thanks to #m69 answer, I made this code for Fraction = pair<Long,Long>:
//this should give the density of fractions between first and last, or less.
double fractionsIn(unsigned long long first, unsigned long long last){
double pi = 3.141592653589793238462643383279502884;
double max = LONG_MAX; //i can't use LONG_MAX directly
double zeroToOne = max/pi * max/pi * 3; // = approx. amount of numbers in Farey's secuence of order LONG_MAX.
double res = 0;
if(first == 0){
res = zeroToOne;
first++;
}
for(double i = first; i < last; i++){
res += zeroToOne/(i * i+1);
if(i == i+1)
i = nextafter(i+1, last); //if this happens, i might not count some fractions, but i have no other choice
}
return floor(res);
}
The main change is nextafter, which is important with big numbers (1e17)
The result
As I explain at the begining, I was trying to compare Fractions with double. Here is the result for Fraction = pair<Long,Long> (and here how I got the density of doubles):
Density between 0,1: | 1,2 | 1e6,1e6+1 | 1e14,1e14+1 | 1e15-1,1e15 | 1e17-10,1e17 | 1e19-10000,1e19 | 1e19-1000,1e19
Doubles: 4607182418800017408 | 4503599627370496 | 8589934592 | 64 | 8 | 1 | 5 | 0
Fraction: 2.58584e+37 | 1.29292e+37 | 2.58584e+25 | 2.58584e+09 | 2.58584e+07 | 2585 | 1 | 0
Density between 0 and 1
If the integers with which you express the fractions are in the range 0~M, then the density of fractions between the values 0 (inclusive) and 1 (exclusive) is:
M: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0~(1): 1 2 4 6 10 12 18 22 28 32 42 46 58 64 72 80 96 102 120 128 140 150 172 180 200 212 230 242 270 278 308 ...
This is sequence A002088 on OEIS. If you scroll down to the formula section, you'll find information about how to approximate it, e.g.:
Φ(n) = (3 ÷ π2) × n2 + O[n × (ln n)2/3 × (ln ln n)4/3]
(Unfortunately, no more detail is given about the constants involved in the O[x] part. See discussion about the quality of the approximation below.)
Distribution across range
The interval from 0 to 1 contains half of the total number of unique fractions that can be expressed with numbers up to M; e.g. this is the distribution when M = 15 (i.e. 4-bit integers):
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
for a total of 144 unique fractions. If you look at the sequence for different values of M, you'll see that the steps in this sequence converge:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1: 1 1
2: 2 1 1
3: 4 2 1 1
4: 6 3 1 1 1
5: 10 5 2 1 1 1
6: 12 6 2 1 1 1 1
7: 18 9 3 2 1 1 1 1
8: 22 11 4 2 1 1 1 1 1
9: 28 14 5 2 2 1 1 1 1 1
10: 32 16 5 3 2 1 1 1 1 1 1
11: 42 21 7 4 2 2 1 1 1 1 1 1
12: 46 23 8 4 2 2 1 1 1 1 1 1 1
13: 58 29 10 5 3 2 2 1 1 1 1 1 1 1
14: 64 32 11 5 4 2 2 1 1 1 1 1 1 1 1
15: 72 36 12 6 4 2 2 2 1 1 1 1 1 1 1 1
Not only is the density between 0 and 1 half of the total number of fractions, but the density between 1 and 2 is a quarter, and the density between 2 and 3 is close to a twelfth, and so on.
As the value of M increases, the distribution of fractions across the ranges 0-1, 1-2, 2-3 ... converges to:
1/2, 1/4, 1/12, 1/24, 1/40, 1/60, 1/84, 1/112, 1/144, 1/180, 1/220, 1/264 ...
This sequence can be calculated by starting with 1/2 and then:
0-1: 1/2 x 1/1 = 1/2
1-2: 1/2 x 1/2 = 1/4
2-3: 1/4 x 1/3 = 1/12
3-4: 1/12 x 2/4 = 1/24
4-5: 1/24 x 3/5 = 1/40
5-6: 1/40 x 4/6 = 1/60
6-7: 1/60 x 5/7 = 1/84
7-8: 1/84 x 6/8 = 1/112
8-9: 1/112 x 7/9 = 1/144 ...
You can of course calculate any of these values directly, without needing the steps inbetween:
0-1: 1/2
6-7: 1/2 x 1/6 x 1/7 = 1/84
(Also note that the second half of the distribution sequence consists of 1's; these are all the integers divided by 1.)
Approximating the density in given interval
Using the formulas provided on the OEIS page, you can calculate or approximate the density in the interval 0-1, and multiplied by 2 this is the total number of unique values that can be expressed as fractions.
Given two values s and t, you can then calculate and sum the densities in the intervals s ~ s+1, s+1 ~ s+2, ... t-1 ~ t, or use an interpolation to get a faster but less precise approximate value.
Example
Let's assume that we're using 10-bit integers, capable of expressing values from 0 to 1023. Using this table linked from the OEIS page, we find that the density between 0~1 is 318452, and the total number of fractions is 636904.
If we wanted to find the density in the interval s~t = 100~105:
100~101: 1/2 x 1/100 x 1/101 = 1/20200 ; 636904/20200 = 31.53
101~102: 1/2 x 1/101 x 1/102 = 1/20604 ; 636904/20604 = 30.91
102~103: 1/2 x 1/102 x 1/103 = 1/21012 ; 636904/21012 = 30.31
103~104: 1/2 x 1/103 x 1/104 = 1/21424 ; 636904/21424 = 29.73
104~105: 1/2 x 1/104 x 1/105 = 1/21840 ; 636904/21840 = 29.16
Rounding these values gives the sum:
32 + 31 + 30 + 30 + 29 = 152
A brute force algorithm gives this result:
32 + 32 + 30 + 28 + 28 = 150
So we're off by 1.33% for this low value of M and small interval with just 5 values. If we had used linear interpolation between the first and last value:
100~101: 31.53
104~105: 29.16
average: 30.345
total: 151.725 -> 152
we'd have arrived at the same value. For larger intervals, the sum of all the densities will probably be closer to the real value, because rounding errors will cancel each other out, but the results of linear interpolation will probably become less accurate. For ever larger values of M, the calculated densities should converge with the actual values.
Quality of approximation of Φ(n)
Using this simplified formula:
Φ(n) = (3 ÷ π2) × n2
the results are almost always smaller than the actual values, but they are within 1% for n ≥ 182, within 0.1% for n ≥ 1880 and within 0.01% for n ≥ 19494. I would suggest hard-coding the lower range (the first 50,000 values can be found here), and then using the simplified formula from the point where the approximation is good enough.
Here's a simple code example with the first 182 values of Φ(n) hard-coded. The approximation of the distribution sequence seems to add an error of a similar magnitude as the approximation of Φ(n), so it should be possible to get a decent approximation. The code simply iterates over every integer in the interval s~t and sums the fractions. To speed up the code and still get a good result, you should probably calculate the fractions at several points in the interval, and then use some sort of non-linear interpolation.
function fractions01(M) {
var phi = [0,1,2,4,6,10,12,18,22,28,32,42,46,58,64,72,80,96,102,120,128,140,150,172,180,200,212,230,242,270,278,308,
324,344,360,384,396,432,450,474,490,530,542,584,604,628,650,696,712,754,774,806,830,882,900,940,964,1000,
1028,1086,1102,1162,1192,1228,1260,1308,1328,1394,1426,1470,1494,1564,1588,1660,1696,1736,1772,1832,1856,
1934,1966,2020,2060,2142,2166,2230,2272,2328,2368,2456,2480,2552,2596,2656,2702,2774,2806,2902,2944,3004,
3044,3144,3176,3278,3326,3374,3426,3532,3568,3676,3716,3788,3836,3948,3984,4072,4128,4200,4258,4354,4386,
4496,4556,4636,4696,4796,4832,4958,5022,5106,5154,5284,5324,5432,5498,5570,5634,5770,5814,5952,6000,6092,
6162,6282,6330,6442,6514,6598,6670,6818,6858,7008,7080,7176,7236,7356,7404,7560,7638,7742,7806,7938,7992,
8154,8234,8314,8396,8562,8610,8766,8830,8938,9022,9194,9250,9370,9450,9566,9654,9832,9880,10060];
if (M < 182) return phi[M];
return Math.round(M * M * 0.30396355092701331433 + M / 4); // experimental; see below
}
function fractions(M, s, t) {
var half = fractions01(M);
var frac = (s == 0) ? half : 0;
for (var i = (s == 0) ? 1 : s; i < t && i <= M; i++) {
if (2 * i < M) {
var f = Math.round(half / (i * (i + 1)));
frac += (f < 2) ? 2 : f;
}
else ++frac;
}
return frac;
}
var M = 1023, s = 100, t = 105;
document.write(fractions(M, s, t));
Comparing the approximation of Φ(n) with the list of the 50,000 first values suggests that adding M÷4 is a workable substitute for the second part of the formula; I have not tested this for larger values of n, so use with caution.
Blue: simplified formula. Red: improved simplified formula.
Quality of approximation of distribution
Comparing the results for M=1023 with those of a brute-force algorithm, the errors are small in real terms, never more than -7 or +6, and above the interval 205~206 they are limited to -1 ~ +1. However, a large part of the range (57~1024) has fewer than 100 fractions per integer, and in the interval 171~1024 there are only 10 fractions or fewer per integer. This means that small errors and rounding errors of -1 or +1 can have a large impact on the result, e.g.:
interval: 241 ~ 250
fractions/integer: 6
approximation: 5
total: 50 (instead of 60)
To improve the results for intervals with few fractions per integer, I would suggest combining the method described above with a seperate approach for the last part of the range:
Alternative method for last part of range
As already mentioned, and implemented in the code example, the second half of the range, M÷2 ~ M, has 1 fraction per integer. Also, the interval M÷3 ~ M÷2 has 2; the interval M÷4 ~ M÷3 has 4. This is of course the Φ(n) sequence again:
M/2 ~ M : 1
M/3 ~ M/2: 2
M/4 ~ M/3: 4
M/5 ~ M/4: 6
M/6 ~ M/5: 10
M/7 ~ M/6: 12
M/8 ~ M/7: 18
M/9 ~ M/8: 22
M/10 ~ M/9: 28
M/11 ~ M/10: 32
M/12 ~ M/11: 42
M/13 ~ M/12: 46
M/14 ~ M/13: 58
M/15 ~ M/14: 64
M/16 ~ M/15: 72
M/17 ~ M/16: 80
M/18 ~ M/17: 96
M/19 ~ M/18: 102 ...
Between these intervals, one integer can have a different number of fractions, depending on the exact value of M, e.g.:
interval fractions
202 ~ 203 10
203 ~ 204 10
204 ~ 205 9
205 ~ 206 6
206 ~ 207 6
The interval 204 ~ 205 lies on the edge between intervals, because M ÷ 5 = 204.6; it has 6 + 3 = 9 fractions because M modulo 5 is 3. If M had been 1022 or 1024 instead of 1023, it would have 8 or 10 fractions. (This example is straightforward because 5 is a prime; see below.)
Again, I would suggest using the hard-coded values for Φ(n) to calculate the number of fractions for the last part of the range. If you use the first 17 values as listed above, this covers the part of the range with fewer than 100 fractions per integer, so that would reduce the impact of rounding errors below 1%. The first 56 values would give you 0.1%, the first 182 values 0.01%.
Together with the values of Φ(n), you could hard-code the number of fractions of the edge intervals for each modulo value, e.g.:
modulo: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
M/ 2 1 2
M/ 3 2 3 4
M/ 4 4 5 5 6
M/ 5 6 7 8 9 10
M/ 6 10 11 11 11 11 12
M/ 7 12 13 14 15 16 17 18
M/ 8 18 19 19 20 20 21 21 22
M/ 9 22 23 24 24 25 26 26 27 28
M/10 28 29 29 30 30 30 30 31 31 32
M/11 32 33 34 35 36 37 38 39 40 41 42
M/12 42 43 43 43 43 44 44 45 45 45 45 46
M/13 46 47 48 49 50 51 52 53 54 55 56 57 58
M/14 58 59 59 60 60 61 61 61 61 62 62 63 63 64
M/15 64 65 66 66 67 67 67 68 69 69 69 70 70 71 72
M/16 72 73 73 74 74 75 75 76 76 77 77 78 78 79 79 80
M/17 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
M/18 96 97 97 97 97 98 98 99 99 99 99 100 100 101 101 101 101 102
This is exactly the same as: (Sum of phi(k)) where m <= k <= M where phi(k) is the Euler Totient Function and with phi(0) = 1 (as defined by the problem). There is no known closed form for this sum. However there are many optimizations known as mentioned in the wiki link. This is known as the Totient Summatory Function in Wolfram. The same website also links to the series: A002088 and provides a few asymptotic approximations.
The reasoning is this: consider the number of values of the form {1/M, 2/M, ...., (M-1)/M, M/M}. All those fractions that will be reducible to a smaller value will not be counted in phi(M) because they are not relatively prime. They will appear in the summation of another totient.
For example, phi(6) = 12 and you have 1 + phi(6), since you also count the 0.

How to make a cumulative sequence?

Say I have a lazy sequence like the following:
(def s (iterate inc 1))
(take 10 s)
=> (1 2 3 4 5 6 7 8 9 10)
Now, I want to generate a sequence of cumulative sum of s like the following:
=> (1 3 6 10 15 ...)
How can I do this?
What I tried is to use atom and accumulate the sum to it(mutating) Is this the only way to generate cumulative sequence or is there a better way to do this?
NOTE: the above cumulative sum is only an example. The source sequence can be other sequence. So I can't use formula: s(n) = n(n+1)/2
(take 10 (reductions + s))
=> (1 3 6 10 15 21 28 36 45 55)

ant colony optimisation for 01 MKP

I'm trying to implement an ACO for 01MKP. My input values are from the OR-Library mknap1.txt. According to my algorithm, first I choose an item randomly. then i calculate the probabilities for all other items on the construction graph. the probability equation depends on pheremon level and the heuristic information.
p[i]=(tau[i]*n[i]/Σ(tau[i]*n[i]).
my pheremon matrix's cells have a constant value at initial (0.2). for this reason when i try to find the next item to go, pheremon matrix is becomes ineffective because of 0.2. so, my probability function determines the next item to go, checking the heuristic information. As you know, the heuristic information equation is
n[i]=profit[i]/Ravg.
(Ravg is the average of the resource constraints). for this reason my prob. functions chooses the item which has biggest profit value. (Lets say at first iteration my algorithm selected an item randomly which has 600 profit. then at the second iteration, chooses the 2400 profit value. But, in OR-Library, the item which has 2400 profit value causes the resource violation. Whatever I do, the second chosen is being the item which has 2400 profit.
is there anything wrong my algorithm? I hope ppl who know somethings about ACO, should help me. Thanks in advance.
Input values:
6 10 3800//no of items (n) / no of resources (m) // the optimal value
100 600 1200 2400 500 2000//profits of items (6)
8 12 13 64 22 41//resource constraints matrix (m*n)
8 12 13 75 22 41
3 6 4 18 6 4
5 10 8 32 6 12
5 13 8 42 6 20
5 13 8 48 6 20
0 0 0 0 8 0
3 0 4 0 8 0
3 2 4 0 8 4
3 2 4 8 8 4
80 96 20 36 44 48 10 18 22 24//resource capacities.
My algorithm:
for i=0 to max_ant
for j=0; to item_number
if j==0
{
item=rand()%n
ant[i].value+=profit[item]
ant[i].visited[j]=item
}
else
{
calculate probabilities for all the other items in P[0..n]
find the biggest P value.
item=biggest P's item.
check if it is in visited list
check if it causes resource constraint.
if everthing is ok:
ant[i].value+=profit[item]
ant[i].visited[j]=item
}//end of else
}//next j
update pheremon matrix => tau[a][b]=rou*tau[a][b]+deltaTou
}//next i

Scheme: Making a list of sublist maxes

Given a list, each item of which is an (r g b) color, return a list consisting of the
maximum component in each color.
Example: given ((123 200 6) (10 30 20) (212 255 10) (0 0 39) (37 34 34)),
code should return (200 30 255 39 37)
(define (sublist-max list-of-lists)
(map (lambda (sublist) (apply max sublist)) list-of-lists))