Running into a unresolved access error using Block distribution - chapel

I've been running into an error when trying to compile with Block distribution.
This is the error:
error: unresolved access of '[BlockDom(3,int(64),false,unmanaged DefaultDist)] real(64)' by '[int(64), int(64)]'
use Random, BlockDist;
config const size = 10;
const Space = {1..size, 1..size};
const gridSpace: domain(2) dmapped Block(boundingBox=Space);
var grid: [gridSpace] real;
var grid2: [gridSpace] real;
var grid3: [gridSpace] real;
fillRandom(grid);
fillRandom(grid2);
forall i in gridSpace do {
forall j in gridSpace do {
forall k in gridSpace do {
grid3[i,j] += grid[i,k] * grid2[k,j]; //error here
}
}
}

When iterating over a multi-dimensional domain in Chapel with a single index, the index will have the index type of the domain. In your example above, the distributed domain gridSpace is a 2-dimensional domain, therefore iterating over it with a single index will yield tuples of 2 integers.
For example,
var dom = {1..2, 1..2};
for idx in dom {
writeln(idx); // index type is (int, int)
}
will print:
(1, 1)
(1, 2)
(2, 1)
(2, 2)
The error I got when compiling your example with Chapel 1.19.0 is:
error: unresolved access of '[BlockDom(2,int(64),false,unmanaged DefaultDist)] real(64)' by '[2*int(64), 2*int(64)]'
This is telling us that we are trying to index into a block-distributed 2D array ([BlockDom(2,int(64),false,unmanaged DefaultDist)]) of reals (real(64)) with 2 tuples of 2 integers ([2*int(64), 2*int(64)]).
One way you could correct the above example is by iterating over each dimension explicitly:
forall i in gridSpace.dim(1) {
forall j in gridSpace.dim(2) {
forall k in gridSpace.dim(1) {
grid3[i,j] += grid[i,k] * grid2[k,j];
}
}
}
However, note that that there will be multiple iterations from the inner-most loop trying to add to same index of grid3 in parallel, creating a data race.
You can remove this data race by making the inner loop serial:
forall (i,j) in gridSpace {
for k in gridSpace.dim(2) {
grid3[i,j] += grid[i,k] * grid2[k,j];
}
}
Alternatively, you can use a + reduction to handle the inner loop summation:
forall (i,j) in gridSpace {
grid3[i,j] = + reduce (grid[i,..]*grid2[..,j]);
}
There are 2 other issues with the code above that I noticed:
The gridSpace is only defined with a type and no value, so it is actually an empty distributed domain. You can fix this by initializing it with the value of Space:
const gridSpace: domain(2) dmapped Block(boundingBox=Space) = Space;
See the Distributions primer for more examples.
The do is not needed in the forall loops above. do is only required when omitting the curly braces for a single-expression for loop body, e.g.
for i in dom do writeln(i);
See the for-loops guide for more information.

Related

What's the time-complexity function [ T(n) ] for these loops?

j = n;
while (j>=1) {
i = j;
while (i <= n) { cout<<"Printed"; i*= 2; }
j /= 2;
}
My goal is finding T(n) (function that gives us number of algorithm execution) whose order is expected to be n.log(n) but I need exact function which can work fine at least for n=1 to n=10 data
I have tried to predict the function, finally I ended in *T(n) = floor((n-1)log(n)) + n
which is correct just for n=1 and n=2.
I should mention that I found that inaccurate function by converting the original code to the for-loop just like below :
for ( j = 1 ; j <= n ; j*= 2) {
for ( i = j ; i<= n ; i*=2 ) {
cout << "Printed";
}
}
Finally I appreciate your help to find the exact T(n) in advance. 🙏
using log(x) is the floor of log based 2
1.)
The inner loop is executed 1+log(N)-log(j) the outer loop executed times 1+log(N) with j=1,2,4...N times the overall complexity is T(N)=log(N)log(N)+2*log(N)+1-(log(1)+log(2)+log(4)...+log(N))= log(N)^2-(0+1+2+...+log(N))+2*log(N)+1= log(N)^2-log(N)(log(N)-1)/2+1= log(N)^2/2+3*log(N)/2+1
2.) same here just in reverse order.
I know it is no proof but maybe easier to follow then math : godbolt play with n. it always returns 0;
Outer loop and inner loop are both O(log₂ N).
So total time is
O(log₁₀ N * log₂ N) == O(2 * log₂ N)
Which just gets reduced to O(lg N)

Running into a error when trying to sum up all the elements in a matrix using a forall loop

While trying to add up all the elements in a matrix I keep getting an error when I use a forall loop, it works using a for loop I'm not sure why.
Here is the error:
error: illegal lvalue in assignment
code:
config const size = 10;
var grid : [1..size, 1..size] real;
var sum : real = 0;
//for user input
for i in 1..size do
for j in 1..size do
grid[i,j] = read(uint(8));
forall i in 1..size {
forall j in 1..size {
sum += grid[i,j]; //error here
}
}
The compiler is preventing you from data-racing on sum. If your code were allowed, multiple iterations of the outer and inner forall-loops would be updating the same variable concurrently without synchronization. So instead the compiler forces sum in the loop body to be a read-only snapshot of the outer sum. The mechanism for this is called "forall intents". It is discussed in the online documentation.
If your intention is to add up all the elements in a matrix, the chpl-erific way to do it is:
const sum = + reduce grid;
Other variations on your code are also possible, depending on what you would like to accomplish.
Aside: it is more efficient to have a single forall over the 2-dimensional space, for example:
forall (i,j) in {1..size,1..size} // {1..size,1..size} is a "domain"
or, better yet:
forall (i,j) in grid.domain

OCaml implementation of code challenge

I'm doing a daily code challenge that allows for any language to be used. I've recently been working through Real-World OCaml. I'm really interested to see how this particular challenge would be solved in idiomatic OCaml. Below are two of my JavaScript implementations. The challenge is to recognize the pattern in a mathematical pyramid:
1
11
21
1211
111221
(The pattern is One 1, two 1s, one 2, one 1, three 1s, two 2s, one 1, the numbers read form the next "level" or line)
function createNextLevel(previousLevel, currentLevel = 0, finalLevel = 40) {
let sub = '';
let level = '';
// iterate on string
for (let i = 0; i < previousLevel.length; i++) {
// if we don't have an element to the left or it's equal to current element
if (!previousLevel[i - 1] || previousLevel[i] === previousLevel[i - 1]) {
sub += previousLevel[i]; // sub '2'
} else {
level += String(sub.length) + sub[0]; // ''
sub = previousLevel[i];
}
// if we're at the end
if (i === previousLevel.length - 1) {
level += String(sub.length) + sub[0]; // '21'
}
}
console.log(level);
if (currentLevel < finalLevel) {
createNextLevel(level, currentLevel + 1)
}
}
var firstLevel = '1';
createNextLevel(firstLevel);
// A bit simpler approach
function createNextLevelPlus(str, currentLevel = 0, finalLevel = 10) {
var delimitedStr = '';
var level = '';
for (let i = 0; i < str.length; i++) {
if (!str[i + 1] || str[i] === str[i+1]) {
delimitedStr += str[i];
} else {
delimitedStr += str[i] + '|';
}
}
delimitedStr.split('|').forEach((group, idx, arr) => {
level += `${String(group.length)}${group[0]}`;
});
console.log(level);
if (currentLevel < finalLevel) {
createNextLevelPlus(level, currentLevel+1)
}
}
var firstLevel = '1';
createNextLevelPlus(firstLevel);
I've mused around a bit on how one might solve this in OCaml, but I'm certain I'd just be re-inventing a C-based way. I've considered recursively walking the string and matching on a head and tail... seeing if they're equal and storing the result in some sort of tuple or something... I am kinda having a hard time warping my mind into the proper thinking.
Here's a high level decomposition.
You want something that iterates a function. There's a looping + displaying part, and an iteration part (how to go from one level to the next level).
There are two steps in each iteration:
count consecutive numbers (turn 1211 into "one one, one two, two ones")
turn a list of counts into a list (turn "one one, one two, two ones" into 111221)
Now, let's think about types. We never use these numbers as numbers (there are no additions, etc), so they can be seen as lists of integers, or in OCaml int list.
The counts, on the other hand, are also a list, but each element of the list is a pair (count, value). For example "one one, one two, two ones" can be represented as [(1, 1); (1, 2); (2, 1)]. The type for such things in OCaml is (int * int) list.
In other words, the important parts of your algorithm will be:
a function of type int list -> (int * int) list that counts successive elements.
a function of type (int * int) list -> int list that turns these counts into the new list.
Once you have these, you should be able to put the pieces together. Have fun!

n-th or Arbitrary Combination of a Large Set

Say I have a set of numbers from [0, ....., 499]. Combinations are currently being generated sequentially using the C++ std::next_permutation. For reference, the size of each tuple I am pulling out is 3, so I am returning sequential results such as [0,1,2], [0,1,3], [0,1,4], ... [497,498,499].
Now, I want to parallelize the code that this is sitting in, so a sequential generation of these combinations will no longer work. Are there any existing algorithms for computing the ith combination of 3 from 500 numbers?
I want to make sure that each thread, regardless of the iterations of the loop it gets, can compute a standalone combination based on the i it is iterating with. So if I want the combination for i=38 in thread 1, I can compute [1,2,5] while simultaneously computing i=0 in thread 2 as [0,1,2].
EDIT Below statement is irrelevant, I mixed myself up
I've looked at algorithms that utilize factorials to narrow down each individual element from left to right, but I can't use these as 500! sure won't fit into memory. Any suggestions?
Here is my shot:
int k = 527; //The kth combination is calculated
int N=500; //Number of Elements you have
int a=0,b=1,c=2; //a,b,c are the numbers you get out
while(k >= (N-a-1)*(N-a-2)/2){
k -= (N-a-1)*(N-a-2)/2;
a++;
}
b= a+1;
while(k >= N-1-b){
k -= N-1-b;
b++;
}
c = b+1+k;
cout << "["<<a<<","<<b<<","<<c<<"]"<<endl; //The result
Got this thinking about how many combinations there are until the next number is increased. However it only works for three elements. I can't guarantee that it is correct. Would be cool if you compare it to your results and give some feedback.
If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.
I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following tested code will iterate through each unique combinations:
public void Test10Choose5()
{
String S;
int Loop;
int N = 500; // Total number of elements in the set.
int K = 3; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
You should be able to port this class over fairly easily to C++. You probably will not have to port over the generic part of the class to accomplish your goals. Your test case of 500 choose 3 yields 20,708,500 unique combinations, which will fit in a 4 byte int. If 500 choose 3 is simply an example case and you need to choose combinations greater than 3, then you will have to use longs or perhaps fixed point int.
You can describe a particular selection of 3 out of 500 objects as a triple (i, j, k), where i is a number from 0 to 499 (the index of the first number), j ranges from 0 to 498 (the index of the second, skipping over whichever number was first), and k ranges from 0 to 497 (index of the last, skipping both previously-selected numbers). Given that, it's actually pretty easy to enumerate all the possible selections: starting with (0,0,0), increment k until it gets to its maximum value, then increment j and reset k to 0 and so on, until j gets to its maximum value, and so on, until j gets to its own maximum value; then increment i and reset both j and k and continue.
If this description sounds familiar, it's because it's exactly the same way that incrementing a base-10 number works, except that the base is much funkier, and in fact the base varies from digit to digit. You can use this insight to implement a very compact version of the idea: for any integer n from 0 to 500*499*498, you can get:
struct {
int i, j, k;
} triple;
triple AsTriple(int n) {
triple result;
result.k = n % 498;
n = n / 498;
result.j = n % 499;
n = n / 499;
result.i = n % 500; // unnecessary, any legal n will already be between 0 and 499
return result;
}
void PrintSelections(triple t) {
int i, j, k;
i = t.i;
j = t.j + (i <= j ? 1 : 0);
k = t.k + (i <= k ? 1 : 0) + (j <= k ? 1 : 0);
std::cout << "[" << i << "," << j << "," << k << "]" << std::endl;
}
void PrintRange(int start, int end) {
for (int i = start; i < end; ++i) {
PrintSelections(AsTriple(i));
}
}
Now to shard, you can just take the numbers from 0 to 500*499*498, divide them into subranges in any way you'd like, and have each shard compute the permutation for each value in its subrange.
This trick is very handy for any problem in which you need to enumerate subsets.

Subset sum (Coin change)

My problem is, I need to count how many combination of array of integers sums to a value W.`
let say:
int array[] = {1,2,3,4,5};
My Algorithm is just find all combinations of lengths from 1 to W / minimum(array), which is equal to W because minimum is 1.
And checking each combination if its sum equal to W then increment a counter N.
any other algorithm to solve this ? should be faster :)
Update:
ok, the subset problem and the Knapsack Problem are good, but my problem is that the combinations of the array repeats the elements, like this:
1,1,1 -> the 1st combination
1,1,2
1,1,3
1,1,4
1,1,5
1,2,2 -> this combination is 1,2,2, not 1,2,1 because we already have 1,1,2.
1,2,3
1,2,4
1,2,5
1,3,3 -> this combination is 1,3,3, not 1,3,1 because we already have 1,1,3.
1,3,4
.
.
1,5,5
2,2,2 -> this combination is 2,2,2, not 2,1,1 because we already have 1,1,2.
2,2,3
2,2,4
2,2,5
2,3,3 -> this combination is 2,3,3, not 2,3,1 because we already have 1,2,3.
.
.
5,5,5 -> Last combination
these are all combinations of {1,2,3,4,5} of length 3. the subset-sum problem gives another kind of combinations that I'm not interested in.
so the combination that sums to W, lets say W = 7,
2,5
1,1,5
1,3,3
2,2,3
1,1,2,3
1,2,2,2
1,1,1,1,3
1,1,1,2,2
1,1,1,1,1,2
1,1,1,1,1,1,1
Update:
The Real Problem is in the repeated of the elements 1,1,1 is need and the order of the generated combination are not important, so 1,2,1 is the same as 1,1,2 and 2,1,1 .
No efficient algorithm exist as of now, and possibly never will (NP-complete problem).
This is (a variation of) the subset-sum problem.
This is coin change problem. It could be solved by dynamic programming with reasonable restrictions of W and set size
Here is code, in Go, that solves this problem. I believe it runs in O(W / min(A)) time. The comments should be sufficient to see how it works. The important detail is that it can use an element in A multiple times, but once it stops using that element it won't ever use it again. This avoids double-counting things like [1,2,1] and [1,1,2].
package main
import (
"fmt"
"sort"
)
// This is just to keep track of how many times we hit ninjaHelper
var hits int = 0
// This is our way of indexing into our memo, so that we don't redo any
// calculations.
type memoPos struct {
pos, sum int
}
func ninjaHelper(a []int, pos, sum, w int, memo map[memoPos]int64) int64 {
// Count how many times we call this function.
hits++
// Check to see if we've already done this computation.
if r, ok := memo[memoPos{pos, sum}]; ok {
return r
}
// We got it, and we can't get more than one match this way, so return now.
if sum == w {
return 1
}
// Once we're over w we can't possibly succeed, so just bail out now.
if sum > w {
return 0
}
var ret int64 = 0
// By only checking values at this position or later in the array we make
// sure that we don't repeat ourselves.
for i := pos; i < len(a); i++ {
ret += ninjaHelper(a, i, sum+a[i], w, memo)
}
// Write down our answer in the memo so we don't have to do it later.
memo[memoPos{pos, sum}] = ret
return ret
}
func ninja(a []int, w int) int64 {
// We reverse sort the array. This doesn't change the complexity of
// the algorithm, but by counting the larger numbers first we can hit our
// target faster in a lot of cases, avoid a bit of work.
sort.Ints(a)
for i := 0; i < len(a)/2; i++ {
a[i], a[len(a)-i-1] = a[len(a)-i-1], a[i]
}
return ninjaHelper(a, 0, 0, w, make(map[memoPos]int64))
}
func main() {
a := []int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
w := 1000
fmt.Printf("%v, w=%d: %d\n", a, w, ninja(a, w))
fmt.Printf("Hits: %v\n", hits)
}
Just to put this to bed, here are recursive and (very simple) dynamic programming solutions to this problem. You can reduce the running time (but not the time complexity) of the recursive solution by using more sophisticated termination conditions, but the main point of it is to show the logic.
Many of the dynamic programming solutions I've seen keep the entire N x |c| array of results, but that's not necessary, since row i can be generated from just row i-1, and furthermore it can be generated in order left to right so no copy needs to be made.
I hope the comments help explain the logic. The dp solution is fast enough that I couldn't find a test case which didn't overflow a long long which took more than a few milliseconds; for example:
$ time ./coins dp 1000000 1 2 3 4 5 6 7
3563762607322787603
real 0m0.024s
user 0m0.012s
sys 0m0.012s
// Return the number of ways of generating the sum n from the
// elements of a container of positive integers.
// Note: This function will overflow the stack if an element
// of the container is <= 0.
template<typename ITER>
long long count(int n, ITER begin, ITER end) {
if (n == 0) return 1;
else if (begin == end || n < 0) return 0;
else return
// combinations which don't use *begin
count(n, begin + 1, end) +
// use one (more) *begin.
count(n - *begin, begin, end);
}
// Same thing, but uses O(n) storage and runs in O(n*|c|) time,
// where |c| is the length of the container. This implementation falls
// directly out of the recursive one above, but processes the items
// in the reverse order; each time through the outer loop computes
// the combinations (for all possible sums <= n) for sum prefix of
// the container.
template<typename ITER>
long long count1(int n, ITER begin, ITER end) {
std::vector<long long> v(n + 1, 0);
v[0] = 1;
// Initial state of v: v[0] is 1; v[i] is 0 for 1 <= i <= n.
// Corresponds to the termination condition of the recursion.
auto vbegin = v.begin();
auto vend = v.end();
for (auto here = begin; here != end; ++here) {
int a = *here;
if (a > 0 && a <= n) {
auto in = vbegin;
auto out = vbegin + a;
// *in is count(n - a, begin, here).
// *out is count(n, begin, here - 1).
do *out++ += *in++; while (out != vend);
}
}
return v[n];
}