Matrix multiplication using multiple locales - chapel

Hi I wanted to see if anyone can see any glaring problems with my code. I am trying to run my code over two Nvidia Jetson boards to utilize 8 cores for speed up. I want to compare the speed of using one board compared to two boards. I have the Chapel environment set up to be allowed for multiple locale execution.
Here is my implementation:
use LinearAlgebra, Norm, Random, Time;
var t : Timer;
writeln("Size of your matrix?");
var size = read(int);
var grid : [1..size, 1..size] uint(8);
var grid2 : [1..size, 1..size] uint(8);
var grid3 : [1..size, 1..size] int;
fillRandom(grid);
fillRandom(grid2);
t.start();
forall loc in Locales do
on loc do
forall i in 1..size do
forall j in 1..size do
forall k in 1..size do
grid3[i,j] += grid[i,k] * grid2[k,j];
t.stop();
writeln("Done!:");
writeln(t.elapsed(),"seconds");
t.clear();
I keep getting:
error: Only 1 locale may be used for CHPL_COMM layer 'none'
To use multiple locales, see $CHPL_HOME/doc/rst/usingchapel/multilocale.rst
When I run the cores.chpl file which has:
coforall loc in Locales do
on loc do
writeln("locale ", here.id, " named ", here.name, " has ", here.numPUs(), " cores.");
This is the output:
locale 0 named JetsonNano has 4 cores.
locale 1 named JetsonNano2 has 4 cores.
So I know the environment is set up right.
I'm just not sure if I am setting my matrix multiplication code right for it be allowed over multiple locales.

The message:
error: Only 1 locale may be used for CHPL_COMM layer 'none'
To use multiple locales, see $CHPL_HOME/doc/rst/usingchapel/multilocale.rst
indicates that when you compiled your Chapel program, you either had CHPL_COMM unset, or set to none in the session where the compilation took place. Try setting CHPL_COMM=gasnet in your current session (or, equivalently, compiling with --comm=gasnet), recompiling, and then running with -nl 2.
Within a given session, you can run $CHPL_HOME/util/printchplenv to see what the current set and/or inferred environment variables are. For a given Chapel program you can run ./myChapelProgram --about to get information about the settings at the time it was compiled.
If you plan to work with CHPL_COMM=gasnet most of the time, you can use a Chapel configuration files to avoid re-setting things over and over again.

Related

How to define tolerance level in CBC solver using pyomo's solverfactory?

I am trying to solve a MIP with CBC using pyomo's solver factory, however running into some infeasibility issues. I wanted to first try configuring the tolerance level and see if that works before diving deep into the data points that could cause the infeasibility.
However when I use this command, the cbc solver outputs an error:
options = {
'tol': 0.0001
}
solver = SolverFactory(solver_type)
solver.options.update(options)
Can anyone help me with the understanding how to define tolerance level in cbc? Thanks!
So, it depends a bit. Assuming you are looking for the tolerance for a mixed integer program, the keyword for CBC is 'ratio'.
Here is a setup that runs 6 threads, max 20 seconds, ratio of 0.02 (2% gap)
### SOLVE
solver = pyo.SolverFactory('cbc')
solver.options = {'sec': 20, 'threads': 6, 'ratio': 0.02}
results = solver.solve(mdl)
print(results)
There are several different types of syntax that are acceptable here... You can pass the options in to the SolverFactory, etc... but this works fine for me.
As another trick, I always get hung up on the correct keywords for these solvers.... If you have CBC installed properly, you can just go to the terminal, open CBC with the command cbc, which should give you the 'coin' prompt and type "?" to see the commands. Then you can use the command and double '??' to get details. This also works for glpk which is super helpful.
For instance:
% cbc
Welcome to the CBC MILP Solver
Version: 2.10.5
Build Date: Dec 5 2021
CoinSolver takes input from arguments ( - switches to stdin)
Enter ? for list of commands or help
Coin:?
In argument list keywords have leading - , -stdin or just - switches to stdin
One command per line (and no -)
abcd? gives list of possibilities, if only one + explanation
abcd?? adds explanation, if only one fuller help
abcd without value (where expected) gives current value
abcd value sets value
Commands are:
Double parameters:
dualB(ound) dualT(olerance) primalT(olerance) primalW(eight) psi
zeroT(olerance)
Branch and Cut double parameters:
allow(ableGap) cuto(ff) inc(rement) integerT(olerance) preT(olerance)
pumpC(utoff) ratio(Gap) sec(onds)
Integer parameters:
force(Solution) idiot(Crash) maxF(actor) maxIt(erations) output(Format)
randomS(eed) slog(Level) sprint(Crash)
Branch and Cut integer parameters:
cutD(epth) cutL(ength) depth(MiniBab) hot(StartMaxIts) log(Level) maxN(odes)
maxSaved(Solutions) maxSo(lutions) passC(uts) passF(easibilityPump)
passT(reeCuts) pumpT(une) randomC(bcSeed) slow(cutpasses) strat(egy)
strong(Branching) trust(PseudoCosts)
Keyword parameters:
allC(ommands) chol(esky) crash cross(over) direction error(sAllowed)
fact(orization) keepN(ames) mess(ages) perturb(ation) presolve
printi(ngOptions) scal(ing) timeM(ode)

How to call function from external library in C/C++

I want to find the symmetry group of an integer linear program. I think there is a function in skip called SCIPgetGeneratorsSymmetry . I how I can use this function?
You are right, to access symmetry information in SCIP, you have to call the function SCIPgetGeneratorsSymmetry() via C/C++. Note that you need to link SCIP against the external software bliss, because otherwise, SCIP is not able to compute symmetries of your (mixed-integer) linear program.
If you set up your (mixed-integer) linear program using a C/C++ project, you have several options for computing symmetries.
If you set the "recompute" parameter to FALSE, SCIP will return the currently available symmetry information - if symmetries have not been computed yet, SCIP will compute symmetries to give you access to this information.
If you set "recompute" to TRUE, SCIP will discard the available symmetry information and you get access to the generators of the current symmetry group. Moreover, you can control the kind of symmetries that are computed via the parameters "symspecrequire" and "symspecrequirefixed", e.g., to only compute symmetries of binary variables that fix continuous variables.
Edit:
If you have no experience with coding in C/C++ and you are only interested in printing the generators of the symmetry group, the easiest way is probably to modify SCIP's source code in presol_symmetry.c as follows:
Add two integer paramaters int i and int p at the very beginning of determineSymmetry().
Search within determineSymmetry() for the line in which computeSymmetryGroup() is called.
Add the following code snippet right after this function call:
for (p = 0; p < presoldata->nperms; ++p)
{
printf("permutation %d\n", p);
for (i = 0; i < presoldata->npermvars; ++i)
{
if ( TRUE )
printf("%d ", presoldata->perms[p][i]);
else
printf("%s ", SCIPvarGetName(presoldata->permvars[presoldata->perms[p][i]]));
}
printf("\n");
}
This code prints the generators of the symmetry group as a list of variable indices, e.g., 1 2 0 is the permutation that maps 0 -> 1, 1 -> 2, and 2 -> 0. If you change TRUE to FALSE, you get the same list but variable indices are replaced by their names.
Do not forget to recompile SCIP.
If you solve an instance with SCIP and symmetry handling is enabled, SCIP will print the generators in the above format whenever it computes the symmetry group. If you are interested in the symmetry group of the original problem, you should use the parameter setting presolving/symbreak/addconsstiming = 0 and propagating/orbitalfixing/symcomptiming = 0. If you are fine with symmetries of the presolved problem, change the zeros to ones.

Declaring variables in Python 2.7x to avoid issues later

I am new to Python, coming from MATLAB, and long ago from C. I have written a script in MATLAB which simulates sediment transport in rivers as a Markov Process. The code randomly places circles of a random diameter within a rectangular area of a specified dimension. The circles are non-uniform is size, drawn randomly from a specified range of sizes. I do not know how many times I will step through the circle placement operation so I use a while loop to complete the process. In an attempt to be more community oriented, I am translating the MATLAB script to Python. I used the online tool OMPC to get started, and have been working through it manually from the auto-translated version (was not that helpful, which is not surprising). To debug the code as I go, I use the
MATLAB generated results to generally compare and contrast against results in Python. It seems clear to me that I have declared variables in a way that introduces problems as calculations proceed in the script. Here are two examples of consistent problems between different instances of code execution. First, the code generated what I think are arrays within arrays because the script is returning results which look like:
array([[ True]
[False]], dtype=bool)
This result was generated for the following code snippet at the overlap_logix operation:
CenterCoord_Array = np.asarray(CenterCoordinates)
Diameter_Array = np.asarray(Diameter)
dist_check = ((CenterCoord_Array[:,0] - x_Center) ** 2 + (CenterCoord_Array[:,1] - y_Center) ** 2) ** 0.5
radius_check = (Diameter_Array / 2) + radius
radius_check_update = np.reshape(radius_check,(len(radius_check),1))
radius_overlap = (radius_check_update >= dist_check)
# Now actually check the overalp condition.
if np.sum([radius_overlap]) == 0:
# The new circle does not overlap so proceed.
newCircle_Found = 1
debug_value = 2
elif np.sum([radius_overlap]) == 1:
# The new circle overlaps with one other circle
overlap = np.arange(0,len(radius_overlap), dtype=int)
overlap_update = np.reshape(overlap,(len(overlap),1))
overlap_logix = (radius_overlap == 1)
idx_true = overlap_update[overlap_logix]
radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
A similar result for the same run was produced for variables:
radius_check_update
radius_overlap
overlap_update
Here is the same code snippet for the working MATLAB version (as requested):
distcheck = ((Circles.CenterCoordinates(1,:)-x_Center).^2 + (Circles.CenterCoordinates(2,:)-y_Center).^2).^0.5;
radius_check = (Circles.Diameter ./ 2) + radius;
radius_overlap = (radius_check >= distcheck);
% Now actually check the overalp condition.
if sum(radius_overlap) == 0
% The new circle does not overlap so proceed.
newCircle_Found = 1;
debug_value = 2;
elseif sum(radius_overlap) == 1
% The new circle overlaps with one other circle
temp = 1:size(radius_overlap,2);
idx_true = temp(radius_overlap == 1);
radius = distcheck(1,idx_true) - (Circles.Diameter(1,idx_true)/2);
In the Python version I have created arrays from lists to more easily operate on the contents (the first two lines of the code snippet). The array within array result and creating arrays to access data suggests to me that I have incorrectly declared variable types, but I am not sure. Furthermore, some variables have a size, for example, (2L,) (the numerical dimension will change as circles are placed) where there is no second dimension. This produces obvious problems when I try to use the array in an operation with another array with a size (2L,1L). Because of these problems I started reshaping arrays, and then I stopped because I decided these were hacks because I had declared one, or more than one variable incorrectly. Second, for the same run I encountered the following error:
TypeError: 'numpy.ndarray' object is not callable
for the operation:
radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
which occurs at the bottom of the above code snippet. I have posted the entire script at the following link because it is probably more useful to execute the script for oneself:
https://github.com/smchartrand/MarkovProcess_Bedload
I have set-up the code to run with some initial parameter values so decisions do not need to be made; these parameter values produce the expected results in the MATLAB-based script, which look something like this when plotted:
So, I seem to specifically be having issues with operations on lines 151-165, depending on the test value np.sum([radius_overlap]) and I think it is because I incorrectly declared variable types, but I am really not sure. I can say with confidence that the Python version and the MATLAB version are consistent in output through the first step of the while loop, and code line 127 which is entering the second step of the while loop. Below this point in the code the above documented issues eventually cause the script to crash. Sometimes the script executes to 15% complete, and sometimes it does not make it to 5% - this is due to the random nature of circle placement. I am preparing the code in the Spyder (Python 2.7) IDE and will share the working code publicly as a part of my research. I would greatly appreciate any help that can be offered to identify my mistakes and misapplications of python coding practice.
I believe I have answered my own question, and maybe it will be of use for someone down the road. The main sources of instruction for me can be found at the following three web pages:
Stackoverflow Question 176011
SciPy FAQ
SciPy NumPy for Matlab users
The third web page was very helpful for me coming from MATLAB. Here is the modified and working python code snippet which relates to the original snippet provided above:
dist_check = ((CenterCoordinates[0,:] - x_Center) ** 2 + (CenterCoordinates[1,:] - y_Center) ** 2) ** 0.5
radius_check = (Diameter / 2) + radius
radius_overlap = (radius_check >= dist_check)
# Now actually check the overalp condition.
if np.sum([radius_overlap]) == 0:
# The new circle does not overlap so proceed.
newCircle_Found = 1
debug_value = 2
elif np.sum([radius_overlap]) == 1:
# The new circle overlaps with one other circle
overlap = np.arange(0,len(radius_overlap[0]), dtype=int).reshape(1, len(radius_overlap[0]))
overlap_logix = (radius_overlap == 1)
idx_true = overlap[overlap_logix]
radius = dist_check[idx_true] - (Diameter[0,idx_true] / 2)
In the end it was clear to me that it was more straightforward for this example to use numpy arrays vs. lists to store results for each iteration of filling the rectangular area. For the corrected code snippet this means I initialized the variables:
CenterCoordinates, and
Diameter
as numpy arrays whereas I initialized them as lists in the posted question. This made a few mathematical operations more straightforward. I was also incorrectly indexing into variables with parentheses () as opposed to the correct method using brackets []. Here is an example of a correction I made which helped the code execute as envisioned:
Incorrect: radius = dist_check(idx_true,1) - (Diameter(idx_true,1) / 2)
Correct: radius = dist_check[idx_true] - (Diameter[0,idx_true] / 2)
This example also shows that I had issues with array dimensions which I corrected variable by variable. I am still not sure if my working code is the most pythonic or most efficient way to fill a rectangular area in a random fashion, but I have tested it about 100 times with success. The revised and working code can be downloaded here:
Working Python Script to Randomly Fill Rectangular Area with Circles
Here is an image of a final results for a successful run of the working code:
The main lessons for me were (1) numpy arrays are more efficient for repetitive numerical calculations, and (2) dimensionality of arrays which I created were not always what I expected them to be and care must be practiced when establishing arrays. Thanks to those who looked at my question and asked for clarification.

How to get FLOPS in RISC-V using SW or HW method?

I am a newbie to RISC-V. I wonder how I could get FLOPS using SW or HW method. I try to use CSR to get FLOPS, but there are some problems.
As I know, if I redesign the hpmcounter which counts every floating operation event, I could get FLOPS by using the csr read instruction. I know there is a similar design in the rocket-chip-based SiFive's U54-core manual. In the manual I can see SiFive core has sophisticated feature counting capabilities. This feature is controlled by the mhpmevent CSR. If I set lower eight bits of mhpmevent as 0, and enable the [19-25] bit, I can get counter value from mhpmcounter. I actually want to design this field like SiFive core.
I try to imitate it for FLOPS, but I encounter some problems.
I can't access to the mhpmcounter, and I can see the illegal instruction error like following link.
illegal instruction error message!!
I make a simple test code and compile it successfully, but there is a illegal instruction error when I implement it using spike and cycle accurate emulator. Both use proxy kernel.
// simple test code
unsigned long instret1 = 0;
unsigned long instret2 = 0;
float a,b,c;
a = 5.0;
b = 4.0;
asm volatile ("csrrs %0, mhpmcounter3, x0 " : "=r"(instret1));
c = a + b;
asm volatile ("csrrs %0, mhpmcounter3, x0 " : "=r"(instret2));
printf("instruction count : %ul \n", instret2-instret1);
It is hard to change to M-mode from user mode for access to the mhpmevet and mhpmcounter. In the RISC-V priv-spec 1.10, I find xRET instruction can change mode. Following text is about xRET in the spec.
The MRET, SRET, or URET instructions are used to return from traps in M-mode, S-mode, or
U-mode respectively. When executing an xRET instruction, supposing xPP holds the value y, x IE
is set to x PIE; the privilege mode is changed to y; x PIE is set to 1; and xPP is set to U (or M if
user-mode is not supported).
If someone knows it, I hope to see the detailed assembly code.
I try to modify rocket-chip/src/main/scala/rocket/CSR.scala for redesign CSR. Is it the only way? Firstly, I want to use spike to test the counter value. How should I change the code?
If anybody has some other ideas or has accomplished it, please point to me. Thanks!

Rainflow algorithm - Fortran conversion to Matlab

I am trying to convert a Rainflow cycle counting algorithm which is in Fortran, which is a language I am not familiar with, into Matlab.
There is a ready made Rainflow I've downloaded for Matlab but that does not fit the requirements of my project so I'm trying to build one from scratch.
Here is the Fortran code:
INTEGER BUFFER (4096), INDEX, VALUE, RANGE, MEAN, X, Y
INDEX = 0
10 CONTINUE
call 'get next peak/valley', VALUE
INDEX = INDEX + 1
BUFFER (INDEX) = VALUE
20 CONTINUE
IF (INDEX.LT.3) THEN
not enough points to form a cycle
GOTO 10
ELSE
X = ABS (BUFFER(INDEX) - BUFFER(INDEX - 1))
Y = ABS (BUFFER(INDEX - 1) - BUFFER(INDEX - 2))
IF (X.GE.Y) THEN
c -- cycle has been closed
RANGE = Y
MEAN = (BUFFER(INDEX-1) + BUFFER(INDEX-2))/2
c -- remove the cycle
INDEX = INDEX - 2
BUFFER(INDEX) = BUFFER(INDEX+2)
c -- see if this value closes any more cycles
GOTO 20
ELSE
GOTO 10
END IF
END IF
I had downloaded f2matlab (a Fortran to Matlab converter) but it requires a Fortran compiler which I do not have.
The bits I don't really understand how I can convert are:
The call 'get next… line (is this an input()?)
The BUFFER(4096) etc (is this a bit large to be a matrix in matlab?)
The GOTO/CONTINUE structure.
What do they mean, in English (or Matlab)?
I have seen
How to translate fortran goto state to matlab
and
translating loop from Fortran to MATLAB
but they do not help me very much.
This
call 'get next peak/valley', VALUE
isn't (currently) syntactically valid Fortran and I'm not sure whether any compiler of yore would have understood it either. I guess that it means get a VALUE for use in the following bits of code.
INTEGER BUFFER (4096)
is a simple declaration that BUFFER is a vector of 4096 integers, nothing to scare Matlab in that volume of data.
Finally, GOTO is an unconditional jump and the number following it is the label of the line to jump to, so GOTO 10 means execute the line with label 10 next. It was fairly common in FORTRAN of the vintage you are showing us to jump to a CONTINUE statement which is, in this context, a no-operation, execution continues to the next line.
In another context, with DO loops CONTINUE would have marked the end of the block of code inside the scope of the loop and would have a subtly different effect.