Output precision of `writeln()` for floating-point numbers - chapel

Using writef(), I can control the output precision of a floating-point number, for example:
writef( "%20.15dr\n", 1.0 / 3.0 ); // 0.333333333333333
but if I use writeln() for convenience, the number is output with 6 digits:
writeln( 1.0 / 3.0 ); // 0.333333
Is there possibly a way to control the default output precision of floating-point numbers for writeln()..? (e.g., via some environment variable?)
For comparison, some languages output 15 digits and some 6 digits by default, so the result seems to vary depending on languages (or compilers).
# python2
print 1.0 / 3.0 # 0.333333333333
# python3
print( 1.0 / 3.0 ) # 0.3333333333333333
# julia
println( 1.0 / 3.0 ) # 0.3333333333333333
# gfortran
print *, 1.0d0 / 3.0d0 # 0.33333333333333331
# swift
print( 1.0 / 3.0 ) # 0.333333333333333
# nim
echo( 1.0 / 3.0 ) # 0.3333333333333333
# g++
cout << 1.0 / 3.0 << endl; # 0.333333
# d (dmd)
writeln( 1.0 / 3.0 ); # 0.333333

Use iostyle and _set_style() :
writeln(100.0/3.0); // 33.3333
stdout.lock();
stdout._set_style(new iostyle(precision=10));
stdout.unlock();
writeln(100.0/3.0); // 33.33333333
You can also pass other things to new iostyle(), for example:
precision=10, realfmt=0 // like %.10g in C: 33.33333333 (default)
precision=10, realfmt=1 // like %.10f in C: 33.3333333333
precision=10, realfmt=2 // like %.10e in C: 3.3333333333e+01

Yes there is. In Chapel, I/O is performed on channels. Each channel has an I/O style (represented by a record of type iostyle) which specifies how values are printed to that channel if a more specific style is not provided within the read/write call itself. A call to writeln() is essentially a call to stdout.writeln() where stdout is a channel whose output shows up in the console.
The following example shows how to change the I/O style of stdout (Try it Online):
// print to stdout using its default style
writeln( 1.0 / 3.0 );
// create a new IO style with a precision of 15
var style = new iostyle(precision=15);
// change stdout to use this new style
stdout._set_style(style);
// print using the new style
writeln( 1.0 / 3.0 );
// restore the default style and print once more
stdout._set_style(defaultIOStyle());
writeln( 1.0 / 3.0 );
where the output is:
0.333333
0.333333333333333
0.333333
Note that it isn't safe to change the style of a channel in parallel code without locking it first. Since the example above is completely serial, it's OK, but in the context of a larger, potentially parallel, program, the better approach would be to lock the channel before setting its style, as follows (Try it Online):
// print to stdout using its default style
writeln( 1.0 / 3.0 );
// create a new IO style with a precision of 15
var style = new iostyle(precision=15);
// change stdout to use this new style
stdout.lock();
stdout._set_style(style);
stdout.unlock();
// print using the new style
writeln( 1.0 / 3.0 );
// restore the default style and print once more
stdout.lock();
stdout._set_style(defaultIOStyle());
stdout.unlock();
writeln( 1.0 / 3.0 );
Chapel's online documentation has more information about I/O styles, the fields of the iostyle record, and locking of channels.

Related

Why does fmt round 0.5 rounded to 0, for zero decimals?

I fully implemented the fmt library in my project. But the last bug I found was that values previously where rounded to the higher bound (0.5 -> 1), while they currently are rounded to their lower bound (0.5 -> 0).
When trying to find the cause, I found:
double halfd = 0.5;
std::string str = fmt::format("{:.0f}", halfd ); // result: "0"
float halff = 0.5f
std::string str = fmt::format("{:.0f}", halff ); // result: "0"
I also tried, if the compiler maybe stores 0.5 as 0.49999999999..., but I could not find the cause.
I work with Microsoft Visual Studio 2017 with v140 compiler.
Does anybody know why fmt rounds 0.5 to 0, instead of 1. I previously used the standard printf library.
As commented on this github issue {fmt} uses round to nearest even mode which is the default rounding mode in IEEE 754. glibc's printf does the same by default: https://godbolt.org/z/jfor1cedo.

Slowdown of pi calculation when Timer is used

The following code is my code for calculating pi = 3.1415... approximately using this formula:
use Time;
var timer = new Timer();
config const n = 10**9;
var x = 0.0, s = 0.0;
// timer.start(); // [1]_____
for k in 0 .. n {
s = ( if k % 2 == 0 then 1.0 else -1.0 ); // (-1)^k
x += s / ( 2.0 * k + 1.0 );
}
// timer.stop(); // [2]_____
// writeln( "time = ", timer.elapsed() ); // [3]_____
writef( "pi (approx) = %30.20dr\n", x * 4 );
// writef( "pi (exact) = %30.20dr\n", pi ); // [4]_____
When the above code is compiled as chpl --fast test.chpl and executed as time ./a.out, then it runs with ~4 seconds as
pi (approx) = 3.14159265458805059268
real 0m4.334s
user 0m4.333s
sys 0m0.006s
On the other hand, if I uncomment Lines [1--3] ( to use Timer ), the program runs much slower with ~10 seconds as
time = 10.2284
pi (approx) = 3.14159265458805059268
real 0m10.238s
user 0m10.219s
sys 0m0.018s
The same slow-down occurs when I uncomment only Line [4] ( to print the built-in value of pi, with Lines [1-3] kept commented out ):
pi (approx) = 3.14159265458805059268
pi (exact) = 3.14159265358979311600
real 0m10.144s
user 0m10.141s
sys 0m0.009s
So I'm wondering why this slow-down occurs...
Am I missing something in the above code (e.g., wrong usage of Timer)?
My environment is OSX10.11 + chapel-1.16 installed via homebrew.
More details are below:
$ printchplenv --anonymize
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: clang
CHPL_TARGET_ARCH: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_MAKE: make
CHPL_ATOMICS: intrinsics
CHPL_GMP: gmp
CHPL_HWLOC: hwloc
CHPL_REGEXP: re2
CHPL_WIDE_POINTERS: struct
CHPL_AUX_FILESYS: none
$ clang --version
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Update
Following the suggestions, I installed Chapel from source by following this and this pages and adding CHPL_TARGET_COMPILER=gnu to ~/.chplconfig (before running make). Then, all the three cases above ran with ~4 seconds. So, the problem may be related to clang on OSX10.11. According to the comments, newer OSX (>= 10.12) does not have this problem, so it may be simply sufficient to upgrade to newer OSX/clang (>= 9.0). FYI, the updated environment info (with GNU) is as follows:
$ printchplenv --anonymize
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: gnu +
CHPL_TARGET_ARCH: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_MAKE: make
CHPL_ATOMICS: intrinsics
CHPL_GMP: none
CHPL_HWLOC: hwloc
CHPL_REGEXP: none
CHPL_WIDE_POINTERS: struct
CHPL_AUX_FILESYS: none
Am I missing something in the above code (e.g., wrong usage of Timer)?
No, you're not missing anything and are using Timer (and Chapel) in a completely reasonable way. From my own experimentation (which confirms yours and is noted in the comments under your question), this looks to be a back-end compiler issue rather than a fundamental problem in Chapel or your use of it.
[--fast] reduces run-time checks, yet not the issue may re-run here
Kindly may also note, how big are setup/operation add-on overheads,
brought in just for educational purposes
( to experiment with concurrent-processing ), that make the forall-constructor equipped with Atomics .add() method, accrue a way much higher overheads, than a concurrent-processing allow to gain, as there is so tiny computation inside the [PAR]-enabled fraction of the process ( ref. newly re-formulated Amdahl's Law on these too thin [PAR]-gains v/s indeed too high add-on overheads to the [SEQ]-costs ).
An exemplary message.
use Time;
var timer = new Timer();
config const n = 10**9;
var s = 0.0, x = 0.0;
var AtomiX: atomic real; // [AtomiX]______
AtomiX.write( 0.0 ); // [AtomiX]______
timer.start(); // [1]_____
for k in 0 .. n {
s = ( if k % 2 == 0 then 1.0 else -1.0 ); // (-1)^k
x += s / ( 2.0 * k + 1.0 );
}
/* forall k in 0..n { AtomiX.add( ( if k % 2 == 0 then 1.0 else -1.0 )
/ ( 2.0 * k + 1.0 )
); } */ // [AtomiX]______
timer.stop(); // [2]_____
writeln( "time = ", timer.elapsed() ); // [3]_____
writef( "pi (approx) = %30.20dr\n", 4 * x );
// writef( "pi (approx) = %30.20dr\n", 4 * AtimiX.read() ); // [AtomiX]______
// writef( "pi (exact) = %30.20dr\n", pi ); // [4]_____
/*
--------------------------------------------------- [--fast] // AN EMPTY RUN
time = 1e-06
Real time: 9.582 s
User time: 8.479 s
Sys. time: 0.591 s
CPU share: 94.65 %
Exit code: 0
--------------------------------------------------- [--fast] // all commented
pi (approx) = 3.14159265458805059268
Real time: 15.553 s
User time: 13.484 s ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~> Timer ~ +/- 1 second ( O/S noise )
Sys. time: 0.985 s
CPU share: 93.03 %
Exit code: 0
-------------------------------------------------- [--fast ] // Timer-un-commented
time = 5.30128
time = 5.3329
pi (approx) = 3.14159265458805059268
Real time: 14.356 s
User time: 13.047 s ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~< Timer ~ +/- 1 second ( O/S noise )
Sys. time: 0.585 s
CPU share: 94.95 %
Exit code: 0
Real time: 16.804 s
User time: 14.853 s
Sys. time: 0.925 s
CPU share: 93.89 %
Exit code: 0
-------------------------------------------------- [--fast] // Timer-un-commented + forall + Atomics
time = 14.7406
pi (approx) = 3.14159265458805680993
Real time: 28.099 s
User time: 26.246 s
Sys. time: 0.914 s
CPU share: 96.65 %
Exit code: 0
*/

Normalizing complex values in NumPy / Python

I am currently trying to normalize complex values..
as i don't have a good way of doing this, i decided to divide my dataset into two, consisting of data with only the real part and data only with the imaginary part.
def split_real_img(x):
real_array = x.real
img_array = x.imag
return real_array, img_array
And then normalize each separately with
def numpy_minmax(X):
xmin = X.min()
print X.min()
print X.max()
return (2*(X - xmin) / (X.max() - xmin)-1)*0.9
after the normalization, is the both dataset supposed to be merged, such that it returns into one data set with complex values?, but how do I do that?
The data normalization is done, such that I can use tanh as activation function, which operates in the ranges -0.9 to 0.9 => which is why I need to the data set to be normalized into these ranges.
Basically, two steps would be involved :
Offset all numbers by the minimum along real and imaginary axes.
Divide each by the max. magnitude. To get the magnitude of a complex number, simply use np.abs().
Thus, the implementation would be -
def normalize_complex_arr(a):
a_oo = a - a.real.min() - 1j*a.imag.min() # origin offsetted
return a_oo/np.abs(a_oo).max()
Sample runs for verification
Let'start with an array that has a minimum one of [0+0j] and two more elements - [x1+y1*J] & [y1+x1*J]. Thus, their magnitudes after normalizing should be 1 each.
In [358]: a = np.array([0+0j, 1+17j, 17+1j])
In [359]: normalize_complex_arr(a)
Out[359]:
array([ 0.00000000+0.j , 0.05872202+0.99827437j,
0.99827437+0.05872202j])
In [360]: np.abs(normalize_complex_arr(a))
Out[360]: array([ 0., 1., 1.])
Next up, let's add an offset to the minimum element. This shouldn't change their magnitudes after normalization -
In [361]: a = np.array([0+0j, 1+17j, 17+1j]) + np.array([2+3j])
In [362]: a
Out[362]: array([ 2. +3.j, 3.+20.j, 19. +4.j])
In [363]: normalize_complex_arr(a)
Out[363]:
array([ 0.00000000+0.j , 0.05872202+0.99827437j,
0.99827437+0.05872202j])
In [364]: np.abs(normalize_complex_arr(a))
Out[364]: array([ 0., 1., 1.])
Finally, let's add another element that is at twice the distance from offsetted origin to make sure this new one has a magnitude of 1 and others are reduce to 0.5 -
In [365]: a = np.array([0+0j, 1+17j, 17+1j, 34+2j]) + np.array([2+3j])
In [366]: a
Out[366]: array([ 2. +3.j, 3.+20.j, 19. +4.j, 36. +5.j])
In [367]: normalize_complex_arr(a)
Out[367]:
array([ 0.00000000+0.j , 0.02936101+0.49913719j,
0.49913719+0.02936101j, 0.99827437+0.05872202j])
In [368]: np.abs(normalize_complex_arr(a))
Out[368]: array([ 0. , 0.5, 0.5, 1. ])

Accuracy warnings in scipy.special

I am running an MCMC sampler which requires the calculation of the hypergeometric function at each step using scipy.special.hyp2f1().
At certain points on my grid (which I do not care about) the solutions to the hypergeometric function are quite unstable and SciPy prints the warning:
Warning! You should check the accuracy
This is rather annoying, and over 1000s of samples may well slow down my routine.
I have tried using special.errprint(0) with no luck, as well as disabling all warnings in Python using both the warnings module and the -W ignore flag.
The offending function (called from another file) is below
from numpy import pi, hypot, real, imag
import scipy.special as special
def deflection_angle(p, (x1, x2)):
# Find the normalisation constant
norm = (p.f * p.m * (p.r0 ** (t - 2.0)) / pi) ** (1.0 / t)
# Define the complex plane
z = x1 + 1j * x2
# Define the radial coordinates
r = hypot(x1, x2)
# Truncate the radial coordinates
r_ = r * (r < p.r0).astype('float') + p.r0 * (r >= p.r0).astype('float')
# Calculate the radial part
radial = (norm ** 2 / (p.f * z)) * ((norm / r_) ** (t - 2))
# Calculate the angular part
h1, h2, h3 = 0.5, 1.0 - t / 2.0, 2.0 - t / 2.0
h4 = ((1 - p.f ** 2) / p.f ** 2) * (r_ / z) ** 2
special.errprint(0)
angular = special.hyp2f1(h1, h2, h3, h4)
# Assemble the deflection angle
alpha = (- radial * angular).conjugate()
# Separate real and imaginary parts
return real(alpha), imag(alpha)`
Unfortunately, hyp2f1 is notoriously hard to compute over some non-trivial areas of the parameter space. Many implementations would dilently produce inaccurate or wildly wrong results. Scipy.special tries hard to at least monitor convergence. An alternative could be to usr arbitrary precision implementations, e.g. mpmath. But these would certainly be quite a bit slower, so MCMC users beware.
EDIT: Ok, this seems to be scipy version dependent. I tried #wrwrwr's example on scipy 0.13.3, and it reproduces what you see: "Warning! You should check the accuracy" is printed regardless of the errprint status. However, doing the same with the dev version, I get
In [12]: errprint(True)
Out[12]: 0
In [13]: hyp2f1(0.5, 2/3., 1.5, 0.09j+0.75j)
/home/br/virtualenvs/scipy_py27/bin/ipython:1: SpecialFunctionWarning: scipy.special/chyp2f1: loss of precision
#!/home/br/virtualenvs/scipy_py27/bin/python
Out[13]: (0.93934867949609357+0.15593972567482395j)
In [14]: errprint(False)
Out[14]: 1
In [15]: hyp2f1(0.5, 2/3., 1.5, 0.09j+0.75j)
Out[15]: (0.93934867949609357+0.15593972567482395j)
So, apparently it got fixed at some point between 2013 and now. You might want to upgrade your scipy version.

Python and Floats - Print only the Whole Number

Is there a way in Python to print only the whole number portion of a float when no additional precision is required to express the number? For example, the float 1.0. Some other languages do this by default. Here are some examples:
In C++, this code prints 1, not 1.0:
int main()
{
float f = 1.0;
std::cout << f << "\n";
return 0;
}
./a.out
1
However, in Python, this code prints 1.0:
f = 1.0
print type(f)
<type 'float'>
print f
1.0
I'd like for the Python code to only print 1, not 1.0, when that's all that is required to fully represent the number.
Use the g formatting option:
f = 1.0
print(f"{f:g}") # Python 3.6 and above
or
print "{:g}".format(f)
or
print "%g" % f
This does something very similar to std::cout in default configuration. It will only print a limited number of digits, just like std::cout.
The modulo operator should work across all Python versions:
>>> f = 1.0
>>> if f % 1 == 0:
... print int(f)
... else:
... print f
...
1
>>>