undefined reference to `gemmkernel_'--C++ routine called from Fortran - c++

I've been working on a Fortran routine that makes a call to a C++ method. I'm getting the following error when I try to make it:
make -f makefile_gcc
Error:
gfortran -O3 -o tgemm tgemm.o mytimer.o dgemmf.o -lblas -dgemmkernel.o
dgemmf.o: In function `dgemmf_':
dgemmf.f:(.text+0x135): undefined reference to `gemmkernel_'
collect2: ld returned 1 exit status
make: *** [tgemm] Error 1
This is my makefile:
`FC=gfortran
CC=gcc
FFLAGS = -O3
CFLAGS = -O5
BLASF=dgemmf.o
BLASFSRC=dgemmf.f
TIMER=mytimer.o
TGEMM=tgemm
ALL= $(TGEMM)
LIBS = -lblas -dgemmkernel.o
all: $(ALL)
$(TGEMM): dgemmkernel.o tgemm.o $(TIMER) $(BLASF)
$(FC) $(FFLAGS) -o $(TGEMM) tgemm.o $(TIMER) $(BLASF) $(LIBS)
dgemmkernel.o: dgemmkernel.cpp
$(CC) $(CFLAGS) -c dgemmkernel.cpp
tgemm.o: tgemm.f $(INCLUDE)
$(FC) $(FFLAGS) -c tgemm.f
clean:
rm -rf *.o $(ALL)
Here is my Fortran code:
SUBROUTINE DGEMMF( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB,
$ BETA, C, LDC )
* .. Scalar Arguments ..
CHARACTER*1 TRANSA, TRANSB
INTEGER M, N, K, LDA, LDB, LDC
DOUBLE PRECISION ALPHA, BETA
* .. Array Arguments ..
DOUBLE PRECISION A( LDA, * ), B( LDB, * ), C( LDC, * )
* .. External Functions ..
LOGICAL LSAME
EXTERNAL LSAME
* .. Local Scalars ..
LOGICAL NOTA, NOTB
INTEGER I, J, L
* .. Parameters ..
DOUBLE PRECISION ONE , ZERO
PARAMETER ( ONE = 1.0D+0, ZERO = 0.0D+0 )
* ..
* .. Executable Statements ..
*
* Set NOTA and NOTB as true if A and B respectively are not transposed
*
NOTA = LSAME( TRANSA, 'N' )
NOTB = LSAME( TRANSB, 'N' )
*
* We only want C = A°B
*
IF ((ALPHA.NE.ONE).OR.( BETA.NE.ZERO).OR.
$ (.NOT.NOTA).OR.(.NOT.NOTB)) STOP
*
* Start the operations.
CALL gemmkernel( M, N, K, A, LDA, B, LDB, C, LDC)
RETURN
* End of DGEMM.
*
END
And here is the C++ bit that I'm trying to call
void gemmkernel_(int * M, int * N, int * K,
double * a, int * LDA,
double * b, int * LDB,
double * c, int * LDC)
All of the .o files do get created, however the executable is never completed. I suspect that the error is with my makefile because every source I've found so far suggests to me that my Fortran/C++ code is correct.

Your make fails at link time. dgemmkernel.o should be in the list of object files. I assume you want this line:
$(FC) $(FFLAGS) -o $(TGEMM) tgemm.o $(TIMER) $(BLASF) $(LIBS)
to be:
$(FC) $(FFLAGS) -o $(TGEMM) tgemm.o dgemmkernel.o $(TIMER) $(BLASF) $(LIBS)
and
LIBS = -lblas -dgemmkernel.o
to be:
LIBS = -lblas

Related

Subroutine Segmentation Fault

PROGRAM olaf
IMPLICIT NONE
INTEGER :: i, j, nc, nd,ok,iter
REAL :: alph, bet, chi, ninf1, C1, ninf2, C2
REAL, DIMENSION(:), ALLOCATABLE :: u,up2
REAL :: E, k, Lc, hc, eps, h, Ld, Cai
INTEGER, DIMENSION(7) :: valnc = (/ 10, 50, 100, 500, 1000, 5000, 10000/)
Do iter=1,7
nc = valnc(iter)
Ld=0.2*Lc ; nd=((Lc+Ld)/h)-nc;
E=25. ; k=125. ; hc=0.01 ; eps=0.01 ; Lc=1 ;
h = Lc/nc ; chi=sqrt((E*hc)/k) ; alph= -(1/h**2) ; bet=(2/h**2)+(k/(E*hc)) ;
CALL resolution(0,nc,bet,2*alph,alph,2*alph,-2*eps/h,2*eps/h,u)
Cai=u(nc)
CALL resolution(nc+1,(nc+nd)+1,-2.0,2.0,1.0,2.0,2.0*eps*h,-2*eps*h,up2,Cai)
DEALLOCATE(u,up2)
END DO
CONTAINS
SUBROUTINE resolution (n1,n2,a,b1,b,b2,c1,c2,u1,u2)
INTEGER, INTENT(IN) :: n1,n2
REAL, INTENT(IN):: a,b1,b,b2,c1,c2
REAL, INTENT(IN),OPTIONAL :: u2
REAL, DIMENSION (:),ALLOCATABLE, INTENT(OUT) :: u1
REAL, DIMENSION(:), ALLOCATABLE :: Ap, Ae, Aw, bh, Lw, Lp, Ue, y
INTEGER :: i
Logical :: Exist
ALLOCATE(Ap(n1+1:n2), Ae(n1:n2), Aw(n1:n2),bh(n1:n2))
ALLOCATE(Lw(n1:n2), Lp(n1:n2), Ue(n1:n2), y(n1:n2),u1(n1:n2))
Aw=0; Ap=0; Ae=0; bh= 0 ; Lw = 0 ; Lp = 0 ; Ue = 0 ; y=0; Lc=0;
Exist=Present(u2)
IF(Exist .eqv. .true.)THEN
u1(n1)=u2
END IF
DO i = n1,n2
Ae(i) = b
Ap(i) = a
Aw(i) = b
END DO
Ae(n1)=b1
Aw(n2)=b2
bh(n1)=c1
bh(n2)=c2
Lp(n1) = Ap(n1+2)
Ue(n1) = Ae(n1)/Lp(n1)
DO i = n1+1, n2-1
Lw(i) = Aw(i)
Lp(i) = Ap(i) - Lw(i)*Ue(i-1)
Ue(i) = Ae(i)/Lp(i)
END DO
Lw(n2) = Aw(n2)
Lp(n2) = Ap(n2) - Lw(n2)*Ue(n2-1)
y(n1) = bh(n1)/Lp(n1)
DO i = n1, n2
y(i) = (bh(i) - Lw(i)*y(i-1)) / Lp(i)
END DO
u1(n2) = y(n2)
DO i = n2-1, n1, -1
u1(i) = y(i) - Ue(i)*u1(i+1)
END DO
DEALLOCATE(Ap, Ae, Aw,bh,Lw, Lp, Ue, y)
END SUBROUTINE
END PROGRAM olaf
I'm trying to do an Ah=b decomposition for u and up2 in my program. Nevertheless, up2 is depend on u for it's first value. In order to not repeat the decomposition resolution, i made it into a subroutine but whenever i'm trying to call it in a same loop for both u and up2, i keep getting sigmentation fault which i can't identify.
Please learn about compiler warnings and running time error checking. Using gfortran with the appropriate flags on your program I get:
ijb#ianbushdesktop ~/work/stack $ gfortran -std=f2003 -Wall -Wextra -fcheck=all -g -O olaf.f90
olaf.f90:13:16:
Ld=0.2*Lc ; nd=((Lc+Ld)/h)-nc;
1
Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1) [-Wconversion]
olaf.f90:4:69:
REAL :: alph, bet, chi, ninf1, C1, ninf2, C2
1
Warning: Unused variable ‘c1’ declared at (1) [-Wunused-variable]
olaf.f90:4:80:
REAL :: alph, bet, chi, ninf1, C1, ninf2, C2
1
Warning: Unused variable ‘c2’ declared at (1) [-Wunused-variable]
olaf.f90:3:45:
INTEGER :: i, j, nc, nd,ok,iter
1
Warning: Unused variable ‘i’ declared at (1) [-Wunused-variable]
olaf.f90:3:48:
INTEGER :: i, j, nc, nd,ok,iter
1
Warning: Unused variable ‘j’ declared at (1) [-Wunused-variable]
olaf.f90:4:65:
REAL :: alph, bet, chi, ninf1, C1, ninf2, C2
1
Warning: Unused variable ‘ninf1’ declared at (1) [-Wunused-variable]
olaf.f90:4:76:
REAL :: alph, bet, chi, ninf1, C1, ninf2, C2
1
Warning: Unused variable ‘ninf2’ declared at (1) [-Wunused-variable]
olaf.f90:3:59:
INTEGER :: i, j, nc, nd,ok,iter
1
Warning: Unused variable ‘ok’ declared at (1) [-Wunused-variable]
olaf.f90:13:0:
Ld=0.2*Lc ; nd=((Lc+Ld)/h)-nc;
Warning: ‘h’ may be used uninitialized in this function [-Wmaybe-uninitialized]
olaf.f90:6:0:
REAL :: E, k, Lc, hc, eps, h, Ld, Cai
note: ‘h’ was declared here
ijb#ianbushdesktop ~/work/stack $ ./a.out
At line 51 of file olaf.f90
Fortran runtime error: Index '0' of dimension 1 of array 'ap' below lower bound of 1
Error termination. Backtrace:
#0 0x400f8d in resolution
at /home/ijb/work/stack/olaf.f90:51
#1 0x401991 in olaf
at /home/ijb/work/stack/olaf.f90:20
#2 0x401991 in main
at /home/ijb/work/stack/olaf.f90:11
ijb#ianbushdesktop ~/work/stack $
Note two things
You are using h uninitialized. The compiler tells you this in the second to last warning. I don't know how to fix this, you will need to do that.
Your are accessing the array ap out of bounds. This occurs at the line
Ap(i) = a
and the problem is that i is zero. Looking at the code i goes from n1 to n2, so the ultimate problem is that at the line
CALL resolution(0,nc,bet,2*alph,alph,2*alph,-2*eps/h,2*eps/h,u)
you are passing the value 0 to n1, but you allocate the array ap as
ALLOCATE(Ap(n1+1:n2), Ae(n1:n2), Aw(n1:n2),bh(n1:n2))
which is inconsistent with the above usage of Ap. My complete guess is that the do loop should read
DO i = n1+1,n2
but that is a guess - the problem as presented is due to out of bounds errors in this loop.

Borwein’s algorithm for the calculation of Pi in Fortran is converging too fast

The following implementation of Borwein’s algorithm with quartic convergence in Fortran admittedly calculates Pi, but converges simply too fast. In theory, a converges quartically to 1/π. On each iteration, the number of correct digits is therefore quadrupled.
! pi.f90
program main
use, intrinsic :: iso_fortran_env, only: real128
implicit none
real(kind=real128), parameter :: CONST_PI = acos(-1._real128)
real(kind=real128) :: pi
integer :: i
do i = 1, 10
pi = borwein(i)
print '("Pi (n = ", i3, "): ", f0.100)', i, pi
end do
print '("Pi:", 11x, f0.100)', CONST_PI
contains
function borwein(n) result(pi)
integer, intent(in) :: n
real(kind=real128) :: pi
real(kind=real128) :: a, y
integer :: i
y = sqrt(2._real128) - 1
a = 2 * (sqrt(2._real128) - 1)**2
do i = 1, n
y = (1 - (1 - y**4)**.25_real128) / (1 + (1 - y**4)**.25_real128)
a = a * (1 + y)**4 - 2**(2 * (i - 1) + 3) * y * (1 + y + y**2)
end do
pi = 1 / a
end function borwein
end program main
But after the second iteration, the value of Pi does not change anymore, as one can see for the first 100 digits:
$ gfortran -o pi pi.f90
$ ./pi
Pi (n = 1): 3.1415926462135422821493444319826910539597974491424025615960346875056357837663334464650688460096716881
Pi (n = 2): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 3): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 4): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 5): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 6): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 7): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 8): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 9): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi (n = 10): 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
Pi: 3.1415926535897932384626433832795027974790680981372955730045043318742967186629755360627314075827598572
(The last output is the correct value of Pi for comparison.)
Is there an error in the implementation? I’m not sure, if real128 precision is always kept.
At the second iteration you have converged to as many digits as real128 can support:
ian#eris:~/work/stack$ cat pi2.f90
Program pi2
Use, intrinsic :: iso_fortran_env, only: real128
Implicit None
Real( real128 ) :: pi
pi = 3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439_real128
Write( *, '( f0.100 )' ) pi
Write( *, '( f0.100 )' ) Nearest( pi, +1.0_real128 )
Write( *, '( f0.100 )' ) Abs( acos(-1._real128) - Nearest( pi, +1.0_real128 ) )
End Program pi2
ian#eris:~/work/stack$ gfortran -std=f2008 -Wall -Wextra -fcheck=all pi2.f90
ian#eris:~/work/stack$ ./a.out
3.1415926535897932384626433832795024122930792206901249618089158148888330110426458929850923595950007439
3.1415926535897932384626433832795027974790680981372955730045043318742967186629755360627314075827598572
.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
ian#eris:~/work/stack$
Thus the further iterations show no change as it is already as accurate as it can be

lapack stemr segmentation fault with a particular matrix

I am trying to find the first (smallest) k eigenvalues of a real symmetric tridiagonal matrix using the appropriate lapack routine.
I am new to both Fortran and lapack libraries, but (d)stemr seemd to me a good choice so I tried calling it but keep getting segmentation faults.
After some trials I noticed the problem was my input matrix, which has:
diagonal = 2 * (1 + order 1e-5 to 1e-3 small variable correction)
subdiagonal all equal -1 (if I use e.g. 0.95 instead everything works)
I reduced the code to a single M(not)WE program shown below.
So the question are:
why is stemr failing with such a matrix, while e.g. stev works?
why a segmentation fault?
program mwe
implicit none
integer, parameter :: n = 10
integer, parameter :: iu = 3
integer :: k
double precision :: d(n), e(n)
double precision :: vals(n), vecs(n,iu)
integer :: m, ldz, nzc, lwk, liwk, info
integer, allocatable :: isuppz(:), iwk(:)
double precision, allocatable :: wk(:)
do k = 1, n
d(k) = +2d0 + ((k-5.5d0)*1d-2)**2
e(k) = -1d0 ! e(n) not really needed
end do
ldz = n
nzc = iu
allocate(wk(1), iwk(1), isuppz(2*iu))
! ifort -mkl gives SIGSEGV at this call <----------------
call dstemr( &
'V', 'I', n, d, e, 0d0, 0d0, 1, iu, &
m, vals, vecs, ldz, -1, isuppz, .true., &
wk, -1, iwk, -1, info)
lwk = ceiling(wk(1)); deallocate(wk); allocate(wk(lwk))
liwk = iwk(1); deallocate(iwk); allocate(iwk(liwk))
print *, info, lwk, liwk ! ok with gfortran
! gfortran -llapack gives SIGSEGV at this call <---------
call dstemr( &
'V', 'I', n, d, e, 0d0, 0d0, 1, iu, &
m, vals, vecs, ldz, nzc, isuppz, .true., &
wk, lwk, iwk, liwk, info)
end program
Compilers are invoked via:
gfortran [(GCC) 9.2.0]: gfortran -llapack -o o.x mwe.f90
ifort [(IFORT) 19.0.5.281 20190815]: ifort -mkl -o o.x mwe.f90
According to the manual, one issue seems that the argument TRYRAC needs to be a variable (rather than a constant) because it can be overwritten by dstemr():
[in,out] TRYRAC : ... On exit, a .TRUE. TRYRAC will be set to .FALSE. if the matrix
does not define its eigenvalues to high relative accuracy.
So, for example, a modified code may look like:
logical :: tryrac
...
tryrac = .true.
call dstemr( &
'V', 'I', n, d, e, 0d0, 0d0, 1, iu, &
m, vals, vecs, ldz, -1, isuppz, tryrac, & !<--
wk, -1, iwk, -1, info)
...
tryrac = .true.
call dstemr( &
'V', 'I', n, d, e, 0d0, 0d0, 1, iu, &
m, vals, vecs, ldz, nzc, isuppz, tryrac, & !<--
wk, lwk, iwk, liwk, info)

BLAS function returns zero in Fortran90

I am learning to use BLAS in Fortran90, and wrote a simple program using the subroutine SAXPY and the function SNRM2. The program computes the distance between two points by subtracting one vector from the other, then taking the euclidean norm of the result.
I am specifying the return value of SNRM2 as external according to the answer to a similar question, "Calling BLAS functions".
My full program:
program test
implicit none
real :: dist
real, dimension(3) :: a, b
real, external :: SNRM2
a = (/ 3.0, 0.0, 0.0 /)
b = (/ 0.0, 4.0, 0.0 /)
call SAXPY(3, -1.0, a,1, b,1)
print *, 'difference vector: ', b
dist = 6.66 !to show that SNRM2 is doing something
dist = SNRM2(3, b, 1)
print *, 'length of diff vector: ', dist
end program test
The result of the program is:
difference vector: -3.00000000 4.00000000 0.00000000
length of diff vector: 0.00000000
The difference vector is correct, but the length ought to be 5. So why is SNRM2 returning a value of zero?
I know the variable dist is modified by SNRM2, so I don't suspect my openBLAS installation is broken. I'm running macos10.13 and installed everything with homebrew.
I am compiling with gfortran with many flags enabled, and I get no warnings:
gfortran test.f90 -lblas -g -fimplicit-none -fcheck=all -fwhole-file -fcheck=all -fbacktrace -Wall -Wextra -Wline-truncation -Wcharacter-truncation -Wsurprising -Waliasing -Wconversion -Wno-unused-parameter -pedantic -o test
I tried looking at the code for snrm2.f, but I don't see any potential problems.
I also tried declaring my variables with real(4) or real(selected_real_kind(6)) with no change in behavior.
Thanks!
According to this page, there seems to be some issue with single precision routines in the BLAS shipped with Apple's Accelerate Framework.
On my Mac (OSX10.11), gfortran-8.1 (installed via Homebrew) + default BLAS (in the system) gives a wrong result:
$ gfortran-8 test.f90 -lblas
or
$ gfortran-8 test.f90 -L/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/ -lBLAS
$ ./a.out
difference vector: -3.00000000 4.00000000 0.00000000
length of diff vector: 0.00000000
while explicitly linking with OpenBLAS (installed via Homebrew) gives the correct result:
$ gfortran-8 test.f90 -L/usr/local/Cellar/openblas/0.2.20_2/lib -lblas
$ ./a.out
difference vector: -3.00000000 4.00000000 0.00000000
length of diff vector: 5.00000000
The above page suggests that the problem occurs when linking with the system BLAS in a way that is not compliant with the old g77 style. Indeed, attaching -ff2c option gives the correct result:
$ gfortran-8 -ff2c test.f90 -lblas
$ ./a.out
difference vector: -3.00000000 4.00000000 0.00000000
length of diff vector: 5.00000000
But I guess it may be better to use the latest OpenBLAS (than using -ff2c option) ...
The following is a separate test in C (to check that the problem is not specific to gfortran).
// test.c
#include <stdio.h>
float snrm2_( int*, float*, int* );
int main()
{
float b[3] = { -3.0f, 4.0f, 0.0f };
int n = 3, inc = 1;
float dist = snrm2_( &n, b, &inc );
printf( "b = %10.7f %10.7f %10.7f\n", b[0], b[1], b[2] );
printf( "dist = %10.7f\n", dist );
return 0;
}
$ gcc-8 test.c -lblas
$ ./a.out
b = -3.0000000 4.0000000 0.0000000
dist = 0.0000000
$ gcc-8 test.c -lblas -L/usr/local/Cellar/openblas/0.2.20_2/lib
$ ./a.out
b = -3.0000000 4.0000000 0.0000000
dist = 5.0000000
As far as I've tried, the double-precision version (DNRM2) works even with the system BLAS, so the problem seems only with the single-precision version (as suggested in the above page).

Bug in the C++ standard library in std::poisson_distribution?

I think I have encountered an incorrect behaviour of std::poisson_distribution from C++ standard library.
Questions:
Could you confirm it is indeed a bug and not my error?
What exactly is wrong in the standard library code of poisson_distribution function, assuming that it is indeed a bug?
Details:
The following C++ code (file poisson_test.cc) is used to generate Poisson-distributed numbers:
#include <array>
#include <cmath>
#include <iostream>
#include <random>
int main() {
// The problem turned out to be independent on the engine
std::mt19937_64 engine;
// Set fixed seed for easy reproducibility
// The problem turned out to be independent on seed
engine.seed(1);
std::poisson_distribution<int> distribution(157.17);
for (int i = 0; i < 1E8; i++) {
const int number = distribution(engine);
std::cout << number << std::endl;
}
}
I compile this code as follows:
clang++ -o poisson_test -std=c++11 poisson_test.cc
./poisson_test > mypoisson.txt
The following python script was used to analyze the sequence of random numbers from file mypoisson.txt:
import numpy as np
import matplotlib.pyplot as plt
def expectation(x, m):
" Poisson pdf "
# Use Ramanujan formula to get ln n!
lnx = x * np.log(x) - x + 1./6. * np.log(x * (1 + 4*x*(1+2*x))) + 1./2. * np.log(np.pi)
return np.exp(x*np.log(m) - m - lnx)
data = np.loadtxt('mypoisson.txt', dtype = 'int')
unique, counts = np.unique(data, return_counts = True)
hist = counts.astype(float) / counts.sum()
stat_err = np.sqrt(counts) / counts.sum()
plt.errorbar(unique, hist, yerr = stat_err, fmt = '.', \
label = 'Poisson generated \n by std::poisson_distribution')
plt.plot(unique, expectation(unique, expected_mean), \
label = 'expected probability \n density function')
plt.legend()
plt.show()
# Determine bins with statistical significance of deviation larger than 3 sigma
deviation_in_sigma = (hist - expectation(unique, expected_mean)) / stat_err
d = dict((k, v) for k, v in zip(unique, deviation_in_sigma) if np.abs(v) > 3.0)
print d
The script produces the following plot:
You can see the problem by bare eye. The deviation at n = 158 is statistically significant, it is in fact a 22σ deviation!
Close-up of the previous plot.
My system is set up as follows (Debian testing):
libstdc++-7-dev:
Installed: 7.2.0-16
libc++-dev:
Installed: 3.5-2
clang:
Installed: 1:3.8-37
g++:
Installed: 4:7.2.0-1d1
I can confirm the bug when using libstdc++:
g++ -o pois_gcc -std=c++11 pois.cpp
clang++ -o pois_clang -std=c++11 -stdlib=libstdc++ pois.cpp
clang++ -o pois_clang_libc -std=c++11 -stdlib=libc++ pois.cpp
Result: