I am trying to get the ZGEEV routine in Lapack to work for a test problem and having some difficulties. I just started coding in FORTRAN a week ago, so I think it is probably something very trivial that I am missing.
I have to diagonalize rather large complex symmetric matrices. To get started, using Matlab I created a 200 by 200 matrix, which I have verified is diagonalizable. When I run the code, it brings up no errors and the INFO = 0, suggesting a success. However, all the eigenvalues are (0,0) which I know is wrong.
Attached is my code.
PROGRAM speed_zgeev
IMPLICIT NONE
INTEGER(8) :: N
COMPLEX*16, DIMENSION(:,:), ALLOCATABLE :: MAT
INTEGER(8) :: INFO, I, J
COMPLEX*16, DIMENSION(:), ALLOCATABLE :: RWORK
COMPLEX*16, DIMENSION(:), ALLOCATABLE :: D
COMPLEX*16, DIMENSION(1,1) :: VR, VL
INTEGER(8) :: LWORK = -1
COMPLEX*16, DIMENSION(:), ALLOCATABLE :: WORK
DOUBLE PRECISION :: RPART, IPART
EXTERNAL ZGEEV
N = 200
ALLOCATE(D(N))
ALLOCATE(RWORK(2*N))
ALLOCATE(WORK(N))
ALLOCATE(MAT(N,N))
OPEN(UNIT = 31, FILE = "newmat.txt")
OPEN(UNIT = 32, FILE = "newmati.txt")
DO J = 1,N
DO I = 1,N
READ(31,*) RPART
READ(32,*) IPART
MAT(I,J) = CMPLX(RPART, IPART)
END DO
END DO
CLOSE(31)
CLOSE(32)
CALL ZGEEV('N','N', N, MAT, N, D, VL, 1, VR, 1, WORK, LWORK, RWORK, INFO)
INFO = WORK(1)
DEALLOCATE(WORK)
ALLOCATE(WORK(INFO))
CALL ZGEEV('N','N', N, MAT, N, D, VL, 1, VR, 1, WORK, LWORK, RWORK, INFO)
IF (INFO .EQ. 0) THEN
PRINT*, D(1:10)
ELSE
PRINT*, INFO
END IF
DEALLOCATE(MAT)
DEALLOCATE(D)
DEALLOCATE(RWORK)
DEALLOCATE(WORK)
END PROGRAM speed_zgeev
I have tried the same code on smaller matrices, of size 30 by 30 and they work fine. Any help would be appreciated! Thanks.
I forgot to mention that I am loading the matrices from a test file which I have verified to be working right.
Maybe LWORK = WORK (1) instead of INFO = WORK(1)? Also change ALLOCATE(WORK(INFO)).
Related
I am compiling some code trying to use the Intel MKL library eigensolvers. I am already using the VSL RNG's and the DFT libraries without issue. I am compiling and running everything in Visual Studio with Intel Parallel Studio XE installed. I have ensured that the mkl flag is enabled in project properties.
include 'lapack.f90'
program heev_test
use lapack95
implicit none
integer , parameter :: dp = kind(0.0d0)
complex(dp) :: matrix(4,4)
real(dp) :: eigs(4)
matrix = (1.0_dp,0.0_dp)
call zheev(matrix, eigs)
print*, eigs
read(*,*)
stop
end program
Running this code yields a segfault on two machines I have tested it on so far. I believe the issue is that the F77 routines are being called which require more arguments (documentation here). I would like to use the far simpler F95 routines. According to the documentation, I should replace zheev with heev. So I tried that, but then I get the error
fatal error LNK1120: 1 unresolved externals
error LNK2019: unresolved external symbol _ZHEEV_F95 referenced in function _MAIN__
ZHEEV_F95 has an interface defined in the lapack.f90 file.
The only other thing I have right now is that the documentation says I should also include mkl.fi, but doing so I get the following compilation errors
Error error #6218: This statement is positioned incorrectly and/or has syntax errors. C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\\mkl\include\lapack.f90 21
Error error #6790: This is an invalid statement; an END [PROGRAM] statement is required. C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\\mkl\include\lapack.f90 24
Error error #6785: This name does not match the unit name. [F95_PRECISION] C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\\mkl\include\lapack.f90 24
Error Compilation Aborted (code 1)
Warning warning #5427: Program may contain only one main entry routine
referring to these lines from the lapack.f90 file:
21 MODULE F95_PRECISION
22 INTEGER, PARAMETER :: SP = KIND(1.0E0)
23 INTEGER, PARAMETER :: DP = KIND(1.0D0)
24 END MODULE F95_PRECISION
The solution was to link the Lapack95 library explicitly, which is not linked with simply the -mkl flag. For example, on Windows 64-bit with 4 byte integers compiled with the Intel Fortran compiler, use
ifort /Qmkl heev_test.f90 mkl_lapack95_lp64.lib
As for the seg fault, I was improperly calling the F77 routines. Since I did not provide an interface for these routines, the compiler did not complain, and when I did call them, it caused a segmentation fault since the arguments were not properly input.
Here is a code which properly calls both routines, and can be verified to produce the same results.
include 'lapack.f90'
program heev_test
use lapack95
implicit none
integer , parameter :: dp = kind(0.0d0)
complex(dp) :: matrix(4,4)
real(dp) :: eigs(4)
matrix = (1.0_dp,0.0_dp)
print*, "checking eigenvalues using zheevr"
call eigenvalues(matrix,eigs)
print*, eigs
matrix = (1.0_dp,0.0_dp)
print*, "checking eigenvalues using heevr"
call heevr(matrix,eigs)
print*, eigs
read(*,*)
stop
contains
subroutine eigenvalues(a,w)
complex(dp) :: a(:,:)
real(dp) :: w(:)
character*1 :: jobz, range, uplo
integer :: n, m, lda, il, iu, ldz
real(dp) :: vl, vu, abstol
integer :: info
integer :: lwork, liwork, lrwork
complex(dp) , allocatable :: z(:,:)
complex(dp) , allocatable :: work(:)
real(dp) , allocatable :: rwork(:)
integer , allocatable :: iwork(:)
integer , allocatable :: isuppz(:)
jobz = 'N'
range = 'A'
uplo = 'U'
n = size(a,dim=1)
lda = max(1,n)
vl = -huge(vl)
vu = +huge(vl)
il = 1
iu = n
abstol = 0.0_dp
ldz = max(1,n)
lwork = max(1,2*n)
lrwork = max(1,24*n)
liwork = max(1,10*n)
allocate(work(lwork))
allocate(rwork(lrwork))
allocate(iwork(liwork))
allocate(z(ldz,max(1,n)))
allocate(isuppz(2*max(1,n)))
lwork = -1
liwork = -1
lrwork = -1
call zheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, &
& z, ldz, isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
lwork = work(1)
liwork = iwork(1)
lrwork = rwork(1)
deallocate(work,iwork,rwork)
allocate(work(lwork))
allocate(iwork(liwork))
allocate(rwork(lrwork))
call zheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, &
& z, ldz, isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
deallocate(work,iwork,rwork)
if (info /= 0) then
print*, "diagonalization failed, info = ", info
read(*,*)
stop
end if
end subroutine
end program
In my case, the f95 libraries were not installed on the target machine (a remote cluster), so I had to fall back on the f77 library, for which the above subroutine is essentially equivalent to the f95 interface for a square matrix and can be easily modified. See the documentation for definitions of variables.
Note that according to the documentation, heevr/zheevr is a more efficient routine than heev/zheev, so I switched them.
I am trying to find inverse and eigenfunctions of nxn Hermitian matrices using Fortran with lapack.
How do I choose the optimal values for parameters like lda, lwork, liwork and lrwork. I browse through some example and find these choices
integer,parameter::lda=nh
integer,parameter::lwork=2*nh+nh*nh
integer,parameter::liwork=3+5*nh
integer,parameter::lrwork=1 + 5*nh + 2*nh*nh
where nh is the dimension of the matrix. I also find another example with lwork=16*nh. How can I determine the best choice? At this point, I am dealing with 500x500 Hermitian matrices (maximum).
I found this documentation which suggests
WORK
(workspace) REAL array, dimension (LWORK)
On exit, if INFO = 0, then WORK(1) returns the optimal LWORK.
LWORK
(input) INTEGER
The dimension of the array WORK. LWORK  max(1,N).
For optimal performance LWORK  N*NB, where NB is the optimal block size returned by ILAENV.
Is it possible to find out the optimal block size using WORK or ILAENV for a given matrix dimension?
I am using both gfortran and ifort with mkl.
EDIT
Based on the comment by #percusse and #kvantour's answer here is a sample code
character,parameter::jobz="v",uplo="u"
integer, parameter::nh=15
complex*16::m(nh,nh),m1(nh,nh)
integer,parameter::lda=nh
integer::ipiv(nh),info
complex*16::work(1)
real*8::rwork(1), w(nh)
integer::iwork(1)
real*8::x1(nh,nh),x2(nh,nh)
call random_seed()
call random_number(x1)
call random_number(x2)
m=cmplx(x1,x2)
m1=conjg(m)
m1=transpose(m1)
m=(m+m1)/2.0
call zheevd(jobz,uplo,nh,m,lda,w,work,-1,rwork,-1,iwork, -1,info)
print*,"info : ", info
print*,"lwork: ", int(work(1)) , 2*nh+nh*nh
print*,"lrwork:", int(rwork(1)) , 1 + 5*nh + 2*nh*nh
print*,"liwork:", int(iwork(1)) , 3+5*nh
end
info : 0
lwork: 255 255
lrwork: 526 526
liwork: 78 78
I'm not sure what you are implying with "Is it possible to find out the optimal block size using WORK or ILAENV for a particular machine architecture?". You can however find the optimal values for a particular problem.
Eg. If you want to find the eigenvalues of a complex Hermitian matrix, using cheev, you can ask the routine to return you the value :
subroutine CHEEV( JOBZ, UPLO, N, A, LDA, W, WORK, LWORK, RWORK, INFO )
character , intent(in) :: JOBZ
character , intent(in) :: UPLO
integer , intent(in) :: N
complex, dimension(lda,*), intent(inout) :: A
integer , intent(in) :: LDA
real , dimension(*) , intent(out) :: W
complex, dimension(*) , intent(out) :: WORK
integer , intent(in) :: LWORK
real , dimension(*) , intent(out) :: RWORK
integer , intent(out) :: INFO
Then the documentation clearly states (be advised, in the past this was easier to read):
WORK is COMPLEX array, dimension (MAX(1,LWORK))
On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
LWORK is INTEGER
The length of the array WORK. LWORK >= max(1,2*N-1).
For optimal efficiency, LWORK >= (NB+1)*N,
where NB is the blocksize for CHETRD returned by ILAENV. If LWORK = -1, then a workspace query is assumed; the routine
only calculates the optimal size of the WORK array, returns
this value as the first entry of the WORK array, and no error
message related to LWORK is issued by XERBLA.
So all you need to do is
call cheev(jobz, uplo, n, a, lda, w, work, -1, rwork, info)
lwork=int(work(1))
dallocate(work)
allocate(work(lwork))
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
I have to write a Fortran routine returning the inverse matrix. If I run the code below in a Fortran program the inverse matrix is correct, but when I run the subroutine from C++ code my first value is a wrong value. It seems like a problem with the data types or the memory.
What am I doing wrong?
Here is the subroutine:
subroutine get_inverse_matrix( matrix, rows_matrix, cols_matrix, tmpMatrix, rows_tmpMatrix, cols_tmpMatrix) bind(c)
use iso_c_binding
integer :: m, n, lda, lwork, info, size_m
integer(c_int) :: rows_matrix, cols_matrix, rows_tmpMatrix, cols_tmpMatrix
real(c_double) :: matrix(rows_matrix, cols_matrix), tmpMatrix(rows_tmpMatrix, cols_tmpMatrix)
integer, dimension( rows_matrix ) :: ipiv
real, dimension( rows_matrix ) :: work
size_m = rows_matrix
m = size_m
n = size_m
lda = size_m
lwork = size_m
write(*,*) "Matrix: ", matrix
!tmpMatrix = matrix
write(*,*) "Temp matrix: ", tmpMatrix
! LU-Faktorisierung (Dreieckszerlegung) der Matrix
call sgetrf( m, n, tmpMatrix, lda, ipiv, info )
write(*,*) info
! Inverse der LU-faktorisierten Matrix
call sgetri( n, tmpMatrix, lda, ipiv, work, lwork, info )
write(*,*) info
select case(info)
case(0)
write(*,*) "SUCCESS"
case(:-1)
write(*,*) "ILLEGAL VALUE"
case(1:)
write(*,*) "SINGULAR MATRIX"
end select
end subroutine get_inverse_matrix
Here is the declaration in the C++ code:
extern "C"
{
void get_inverse_matrix( double *matrix, int *rows_matrix, int *cols_matrix, double *tmpMatrix, int *rows_tmpMatrix, int *cols_tmpMatrix);}
Here is the call from my C++ program:
get_inverse_matrix(&lhs[0], &sz, &sz, &res[0], &sz, &sz);
My program only uses a 3x3 matrix. If I pass the identity matrix the result looks like:
5.29981e-315 0 0
0 1 0
0 0 1
You are declaring your arrays as type real with kind c_double but you are using lapack routines that are expecting single precision inputs (e.g. c_float). To fix this you should replace the calls to sgetrf and sgetri with dgetrf and dgetri.
As noted by Vladimir F in the comments these issues can be more easily caught if you provide interfaces.
I'm struggling with LAPACK's dgetrf and dgetri routines. Below is a subroutine I've created (the variable fit_coeffs is defined externally and is allocatable, it's not the problem). When I run I get memory allocation errors, that appear when I assign fit_coeffs, due to the matmul(ATA,AT) line. I know this from inserting a bunch of print statements. Also, both error checking statements after calls to LAPACK subroutines are printed, suggesting an error.
Does anyone understand where this comes from? I'm compiling using the command:
gfortran -Wall -cpp -std=f2003 -ffree-form -L/home/binningtont/lapack-3.4.0/ read_grib.f -llapack -lrefblas.
Thanks in advance!
subroutine polynomial_fit(x_array, y_array, D)
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
real, allocatable, dimension(:,:) :: A, AT, ATA
real, allocatable, dimension(:) :: work
integer, dimension(:), allocatable :: pivot
integer :: l, m, n, lda, lwork, ok
l = D + 1
lda = l
lwork = l
allocate(fit_coeffs(l))
allocate(pivot(l))
allocate(work(l))
allocate(A(size(x_array),l))
allocate(AT(l,size(x_array)))
allocate(ATA(l,l))
do m = 1,size(x_array),1
do n = 1,l,1
A(m,n) = x_array(m)**(n-1)
end do
end do
AT = transpose(A)
ATA = matmul(AT,A)
call dgetrf(l, l, ATA, lda, pivot, ok)
! ATA is now represented as PLU (permutation, lower, upper)
if (ok /= 0) then
write(6,*) "HERE"
end if
call dgetri(l, ATA, lda, pivot, work, lwork, ok)
! ATA now contains the inverse of the matrix ATA
if (ok /= 0) then
write(6,*) "HERE"
end if
fit_coeffs = matmul(matmul(ATA,AT),y_array)
deallocate(pivot)
deallocate(fit_coeffs)
deallocate(work)
deallocate(A)
deallocate(AT)
deallocate(ATA)
end subroutine polynomial_fit
1) Where is fit_coeffs declared? I can't see how the above can even compile
1b) Implicit None is your friend!
2) You do have an interface in scope at the calling point, don't you?
3) dgertf and dgetri want "double precision" while you have single. So you need sgetrf and sgetri
"Fixing" all these and completeing the program I get
Program testit
Implicit None
Real, Dimension( 1:100 ) :: x, y
Integer :: D
Interface
subroutine polynomial_fit(x_array, y_array, D)
Implicit None ! Always use this!!
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
End subroutine polynomial_fit
End Interface
Call Random_number( x )
Call Random_number( y )
D = 6
Call polynomial_fit( x, y, D )
End Program testit
subroutine polynomial_fit(x_array, y_array, D)
Implicit None ! Always use this!!
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
real, allocatable, dimension(:,:) :: A, AT, ATA
real, allocatable, dimension(:) :: work, fit_coeffs
integer, dimension(:), allocatable :: pivot
integer :: l, m, n, lda, lwork, ok
l = D + 1
lda = l
lwork = l
allocate(fit_coeffs(l))
allocate(pivot(l))
allocate(work(l))
allocate(A(size(x_array),l))
allocate(AT(l,size(x_array)))
allocate(ATA(l,l))
do m = 1,size(x_array),1
do n = 1,l,1
A(m,n) = x_array(m)**(n-1)
end do
end do
AT = transpose(A)
ATA = matmul(AT,A)
call sgetrf(l, l, ATA, lda, pivot, ok)
! ATA is now represented as PLU (permutation, lower, upper)
if (ok /= 0) then
write(6,*) "HERE"
end if
call sgetri(l, ATA, lda, pivot, work, lwork, ok)
! ATA now contains the inverse of the matrix ATA
if (ok /= 0) then
write(6,*) "HERE"
end if
fit_coeffs = matmul(matmul(ATA,AT),y_array)
deallocate(pivot)
deallocate(fit_coeffs)
deallocate(work)
deallocate(A)
deallocate(AT)
deallocate(ATA)
end subroutine polynomial_fit
This runs to completion. If I omit the interface I get "HERE" printed twice. If I use the d versions I get seg faults.
Does this answer your question?
I'm struggling with LAPACK's dgetrf and dgetri routines. Below is a subroutine I've created (the variable fit_coeffs is defined externally and is allocatable, it's not the problem). When I run I get memory allocation errors, that appear when I assign fit_coeffs, due to the matmul(ATA,AT) line. I know this from inserting a bunch of print statements. Also, both error checking statements after calls to LAPACK subroutines are printed, suggesting an error.
Does anyone understand where this comes from? I'm compiling using the command:
gfortran -Wall -cpp -std=f2003 -ffree-form -L/home/binningtont/lapack-3.4.0/ read_grib.f -llapack -lrefblas.
Thanks in advance!
subroutine polynomial_fit(x_array, y_array, D)
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
real, allocatable, dimension(:,:) :: A, AT, ATA
real, allocatable, dimension(:) :: work
integer, dimension(:), allocatable :: pivot
integer :: l, m, n, lda, lwork, ok
l = D + 1
lda = l
lwork = l
allocate(fit_coeffs(l))
allocate(pivot(l))
allocate(work(l))
allocate(A(size(x_array),l))
allocate(AT(l,size(x_array)))
allocate(ATA(l,l))
do m = 1,size(x_array),1
do n = 1,l,1
A(m,n) = x_array(m)**(n-1)
end do
end do
AT = transpose(A)
ATA = matmul(AT,A)
call dgetrf(l, l, ATA, lda, pivot, ok)
! ATA is now represented as PLU (permutation, lower, upper)
if (ok /= 0) then
write(6,*) "HERE"
end if
call dgetri(l, ATA, lda, pivot, work, lwork, ok)
! ATA now contains the inverse of the matrix ATA
if (ok /= 0) then
write(6,*) "HERE"
end if
fit_coeffs = matmul(matmul(ATA,AT),y_array)
deallocate(pivot)
deallocate(fit_coeffs)
deallocate(work)
deallocate(A)
deallocate(AT)
deallocate(ATA)
end subroutine polynomial_fit
1) Where is fit_coeffs declared? I can't see how the above can even compile
1b) Implicit None is your friend!
2) You do have an interface in scope at the calling point, don't you?
3) dgertf and dgetri want "double precision" while you have single. So you need sgetrf and sgetri
"Fixing" all these and completeing the program I get
Program testit
Implicit None
Real, Dimension( 1:100 ) :: x, y
Integer :: D
Interface
subroutine polynomial_fit(x_array, y_array, D)
Implicit None ! Always use this!!
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
End subroutine polynomial_fit
End Interface
Call Random_number( x )
Call Random_number( y )
D = 6
Call polynomial_fit( x, y, D )
End Program testit
subroutine polynomial_fit(x_array, y_array, D)
Implicit None ! Always use this!!
integer, intent(in) :: D
real, intent(in), dimension(:) :: x_array, y_array
real, allocatable, dimension(:,:) :: A, AT, ATA
real, allocatable, dimension(:) :: work, fit_coeffs
integer, dimension(:), allocatable :: pivot
integer :: l, m, n, lda, lwork, ok
l = D + 1
lda = l
lwork = l
allocate(fit_coeffs(l))
allocate(pivot(l))
allocate(work(l))
allocate(A(size(x_array),l))
allocate(AT(l,size(x_array)))
allocate(ATA(l,l))
do m = 1,size(x_array),1
do n = 1,l,1
A(m,n) = x_array(m)**(n-1)
end do
end do
AT = transpose(A)
ATA = matmul(AT,A)
call sgetrf(l, l, ATA, lda, pivot, ok)
! ATA is now represented as PLU (permutation, lower, upper)
if (ok /= 0) then
write(6,*) "HERE"
end if
call sgetri(l, ATA, lda, pivot, work, lwork, ok)
! ATA now contains the inverse of the matrix ATA
if (ok /= 0) then
write(6,*) "HERE"
end if
fit_coeffs = matmul(matmul(ATA,AT),y_array)
deallocate(pivot)
deallocate(fit_coeffs)
deallocate(work)
deallocate(A)
deallocate(AT)
deallocate(ATA)
end subroutine polynomial_fit
This runs to completion. If I omit the interface I get "HERE" printed twice. If I use the d versions I get seg faults.
Does this answer your question?