I am writing a Fortran code which requires an FFT. I am using the double precision version of four1 from Numerical Recipes in Fortran 77 (page 501). Below is a program to test the FFT. Below that is the output I get from the FFT along with the output I expect to get. The real part of the transform is correct within rounding errors, but the imaginary part is not. However, for my purposes, I do not even need the imaginary part, so should I be able to just grab the real part and continue with that? It still bothers me that the output is not correct though and makes me think I don't understand something about the implimentation of this subroutine.
The way I understand it, the values given to the array "data" when it is constructed are all real (1,1,1,1,0,0,0,0). And the imaginary values are all zero. Is that correct? I am concerned that I am not giving the input array to the FFT in the form that it needs. When I use this subroutine in my actual program, what comes out of the FFT is somewhat nonsensical and the whole thing goes to NaNs after several timesteps.
program fftTest
implicit none
complex(kind=8), dimension(8) :: data = (/1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0/)
integer :: i
do i=1,8
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i)
end do
print *, '-----'
call dfour1(data,8,1)
do i=1,8
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i)
end do
print *, '-----'
call dfour1(data,8,-1)
data = data/8
do i=1,8
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i)
end do
end program fftTest
SUBROUTINE dfour1(data,nn,isign)
INTEGER isign,nn
DOUBLE PRECISION data(2*nn)
INTEGER i,istep,j,m,mmax,n
DOUBLE PRECISION tempi,tempr
DOUBLE PRECISION theta,wi,wpi,wpr,wr,wtemp
n=2*nn
j=1
do 11 i=1,n,2 !This is the bit reversal section of the routine.
if(j.gt.i)then
tempr=data(j) !Exchange the two complex numbers.
tempi=data(j+1)
data(j)=data(i)
data(j+1)=data(i+1)
data(i)=tempr
data(i+1)=tempi
endif
m=n/2
1 if ((m.ge.2).and.(j.gt.m)) then
j=j-m
m=m/2
goto 1
endif
j=j+m
11 continue
mmax=2 !Here begins the Danielson-Lanczos section of the routine.
2 if (n.gt.mmax) then
istep=2*mmax
theta=6.28318530717959d0/(isign*mmax)
wpr=-2.d0*sin(0.5d0*theta)**2
wpi=sin(theta)
wr=1.d0
wi=0.d0
do 13 m=1,mmax,2 !Here are the two nested inner loops.
do 12 i=m,n,istep
j=i+mmax !This is the Danielson-Lanczos formula:
tempr=wr*data(j)-wi*data(j+1)
tempi=wr*data(j+1)+wi*data(j)
data(j)=data(i)-tempr
data(j+1)=data(i+1)-tempi
data(i)=data(i)+tempr
data(i+1)=data(i+1)+tempi
12 continue
wtemp=wr !Trigonometric recurrence
wr=wr*wpr-wi*wpi+wr
wi=wi*wpr+wtemp*wpi+wi
13 continue
mmax=istep
goto 2 !Not yet done.
endif !All done.
return
END
Expected (correct) output:
( 4.000000000000000, 0.000000000000000i )
( 1.000000000000000, -2.414213562373095i )
( 0.000000000000000, 0.000000000000000i )
( 1.000000000000000, -0.414213562373095i )
( 0.000000000000000, 0.000000000000000i )
( 1.000000000000000, 0.414213562373095i )
( 0.000000000000000, 0.000000000000000i )
( 1.000000000000000, 2.414213562373095i )
Actual output from test program:
( 1.000000000000000, 0.000000000000000i )
( 1.000000000000000, 0.000000000000000i )
( 1.000000000000000, 0.000000000000000i )
( 1.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
-----
( 4.000000000000000, 0.000000000000000i )
( 0.999999999999998, 2.414213562373094i )
( 0.000000000000000, 0.000000000000000i )
( 0.999999999999999, 0.414213562373096i )
( 0.000000000000000, 0.000000000000000i )
( 1.000000000000000, -0.414213562373094i )
( 0.000000000000000, 0.000000000000000i )
( 1.000000000000003, -2.414213562373096i )
-----
( 1.000000000000000, 0.000000000000000i )
( 1.000000000000000, -0.000000000000000i )
( 1.000000000000000, 0.000000000000000i )
( 1.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
( 0.000000000000000, 0.000000000000000i )
( 0.000000000000000, -0.000000000000000i )
( 0.000000000000000, -0.000000000000000i )
program fftTest
use iso_fortran_env
implicit none
integer, parameter :: nn = 8
real(real64), dimension(2*nn) :: data = [ 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, &
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
integer :: i
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
print *, '-----'
call dfour1(data,nn,-1)
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
print *, '-----'
call dfour1(data,nn,+1)
data = data/8
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
end program fftTest
and a demo of using the GNU Scientific Library, which is Open Source, via Forran's ISO C Binding:
program fftTest
use iso_c_binding
implicit none
integer (c_size_t), parameter :: nn = 8
real(C_DOUBLE), dimension(2*nn) :: data = [ 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, &
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
integer :: i
integer :: result
interface
! Adapted from Fortran GSL Library, FGSL,
! http://www.lrz.de/services/software/mathematik/gsl/fortran/
function gsl_fft_complex_radix2_backward(data, stride, n) bind(c)
import :: c_size_t, c_double, c_int
integer(c_size_t), value :: stride, n
real (c_double), dimension(*), intent (inout) :: data
integer(c_int) :: gsl_fft_complex_radix2_backward
end function gsl_fft_complex_radix2_backward
function gsl_fft_complex_radix2_forward(data, stride, n) bind(c)
import :: c_size_t, c_double, c_int
integer(c_size_t), value :: stride, n
real (c_double), dimension(*), intent (inout) :: data
integer(c_int) :: gsl_fft_complex_radix2_forward
end function gsl_fft_complex_radix2_forward
end interface
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
print *, '-----'
result = gsl_fft_complex_radix2_forward (data, 1_c_size_t, nn)
write (*, '("GSL return code: ", I0)' ) result
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
print *, '-----'
result = gsl_fft_complex_radix2_backward (data, 1_c_size_t, nn)
write (*, '("GSL return code: ", I0)' ) result
data = data/8
do i=1,2*nn,2
write(*,'("(", F20.15, ",", F20.15, "i )")') data(i),data(i+1)
end do
end program fftTest
Related
Related question Fortran: Which method is faster to change the rank of arrays? (Reshape vs. Pointer)
If I have a tensor contraction
A[a,b] * B[b,c,d] = C[a,c,d]
If I use BLAS, I think I need DGEMM (assume real values), then I can
first reshape tensor B[b,c,d] as D[b,e] where e = c*d,
DGEMM, A[a,b] * D[b,e] = E[a,e]
reshape E[a,e] into C[a,c,d]
The problem is, reshape is not that fast :( I saw discussions in Fortran: Which method is faster to change the rank of arrays? (Reshape vs. Pointer)
, in the above link, the author met some error messages, except reshape itself.
Thus, I am asking if there is a convenient solution.
[I have prefaced the size of dimensions with the letter n to avoid confusion in the below between the tensor and the size of the tensor]
As discussed in the comments there is no need to reshape. Dgemm has no concept of tensors, it only knows about arrays. All it cares about is that those arrays are laid out in the correct order in memory. As Fortran is column major if you use a 3 dimensional array to represent the 3 dimensional tensor B in the question it will be laid out exactly the same in memory as a 2 dimensional array used to represent the 2 dimensional tensor D. As far as the matrix mult is concerned all you need to do now is get the dot products which form the result to be the right length. This leads you to the conclusion that if you tell dgemm that B has a leading dim of nb, and a second dim of nc*nd you will get the right result. This leads us to
ian#eris:~/work/stack$ gfortran --version
GNU Fortran (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ian#eris:~/work/stack$ cat reshape.f90
Program reshape_for_blas
Use, Intrinsic :: iso_fortran_env, Only : wp => real64, li => int64
Implicit None
Real( wp ), Dimension( :, : ), Allocatable :: a
Real( wp ), Dimension( :, :, : ), Allocatable :: b
Real( wp ), Dimension( :, :, : ), Allocatable :: c1, c2
Real( wp ), Dimension( :, : ), Allocatable :: d
Real( wp ), Dimension( :, : ), Allocatable :: e
Integer :: na, nb, nc, nd, ne
Integer( li ) :: start, finish, rate
Write( *, * ) 'na, nb, nc, nd ?'
Read( *, * ) na, nb, nc, nd
ne = nc * nd
Allocate( a ( 1:na, 1:nb ) )
Allocate( b ( 1:nb, 1:nc, 1:nd ) )
Allocate( c1( 1:na, 1:nc, 1:nd ) )
Allocate( c2( 1:na, 1:nc, 1:nd ) )
Allocate( d ( 1:nb, 1:ne ) )
Allocate( e ( 1:na, 1:ne ) )
! Set up some data
Call Random_number( a )
Call Random_number( b )
! With reshapes
Call System_clock( start, rate )
d = Reshape( b, Shape( d ) )
Call dgemm( 'N', 'N', na, ne, nb, 1.0_wp, a, Size( a, Dim = 1 ), &
d, Size( d, Dim = 1 ), &
0.0_wp, e, Size( e, Dim = 1 ) )
c1 = Reshape( e, Shape( c1 ) )
Call System_clock( finish, rate )
Write( *, * ) 'Time for reshaping method ', Real( finish - start, wp ) / rate
! Direct
Call System_clock( start, rate )
Call dgemm( 'N', 'N', na, ne, nb, 1.0_wp, a , Size( a , Dim = 1 ), &
b , Size( b , Dim = 1 ), &
0.0_wp, c2, Size( c2, Dim = 1 ) )
Call System_clock( finish, rate )
Write( *, * ) 'Time for straight method ', Real( finish - start, wp ) / rate
Write( *, * ) 'Difference between result matrices ', Maxval( Abs( c1 - c2 ) )
End Program reshape_for_blas
ian#eris:~/work/stack$ cat in
40 50 60 70
ian#eris:~/work/stack$ gfortran -std=f2008 -Wall -Wextra -fcheck=all reshape.f90 -lblas
ian#eris:~/work/stack$ ./a.out < in
na, nb, nc, nd ?
Time for reshaping method 1.0515256000000001E-002
Time for straight method 5.8608790000000003E-003
Difference between result matrices 0.0000000000000000
ian#eris:~/work/stack$ gfortran -std=f2008 -Wall -Wextra reshape.f90 -lblas
ian#eris:~/work/stack$ ./a.out < in
na, nb, nc, nd ?
Time for reshaping method 1.3585931000000001E-002
Time for straight method 1.6730429999999999E-003
Difference between result matrices 0.0000000000000000
That said I think it worth noting though that the overhead for reshaping is O(N^2) while the time for the matrix multiply is O(N^3). Thus for large matrices the percentage overhead due to the reshape will tend to zero. Now code performance is not the only consideration, code readability and maintainability is also very important. So, if you find the reshape method much more readable and the matrices you use are sufficiently large that the overhead is not of import, you may well use the reshapes as in this case code readability might be more important than the performance. Your call.
I want to calculate D[a,d] = A[a,b,c] * B[b,c,d].
Method I: reshape A[a,b,c] => C1[a,e], B[b,c,d] => C2[e,d], e = b*c
Method II: directly call dgemm. This is a run-time error.
" na, nb, nc, nd ?
2 3 5 7
Time for reshaping method 2.447600000000000E-002
Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .
Time for straight method 1.838800000000000E-002
Difference between result matrices 5.46978468774136 "
Question: Can we contract two indexes together by BLAS?
The following only works for one index.
How to speed up reshape in higher rank tensor contraction by BLAS in Fortran?
Program reshape_for_blas
Use, Intrinsic :: iso_fortran_env, Only : wp => real64, li => int64
Implicit None
Real( wp ), Dimension( :, :, : ), Allocatable :: a
Real( wp ), Dimension( :, :, : ), Allocatable :: b
Real( wp ), Dimension( :, : ), Allocatable :: c1, c2
Real( wp ), Dimension( :, : ), Allocatable :: d
Real( wp ), Dimension( :, : ), Allocatable :: e
Integer :: na, nb, nc, nd, ne
Integer( li ) :: start, finish, rate
Write( *, * ) 'na, nb, nc, nd ?'
Read( *, * ) na, nb, nc, nd
ne = nb * nc
Allocate( a ( 1:na, 1:nb, 1:nc ) )
Allocate( b ( 1:nb, 1:nc, 1:nd ) )
Allocate( c1( 1:na, 1:ne ) )
Allocate( c2( 1:ne, 1:nd ) )
Allocate( d ( 1:na, 1:nd ) )
Allocate( e ( 1:na, 1:nd ) )
! Set up some data
Call Random_number( a )
Call Random_number( b )
! With reshapes
Call System_clock( start, rate )
c1 = Reshape( a, Shape( c1 ) )
c2 = Reshape( b, Shape( c2 ) )
Call dgemm( 'N', 'N', na, nd, ne, 1.0_wp, c1, Size( c1, Dim = 1 ), &
c2, Size( c2, Dim = 1 ), &
0.0_wp, e, Size( e, Dim = 1 ) )
Call System_clock( finish, rate )
Write( *, * ) 'Time for reshaping method ', Real( finish - start, wp ) / rate
! Direct
Call System_clock( start, rate )
Call dgemm( 'N', 'N', na, nd, ne, 1.0_wp, a , Size( a , Dim = 1 ), &
b , Size( b , Dim = 1 ), &
0.0_wp, d, Size( d, Dim = 1 ) )
Call System_clock( finish, rate )
Write( *, * ) 'Time for straight method ', Real( finish - start, wp ) / rate
Write( *, * ) 'Difference between result matrices ', Maxval( Abs( d - e ) )
End Program reshape_for_blas
I have the following Fortran code (modified on top of many answers from stack overflow..)
Program blas
integer, parameter :: dp = selected_real_kind(15, 307)
Real( dp ), Dimension( :, : ), Allocatable :: a
Real( dp ), Dimension( :, :, : ), Allocatable :: b
Real( dp ), Dimension( :, :, : ), Allocatable :: c1, c2
Integer :: na, nb, nc, nd, ne
Integer :: la, lb, lc, ld
Write( *, * ) 'na, nb, nc, nd ?'
Read( *, * ) na, nb, nc, nd
ne = nc * nd
Allocate( a ( 1:na, 1:nb ) )
Allocate( b ( 1:nb, 1:nc, 1:nd ) )
Allocate( c1( 1:na, 1:nc, 1:nd ) )
Allocate( c2( 1:na, 1:nc, 1:nd ) )
Call Random_number( a )
Call Random_number( b )
c1 = 0.0_dp
c2 = 0.0_dp
do ld = 1, nd
do lc = 1, nc
do lb = 1, nb
do la = 1, na
c1(la,lc,ld) = c1(la,lc,ld) + a(la,lb) * b(lb, lc, ld)
end do
end do
end do
end do
Call dgemm( 'N', 'N', na, ne, nb, 1.0_dp, a , Size( a , Dim = 1 ), &
b , Size( b , Dim = 1 ), &
0.0_dp, c2, Size( c2, Dim = 1 ) )
do la = 1, na
do lc = 1, nc
do ld = 1, nd
if ( dabs(c2(la,lc,ld) - c1(la,lc,ld)) > 1.e-6 ) then
write (*,*) '!!! c2', la,lc,ld, c2(la,lc,ld) - c1(la,lc,ld)
endif
enddo
enddo
enddo
End
(call it test.f90).
It works by gfortran -O3 test.f90 -L/opt/OpenBLAS/lib -lopenblas. Then, I tried to link gfortran to mkl, suggested by https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html
gfortran -O3 test.f90 -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_gf_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm -ld. And I got
Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .
My question is, what's wrong with the parameter 10? and how to fix it? It seems if I use ifort with -mkl, the above problem does not appear.
You selected the ilp64 version of MKL. That means that integers, longs and pointers are 64-bit. But you are not using gfortran with 64-bit integers, the default in all compilers I know is 32-bit integers. Either you want a different version of MKL, like lp64, or you want to set up your gfortran to use 64-bit default integers. For the former, select the 32bit-integer interface layer in the Link Advisor.
See also https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
I am new in Fortran programming so I need a help about allocatable arrays.
This is my simple code:
PROGRAM MY_SIMPLE_CODE
IMPLICIT NONE
INTEGER :: N_TMP, ALLOC_ERR, DEALLOC_ERR
REAL, ALLOCATABLE, DIMENSION(:) :: P_POT
WRITE( *,* ) "ENTER THE VALUE FOR N_TMP:"
READ( *,* ) N_TMP
IF ( .NOT. ALLOCATED( P_POT) ) ALLOCATE( P_POT( N_TMP), STATUS = ALLOC_ERR )
IF ( ALLOC_ERR .NE. 0 ) STOP( "ERROR - ALLOCATION P_POT !!!")
IF ( ALLOCATED( P_POT) ) DEALLOCATE( P_POT, STATUS = DEALLOC_ERR )
IF ( DEALLOC_ERR .NE. 0 ) STOP( "ERROR - DEALLOCATION P_POT !!!")
END PROGRAM MY_SIMPLE_CODE
When I cobuild this code I got this error message:
Allocate-object is neither a data pointer nor an allocatable variable
What is wrong with this code?
What kind of tricky stuff can be masked in this simple code?
IDE: Code::Blocks TDM_GCC_5 1 0
OS: Win 10 X64
Just like #Steve said in the comment, the keyword for the status of allocation/deallocation is STAT, not STATUS. The error comes because the compiler doesn't recognize the name and thinks it is a variable.
Moreover, there is a syntax error because there must be at least a space between the STOP statement and the opening brace (or no braces at all).
IF ( .NOT. ALLOCATED( P_POT) ) ALLOCATE( P_POT( N_TMP), STAT = ALLOC_ERR )
IF ( ALLOC_ERR .NE. 0 ) STOP "ERROR - ALLOCATION P_POT !!!"
!(...)
IF ( ALLOCATED( P_POT) ) DEALLOCATE( P_POT, STAT = DEALLOC_ERR )
IF ( DEALLOC_ERR .NE. 0 ) STOP "ERROR - DEALLOCATION P_POT !!!"
I am new to MPI programming with Fortran. I want to plot a 2D graph. I am trying to let each processor calculate one point of graph and send it to root to write it on file. Can somebody tell me how to send two variables viz: x and f(x) with mpi_gather. Thanks for any help.
Just as an example of both what Hristo said and " Is there anything wrong with passing an unallocated array to a routine without an explicit interface? " here's how you might do it
Program gather
Use mpi
Implicit None
Integer, Dimension( :, : ), Allocatable :: result
Integer, Dimension( 1:2 ) :: buffer
Integer :: me, nprocs, error
Integer :: x, fx
Call mpi_init( error )
Call mpi_comm_rank( mpi_comm_world, me , error )
Call mpi_comm_size( mpi_comm_world, nprocs, error )
If( me == 0 ) Then
Allocate( result( 1:2, 1:nprocs ) ) !Naughty - should check stat
Else
Allocate( result( 1:0, 1:0 ) ) !Naughty - should check stat
End If
x = me
fx = x * x
buffer( 1 ) = x
buffer( 2 ) = fx
Call mpi_gather( buffer, 2, mpi_integer, &
result, 2, mpi_integer, &
0, mpi_comm_world, error )
If( me == 0 ) Then
Write( *, '( 99999( i3, 1x ) )' ) result( 1, : )
Write( *, '( 99999( i3, 1x ) )' ) result( 2, : )
End If
Call mpi_finalize( error )
End Program gather
Wot now? mpif90 gather.f90
Wot now? mpirun -np 7 ./a.out
0 1 2 3 4 5 6
0 1 4 9 16 25 36