I am using a very large code in Fortran. It is called sapick6.f90. If I compile this code:
gfortran -O3 -o sapick6 sapick6.f90
and run it:
./sapick6
I have always the same exact result (as expected). It has random number subroutines, but the seed, at the beginning is always the same, as it reads input parameters from a file.
Results are:
Mean window energy: 1.84019
Final window energy: 19.98531
Qe: SNR (Ursin): 18.53402
Rc: c_mean/c_mean_noise: -53.57054
e_tmean: 0.00000
Mean weighted az. (r>0.5): 19.71334
always.
Now, I change this code initial line, from
PROGRAM sapick6
USE nrtype
IMPLICIT NONE
INTEGER :: i,ii,iii,j,jj,ijk,k,n
INTEGER, PARAMETER :: mp = 5, np = 10
INTEGER, PARAMETER :: imax = 5001,jmax = 8,kmax = 251
INTEGER :: NTRACES, NSAMP
...code....
...code...
end PROGRAM
to simply:
SUBROUTINE sapick6
USE nrtype
IMPLICIT NONE
INTEGER :: i,ii,iii,j,jj,ijk,k,n
INTEGER, PARAMETER :: mp = 5, np = 10
INTEGER, PARAMETER :: imax = 5001,jmax = 8,kmax = 251
INTEGER :: NTRACES, NSAMP
...code....
...code...
end SUBROUTINE
note that I only changed program to subroutine. When compiled as a library:
gfortran -shared -fPIC sapick6.f90 -o sapick6.so
and running this with julia (first time):
julia> function callsapick()
t=ccall((:sapick6_,"./sapick6.so"),Void,(),)
return
end
callsapick (generic function with 1 method)
julia> callsapick()
Mean window energy: 1.84019
Final window energy: 19.98531
Qe: SNR (Ursin): 18.53402
Rc: c_mean/c_mean_noise: -53.57054
e_tmean: 0.00000
Mean weighted az. (r>0.5): 19.71334
and running this on julia (second time):
julia> callsapick()
signal (11): ViolaciĆ³n de segmento
while loading no file, in expression starting on line 0
sapick6_ at ./sapick6.so (unknown line)
callsapick at ./REPL[1]:2
unknown function (ip: 0x7f8ec525e4af)
jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933
do_call at /home/centos/buildbot/slave/package_tarball64/build/src/interpreter.c:75
eval at /home/centos/buildbot/slave/package_tarball64/build/src/interpreter.c:242
jl_interpret_toplevel_expr at /home/centos/buildbot/slave/package_tarball64/build/src/interpreter.c:34
jl_toplevel_eval_flex at /home/centos/buildbot/slave/package_tarball64/build/src/toplevel.c:577
jl_toplevel_eval_in at /home/centos/buildbot/slave/package_tarball64/build/src/builtins.c:496
eval at ./boot.jl:235
unknown function (ip: 0x7f8ed988439f)
jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933
eval_user_input at ./REPL.jl:66
unknown function (ip: 0x7f8ed98f21cf)
jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933
macro expansion at ./REPL.jl:97 [inlined]
#1 at ./event.jl:73
unknown function (ip: 0x7f8ec52572af)
jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1933
jl_apply at /home/centos/buildbot/slave/package_tarball64/build/src/julia.h:1424 [inlined]
start_task at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:267
unknown function (ip: 0xffffffffffffffff)
Allocations: 1178414 (Pool: 1177210; Big: 1204); GC: 0
Mean window energy: 1.20471
Final window energy: 20.11686
Qe: SNR (Ursin): 2.15156
Rc: c_mean/c_mean_noise: -32.36376
e_tmean: 0.00000
Mean weighted az. (r>0.5): 6.55283
So I get a different result, and even worst, a segmentation fault!
What is going on?
Related
I have to fill a large array of size (150000,35) with random numbers drawn from the normal distribution. I am using r8mat_normal_ab from the r8lib library (https://people.math.sc.edu/Burkardt/f_src/r8lib/r8lib.html and https://pastebin.com/0pEkZfYp); for some reason, a segmentation fault occurs if the size of the first dimension of the array is greater than about 27,000. I am using the intel compiler. This is the failing code
program test2
integer, parameter :: dp=kind(0.d0)
integer, parameter :: n_sim = 30000
integer, parameter :: nt = 35
real(dp) :: shockARWmat(n_sim,nt+1)
real(dp) :: sigmaARW=sqrt(.0344705)
integer :: seed_sim=234567, i
call r8mat_normal_ab (n_sim, nt+1, 0., sigmaARW, seed_sim, shockARWmat)
print*, 'success !!'
end program
While if I set n_sim = 27000 or lower it works fine. Is there a compilation parameter I am missing when I am dealing with large matrices? Setting -mcmodel=large did not help.
I am trying to call a subroutine in a loop. This subroutine has a local coarray. Following is the code that I am using:
! Test local coarray in procedure called in a loop.
!
program main
use, intrinsic :: iso_fortran_env, only : input_unit, output_unit, error_unit
implicit none
! Variable declaration.
integer :: me, ti
integer :: GHOST_WIDTH, TSTART, TSTEPS
sync all
! Initialize.
GHOST_WIDTH = 1
TSTART = 0
TSTEPS = 100000
me = this_image()
! Iterate.
do ti = TSTART + 1, TSTART + TSTEPS
call Aldeal( GHOST_WIDTH )
if ( me == 1 ) write( output_unit, * ) ti
end do
if ( me == 1 ) write( output_unit, * ) "All done!"
contains
subroutine Aldeal( width )
integer, intent(in) :: width
integer, allocatable, codimension[:] :: shell1_Co, shell2_Co, shell3_Co
allocate( shell1_Co[*], shell2_Co[*], shell3_Co[*] )
deallocate( shell1_Co, shell2_Co, shell3_Co )
return
end subroutine Aldeal
end program main
Right now the subroutine is not doing anything other than allocating the local coarray and deallocating it. But even while doing this, the program is throwing me the following error after some iterations:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
In coarray image 1
Image PC Routine Line Source
coarray_main 0000000000406063 Unknown Unknown Unknown
libpthread-2.17.s 00007F21D8B845F0 Unknown Unknown Unknown
libicaf.so 00007F21D90970D5 for_rtl_ICAF_CO_D Unknown Unknown
coarray_main 0000000000405054 main_IP_aldeal_ 37 coarray_main.f90
coarray_main 0000000000404AEC MAIN__ 23 coarray_main.f90
coarray_main 0000000000404A22 Unknown Unknown Unknown
libc-2.17.so 00007F21D85C5505 __libc_start_main Unknown Unknown
coarray_main 0000000000404929 Unknown Unknown Unknown
Abort(0) on node 0 (rank 0 in comm 496): application called MPI_Abort(comm=0x84000003, 0) - process 0
And the same error is repeated for other images as well.
Line 23 is call Aldeal( GHOST_WIDTH ) inside the do loop of the main program. And line 37 corresponds to deallocate( shell1_Co, shell2_Co, shell3_Co ) statement in the subroutine.
Additionally, if I remove the deallocate statement from the subroutine, it throws the same error but the line number in the error statement this time are 23 and 39. Line 39 corresponds to the end subroutine Aldeal statement.
I am not able to understand what exactly I am doing wrong. Please help.
P.S. I am using Centos 7 with Intel(R) Parallel Studio XE 2019 Update 4 for Linux.
Observations:
If I modify the code to have a derived-type with an allocatable component and use that to create the coarray in the subroutine, the code runs a little longer but eventually aborts with an error. Following is the modification:
module mod_coarray_error
implicit none
type :: int_t
integer, allocatable, dimension(:) :: var
end type int_t
contains
subroutine Aldeal_type( width )
integer, intent(in) :: width
type(int_t), allocatable, codimension[:] :: int_t_Co
allocate( int_t_Co[*] )
allocate( int_t_Co%var(width) )
sync all
! deallocate( int_t_Co%var )
deallocate( int_t_Co )
return
end subroutine Aldeal_type
end module mod_coarray_error
program main
use, intrinsic :: iso_fortran_env, only : input_unit, output_unit, error_unit
use :: mod_coarray_error
implicit none
! Variable declaration.
integer :: me, ti
integer :: GHOST_WIDTH, TSTART, TSTEPS, SAVET
sync all
! Initialize.
GHOST_WIDTH = 3
TSTART = 0
TSTEPS = 100000
SAVET = 1000
me = this_image()
! Iterate.
do ti = TSTART + 1, TSTART + TSTEPS
sync all
call Aldeal_type( GHOST_WIDTH )
if ( mod( ti, SAVET ) == 0 ) then
if ( me == 1 ) write( output_unit, * ) ti
end if
end do
sync all
if ( me == 1 ) write( output_unit, * ) "All done!"
end program main
Additionally, this code runs fine till the end when compiled in Windows.
Now if I add the compiler option heap-arrays 0, the code seems to run till the end even in Linux.
I tried to increase the number of loops, ie, TSTEPS in the code to 1e7. Even then, it runs successfully till the end. But I observe the following effects:
Code gets slower as loop count increases, ie, it takes more time to run from ti = 1e6 to ti = 2e6 than the time it takes to run from ti = 1 to ti = 1e6.
Memory used by the program keeps on increasing, ie, each image which consumes 2GB at start of the program run, consumes 3.5GB at ti = 2e6, 4.7GB at ti = 4e6, and 6GB at ti = 6e6.
Memory used by the program is relatively less when run in Windows, but it still keeps on increasing as the loop count increases. Eg each image which consumes 100MB at start, consumes 1.5GB at ti = 2e6, 2.5GB at ti = 4e6, and 3.5GB at ti = 6e6.
Using the compiler option /heap-arrays0 in Windows has no effect either on the run (as it was already successfully running without it) or on the amount of memory consumed while running.
The original code posted in the question still throws an error even when compiled using the above compiler option. It does not seem to run in Windows too.
Ultimately, I am still confused as to what is happening.
P.S. I posted the question in Intel forum but have not received any response yet.
I am trying to write a function that extracts a specified line from a given file. My function to do so takes two arguments:
fUnit: this is the numerical identifier of the given file.
fLine: this is the line number that I'd like to extract. If the value of this input is -1, then the function will return the last line of the file (in my work, this is the functionality I need the most).
I have wrapped this function inside a module (routines.f95), as shown:
module routines
contains
function getLine(fUnit, fLine)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Get the nth line of a file. It is assumed that the file is !
! numerical only. The first argument is the unit number of the !
! file, and the second number is the line number. If -1 is !
! passed to the second argument, then the program returns the !
! final line of the program. It is further assumed that each !
! line of the file contains two elements. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
implicit none
integer, intent(in) :: fUnit, fLine
integer :: i
real, dimension(2) :: tmp, getLine
if (fline .eq. -1) then
do
read(fUnit, *, end=10) tmp
end do
else
do i = 1, fLine
read(fUnit, *, end=10) tmp
end do
end if
10 getLine = tmp
end function getLine
end module routines
To test this function, I set up the following main program (test.f95):
program test
use routines
implicit none
integer :: i
real, dimension(2) :: line
open(21, file = 'data.dat')
do i = 1, 5
line = getLine(21, i)
write(*, *) i, line
end do
close(21)
end program test
The file data.dat contains the following information:
1.0 1.00
2.0 0.50
3.0 0.33
4.0 0.25
5.0 0.20
This code is a simplified version of the one I've written, but it reflects all the errors I obtain in my primary code. When I compile the above code with the commands
gfortran -c routines.f95
gfortran -c test.f95
gfortran -o test test.o routines.o
I do not obtain any syntax errors. The output of the program gives the following:
1 1.00000000 1.00000000
2 3.00000000 0.330000013
3 5.00000000 0.200000003
At line 28 of file routines.f95 (unit = 21, file = 'data.dat')
Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE
Error termination. Backtrace:
#0 0x7f2425ea15cd in ???
#1 0x7f2425ea2115 in ???
#2 0x7f2425ea287a in ???
#3 0x7f242601294b in ???
#4 0x400ccb in ???
#5 0x4009f0 in ???
#6 0x400b32 in ???
#7 0x7f2425347f49 in ???
#8 0x400869 in ???
at ../sysdeps/x86_64/start.S:120
#9 0xffffffffffffffff in ???
I understand that the error is being thrown because the program tries to extract a line that is past the EOF marker. The reason for this is because the program is skipping every other line, and thus skipping over the last line in the program.
Could someone please help me to understand why my program is skipping every other line of the input file? I am unable to find the issue in my code.
The position of a connected external file is a global state. In this case, the function getline changes the position of the file after it has searched. The next time the function is called, searching commences from the position it was left.
What you see, then, is not so much "skipping" of lines, but:
in the first iteration, the first line is read;
in the second iteration, a line (the second) is skipped, then a line (the third) is read;
in the third iteration, two lines are skipped and a third is attempted to be read.
However, the third line in the third iteration (the sixth of the file) is after an end-of-file condition. You see the result of reading the fifth line.
To enable seeking as you desire it, ensure that you position the file at its initial point before skipping lines. The rewind statement places a connected file at its initial position.
Instead of rewinding, you may close the file and re-open with position='rewind' to ensure it is positioned at its initial point, but the rewind statement is a better way to reposition. If you re-open without a position= specifier you see an effect similar to position='asis'. This leaves the position in the file unspecified by the Fortran standard.
After the help from #francescalus, I can answer my own question. The issue with my code was that each time my main program iterated through the function, the position of the read statement picked up at the last location. Because of this, my program skipped lines. Here is my updated code:
test.f95
program test
use routines
implicit none
integer :: i
real, dimension(2) :: line
open(21, file = 'data.dat')
do i = 1, 5
line = getLine(21, i)
write(*, *) i, line
end do
line = getLine(21, -1)
write(*, *) -1, line
close(21)
end program test
routines.f95
module routines
contains
function getLine(fUnit, fLine)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
! Get the nth line of a file. It is assumed that the file is !
! numerical only. The first argument is the unit number of the !
! file, and the second number is the line number. If -1 is !
! passed to the second argument, then the program returns the !
! final line of the program. It is further assumed that each !
! line of the file contains two elements. !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
implicit none
integer, intent(in) :: fUnit, fLine
integer :: i
real, dimension(2) :: tmp, getLine
rewind(fUnit)
if (fline .eq. -1) then
do
read(fUnit, *, end=10) tmp
end do
else
do i = 1, fLine
read(fUnit, *, end=10) tmp
end do
end if
10 getLine = tmp
end function getLine
end module routines
data.dat
1.0 1.00
2.0 0.50
3.0 0.33
4.0 0.25
5.0 0.20
Compile with
gfortran -c routines.f95
gfortran -c test.f95
gfortran -o test test.o routines.o
The output of this program is
1 1.00000000 1.00000000
2 2.00000000 0.500000000
3 3.00000000 0.330000013
4 4.00000000 0.250000000
5 5.00000000 0.200000003
-1 5.00000000 0.200000003
Hey I am trying to get my LAPACK libraries to work and I have searched and searched but I can't seem to figure out what I am doing wrong.
I try running my code, and I get the following error
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7FFB23D405F7
#1 0x7FFB23D40C3E
#2 0x7FFB23692EAF
#3 0x401ED1 in sgesv_
#4 0x401D0B in MAIN__ at CFDtest.f03:? Segmentation fault (core dumped)
I will paste my main code here, hopefully someone can help me with this problem.
****************************************************
PROGRAM CFD_TEST
USE MY_LIB
IMPLICIT DOUBLE PRECISION (A-H,O-Z)
DIMENSION ET(0:10), VN(0:10), WT(0:10)
DIMENSION SO(0:10), FU(0:10), DMA(0:10,0:10)
DIMENSION DMA2(0:10,0:10), QN(0:10), WKSPCE(0:10)
INTEGER*8 :: pivot(10), inf
INTEGER*8 :: N
EXTERNAL SGESV
!SET THE PARAMETERS
SIGMA1 = 0.D0
SIGMA2 = 0.D0
TAU = 1.D0
EF = 1.D0
EXP = 2.71828182845904509D0
COST = EXP/(1.D0+EXP*EXP)
DO 1 N=2, 10
!COMPUATION OF THE NODES, WEIGHTS AND DERIVATIVE MATRIX
CALL ZELEGL(N,ET,VN)
CALL WELEGL(N,ET,VN,WT)
CALL DMLEGL(N,10,ET,VN,DMA)
!CONSTRUCTION OF THE MATRIX CORRESPONDING TO THE
!DIFFERENTIAL OPERATOR
DO 2 I=0, N
DO 2 J=0, N
SUM = 0.D0
DO 3 K=0, N
SUM = SUM + DMA(I,K)*DMA(K,J)
3 CONTINUE
OPER = -SUM
IF(I .EQ. J) OPER = -SUM + TAU
DMA2(I,J) = OPER
2 CONTINUE
!CHANGE OF THE ENTRIES OF THE MATRIX ACCORDING TO THE
!BOUNDARY CONDITIONS
DO 4 J=0, N
DMA2(0,J) = 0.D0
DMA2(N,J) = 0.D0
4 CONTINUE
DMA2(0,0) = 1.D0
DMA2(N,N) = 1.D0
!CONSTRUCTION OF THE RIGHT-HAND SIDE VECTOR
DO 5 I=1, N-1
FU(I) = EF
5 CONTINUE
FU(0) = SIGMA1
FU(N) = SIGMA2
!SOLUTION OF THE LINEAR SYSTEM
N1 = N + 1
CALL SGESV(N,N,DMA2,pivot,FU,N,inf)
DO 6 I = 0, N
FU(I) = SO(I)
6 CONTINUE
PRINT *, pivot
1 CONTINUE
RETURN
END PROGRAM CFD_TEST
*****************************************************
The commands I run to compile are
gfortran -c MY_LIB.f03
gfortran -c CFDtest.f03
gfortran MY_LIB.o CFDtest.o -o CFDtest -L/usr/local/lib -llapack -lblas
I ran the command
-fbacktrace -g -Wall -Wextra CFDtest
CFDtest: In function _fini':
(.fini+0x0): multiple definition of_fini'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o:/build/buildd/glibc-2.19/csu/../sysdeps/x86_64/crti.S:80: first defined here
CFDtest: In function data_start':
(.data+0x0): multiple definition ofdata_start'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.data+0x0): first defined here
CFDtest: In function data_start':
(.data+0x8): multiple definition of__dso_handle'
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtbegin.o:(.data+0x0): first defined here
CFDtest:(.rodata+0x0): multiple definition of _IO_stdin_used'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0): first defined here
CFDtest: In function_start':
(.text+0x0): multiple definition of _start'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crt1.o:(.text+0x0): first defined here
CFDtest: In function_init':
(.init+0x0): multiple definition of _init'
/usr/lib/gcc/x86_64-linux-gnu/4.9/../../../x86_64-linux-gnu/crti.o:/build/buildd/glibc-2.19/csu/../sysdeps/x86_64/crti.S:64: first defined here
/usr/lib/gcc/x86_64-linux-gnu/4.9/crtend.o:(.tm_clone_table+0x0): multiple definition of__TMC_END'
CFDtest:(.data+0x10): first defined here
/usr/bin/ld: error in CFDtest(.eh_frame); no .eh_frame_hdr table will be created.
collect2: error: ld returned 1 exit status
You haven't posted your code for MY_LIB.f03 so we cannot compile CFDtest.f03 exactly as you have supplied it.
(As an aside, the usual naming convention is that f90 in a .f90 file is not supposed to imply the language version being targeted. Rather, .f90 denotes free format while .f is used for fixed format. By extension, your .f03 files would be better (i.e., more portable if) named as .f90.)
I commented out the USE MY_LIB line and ran your code through nagfor -u -c cfd_test.f90. The output, broken down, is
Extension: cfd_test.f90, line 13: Byte count on numeric data type
detected at *#8
Extension: cfd_test.f90, line 15: Byte count on numeric data type
detected at *#8
Byte counts are not portable. The kind value for an 8-byte integer is selected_int_kind(18). (Similarly you might like to use a kind(0.0d0) kind value for your double precision data.)
Error: cfd_test.f90, line 48: Implicit type for I
detected at 2#I
Error: cfd_test.f90, line 50: Implicit type for J
detected at 2#J
Error: cfd_test.f90, line 54: Implicit type for K
detected at 3#K
Error: cfd_test.f90, line 100: Implicit type for N1
detected at N1#=
You have these implicitly typed, which implies they are 4-byte (default) integers. You should probably declare these explicitly as 8-byte integers (using the 8-byte integer kind value above) if that's what you intend.
Questionable: cfd_test.f90, line 116: Variable COST set but never referenced
Questionable: cfd_test.f90, line 116: Variable N1 set but never referenced
Warning: cfd_test.f90, line 116: Unused local variable QN
Warning: cfd_test.f90, line 116: Unused local variable WKSPCE
You need to decide what you intend to do with these, or whether they are just deletable cruft.
With the implicit integers declared explicitly, there is further output
Warning: cfd_test.f90, line 116: Variable SO referenced but never set
This looks bad.
Obsolescent: cfd_test.f90, line 66: 2 is a shared DO termination label
Your DO loops would probably be better using the modern END DO terminators (not shared!)
Error: cfd_test.f90, line 114: RETURN is only allowed in SUBROUTINEs and FUNCTIONs
This is obviously easy to fix.
For the LAPACK call, one source of explicit interfaces for these routines is the NAG Fortran Library (through the nag_library module). Since your real data is not single precision, you should be using dgesv instead of sgesv. Adding USE nag_library, ONLY: dgesv and switching to call dgesv instead of sgesv, then recompiling as above, reveals
Incorrect data type INTEGER(KIND=4) (expected INTEGER) for argument N (no. 1) of DGESV
so you should indeed be using default (4-byte integers) - at least for the LAPACK build on your system, which will almost certainly be using 4-byte integers. Thus you might want to forget all about kinding your integers and just use the default integer type for all. Correcting this gives
Array supplied for scalar argument LDA (no. 4) of DGESV
so you do need to add this argument. Maybe pass size(DMA2,1)?
With this argument added to the call the code compiles successfully, but without the definitions for your *LEGL functions I couldn't go through any run-time testing.
Here is my modified (and pretty-printed) version of your program
Program cfd_test
! Use my_lib
! Use nag_library, Only: dgesv
Implicit None
Integer, Parameter :: wp = kind(0.0D0)
Real (Kind=wp) :: ef, oper, sigma1, sigma2, tau
Integer :: i, inf, j, k, n, sum
Real (Kind=wp) :: dma(0:10, 0:10), dma2(0:10, 0:10), et(0:10), fu(0:10), &
so(0:10), vn(0:10), wt(0:10)
Integer :: pivot(10)
External :: dgesv, dmlegl, welegl, zelegl
Intrinsic :: kind, size
! SET THE PARAMETERS
sigma1 = 0._wp
sigma2 = 0._wp
tau = 1._wp
ef = 1._wp
Do n = 2, 10
! COMPUATION OF THE NODES, WEIGHTS AND DERIVATIVE MATRIX
Call zelegl(n, et, vn)
Call welegl(n, et, vn, wt)
Call dmlegl(n, 10, et, vn, dma)
! CONSTRUCTION OF THE MATRIX CORRESPONDING TO THE
! DIFFERENTIAL OPERATOR
Do i = 0, n
Do j = 0, n
sum = 0._wp
Do k = 0, n
sum = sum + dma(i, k)*dma(k, j)
End Do
oper = -sum
If (i==j) oper = -sum + tau
dma2(i, j) = oper
End Do
End Do
! CHANGE OF THE ENTRIES OF THE MATRIX ACCORDING TO THE
! BOUNDARY CONDITIONS
Do j = 0, n
dma2(0, j) = 0._wp
dma2(n, j) = 0._wp
End Do
dma2(0, 0) = 1._wp
dma2(n, n) = 1._wp
! CONSTRUCTION OF THE RIGHT-HAND SIDE VECTOR
Do i = 1, n - 1
fu(i) = ef
End Do
fu(0) = sigma1
fu(n) = sigma2
! SOLUTION OF THE LINEAR SYSTEM
Call dgesv(n, n, dma2, size(dma2,1), pivot, fu, n, inf)
Do i = 0, n
fu(i) = so(i)
End Do
Print *, pivot
End Do
End Program
In general your development experience will be the most pleasant if you use as good a checking compiler as you can get your hands on and if you make sure you ask it to diagnose as much as it can for you.
As far as I can tell, there could be a number of problems:
Your integers with INTEGER*8 might be too long, maybe INTEGER*4 or simply INTEGER would be better
You call SGESV on double arguments instead of DGESV
Your LDA argument is missing, so your code should perhaps look like CALL DGESV(N,N,DMA2,N,pivot,FU,N,inf) but you need to check whether this is what you want.
I like to do this:
program main
implicit none
integer l
integer, allocatable, dimension(:) :: array
allocate(array(10))
array = 0
!$omp parallel do private(array)
do l = 1, 10
array(l) = l
enddo
!$omp end parallel do
print *, array
deallocate(array)
end
But I am running into error messages:
* glibc detected * ./a.out: munmap_chunk(): invalid pointer: 0x00007fff25d05a40 *
This seems to be a bug in ifort according to some discussions at intel forums but should be resolved in the version I am using (11.1.073 - Linux). This is a MASSIVE downscaled version of my code! I unfortunately can not use static arrays to have a workaround.
If I put the print into the loop, I get other errors:
* glibc detected ./a.out: double free or corruption (out): 0x00002b22a0c016f0 **
I didn't get the errors you're getting, but you have an issue with privatizing array in your OpenMP call.
[mjswartz#666-lgn testfiles]$ vi array.f90
[mjswartz#666-lgn testfiles]$ ifort -o array array.f90 -openmp
[mjswartz#666-lgn testfiles]$ ./array
0 0 0 0 0 0
0 0 0 0
[mjswartz#666-lgn testfiles]$ vi array.f90
[mjswartz#666-lgn testfiles]$ ifort -o array array.f90 -openmp
[mjswartz#666-lgn testfiles]$ ./array
1 2 3 4 5 6
7 8 9 10
First run is with private array, second is without.
program main
implicit none
integer l
integer, allocatable, dimension(:) :: array
allocate(array(10))
!$omp parallel do
do l = 1, 10
array(l) = l
enddo
print*, array
deallocate(array)
end program main
I just ran your code with ifort and openmp and it spewed 0d0's. I had to manually quit the execution. What is your expected output? I'm not a big fan of unnecessarily dynamically allocating arrays. You know what you're going to allocate your matrices as, so just make parameters and statically do it. I'll mess with some stuff and edit this response in a few.
Ok, so here's my edits:
program main
implicit none
integer :: l, j
integer, parameter :: lmax = 15e3
integer, parameter :: jmax = 25
integer, parameter :: nk = 300
complex*16, dimension(9*nk) :: x0, xin, xout
complex*16, dimension(lmax) :: e_pump, e_probe
complex*16 :: e_pumphlp, e_probehlp
character*25 :: problemtype
real*8 :: m
! OpenMP variables
integer :: myid, nthreads, omp_get_num_threads, omp_get_thread_num
x0 = 0.0d0
problemtype = 'type1'
if (problemtype .ne. 'type1') then
write(*,*) 'Problem type not specified. Quitting'
stop
else
! Spawn a parallel region explicitly scoping all variables
!$omp parallel
myid = omp_get_thread_num()
if (myid .eq. 0) then
nthreads = omp_get_num_threads()
write(*,*) 'Starting program with', nthreads, 'threads'
endif
!$omp do private(j,l,m,e_pumphlp,e_probehlp,e_pump,e_probe)
do j = 1, jmax - 1
do l = 1, lmax
call electricfield(0.0d0, 0.0d0, e_pumphlp, &
e_probehlp, 0.0d0)
! print *, e_pumphlp, e_probehlp
e_pump(l) = e_pumphlp
e_probe(l) = e_probehlp
print *, e_pump(l), e_probe(l)
end do
end do
!$omp end parallel
end if
end program main
Notice I removed your use of a module since it was unnecessary. You have an external module containing a subroutine, so just make it an external subroutine. Also, I changed your matrices to be statically allocated. Case statements are a fancy and expensive version of if statements. You were casing 15e3*25 times rather than once (expensive), so I moved those outside. I changed the OpenMP calls, but only semantically. I gave you some output so that you know what OpenMP is actually doing.
Here is the new subroutine:
subroutine electricfield(t, tdelay, e_pump, e_probe, phase)
implicit none
real*8, intent(in) :: t, tdelay
complex*16, intent(out) :: e_pump, e_probe
real*8, optional, intent (in) :: phase
e_pump = 0.0d0
e_probe = 0.0d0
return
end subroutine electricfield
I just removed the module shell around it and changed some of your variable names. Fortran is not case sensitive, so don't torture yourself by doing caps and having to repeat it throughout.
I compiled this with
ifort -o diffeq diffeq.f90 electricfield.f90 -openmp
and ran with
./diffeq > output
to catch the program vomiting 0's and to see how many threads I was using:
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
Starting program with 32 threads
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
(0.000000000000000E+000,0.000000000000000E+000)
Hope this helps!
It would appear that you are running into a compiler bug associated with the implementation of OpenMP 3.0.
If you can't update your compiler, then you will need to change your approach. There are a few options - for example you could make the allocatable arrays shared, increase their rank by one and have one thread allocate them such that the extent of the additional dimension is the number of workers in the team. All subsequent references to those arrays then need to be have the subscript for that additional rank be the omp team number (+ 1, depending on what you've used for the lower bound).
Explicit allocation of the private allocatable arrays inside the parallel construct (only) may also be an option.