This is my code:
Program Dynamic_Array
Use Variables
Use Allocation_Module
Use Dealloaction_Module
Implicit none
Call Subroutine_0
Call Subroutine_1
End Program Dynamic_Array
Module Variables
Implicit none
Integer :: i , k
Integer , parameter :: Br_sn_cvo = 10
Integer , parameter :: Br_nn_mre = 7
Integer , parameter , dimension ( Br_nn_mre) :: Br_nn_cvo = [ 7 , 6 , 5 , 4 , 3 , 2 , 1 ]
Integer , dimension ( Br_nn_mre ) :: i_nn_dm_1 , i_nn_dm_2
type :: my_type
integer, allocatable :: my_size(:)
end type my_type
type(my_type), allocatable :: dS_sn(:)
End Module Variables
Module Allocation_Module
Use Variables
Implicit none
Contains
Subroutine Subroutine_0
Allocate(dS_sn(Br_nn_mre))
Loop_1: Do k = 1, Br_nn_mre
i_nn_dm_1(k) = Br_sn_cvo + Br_nn_mre + 1 + Br_nn_cvo(k) * ( k - 1 )
i_nn_dm_2(k) = Br_sn_cvo + Br_nn_mre + k * Br_nn_cvo(k)
Allocate( dS_sn(k)%my_size( i_nn_dm_1(k) : i_nn_dm_2(k)) )
Loop_2: Do i = i_nn_dm_1(k) , i_nn_dm_2(k)
dS_sn(k)%my_size(i) = i + k
End Do Loop_2
End do loop_1
End subroutine Subroutine_0
End Module Allocation_Module
Module Dealloaction_Module
Use Variables
Implicit none
Contains
Subroutine Subroutine_1
Do k = 1 , Br_nn_mre
Deallocate(dS_sn(k)%my_size)
End do
Deallocate(dS_sn)
Return
End Subroutine Subroutine_1
End Module Dealloaction_Module
I am not experienced programer in Fortran so I need to ask a few questions about process of memory allocation and deallocation for a dynamic arrays. Is there any problem with memory leak in this code?
Is this correct way for a memory allocation in separate module?
Is this correct way for a memory deallocation in separate module?
Here is an example...
...
IF(ALLOCATED(TheArray)) THEN
IF(SIZE(TheArray) /= The_Size_I_need) DEALLOCATE(TheArray)
ENDIF
IF(.NOT. ALLOCATED(TheArray)) ALLOCATE(TheArray(The_Size_I_need))
This is useful if the array gets used repeatedly for different processing sizes.
If it is "always" fixed in terms of the current execution, then there is no real need to do anything.
There is no memory leak in the code. It is impossible to make a memory leak with allocatable entities in Fortran. Only pointer can cause a memory leak.
With allocatable if something is going out of scope, it is deallocated automatically.
Your main array is a module variable so it is never going out of scope (it is save implicitly by Fortran 2008 rules). So if you don't deallocate it yourself, it will remain allocated and then deleted by the operating system on the program termination. But that is not normally considered to be a memory leak. It is not really harmful, because there is no way to make some forgotten copies of the array in memory.
The individual components my_size could go out of scope, when deallocating the large array dS_sn. In that case they are deallocated automatically by Fortran rules. You don't have to deallocate them one by one.
So you do not really have to do
Do k = 1 , Br_nn_mre
Deallocate(dS_sn(k)%my_size)
End do
Doing just
Deallocate(dS_sn)
is perfectly correct.
Related
I want to add elements to a 1d matrix mat, subject to a condition as in the test program below. In Fortran 2003 you can add an element
mat=[mat,i]
as mentioned in the related question Fortran array automatically growing when adding a value. Unfortunately, this is very slow for large matrices. So I tried to overcome this, by writing the matrix elements in an unformatted file and reading them afterwards. This turned out to be way faster than using mat=[mat,i]. For example for n=2000000_ilong the run time is 5.1078133666666661 minutes, whereas if you store the matrix elements in the file the run time drops to 3.5234166666666665E-003 minutes.
The problem is that for large matrix sizes the file storage.dat can be hundreds of GB...
Any ideas?
program test
implicit none
integer, parameter :: ndig=8
integer, parameter :: ilong=selected_int_kind(ndig)
integer (ilong), allocatable :: mat(:)
integer (ilong), parameter :: n=2000000_ilong
integer (ilong) :: i, cn
logical, parameter :: store=.false.
real(8) :: z, START_CLOCK, STOP_CLOCK
open(1, file='storage.dat',form='unformatted')
call cpu_time(START_CLOCK)
if(store) then
cn=0
do i=1,n
call random_number(z)
if (z<0.5d0) then
write(1) i
cn=cn+1
end if
end do
rewind(1); allocate(mat(cn)); mat=0
do i=1,cn
read(1) mat(i)
end do
else
allocate(mat(1)); mat=0
do i=1,n
call random_number(z)
if (z<0.5d0) then
mat=[mat,i]
end if
end do
end if
call cpu_time(STOP_CLOCK)
print *, 'run took:', (STOP_CLOCK - START_CLOCK)/60.0d0, 'minutes.'
end program test
If the data file has hundreds of gigabytes, than there can may be no solution available at all, because you need so much RAM memory anyway for your array. Maybe you made the mistake of storing the data as text and then the memory size will be somewhat lower, but still tens of GB.
What is often done, when you need to add elements one-by-one and you do not know the final size, is growing the array geometrically in steps. That means pre-allocate an array to size N. When the array is full, you allocate a new array of size 2*N. When the array is full again, you allocate it to 4*N. And so on. Either you are finished or you exhausted all your memory.
Of course, it is often best to know the size of the array beforehand, but in some algorithms you simply do not have the information.
Maybe you need a dynamic container such as C++'s std::vector, with a push_back() function.
The following is a simplified version. You probably ought to check the allocation to make sure that you don't run out of addressable memory.
Note the need for random_seed.
module container
use iso_fortran_env
implicit none
type array
integer(int64), allocatable :: A(:)
integer(int64) num
contains
procedure push_back
procedure print
end type array
interface array ! additional constructors
procedure array_constructor
end interface array
contains
!----------------------------------------------
function array_constructor() result( this ) ! performs initial allocation
type(array) this
allocate( this%A(1) )
this%num = 0
end function array_constructor
!----------------------------------------------
subroutine push_back( this, i )
class(array), intent(inout) :: this
integer(int64) i
integer(int64), allocatable :: temp(:)
if ( size(this%A) == this%num ) then ! Need to resize
allocate( temp( 2 * this%num ) ) ! <==== for example
temp(1:this%num ) = this%A
call move_alloc( temp, this%A )
! print *, "Resized to ", size( this%A ) ! debugging only!!!
end if
this%num = this%num + 1
this%A(this%num) = i
end subroutine push_back
!----------------------------------------------
subroutine print( this )
class(array), intent(in) :: this
write( *, "( *( i0, 1x ) )" ) ( this%A(1:this%num) )
end subroutine print
end module container
!=======================================================================
program test
use iso_fortran_env
use container
implicit none
type(array) mat
integer(int64) :: n = 2000000_int64
integer(int64) i
real(real64) z, START_CLOCK, STOP_CLOCK
mat = array() ! initial trivial allocation
call random_seed ! you probably need this
call cpu_time(START_CLOCK)
do i = 1, n
call random_number( z )
if ( z < 0.5_real64 ) call mat%push_back( i )
end do
call cpu_time(STOP_CLOCK)
print *, 'Run took ', ( STOP_CLOCK - START_CLOCK ) / 60.0_real64, ' minutes.'
! call mat%print ! debugging only!!!
end program test
I am just trying to write in a collective way in MPI Fortran from a CFD code. In each process, data are divided in blocks, with a general number of cells, and a structure var(b) is created which hosts the two variables r and p of the block b. Then a double MPI structure derived type is created to collect all data in a process, the first type collecting all variables in a block, and the second one all the structures in a block. So, each process has to write one of this double derived datatype, where the offset is evaluated all the data amount in the previous processes (0 for rank 0, all data in the rank 0 for rank 1, and so one). The code is the following
module var_mod
type vt
sequence
double precision,dimension(:,:,:),allocatable :: r,p
end type vt
type(vt),target,dimension(:),allocatable :: var
end module var_mod
PROGRAM main
USE MPI_F08
USE var_mod
IMPLICIT NONE
! FILES
INTEGER,PARAMETER :: NB = 4
!----------------------------------------------------------------
INTEGER :: b,i,j,k,me,np
TYPE(MPI_File) :: mpifh
INTEGER(KIND=MPI_OFFSET_KIND) :: mpidisp,sum_dim
integer,dimension(:),allocatable :: ni,nj,nk,mpiblock,mpistride
integer :: cont,mpierr
INTEGER,dimension(nb) :: Blocks
INTEGER(KIND=MPI_ADDRESS_KIND),dimension(:),allocatable :: Offsets,Pos
INTEGER(KIND=MPI_COUNT_KIND) :: lb, ext8
TYPE(MPI_Datatype),dimension(:),allocatable :: Elem_Type,Types
TYPE(MPI_Datatype) :: All_Type,mpiparflowtype
TYPE(MPI_Status) :: status
CHARACTER(LEN=MPI_MAX_ERROR_STRING) :: string
INTEGER :: resultlen
!----------------------------------------------------------------
call mpi_init
call mpi_comm_size(mpi_comm_world,np)
call mpi_comm_rank(mpi_comm_world,me)
allocate (ni(nb))
allocate (nj(nb))
allocate (nk(nb))
allocate (var(nb))
do b = 1,NB
ni(b) = b/3+1
nj(b) = b
nk(b) = b/5+1
allocate (var(b)%r(ni(b),nj(b),nk(b)))
allocate (var(b)%p(ni(b),nj(b),nk(b)))
END DO
!
! Initialize the data
!
do b = 1,nb
DO k = 1,nk(b)
DO j = 1,nj(b)
DO i = 1,ni(b)
var(b)%r(i,j,k) = 10000*me+1000*b+i*100+j*10+k
var(b)%p(i,j,k) = -var(b)%r(i,j,k)
END DO
END DO
END DO
end do
! (1) Create a separate structure datatype for each record
allocate (Offsets(2),Pos(2),Types(2),Elem_Type(nb))
DO b = 1,nb
CALL MPI_GET_ADDRESS(var(b)%r,POS(1))
CALL MPI_GET_ADDRESS(var(b)%p,POS(2))
Offsets = POS-POS(1)
Types = MPI_REAL8
Blocks = ni(b)*nj(b)*nk(b)
CALL MPI_TYPE_CREATE_STRUCT(2,Blocks,Offsets,Types,Elem_Type(b),mpierr)
END DO
deallocate (Offsets,Pos,Types)
! Create a structure of structures that describes the whole array
allocate (Offsets(nb),Pos(nb))
Blocks = 1
DO b = 1,nb
CALL MPI_GET_ADDRESS(var(b)%r,POS(b))
END DO
Offsets = POS-POS(1)
CALL MPI_TYPE_CREATE_STRUCT(nb,Blocks,Offsets,Elem_Type,All_Type)
CALL MPI_TYPE_COMMIT(All_Type,mpierr)
! Free the intermediate datatypes
DO b = 1,nb
CALL MPI_TYPE_FREE(Elem_Type(b))
END DO
deallocate(Offsets,Pos,Elem_Type)
! Set index
cont = 1
allocate(mpiblock(cont))
allocate(mpistride(cont))
mpiblock=1
mpistride=0
call MPI_TYPE_INDEXED(cont,mpiblock,mpistride,All_Type,mpiparflowtype)
call MPI_TYPE_COMMIT(mpiparflowtype)
deallocate(mpiblock,mpistride)
! Position where to write
CALL MPI_Type_get_extent(MPI_REAL8, lb, ext8)
mpidisp = 0
do b = 1,nb
mpidisp = mpidisp + (ni(b)*nj(b)*nk(b)) ! number of cell in the block b
end do
mpidisp = mpidisp*2*ext8*me !multiply for number of variables and byte of each variable and shif to the process rank
! Open file
call MPI_FILE_OPEN(MPI_COMM_WORLD,'MPIDATA',IOR(MPI_MODE_CREATE,MPI_MODE_WRONLY),MPI_INFO_NULL,mpifh)
! setting file view
call MPI_FILE_SET_VIEW(mpifh,mpidisp,All_Type,mpiparflowtype,'native',MPI_INFO_NULL,mpierr)
write(*,*) me,'error on file set view:',mpierr
call MPI_Error_string(mpierr, string, resultlen)
write(*,*) 'string:',trim(string),resultlen
! MPI Write file
call MPI_FILE_WRITE_ALL(mpifh,var(1)%r,1,All_Type,status)
! Close file
call MPI_FILE_CLOSE(mpifh)
! deallocations and free
CALL MPI_TYPE_FREE(All_Type)
CALL MPI_TYPE_FREE(mpiparflowtype)
do b = 1,nb
deallocate (var(b)%r,var(b)%p)
END DO
deallocate (var)
deallocate (ni,nj,nk)
! end
call mpi_finalize
END PROGRAM main
When the code is launched, for instance, on two processes (both Intel and gnu compilers no problem in compilation phase), the run concludes but an error MPI_TYPE_ERR in MPI_FILE_SET_VIEW is issued and the data file contains only rank 0 data.
I would expect a file with data from all ranks, but I can not understand what the problem is.
This is more of a best practice on Fortran code writing other than solving an error.
I have this following code sample with some large array that needs to be passed around to some subroutine for some calculation
program name
implicit none
integer, parameter:: n = 10**8
complex(kind=8) :: x(n)
integer :: i, nVal
nVal = 30
do i =1,1000
call test(x,nVal)
!-----other calculations-----!
! after every step nVal chnages, and after few step nVal converges
! e.g. `nVal` starts from 30 and converges at 14, after 10-15 steps, and stays there for rest of the loop
! once `nVal` converges the `workarray` requires much less memory than it requires at the starts
enddo
contains
subroutine test(arr,m)
integer , intent(inout) :: m
complex(kind=8), intent(inout) :: arr(n)
complex(kind=8) :: workarray(n,m) ! <-- large workspace
!----- do calculation-----------!
!--- check convergence of `m`----!
end
end program name
The internal workarray depends on a value that decreases gradually and reaches a convergence, and stays there for rest of the code. If I check the memory usage with top it shows at 27% from starts to finish. But after few steps the memory requirement should decrease too.
So, I modified the code to use allocatable workarray like this,
program name
implicit none
integer, parameter:: n = 10**8
complex(kind=8) :: x(n)
integer :: i, nVal, oldVal
complex(kind=8), allocatable :: workarray(:,:)
nVal = 30
oldVal = nVal
allocate(workarray(n,nVal))
do i =1,1000
! all calculation of the subroutine `test` brought to this main code
!--- check convergence of `nVal`----!
if(nVal /= oldVal) then
deallocate(workarray)
allocate(workarray(n,nVal))
oldVal = nVal
endif
enddo
end program name
Now, If I use top the memory usage starts at about 28% and then decreases and reaches a converged value of 19%.
Now, my question is how should I code situations like this. The allocatable option do decreases memory requirement but it also hampers the code readability a little bit and introduces code duplication in several places. On the other hand, the prior option keeps larger memory for the whole time where much less memory would suffice. So, what is preferred way of coding in this situation?
I can't help you decide which of the two methods is better; it will depend on how you (or the users of your code) value the potential tradeoff between memory use and cpu use. However, I can suggest a better version of your second method.
Rather than passing workarray in and out of test, you can keep it local to test and use the save attribute to make it persistent between procedure calls.
This would look something like
program name
implicit none
integer, parameter :: dp = selected_real_kind(15,300)
integer, parameter:: n = 10**8
complex(dp) :: x(n)
integer :: i, nVal
nVal = 30
do i =1,1000
call test(x,nVal)
enddo
contains
subroutine test(arr,m)
complex(dp), intent(inout) :: arr(:)
integer, intent(inout) :: m
! Initialise workarray to an empty array
! Avoids having to check if it is allocated each time
complex(dp), allocatable, save :: workarray(:,:) = reshape([complex(dp)::], [0, 0])
! Reallocate workarray if necessary.
if (size(workarray, 2)<m) then
deallocate(workarray)
allocate(workarray(size(arr), m))
endif
end subroutine
end program
If m is likely to increase slowly, you may also want to consider replacing allocate(workarray(size(arr), m)) with allocate(workarray(size(arr), 2*m)), such that you get c++ std::vector-style memory management.
The main downside of this approach (besides not reducing the memory use) is that you need to be more careful if you want to run parallel code which uses procedures with saved variables.
Hermite Interpolation woes
I am trying to find the Newton Dividing Differences for the function and derivative values of a given set of x's. I'm running into serious problems with my code working for tiny examples, but failing on bigger one's. As is clearly visible, my answers are very much larger than they original function values.
Does anybody have any idea what I'm doing wrong?
program inter
implicit none
integer ::n,m
integer ::i
real(kind=8),allocatable ::xVals(:),fxVals(:),newtonDivDiff(:),dxVals(:),zxVals(:),zdxVals(:),zfxVals(:)
real(kind=8) ::Px
real(kind=8) ::x
Open(Unit=8,File="data/xVals")
Open(Unit=9,File="data/fxVals")
Open(Unit=10,File="data/dxVals")
n = 4 ! literal number of data pts
m = n*2+1
!after we get the data points allocate the space
allocate(xVals(0:n))
allocate(fxVals(0:n))
allocate(dxVals(0:n))
allocate(newtonDivDiff(0:n))
!allocate the zvalue arrays
allocate(zxVals(0:m))
allocate(zdxVals(0:m))
allocate(zfxVals(0:m))
!since the size is the same we can read in one loop
do i=0,n
Read(8,*) xVals(i)
Read(9,*) fxVals(i)
Read(10,*) dxVals(i)
end do
! contstruct the z illusion
do i=0,m,2
zxVals(i) = xVals(i/2)
zxVals(i+1) = xVals(i/2)
zdxVals(i) = dxVals(i/2)
zdxVals(i+1) = dxVals(i/2)
zfxVals(i) = fxVals(i/2)
zfxVals(i+1) = fxVals(i/2)
end do
!slightly modified business as usual
call getNewtonDivDiff(zxVals,zdxVals,zfxVals,newtonDivDiff,m)
do i=0,n
call evaluatePolynomial(m,newtonDivDiff,xVals(i),Px,zxVals)
print*, xVals(i) ,Px
end do
close(8)
close(9)
close(10)
stop
deallocate(xVals,fxVals,dxVals,newtonDivDiff,zxVals,zdxVals,zfxVals)
end program inter
subroutine getNewtonDivDiff(xVals,dxVals,fxVals,newtonDivDiff,n)
implicit none
integer ::i,k
integer, intent(in) ::n
real(kind=8), allocatable,dimension(:,:) ::table
real(kind=8),intent(in) ::xVals(0:n),dxVals(0:n),fxVals(0:n)
real(kind=8), intent(inout) ::newtonDivDiff(0:n)
allocate(table(0:n,0:n))
table = 0.0d0
do i=0,n
table(i,0) = fxVals(i)
end do
do k=1,n
do i = k,n
if( k .eq. 1 .and. mod(i,2) .eq. 1) then
table(i,k) = dxVals(i)
else
table(i,k) = (table(i,k-1) - table(i-1,k-1))/(xVals(i) - xVals(i-k))
end if
end do
end do
do i=0,n
newtonDivDiff(i) = table(i,i)
!print*, newtonDivDiff(i)
end do
deallocate(table)
end subroutine getNewtonDivDiff
subroutine evaluatePolynomial(n,newtonDivDiff,x,Px,xVals)
implicit none
integer,intent(in) ::n
real(kind=8),intent(in) ::newtonDivDiff(0:n),xVals(0:n)
real(kind=8),intent(in) ::x
real(kind=8), intent(out) ::Px
integer ::i
Px = newtonDivDiff(n)
do i=n,1,-1
Px = Px * (x- xVals(i-1)) + newtonDivDiff(i-1)
end do
end subroutine evaluatePolynomial
Values
x f(x) f'(x)
1.16, 1.2337, 2.6643
1.32, 1.6879, 2.9989
1.48, 2.1814, 3.1464
1.64, 2.6832, 3.0862
1.8, 3.1553, 2.7697
Output
1.1599999999999999 62.040113431002474
1.3200000000000001 180.40121445431600
1.4800000000000000 212.36319446149312
1.6399999999999999 228.61845650513027
1.8000000000000000 245.11610836104515
You are accessing array newtonDivDiff out of bounds.
You are first allocating it as 0:n (main program's n) then you are passing to subroutine getNewtonDivDiff as 0:n (the subroutine's n) but you pass m (m=n*2+1) to the argument n. That means you tell the subroutine that the array has bounds 0:m which is 0:9, but it has only bounds 0:4.
It is quite difficult to debug the program as it stands, I had to use valgrind. If you move your subroutines to a module and change the dummy arguments to assumed shape arrays (:,:) then the bound checking in gfortran (-fcheck=all) will catch the error.
Other notes:
kind=8 is ugly, 8 can mean different things for different compilers. If you want 64bit variables, you can use kind=real64 (real64 comes from module iso_fortran_env in Fortran 2008) or use selected_real_kind() (Fortran 90 kind parameter)
You do not have to deallocate your local arrays in the subroutines, they are deallocated automatically.
Your deallocate statement in the main program is after the stop statement, it will never be executed. I would just delete the stop, there is no reason to have it.
I need to find how to use the dimension attribute in this program. The problem in here that I can't figure out is how user can specify the number of rows? (another word, the number of students):
PROGRAM
implicit none
integer::k,sn
real,dimension(**?**,4)::A
character(len=10),dimension(**?**)::B
open(10,file='students.txt',status='new')
write(*,*)'how many student are in the classroom?'
read(*,*)sn
k=1
do
write(*,*)k,'.','student name=';read(*,*)B(k)
write(*,*)'1.Quiz';read(*,*)A(k,1)
write(*,*)'2.Quiz';read(*,*)A(k,2)
write(*,*)'Final Quiz';read(*,*)A(k,3)
A(k,4)=(A(k,1)*30/100)+(A(k,2)*30/100)+(A(k,3)*40/100)
write(10,9)B(k),' ',A(k,1),' ',A(k,2),' ',A(k,3),' ',A(k,4)
k=k+1
if(k>sn)exit
end do
9 format(1x,A10,A5,F5.1,A3,F5.1,A3,F5.1,A3,F5.1)
end program
Well basically you have fixed (static) arrays which are defined e.g. using dimension:
real,dimension(4) :: X
X is an array of length 4 (1-4). This is equivalent to:
real :: X(4)
Static arrays have a fixed length throughout their scope (e.g. throughout the program for global variables or throughout functions/subroutines).
What you need are allocatable arrays which are allocated at runtime:
program test
implicit none
real, allocatable :: B(:) ! The shape is given by ":" - 1 dimension
integer :: stat
! allocate memory, four elements:
allocate( B(4), stat=stat )
! *Always* check the return value
if ( stat /= 0 ) stop 'Cannot allocate memory'
! ... Do stuff
! Clean up
deallocate( B )
! Allocate again using a different length:
allocate( B(3), stat=stat )
! *Always* check the return value
if ( stat /= 0 ) stop 'Cannot allocate memory'
! No need to deallocate at the end of the program!
end program
real,dimension(:,:),allocatable ::A
character(len=10),dimension(:),allocatable::B
.
.
.
DEALLOCATE(A)
DEALLOCATE(B)
This works! Thank you guys.