Related
I have a 3D matrix (dimension nx,nz,ny) which corresponds to a physical domain. This matrix contains a continuous field from -1 (phase 1) to +1 (phase 2); the interface between the two phases is the level 0 of this field.
Now, I want to calculate efficiently the signed distance function from the interface for every point in the domain.
I tried two possibilities (sgn is the sign of my field, with values +1,0,-1, xyz contains the grid as triplets of x,y,z at each point and dist is the signed distance function I want to calculate).
double precision, dimension(nx,nz,ny) :: dist,sgn,eudist
integer :: i,j,k
double precision :: seed,posit,tmp(nx)
do j=1,ny
do k=1,nz
do i=1,nx
seed=sgn(i,k,j)
! look for interface
eudist=(xyz(:,:,:,1)-x(i))**2+(xyz(:,:,:,2)-z(k))**2+(xyz(:,:,:,3)-y(j))**2
! find min within mask
posit=minval(eudist,seed*sgn.le.0)
! tmp fits in cache, small speed-up
tmp(i)=-seed*dsqrt(posit)
enddo
dist(:,k,j)=tmp
enddo
enddo
I also tried a second version, which is quite similar to the above one but it calculates the Euclidean distance only in a subset of the whole matrix. With this second version there is some speed up, but it is still too slow. I would like to know whether there is a more efficient way to calculate the distance function.
Second version:
double precision, dimension(nx,nz,ny) :: dist,sgn
double precision, allocatable, dimension(:,:,:) :: eudist
integer :: i,j,k , ii,jj,kk
integer :: il,iu,jl,ju,kl,ku
double precision :: seed, deltax,deltay,deltaz,tmp(nx)
deltax=max(int(nx/4),1)
deltay=max(int(ny/4),1)
deltaz=max(int(nz/2),1)
allocate(eudist(2*deltax+1,2*deltaz+1,2*deltay+1))
do j=1,ny
do k=1,nz
do i=1,nx
! look for closest point in box 2*deltax+1,2*deltaz+1,2*deltay+1
il=max(1,i-deltax)
iu=min(nx,i+deltax)
jl=max(1,j-deltay)
ju=min(ny,j+deltay)
kl=max(1,k-deltaz)
ku=min(nz,k+deltaz)
eudist(:,1:ku-kl+1,:)=(xyz(il:iu,kl:ku,jl:ju,1)-x(i))**2 &
& +(xyz(il:iu,kl:ku,jl:ju,2)-z(k))**2 &
& +(xyz(il:iu,kl:ku,jl:ju,3)-y(j))**2
seed=sgn(i,k,j)
tmp(i)=minval(eudist(:,1:ku-kl+1,:),seed*sgn(il:iu,kl:ku,jl:ju).le.0)
tmp(i)=-seed*dsqrt(tmp(i))
enddo
dist(:,k,j)=tmp
enddo
enddo
eudist: Euclidean distance between the point i,k,j and any other point in a box 2*deltax+1,2*deltaz+1,2*deltay+1 centered in i,k,j. This reduces computational cost, as the distance is calculated only in a subset of the whole grid (here I am assuming that the subset is large enough to contain an interfacial point).
After Vladimir suggestion (x,y,z are the axes determining grid position, xyz(i,k,j)=(x(i),z(k),y(j)) ):
double precision, dimension(nx,nz,ny) :: dist,sgn
double precision :: x(nx), y(ny), z(nz)
double precision, allocatable, dimension(:,:,:) :: eudist
double precision, allocatable, dimension(:) :: xd,yd,zd
integer :: i,j,k , ii,jj,kk
integer :: il,iu,jl,ju,kl,ku
double precision :: seed, deltax,deltay,deltaz,tmp(nx)
deltax=max(int(nx/4),1)
deltay=max(int(ny/4),1)
deltaz=max(int(nz/2),1)
allocate(eudist(2*deltax+1,2*deltaz+1,2*deltay+1))
allocate(xd(2*deltax+1))
allocate(yd(2*deltay+1))
allocate(zd(2*deltaz+1))
do j=1,ny
do k=1,nz
do i=1,nx
! look for closest point in box 2*deltax+1,2*deltaz+1,2*deltay+1
il=max(1,i-deltax)
iu=min(nx,i+deltax)
jl=max(1,j-deltay)
ju=min(ny,j+deltay)
kl=max(1,k-deltaz)
ku=min(nz,k+deltaz)
do ii=1,iu-il+1
xd(ii)=(xyz(il+ii-1)-x(i))**2
end do
do jj=1,ju-jl+1
yd(jj)=(y(jj+jl-1)-y(j))**2
end do
do kk=1,ku-kl+1
zd(kk)=(z(kk+kl-1)-z(k))**2
end do
do jj=1,ju-jl+1
do kk=1,ku-kl+1
do ii=1,iu-il+1
eudist(ii,kk,jj)=xd(ii)+yd(jj)+zd(kk)
enddo
enddo
enddo
seed=sgn(i,k,j)
tmp(i)=minval(eudist(:,1:ku-kl+1,:),seed*sgn(il:iu,kl:ku,jl:ju).le.0)
tmp(i)=-seed*dsqrt(tmp(i))
enddo
dist(:,k,j)=tmp
enddo
enddo
EDIT: more information on the problem at hand.
The grid is an orthogonal grid mapped to a matrix. The number of points of this grid is of the order of 1000 in each direction (in total about 1 billion points).
My goal is switching from a sign function (+1,0,-1) to a signed distance function in the entire grid in an efficient way.
I would still do what I suggested, no matter if you do that on a subset or across the whole plane. Take advantage of the orthogonal grid, it is a great thing to have
do j=1,ny
do k=1,nz
do i=1,nx
! look for closest point in box 2*deltax+1,2*deltaz+1,2*deltay+1
il=max(1,i-deltax)
iu=min(nx,i+deltax)
jl=max(1,j-deltay)
ju=min(ny,j+deltay)
kl=max(1,k-deltaz)
ku=min(nz,k+deltaz)
do ii = il,iu
xd(i) = (xyz(ii,kl:ku,jl:ju,1)-x(i))**2
end do
do jj = jl,ju
yd(i) = (xyz(il:iu,kl:ku,jj,2)-y(j))**2
end do
do kk = kl,ku
zd(k) = (xyz(il:iu,kk,jl:ju,3)-z(k))**2
end do
do jj = jl,ju
do kk = kl,ku
do ii = il,iu
eudist(il:iu,kl:ku,jl:ju) = xd(ii) + yd(jj) + zd(kk)
end do
end do
end do
....
enddo
dist(:,k,j)=tmp
enddo
enddo
Consider separating the whole thing that is inside the outer triple loop into a subroutine or a function. It would not be faster, but it would be much more readable. Especially for us here, It would be enough for us here to only deal with that function, the outer loop is just a confusing extra layer.
I try to diagonalize a matrix using zgeev and it giving correct eigenvalues but the eigenvectors are not orthogonal.
program complex_diagonalization
implicit none
integer,parameter :: N=3
integer::i,j
integer,parameter :: LDA=N,LDVL=N,LDVR=N
real(kind=8),parameter::q=dsqrt(2.0d0),q1=1.0d0/q
integer,parameter :: LWMAX=1000
integer :: INFO,LWORK
real(kind=8) :: RWORK(2*N)
complex(kind=8) :: B(LDA,N),VL(LDVL,N),VR(LDVR,N),W(N),WORK(LWMAX)
external::zgeev
!matrix defining
B(1,1)=0.0d0;B(1,2)=-q1;B(1,3)=-q1
B(2,1)=-q1;B(2,2)=0.50d0;B(2,3)=-0.50d0
B(3,1)=-q1;B(3,2)=-0.5d0;B(3,3)=0.50d0
LWORK=-1
CALL ZGEEV('Vectors','Vectors',N,B,LDA,W,VL,LDVL,VR,LDVR,WORK,LWORK,RWORK,INFO)
LWORK=MIN(LWMAX,INT(WORK(1)))
CALL ZGEEV('Vectors','Vectors',N,B,LDA,W,VL,LDVL,VR,LDVR,WORK,LWORK,RWORK,INFO)
IF( INFO.GT.0 ) THEN
WRITE(*,*)'The algorithm failed to compute eigenvalues.'
STOP
END IF
!eigenvalues
do i=1,N
WRITE(*,*)W(i)
enddo
!eigenvectors
do i=1,N
WRITE(*,*)(VR(i,j),j=1,N)
ENDDO
end
and the result I am getting are this:
eigenvalues:
( 0.99999999999999978,0.0000000000000000)
(-0.99999999999999978,0.0000000000000000)
( 0.99999999999999978,0.0000000000000000)
eigenvectors
(0.70710678118654746,0.0000000000000000)
(-0.50000000000000000,0.0000000000000000)
(-0.50000000000000000,0.0000000000000000)
(0.70710678118654746,0.0000000000000000)
(0.50000000000000000,0.0000000000000000)
(0.50000000000000000,0.0000000000000000)
(-0.11982367636731203,0.0000000000000000)
( 0.78160853028734012,0.0000000000000000)
(-0.61215226207528295,0.0000000000000000)
you can see that the third eigenvector is not orthogonal with one of the two eigenvectors. What I am expecting is that in the third eigenvector first entry should be zero and second entry will be minus of third entry and because it's a unit vector it will be 0.707.
A real symmetric matrix has three orthogonal eigenvectors if the three eigenvalues are unique. Only the eigenvectors corresponding to distinct eigenvalues have tobe orthogonal. https://math.stackexchange.com/a/1368948/134138
The Hermitian specialized routine ZHEEV should guarantee orthogonality of the eigenvectors as suggested by Ian Bush. Or in your case you can also consider DSYEV (because your matrix is real).
The situation is well described in this post from LAPACK Forum http://icl.cs.utk.edu/lapack-forum/archives/lapack/msg01352.html
From the documentation:
DSYEV:
* On exit, if JOBZ = 'V', then if INFO = 0, A contains the
* orthonormal eigenvectors of the matrix A.
ZHEEV:
* On exit, if JOBZ = 'V', then if INFO = 0, A contains the
* orthonormal eigenvectors of the matrix A.
I have very oscillated 1D velocity data. I wanted to do smooth and remove some outliers from my data. I have gone through internet to know how to do this and based on findings, I have done following code for my data.
program smoothing
parameter (ni=775)
real ulb(ni), copu(ni), ri(ni), temp(ni),wei(ni)
real med
open(131,file='copy.txt' )
open(130,file='u.txt' )
read(130,'(10000f10.4)') (ulb(i), i=1,ni)
copu=0.
ri=0.
do i=1,ni
copu(i)=ulb(i)
enddo
print*, copu(200)
write(131,'(10000f10.4)') ( copu(i),i=2,ni)
! first smoothing
do i=4,ni-4
copu(i)=(copu(i-3)+copu(i-2)+copu(i-1)+copu(i)+copu(i+1)
& +copu(i+2)+copu(i+3))/7.
if(i.eq.1.or.i.eq.ni) copu(i)=copu(i)
if(i.eq.2.or.i.eq.ni-1) copu(i)=(copu(i+1)+copu(i)+copu(i-1))/3.
if(i.eq.3.or.i.eq.ni-2) copu(i)=(copu(i+2)+copu(i+1)
& +copu(i)+copu(i-1)+copu(i-2))/5.
enddo
write(131,'(10000f10.4)') ( copu(i),i=2,ni)
do k=1,4 ! iteration
! calculating resudial
do i=1,ni
ri(i)=ulb(i)-copu(i)
enddo
print*, ri(200)
! finding median along resudials
do i=1,ni
temp(i)=copu(i)
enddo
call sort(ri,ni)
if(mod(ni,2).eq.0) then
med=((ri(ni/2))+ri(ni/(2+1)))/2.
else
med=ri(ni/(2+1))
endif
print*, k, med
! calculating robust weigths
do i=1,ni
if(abs(ri(i)).ge.6.*med) then
wei(i)=0.
else if(abs(ri(i)).lt.6.*med) then
wei(i)=(1.-(ri(i)/(6.*med))**2)**2
endif
copu(i)=copu(i)+wei(i)*copu(i)
enddo
enddo ! iteration
write(131,'(10000f10.4)') ( copu(i),i=2,ni)
close (131)
end program
! ---------------------------------------------
subroutine sort(ri,ni)
real ri(ni)
do i=1,ni-1
do j=1,ni-1
if(ri(j).gt.ri(j+1)) then
tempu=ri(j)
ri(j)=ri(j+1)
ri(j+1)=tempu
end if
end do
end do
return
end subroutine
I had used four times smoothing with polynomial simple smoothing as
"first smoothing". Because of spin length, some outliers does not fitted appropriately. So I decided to apply Robust local regression in my code. But it does not work so far. I followed document of Mathlab. Any correction/suggestion would be very appreciated.
I am currently writing a Monte Carlo code whose volume is fluctuating. I divide my simulate cell, a cube, into many smaller cubes for algorithm speed reasons. However as the volume fluctuates, the number of these smaller cubes also fluctuates. Therefore, I have been reallocating my arrays that contain info on these smaller cubes as necessary:
!.....................................................................................!
! want to assign each atom to appropriate cell
! generate list of atoms in each cell for GONET algorithm
! arrays are allocated and deallocated in this routine.
!......................................................................................!
subroutine indiv_cell_lists_alloc(r)
implicit none
double precision :: r(3,param%np)
integer :: i,icell,size_cl
integer :: temp_num(0:listvar%ncellT-1) !...listvar%ncellT is total number of cells.
integer :: temp_cell(param%np)
temp_num(:) = 0
!--- param%np is total number of particles in simulation
do i = 1 ,param%np
!--- based off coordinates, finds what cell particle i is in
icell = cell(r(1,i),r(2,i),r(3,i))
temp_cell(i) = icell
temp_num(icell) = temp_num(icell) + 1
!--- keep particles current cell
atom(i)%cell = icell
atom(i)%loc = temp_num(icell)
enddo
!--- listvar%cl is an array. listvar%cl(i) contains info for cell i
!--- listvar%cl(i)%cmem(:) is an array containing the particles that are in the ith cell
!--- deallocate cell member lists
!--- subtract one since array is 0:ncellT-1, where ncellT is total number of cells
size_cl = size(listvar%cl)-1
do i = 0, size_cl
deallocate(listvar%cl(i)%cmem)
enddo
!--- deallocate cell list
deallocate(listvar%cl)
!--- allocate new celllist
allocate(listvar%cl(0:listvar%ncellT-1))
!--- allocate new arrays
do i = 0, listvar%ncellT-1
!--- allocate new array based off new size
allocate(listvar%cl(i)%cmem(temp_num(i)))
!--- number of molecules in cell
listvar%cl(i)%num = temp_num(i)
enddo
do i= 1,param%np
!--- get cell for particle
icell = temp_cell(i)
!--- place particles in cell
listvar%cl(icell)%cmem((atom(i)%loc)) = i
enddo
end subroutine indiv_cell_lists_alloc
Now the reason I am making sure that these arrays are only as big as they need to be, is that eventually these arrays will be exported to a Xeon Phi coprocessor. Due to the reduced memory there, I think that just allocating a large amount of memory and forgetting about it would lead to bad performance. However in serial execution, this subroutine is taking up ~70% of my run time.
Do you have any suggestions on how I could accomplish this reallocation more efficiently, or any suggestions about other methods I could use?
I have never done programming in my life and this is my very first code for a uni assignment, I get no errors in the compiling stage but myh program does not run saying that I have the error in the title, guess the problem is when I call the subroutine. Can anyone help me? It is my first code and it is really frustrating. Thank you.
!NUMERICAL COMPUTATION OF INCOMPRESSIBLE COUETTE FLOW USING FINITE DIFFERENCE METHOD
!IMPLICIT APPROACH
!MODEL EQUATION
!PARTIAL(U)/PARTIAL(T)=1/RE*(PARTIAL(U) SQUARE/PARTIAL(Y) SQUARE)
!DEFINE VARIABLES
IMPLICIT NONE
!VELOCITY U AT TIME T, VELOCITY UNEW AT TIME T+1, TIME T
!MAXIMUM 1000 POINTS
REAL V(1000)
REAL VNEW(1000)
REAL T
!GRID SPACING DY, GRID POINTS N+1
REAL DY
INTEGER N
!TIME STEP
REAL DT
!FLOW REYNOLDS NUMBER IN THE MODEL EQUATION
REAL ALPHA
!TOTAL SIMULATION TIME - LOOP NUMBER
INTEGER REP, I, J
!COEFFICIENTS IN LINEAR EQUATION MATRIX, SOURCE TERM K, DIAGONAL B, NON-DIAGONAL A
REAL S(1000), B, A
!INITIALIZATION OF DATA
DATA ALPHA/5000.0/
DATA N/100/
DATA REP/3000/
!CALCULATION OF GRID SPACING
DY=1.0/N
!CALCULATION OF TIME STEP DELTA T, CAN BE LARGER THAN THAT IN AN EXPLICIT METHOD
DT=0.5*RE*DY*DY
DT=ALPHA*DY*DY
!INITIAL CONDITIONS OF VELOCITY PROFILE
!BOTTOM AND INNER POINTS
DO I=1,N
V(I)=0.0
ENDDO
!POINT AT MOVING PLATE
V(N+1)=1.0
!BOUNDARY CONDITIONS AT LOWER AND UPPER POINTS ON PLATE
V(1)=0.0
V(N+1)=1.0
!CALCULATION OF DIAGONAL B AND NON-DIAGONAL A IN LINEAR EQUATION MATRIX
B=1.0+DT/DY/DY/ALPHA
A=-(DT)/2.0/DY/DY/ALPHA
!INITIAL COMPUTATION TIME
T=0.0
!ENTER MAIN LOOP TO MARCH IN TIME DIRECTION
DO I=1,REP
!SIMULATION TIME INCREASE BY DELTA T EACH STEP
T=T+DT
!USE IMPLICIT METHOD TO UPDATE GRID POINT VALUES FOR ALL INTERNAL GRIDS ONLY
!TWO BOUNDARY GRID POINTS VALUES ARE CONSTANT WITHIN THE WHOLE SIMULATION
!CALCULATION OF SOURCE TERM IN LINEAR EQUATION
DO J=2,N
S(J)=(1.0-DT/DY/DY/ALPHA)*V(J)+DT/2.0/DY/DY/ALPHA*V(J+1)+V(J-1)
ENDDO
!INCLUDE BOUNDARY CONDITIONS FOR TWO POINTS NEAR BOUDNARY
S(2)=S(2)-A*V(1)
S(N)=S(N)-A*V(N+1)
!USE SOURCE TERM K, DIAGONAL B, NON-DIAGONAL A, ORDER OF MATRIX N, TO SOLVE LINEAR EQUATION TO GET UPDATED VELOCITY
!CHECK ON INTERNET HOW TO SOLVE THIS BECUASE THIS COMPILER
!DOES NOT SOLVE IT, SOLVE LINEAR EQUATIONS BY A LINEAR SOLVER, FIND AND DOWNLOAD THE MATH LIBRARY FOR THIS COMPILER
CALL SR1(A,B,N,S,VNEW)
!REPLACE OLD VELOCITY VALUES WITH NEW VALUES.
!SINCE UNEW IS FROM UNEW(1), UNEW(2)......., UNEW(N-1), WE SHOULD RE-ARRANGE NUMBERS AS FOLLOWS
DO J=1,N-1
V(J+1)=VNEW(J)
ENDDO
!RETURN TO MAIN LOOP HERE
ENDDO
PRINT*,'HERE'
!OUTPUT VELOCITY PROFILES AT THE END OF COMPUTATION
!CREATE OUPUT FILE NAME
OPEN(15,FILE='PLEASEWORK')
!WRITE GRID POINTS AND VELOCITY VALUES
DO I=1,N+1
WRITE(15,10) V(I),(I-1)*DY
10 FORMAT(2F12.3)
ENDDO
CLOSE(15)
!DISPLAY INFORMATION ON SCREEN
!WRITE(*,*) 'THE OUTPUT VELOCITY IS AFTER', ITER, ' TIME STEPS'
!TERMINATION OF COMPUTER PROGRAM
STOP
END
!!!!!!!!
!!!!!!!!!!!!
!!!!!!!!!
SUBROUTINE SR1(A,B,N,S,VNEW)
REAL DIAGM(N), DIAGU(N), DIAGL(N)
REAL SS(N)
DO J=1,N-1
SS(J)=S(J+1)
ENDDO
DO I=1,N
DIAGM(i)=B
!Sets main diagonal as B for every value of i
IF (I==0) then
DIAGU(I)=A
DIAGL(I)=0
! No lower diagonal coefficient when i = 0
ELSE IF (I==N) THEN
DIAGU(I)=0
! No upper diagonal coefficient when i = Num
DIAGL(I)=A
ELSE
DIAGU(I)=A
! For all other points there is an upper diagonal coefficient
DIAGL(I)=A
! For all other points there is a lower diagonal coefficient
ENDIF
ENDDO
!CALL STANDARD FORTRAN MATH LIBRARY TO SOLVE LINEAR EQUATION AND GET SOLUTION VECTOR X(N-1)
CALL SR2 (DIAGL,DIAGM,DIAGU,SS,VNEW,N-2)
!RETURN TO MAIN PROGRAM AND X(N-1) IS FEEDED INTO UNEW(N-1)
RETURN
END SUBROUTINE
!!!!!!!!!!!!!!!
!!!!!!!!!!!
!!!!!!!!!!!
SUBROUTINE SR2 (A,B,C,D,Z,N)
!a - sub-diagonal (means it is the diagonal below the main diagonal)
!b - the main diagonal
!c - sup-diagonal (means it is the diagonal above the main diagonal)
!K - right part
!UNEW - the answer
!E - number of equations
INTEGER N
REAL A(N), B(N), C(N), D(N)
REAL CP(N), DP(N), Z(N)
REAL M
INTEGER I
DATA M/1/
!initialize c-prime and d-prime
CP(1) = C(1)/B(1)
DP(1) = D(1)/B(1)
!solve for vectors c-prime and d-prime
DO I=2,N
M=b(i)-CP(I-1)*(A(I))
CP(I)=C(I)/M
DP(I)=(D(I)-DP(I-1)*A(I))/M
ENDDO
!initialize UNEW
Z(N)=DP(N)
!solve for x from the vectors c-prime and d-prime
DO I=N-1, 1, -1
Z(I)=DP(I)-CP(I)*Z(I+1)
ENDDO
END SUBROUTINE
As george says in a comment, the problem is with the subroutine SR1. So that this isn't just a CW-stealing-a-comment answer I'll also expand a bit.
The way things are structured SR1 is a different scope from the main program. The IMPLICIT NONE in the main program doesn't apply to the subroutine, so A, B, N, S and VNEW are all implicitly typed. Apart from N,which is an integer, they are (scalar) reals.
The reference to S(J+1), as george says, means that S is not only a scalar real, but also a function. Remember that SR1 is a different scope and no information is passed from the caller to the callee about types, shapes, etc.. Further, that the dummy argument in SR1 called A happens to be same name as the actual argument in the call doesn't mean that the callee "knows" things. Your call to SR2 with the VNEW is also a problem for the same reason.
The question is tagged as "fortran77" so there isn't too much you can do to ensure there is a lot of checking going on, but there may well be compiler options and as you can use IMPLICIT NONE (not Fortran 77) that would detect your problems.
But, the question is also tagged "fortran" and "fortran95" so I'll point out that there are far better ways to detect the issues, using more modern features. Look at interfaces, modules and internal procedures.