OpenMP code not giving any result in Intel Fortran - fortran

I wrote a serial code for the conjugate gradient method and tried to parallelize it with OpenMP ( my platform is Intel cluster )
When I use the serial code I am getting output as follows :-
Test data (matrix and right hand side) :
5.00000 0.00000 -1.00000 -1.00000 1.00000
0.00000 5.00000 -1.00000 -1.00000 1.00000
-1.00000 -1.00000 5.00000 -1.00000 1.00000
-1.00000 -1.00000 -1.00000 5.00000 1.00000
Solution for the linear system:
0.37500 0.37500 0.43750 0.43750
But when I am using Openmp for a particular do loop I am not getting the solution . The output is only this :
Test data (matrix and right hand side) :
5.00000 0.00000 -1.00000 -1.00000 1.00000
0.00000 5.00000 -1.00000 -1.00000 1.00000
-1.00000 -1.00000 5.00000 -1.00000 1.00000
-1.00000 -1.00000 -1.00000 5.00000 1.00000
There are no errors during compiling.
Can anybody help ?
I have copy-pasted the OpenMP code.
module cg
contains
subroutine cg_method (n,a,y,x)
use omp_lib
implicit none
integer :: n, k , i , j
integer,parameter :: sze = 24 , nt = 4
double precision :: tol = 2.d-16
double precision :: a(0:sze,0:sze),x(0:sze),y(0:sze)
double precision :: d(0:n-1),g(0:n-1),auxD(0:n-1),alpha,beta
double precision :: num,den,aux1,aux2,dist,xnorm
!$call omp_set_num_threads(nt)
! start with x at origin
do i = 0,n-1
x(i) = 0.d0
enddo
! initialize d and g
! d = -g = -(a*x-y) = y as x = 0
do i = 0,n-1
aux1 = y(i)
d(i) = aux1
g(i) = -aux1
enddo
! perform at most n steps of CG algo
do k = 0,n
! compute new alpha
! alpha = -(d(transp)*g)/(d(transp)*(a*d))
num = 0.d0
den = 0.d0
do i = 0,n-1
num = num + d(i)*g(i)
aux1 = 0.d0
do j = 0 , i-1
aux1 = aux1 + a(j,i)*d(j)
enddo
do j = i,n-1
aux1 = aux1 + a(i,j)*d(j)
enddo
auxD(i) = aux1
den = den + d(i)*aux1
enddo
alpha = -num/den
! compute the norm of x and alpha*d and find a new x
!x = x + alpha*d , then check if x is close in order to stop the process
!before n complete steps
xnorm = 0.d0
dist = 0.d0
do i = 0,n-1
aux1 = x(i)
xnorm = xnorm + aux1*aux1
aux2 = alpha*d(i)
dist = dist + aux2*aux2
x(i) = aux1 + aux2
enddo
!compute new g : g + alpha*(a*d)
do i = 0,n-1
g(i) = g(i) + alpha*auxD(i)
enddo
! compute new beta :
! beta = (g(transp)*(a*d))/(d(transp)*(a*d))
num = 0.d0
do i = 0,n-1
num = num + g(i)*auxD(i)
enddo
beta = num/den
!compute new d : d = -g + beta*d
!$omp parallel default(none) shared(beta,d,g) private(i,n)
!$omp do
do i = 0,n-1
d(i) = -g(i) + beta*d(i)
enddo
!$omp enddo
!$omp end parallel
enddo !k loop
end subroutine cg_method
end module cg
program test
use cg
use omp_lib
integer,parameter :: sze = 24
double precision :: a(0:sze,0:sze), y(0:sze),x(0:sze)
integer :: n , i , j
n = 4
!define a matrix
a(0,0)=5.d0;a(0,2)=-1.d0;a(0,3)=-1.d0
a(1,1)=5.d0;a(1,2)=-1.d0;a(1,3)=-1.d0
a(2,2)=5.d0;a(2,3)=-1.d0;a(3,3)=5.d0
!define b vector
y(0)=1.d0;y(1)=1.d0;y(2)=1.d0;y(3)=1.d0
print*,'Test data (matrix and right hand side) :'
do i = 0,n-1
write(*,100) (a(j,i),j=0,i-1),(a(i,j),j=i,n-1),y(i)
enddo
call cg_method(n,a,y,x) !perform CG method
print *,' Solution for the linear system:'
write(*,100) (x(i),i=0,n-1)
100 format(10F9.5)
end program test

The solution is quite simple: private variables are undefined when entering an OpenMP section, and also after leaving it. Simply change the corresponding line to
!$omp parallel default(none) shared(beta,d,g) private(i) firstprivate(n)
and everything works as expected. Using firstprivate, you tell the compiler to copy the value of the original variable to all threads when entering the parallel section.
But why do you need to declare n private? It is just read! Your code also works for a shared n:
!$omp parallel default(none) shared(beta,d,g,n) private(i)

Related

Vector- Sparce Matrix multiplication using MKL

I have simple Fortran code which convert the A=[1,1;1,1] matrix to CSR sparse format and then multiply it with x=(100,200) as a vector, y=A*x.
Unfortunately the result is strange y=(200,200) while it should be y =(300,300) vector. Thanks.
program main
implicit none
include 'mkl_spblas.fi'
integer :: nzmax, nnz, job( 8 ), m, n, lda, info, irow, k
double precision :: A(2,2)
double precision, allocatable :: Asparse(:)
integer, allocatable :: ia(:), ja(:)
double precision:: x(2)
double precision:: y(2)
A(1,1) = 1.d0
A(1,2) = 1.d0
A(2,1) = 1.d0
A(2,2) = 1.d0
x(1) = 100.d0
x(2) = 200.d0
!! Give an estimate of the number of non-zeros.
nzmax = 4
print *, "nzmax = ", nzmax
m = size( A, 1 ) !! number of rows
n = size( A, 2 ) !! number of columns
lda = m !! leading dimension of A
allocate( Asparse( nzmax ) )
allocate( ja( nzmax ) ) !! <-> columns(:)
allocate( ia( m + 1 ) ) !! <-> rowIndex(:)
job( 1 ) = 0 !! convert dense to sparse A
job( 2:3 ) = 1 !! use 1-based indices
job( 4 ) = 2 !! use the whole A as input
job( 5 ) = nzmax !! maximum allowed number of non-zeros
job( 6 ) = 1 !! generate Asparse, ia, and ja as output
call mkl_ddnscsr( job, m, n, A, lda, Asparse, ja, ia, info )
if ( info /= 0 ) then
print *, "insufficient nzmax (stopped at ", info, "row)"; stop
endif
nnz = ia(m+1)
print *, "number of non-zero elements = ", nnz
do irow = 1, m
!! This loop runs only for rows having nonzero elements.
do k = ia( irow ), ia( irow + 1 ) - 1
print "(2i5, f15.8)", irow, ja( k ), Asparse( k )
enddo
enddo
call mkl_cspblas_dcsrgemv('n', m, Asparse, ia, ja, x, y)
print*, y
end program
that's because you are using a c-style indexing for calling MKL subroutine.
mkl_cspblas_dcsrgemv is zero-based indexing.
for your program you should use mkl_dcsrgemv which is one-based indexing.

values are beign assigned correctly but doesn't work on the calculation [duplicate]

This question already has answers here:
integer, do loop, fortran, error
(1 answer)
Why are the elements of an array formatted as zeros when they are multiplied by 1/2 or 1/3?
(1 answer)
Closed 5 years ago.
I am doing a for loop to calculate the present value. When j = 1, I will get the correct product and sum, but when j goes to 2 till 20, the product will be just 0. may I know what did I miss out?
program q5
implicit none
integer :: j, k, lx, d, ld, row, col
real :: pv, v, qd, sum, product
real, dimension(110,3) :: arr
sum = 0.0
product = 0.0
lx = 0
ld = 0
qd = 0.0
!connect file
open( unit = 1, file = 'Females_Australian_Life_Tables_2010-12.csv' )
do row = 1, 110
read( 1, *) arr(row,:)
write(*,*) arr(row,:)
enddo
do j = 1, 20
d = 66 + j - 1
v = 1 / ( 1 + 0.05 )
lx = arr(66,2)
ld = arr(d,2)
qd = arr(d,3)
product = ( ld/lx ) * qd * (v**j)
sum = sum + product
print*, '*************', lx, ld, qd, '**********'
print*, 'j=', j
print*, ' product ', product
print*, ' sum ', sum
enddo
pv = 30000 * sum
!disconnect file
close( 1 )
print*, 'The present value is', pv
end program q5
this is the error I have got

Is there a command or subroutine for LU factorization? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
In MatLab, the command lu(A) gives as output the two matrices L and U, that is, the LU factorization of A. I was wondering whether there is some command in Fortran doing exactly the same. I have not been able to find it anywhere. I found a lot of subroutines of LAPACK which solve linear systems by first performing the LU factorization, but for my purpouses I need to specifically perform the LU factorization and store the L and U matrices.
Is there a command or subroutine which has as input a square matrix A and as outputs the matrices L and U of the LU factorization?
There is no built-in command that corresponds to lu in Matlab, but we can write a simple wrapper to dgetrf in LAPACK such that the interface is similar to lu, e.g.,
module linalg
implicit none
integer, parameter :: dp = kind(0.0d0)
contains
subroutine lufact( A, L, U, P )
!... P * A = L * U
!... http://www.netlib.org/lapack/explore-3.1.1-html/dgetrf.f.html
!... (note that the definition of P is opposite to that of the above page)
real(dp), intent(in) :: A(:,:)
real(dp), allocatable, dimension(:,:) :: L, U, P
integer, allocatable :: ipiv(:)
real(dp), allocatable :: row(:)
integer :: i, n, info
n = size( A, 1 )
allocate( L( n, n ), U( n, n ), P( n, n ), ipiv( n ), row( n ) )
L = A
call DGETRF( n, n, L, n, ipiv, info )
if ( info /= 0 ) stop "lufact: info /= 0"
U = 0.0d0
P = 0.0d0
do i = 1, n
U( i, i:n ) = L( i, i:n )
L( i, i:n ) = 0.0d0
L( i, i ) = 1.0d0
P( i, i ) = 1.0d0
enddo
!... Assuming that P = P[ipiv(n),n] * ... * P[ipiv(1),1]
!... where P[i,j] is a permutation matrix for i- and j-th rows.
do i = 1, n
row = P( i, : )
P( i, : ) = P( ipiv(i), : )
P( ipiv(i), : ) = row
enddo
endsubroutine
end module
Then, we can test the routine with a 3x3 matrix shown in the Matlab page for lu():
program test_lu
use linalg
implicit none
real(dp), allocatable, dimension(:,:) :: A, L, U, P
allocate( A( 3, 3 ) )
A( 1, : ) = [ 1, 2, 3 ]
A( 2, : ) = [ 4, 5, 6 ]
A( 3, : ) = [ 7, 8, 0 ]
call lufact( A, L, U, P ) !<--> [L,U,P] = lu( A ) in Matlab
call show( "A =", A )
call show( "L =", L )
call show( "U =", U )
call show( "P =", P )
call show( "P * A =", matmul( P, A ) )
call show( "L * U =", matmul( L, U ) )
call show( "P' * L * U =", matmul( transpose(P), matmul( L, U ) ) )
contains
subroutine show( msg, X )
character(*) :: msg
real(dp) :: X(:,:)
integer i
print "(/,a)", trim( msg )
do i = 1, size(X,1)
print "(*(f8.4))", X( i, : )
enddo
endsubroutine
end program
which gives the expected result:
A =
1.0000 2.0000 3.0000
4.0000 5.0000 6.0000
7.0000 8.0000 0.0000
L =
1.0000 0.0000 0.0000
0.1429 1.0000 0.0000
0.5714 0.5000 1.0000
U =
7.0000 8.0000 0.0000
0.0000 0.8571 3.0000
0.0000 0.0000 4.5000
P =
0.0000 0.0000 1.0000
1.0000 0.0000 0.0000
0.0000 1.0000 0.0000
P * A =
7.0000 8.0000 0.0000
1.0000 2.0000 3.0000
4.0000 5.0000 6.0000
L * U =
7.0000 8.0000 0.0000
1.0000 2.0000 3.0000
4.0000 5.0000 6.0000
P' * L * U =
1.0000 2.0000 3.0000
4.0000 5.0000 6.0000
7.0000 8.0000 0.0000
Here please note that the inverse of P is given by its transpose (i.e., inv(P) = P' = transpose(P)) because P is the product of (elementary) permutation matrices.
I have added an method to compute LU using DOLITTLE method. Which is used by MATLAB to computed LU for faster computation involving larger matrices. The algorithm is as follows. To execute the algorithm you have to provide an input file in the format given below. Since the algorithm is a subroutine, you can add it to your code and call it whenever required. Algorithm, input file, output file are as follows.
PROGRAM DOLITTLE
IMPLICIT NONE
INTEGER :: n
!**********************************************************
! READS THE NUMBER OF EQUATIONS TO BE SOLVED.
OPEN(UNIT=1,FILE='input.dat',ACTION='READ',STATUS='OLD')
READ(1,*) n
CLOSE(1)
!**********************************************************
CALL LU(n)
END PROGRAM
!==========================================================
! SUBROUTINES TO MAIN PROGRAM
SUBROUTINE LU(n)
IMPLICIT NONE
INTEGER :: i,j,k,p,n,z,ii,itr = 500000
REAL(KIND=8) :: temporary,s1,s2
REAL(KIND=8),DIMENSION(1:n) :: x,b,y
REAL(KIND=8),DIMENSION(1:n,1:n) :: A,U,L,TEMP
REAL(KIND=8),DIMENSION(1:n,1:n+1) :: AB
! READING THE SYSTEM OF EQUATIONS
OPEN(UNIT=2,FILE='input.dat',ACTION='READ',STATUS='OLD')
READ(2,*)
DO I=1,N
READ(2,*) A(I,:)
END DO
DO I=1,N
READ(2,*) B(I)
END DO
CLOSE(2)
DO z=1,itr
U(:,:) = 0
L(:,:) = 0
DO j = 1,n
L(j,j) = 1.0d0
END DO
DO j = 1,n
U(1,j) = A(1,j)
END DO
DO i=2,n
DO j=1,n
DO k=1,i1
s1=0
if (k==1)then
s1=0
else
DO p=1,k1
s1=s1+L(i,p)*U(p,k)
end DO
endif
L(i,k)=(A(i,k)-s1)/U(k,k)
END DO
DO k=i,n
s2=0
DO p=1,i-1
s2=s2+l(i,p)*u(p,k)
END DO
U(i,k)=A(i,k)*s2
END DO
END DO
END DO
IF(z.eq.1)THEN
OPEN(UNIT=3,FILE='output.dat',ACTION='write')
WRITE(3,*) ''
WRITE(3,*) '********** SOLUTIONS *********************'
WRITE(3,*) ''
WRITE(3,*) ''
WRITE(3,*) 'UPPER TRIANGULAR MATRIX'
DO I=1,N
WRITE(3,*) U(I,:)
END DO
WRITE(3,*) ''
WRITE(3,*) ''
WRITE(3,*) 'LOWER TRIANGULAR MATRIX'
DO I=1,N
WRITE(3,*) L(I,:)
END DO
END SUBROUTINE
Here goes the input file format for system Ax=B. First line is number of equations, next three lines are the A matrix element, next three lines are B vector ,
3
10 8 3
3 20 1
4 5 15
18
23
9
And the output is generated as,
********** SOLUTIONS *********************
UPPER TRIANGULAR MATRIX
10.000000000000000 8.0000000000000000 3.0000000000000000
0.0000000000000000 17.600000000000001 0.1000000000000009
0.0000000000000000 0.0000000000000000 13.789772727272727
LOWER TRIANGULAR MATRIX
1.0000000000000000 0.0000000000000000 0.0000000000000000
0.2999999999999999 1.0000000000000000 0.0000000000000000
0.4000000000000002 0.1022727272727273 1.0000000000000000
You can try "numerical recipes in fortran 77",
there was LU decomposition subroutine.
There are a lot of useful subroutines, linalg, stasistics, etc.

Write Newton binomial in Fortran90

I have to write an script in Fortran that returns the results of the Newton binomial:
for a, b and n given.
The problem is that I cant use functions or subroutines.
Until now I have written the code for the combinations:
if (n==0) then
print*, "Cnk=",Cnk
else if ((n>=0).and.(k==0)) then
print*, "Cnk=",Cnk
else
do i=1,n,1
aux=aux*i
if (k==i) then
factK=aux
end if
if ((n-k)==i) then
factnk=aux
end if
factn=aux
end do
Cnk=factn/(factk*factnk)
print*, "Cnk=",Cnk
end if
In the case of the binomial k is variable from 0 to n.
Probably not the fastest solution, but quite short:
program binom
implicit none
integer,parameter :: N=5
integer,parameter :: a=3
integer,parameter :: b=5
integer :: k, i
integer :: coeff, eval, total
total = 0
do i=0,N
coeff = product((/ (k,k=1,n) /)) / product((/ (k,k=1,i),(k,k=1,n-i) /))
eval = coeff * a**(n-i) * b**i
total = total + eval
write(*,*) 'i=',i,'coeff=',coeff, 'eval=',eval
enddo !i
write(*,*) '(a+b)**n=',(a+b)**N,'Total=',total
end program binom
Output:
i= 0 coeff= 1 eval= 243
i= 1 coeff= 5 eval= 2025
i= 2 coeff= 10 eval= 6750
i= 3 coeff= 10 eval= 11250
i= 4 coeff= 5 eval= 9375
i= 5 coeff= 1 eval= 3125
(a+b)**n= 32768 Total= 32768

How to call each 4 values out of 40 values in fortran

I have a column matrix with 40 values. Say,
1
4
5
2
4
1
9
.
.
.
2
How can I call every four values and average them until it reaches 40th? I managed to do in the following way but is there a better way? Beste!
i = 1, 4
avg1 = avg + avg(i)
i = 5,8
avg2 = avg + avg(i)
i = 9,12
avg3 = avg + avg(i)
.......
i = 37,40
avg10 = avg + avg(i)
It took me a couple of iterations to get the syntax right, but how about this?
integer, parameter, dimension(*) :: a = [ 1, 4, 5, ..., 2 ]
integer :: i
real, dimension(10) :: avg
avg = [ (sum(a(i * 4 + 1 : (i + 1) * 4)) / 4., i = 0, 9) ]
print *, avg
end
How about that?
program testing
implicit none
integer, dimension(40) :: array
real, dimension(10) :: averages
integer :: i, j, k, aux
array(:) = (/(i, i=1,40)/) ! just values 1 to 40
averages(:) = 0.0
k = 1 ! to keep track of where to store the next average
do i=1,40,4 ! iterate over the entire array in steps of 4
aux = 0 ! just a little helper variable, not really required, but neater I think
do j=i,i+3 ! iterating over 4 consecutive values
aux = aux + array(j)
end do
averages(k) = aux / 4.0
k = k + 1
end do
print *, averages
end program testing
This is the output:
2.500000 6.500000 10.50000 14.50000 18.50000
22.50000 26.50000 30.50000 34.50000 38.50000
Is this what you were looking for?