Declaring parameters through input arguments in FORTRAN - fortran

I am trying to write a fortran routine where I declare arrays whose length comes from operations made upon the input parameters.
subroutine my_program(N, A,B,m)
implicit none
integer, intent(in) :: N
integer, parameter :: minA = 1, maxA = N
integer, parameter :: minB = 0, maxB = N-1
double precision, intent(out) :: max,A(minA:maxA),B(minB:maxB)
A = 0.d0
B = 1.d0
m = maxA*maxB-minA*minB
end subroutine my_program
Right now, I have an error coming from the the 5th line Parameter 'N' at (1) has not been declared or is a variable, which does not reduce to a constant expression

N is not known at compile time, so you cannot use it to initialize a parameter. Instead, use N directly to declare A and B:
subroutine my_program(N, A, B, m)
implicit none
integer, intent(in) :: N
double precision, intent(out) :: m, A(1:N), B(0:N-1)
integer :: minA, maxA
integer :: minB, maxB
minA = 1 ; maxA = N
minB = 0 ; maxB = N-1
A = 0.d0
B = 1.d0
m = maxA*maxB - minA*minB
end subroutine my_program

Related

Fortran Subroutines/Functions: Returned Value Changes If Subroutines/Functions Is Called More Often?

I am currently implementing integrals in Fortran as subroutines. The subroutines on their own return the correct values. If i now call the e.g. same subroutine twice after each other, with the same input values, their returned value differs significantly?
The main program only calls the function like this:
program main
use types
use constants
use integrals
use basis
real(dp), dimension(2,3) :: molecule_coords
real(dp), dimension(2) :: z
type(primitive_gaussian), allocatable :: molecule(:,:)
molecule_coords(1,:) = (/0.,0.,0./)
molecule_coords(2,:) = (/0.,0.,1.6/)
molecule = def_molecule(molecule_coords)
z = (/1.0, 1.0/)
call overlap(molecule) ! Correct Value returned
call overlap(molecule) ! Wrong Value returned
end program main
My function for the overlap looks like this:
module integrals
use types
use constants
use basis
use stdlib_specialfunctions_gamma!, only: lig => lower_incomplete_gamma
contains
subroutine overlap(molecule)
implicit none
type(primitive_gaussian), intent(in) :: molecule(:,:)
integer :: nbasis, i, j, k, l
real(dp) :: norm, p, q, coeff, Kab
real(dp), dimension(3) :: Q_xyz
real(dp), dimension(INT(size(molecule,1)),INT(size(molecule,1))) :: S
nbasis = size(molecule,1)
do i = 1, nbasis
do j = 1, nbasis
! Iterate over l and m primitives in basis
do k = 1, size(molecule(i,:))
do l = 1, size(molecule(j,:))
norm = molecule(i, k)%norm() * molecule(j, l)%norm()
! Eq. 63
Q_xyz = (molecule(i, k)%coords - molecule(j, l)%coords)
! Eq. 64, 65
p = (molecule(i, k)%alpha + molecule(j, l)%alpha)
q = (molecule(i, k)%alpha * molecule(j, l)%alpha) / p
! Eq. 66
Kab = exp(-q * dot_product(Q_xyz,Q_xyz))
coeff = molecule(i, k)%coeff * molecule(j, l)%coeff
S(i,j) = S(i,j) + norm * coeff * Kab * (pi / p) ** (1.5)
end do
end do
end do
end do
print *, S
end subroutine overlap
end module integrals
I am a bit lost, why this would be the case, but I am also rather new to Fortran.
Any help is appreciated! Thanks!

Why does a manually programmed matrix multiplication combined with matrix addition give better performance than the intrinsic functions?

I have some legacy code which performs the matrix operation of B = B + A*E as
DO I = 1,N
DO L = 1,N
DO K = 1,N
B(I,K) = B(I,K) + A(I,L)*E(L,K,J-1)
end do
end do
end do
To improve readability as well as take advantage of modern fortran intrinsic functions, I would like to write the above code as
B = B + matmul( A, E(:, 1:N, J-1) )
I noticed that the improved readability comes at the cost of performance. I determined that the problem is not with the intrinsic function matmul - the left figure shows that matmul performs just as well as the manually written operation for all values of N.
When matrix multiplication is combined with matrix addition, then for small values of N the manually written operation performs better than the intrinsic functions. For my uses, usually N < 10; I would like to improve the readability without losing the performance. Might there be a suggestion for that?
The code I am using is below. I am using Mac OS 10.14.6 with gfortran 8.2.0 and compiling with the -O3 optimization option.
program test
implicit none
integer :: loop_max = 1000
integer :: j ! loop index
integer :: i ! loop index
real :: t1, t2 ! start and end times
real :: t_manual, t_intrinsic, t_man_add, t_intrn_add
integer :: N ! matrix dimension
integer, parameter :: NJ = 12
real, dimension(:, :), allocatable :: A, B ! matrices
real, dimension(:, :), allocatable :: D
real, dimension(:), allocatable :: G
real, dimension(:, :, :), allocatable :: E
open(1, file = 'Delete.txt', status = 'unknown')
do N = 1, 40
allocate(A(N,N), B(N,N), G(N), D(N, 2*N+1), E(N, N+1, NJ))
! ##########################################################################
! manual matrix multiplication vs matmul
call rand_fill
call CPU_time(t1)
do i = 1, loop_max
do j = 2, 12
call matmul_manual(j, N, NJ, A, B, D, G, E)
end do
end do
call CPU_time(t2)
t_manual = t2 - t1
write(1, *) A, B, D, G, E
call rand_fill
call CPU_time(t1)
do i = 1, loop_max
do j = 2, 12
B = matmul( A, E(:, 1:N, j-1) )
end do
end do
call CPU_time(t2)
t_intrinsic = t2 - t1
write(1, *) A, B, D, G, E
! --------------------------------------------------------------------------
! ##########################################################################
! manual matrix multiplication with matrix addition
call rand_fill
call CPU_time(t1)
do i = 1, loop_max
do j = 2, 12
call manual_matmul_add(j, N, NJ, A, B, D, G, E)
end do
end do
call CPU_time(t2)
t_man_add = t2 - t1
write(1, *) A, B, D, G, E
! --------------------------------------------------------------------------
! ##########################################################################
! intrinsic matrix multiplication (matmul) with matrix addition
call rand_fill
call CPU_time(t1)
do i = 1, loop_max
do j = 2, 12
call intrinsic_matmul_add(j, N, NJ, A, B, D, G, E)
end do
end do
call CPU_time(t2)
t_intrn_add = t2 - t1
write(1, *) A, B, D, G, E
! --------------------------------------------------------------------------
deallocate(A, B, D, G, E)
print*, N, t_manual, t_intrinsic, t_man_add, t_intrn_add
end do
contains
subroutine rand_fill
! fill the matrices with random numbers
call random_number(A)
call random_number(B)
call random_number(D)
call random_number(G)
call random_number(E)
end subroutine
end program test
subroutine matmul_manual(j, N, NJ, A, B, D, G, E)
implicit none
integer, intent(in) :: j
integer, intent(in) :: N, NJ
real, dimension(N, N), intent(in out) :: A, B
real, dimension(N, 2*N+1), intent(in out) :: D
real, dimension(N), intent(in out) :: G
real, dimension(N, N+1, NJ), intent(in out) :: E
integer :: I, L, K ! loop indices
B = 0.0
DO I = 1,N
DO L = 1,N
DO K = 1,N
B(I,K) = B(I,K) + A(I,L)*E(L,K,J-1)
end do
end do
end do
end subroutine matmul_manual
subroutine manual_matmul_add(j, N, NJ, A, B, D, G, E)
implicit none
integer, intent(in) :: j
integer, intent(in) :: N, NJ
real, dimension(N, N), intent(in out) :: A, B
real, dimension(N, 2*N+1), intent(in out) :: D
real, dimension(N), intent(in out) :: G
real, dimension(N, N+1, NJ), intent(in out) :: E
integer :: I, L, K ! loop indices
DO I = 1,N
D(I,N+1) = -G(I)
DO L = 1,N
D(I,N+1) = D(I,N+1)+A(I,L)*E(L,N+1,J-1)
DO K = 1,N
B(I,K) = B(I,K) + A(I,L)*E(L,K,J-1)
end do
end do
end do
end subroutine manual_matmul_add
subroutine intrinsic_matmul_add(j, N, NJ, A, B, D, G, E)
implicit none
integer, intent(in) :: j
integer, intent(in) :: N, NJ
real, dimension(N, N), intent(in out) :: A, B
real, dimension(N, 2*N+1), intent(in out) :: D
real, dimension(N), intent(in out) :: G
real, dimension(N, N+1, NJ), intent(in out) :: E
real, dimension(N, N+1) :: temp1
real, dimension(N, N) :: temp2
D(:, N+1) = -G + matmul( A, E(:, N+1, j-1) )
B = B + matmul( A, E(:, 1:N, j-1) )
end subroutine intrinsic_matmul_add
subroutine mat_sub_new(j, N, NJ, A, B, D, G, E)
implicit none
integer, intent(in) :: j
integer, intent(in) :: N, NJ
real, dimension(N, N), intent(in out) :: A, B
real, dimension(N, 2*N+1), intent(in out) :: D
real, dimension(N), intent(in out) :: G
real, dimension(N, N+1, NJ), intent(in out) :: E
if (N == 1) then ! matmul seems to be inefficient when N = 1
D(N,N+1) = -G(N) + A(N,N)*E(N, N+1, J-1)
B(N,N) = B(N,N) + A(N,N)*E(N, N, J-1)
else
D(:, N+1) = -G + matmul( A, E(:, N+1, j-1) )
B = B + matmul( A, E(:, 1:N, j-1) )
end if
end subroutine mat_sub_new
I suspect this has to do with two issues:
How the compiler resolves the MATMUL generic interface to a proper call to the correct routine (REAL vs. COMPLEX vs. INTEGER, or matrix times vector vs. matrix times matrix): I have no idea whether this is done systematically at compilation or at runtime, or whether this choice is made based on the optimization level; (this would justify the additional overhead especially for the low-size cases)
The internal "general purpose" algorithm may not be in general the best suited for your problem, as it looks like in some cases, brute-force compiler optimization does a better job. Is gfortran's MATMUL intrinsics based on BLAS, for example? If so, that may not be the fastest.
I've done a similar test with NORM2 on my PC (Windows, i7, gfortran 9.2.0, compiled with -O3 -march=core-avx2): Turns out that, for NORM2:
The BLAS implementation is always slowest, despite having been slightly refactored to feed the compiler a PURE version
Usage of the intrinsics (both NORM2 or SQRT(SUM(X**2))) is always slow, regardless of the array size
The fastest cases are when either using a simple loop, or the intrinsics with a fixed-size array:
ARRAY SIZE assumed-shape fixed-size intrinsic NORM2 LOOP BLAS
N [ms/N] [ms/N] [ms/N] [ms/N] [ms/N]
2 5.93750E-09 4.06250E-09 8.43750E-09 4.06250E-09 1.03125E-08
12 1.03125E-08 7.81250E-09 3.12500E-08 7.81250E-09 5.09375E-08
22 1.65625E-08 1.40625E-08 5.50000E-08 1.43750E-08 9.15625E-08
32 2.25000E-08 2.00000E-08 7.81250E-08 2.00000E-08 1.29375E-07
BTW The code is pasted here below (beware of the large memory footprint!):
program test_norm
use iso_fortran_env
implicit none
integer :: xsize,i,iunit,icase
integer, parameter :: testSize = 50000000
real(real64), allocatable :: set(:,:),setNorm(:)
real(real64) :: t0,t1,setSum(5),timeTable(5,35)
intrinsic :: norm2
open(newunit=iunit,file='garbage.txt',action='write')
print '(6(1x,a15))' ,'ARRAY SIZE','assumed-shape','fixed-size','intrinsic NORM2','LOOP','BLAS'
print '(6(1x,a15))' ,'N',('[ms/N]',i=1,5)
icase = 0
do xsize = 2,32,10
! Initialize test set
icase = icase+1
allocate(set(xsize,testSize),setNorm(testSize))
call random_number(set)
! Test 1: intrinsic SQRT/SUM, assumed-shape array
call cpu_time(t0); forall(i=1:testSize) setNorm(i) = norm_v1(set(:,i)); call cpu_time(t1)
setSum(1) = sum(setNorm); timeTable(1,icase) = t1-t0
! Test 2: intrinsic SQRT/SUM, fixed-size array
call cpu_time(t0); forall(i=1:testSize) setNorm(i) = norm_v2(xsize,set(:,i)); call cpu_time(t1)
setSum(2) = sum(setNorm); timeTable(2,icase) = t1-t0
! Test 3: intrinsic NORM2
call cpu_time(t0); forall(i=1:testSize) setNorm(i) = norm2(set(:,i)); call cpu_time(t1)
setSum(3) = sum(setNorm); timeTable(3,icase) = t1-t0
! Test 4: LOOP
call cpu_time(t0); forall(i=1:testSize) setNorm(i) = norm_v3(xsize,set(:,i)); call cpu_time(t1)
setSum(4) = sum(setNorm); timeTable(4,icase) = t1-t0
! Test 5: BLAS
call cpu_time(t0); forall(i=1:testSize) setNorm(i) = DNRM2(xsize,set(:,i),1); call cpu_time(t1)
setSum(5) = sum(setNorm); timeTable(5,icase) = t1-t0
! Print output
print '(7x,i2,9x,5(2x,1pe13.5e2,1x))', xsize,timeTable(:,icase)/testSize
write (iunit,*) 'setSum = ',setSum
deallocate(set,setNorm)
end do
close(iunit)
contains
pure real(real64) function norm_v1(x) result(L2)
real(real64), intent(in) :: x(:)
L2 = sqrt(sum(x**2))
end function norm_v1
pure real(real64) function norm_v2(n,x) result(L2)
integer, intent(in) :: n
real(real64), intent(in) :: x(n)
L2 = sqrt(sum(x**2))
end function norm_v2
pure real(real64) function norm_v3(n,x) result(L2)
integer, intent(in) :: n
real(real64), intent(in) :: x(n)
integer :: i
L2 = 0.0_real64
do i=1,n
L2 = L2 + x(i)**2
end do
L2 = sqrt(L2)
end function norm_v3
PURE REAL(REAL64) FUNCTION DNRM2 ( N, X, INCX )
INTEGER, INTENT(IN) :: N,INCX
REAL(REAL64), INTENT(IN) :: X( * )
REAL(REAL64), PARAMETER :: ONE = 1.0_REAL64
REAL(REAL64), PARAMETER :: ZERO = 0.0_REAL64
INTEGER :: IX
REAL(REAL64) :: ABSXI, NORM, SCALE, SSQ
INTRINSIC :: ABS, SQRT
IF( N<1 .OR. INCX<1 )THEN
NORM = ZERO
ELSE IF( N==1 )THEN
NORM = ABS( X( 1 ) )
ELSE
SCALE = ZERO
SSQ = ONE
DO IX = 1, 1 + ( N - 1 )*INCX, INCX
IF( X( IX )/=ZERO )THEN
ABSXI = ABS( X( IX ) )
IF( SCALE<ABSXI )THEN
SSQ = ONE + SSQ*( SCALE/ABSXI )**2
SCALE = ABSXI
ELSE
SSQ = SSQ + ( ABSXI/SCALE )**2
END IF
END IF
END DO
NORM = SCALE * SQRT( SSQ )
END IF
DNRM2 = NORM
RETURN
END FUNCTION DNRM2
end program test_norm

Using Minpack to solve S-curve

I'd like to use Minpack (fortran) to estimate the D parameter in the following generalized form of the S-curve: y = (A - D) / (1 + (x**B/C)) + D
The idea is that in this application, the user provides A [which is always 0 to force passage through (0,0)], B, and C, and from there Minpack will find a value for D that forces passage through (1,y), where y is also supplied by the user but must be <= 1. I was able to accomplish this task with the code below, however, minpack is claiming it hasn't converged when in fact it appears that it has. For example, when running this code and entering the values 1 (at the first prompt) and 0 4 0.1 (at the second prompting), minpack returns info = 2, which according to the comments in lmdif means:
relative error between two consecutive iterates is at most xtol.
I'm tempted to comment out line 63, but am worried that's playing with fire...are there any seasoned minpack users out there who could comment on this? Line 63 is the one that reads:
if (info /= 1) stop "failed to converge"
Am I mis-using Minpack even though it appears to converge (based on my verifying the value in pars)?
module types
implicit none
private
public dp
integer, parameter :: dp=kind(0d0)
end module
module f_vals
DOUBLE PRECISION, SAVE, DIMENSION(:), POINTER:: fixed_vals
end module
module find_fit_module
! This module contains a general function find_fit() for a nonlinear least
! squares fitting. The function can fit any nonlinear expression to any data.
use minpack, only: lmdif1
use types, only: dp
implicit none
private
public find_fit
contains
subroutine find_fit(data_x, data_y, expr, pars)
! Fits the (data_x, data_y) arrays with the function expr(x, pars).
! The user can provide any nonlinear function 'expr' depending on any number of
! parameters 'pars' and it must return the evaluated expression on the
! array 'x'. The arrays 'data_x' and 'data_y' must have the same
! length.
real(dp), intent(in) :: data_x(:), data_y(:)
interface
function expr(x, pars) result(y)
use types, only: dp
implicit none
real(dp), intent(in) :: x(:), pars(:)
real(dp) :: y(size(x))
end function
end interface
real(dp), intent(inout) :: pars(:)
real(dp) :: tol, fvec(size(data_x))
integer :: iwa(size(pars)), info, m, n
real(dp), allocatable :: wa(:)
tol = sqrt(epsilon(1._dp))
!tol = 0.001
m = size(fvec)
n = size(pars)
allocate(wa(m*n + 5*n + m))
call lmdif1(fcn, m, n, pars, fvec, tol, info, iwa, wa, size(wa))
open(222, FILE='D_Value.txt')
write(222,4) pars(1)
4 format(E20.12)
close(222)
if (info /= 1) stop "failed to converge"
contains
subroutine fcn(m, n, x, fvec, iflag)
integer, intent(in) :: m, n, iflag
real(dp), intent(in) :: x(n)
real(dp), intent(out) :: fvec(m)
! Suppress compiler warning:
fvec(1) = iflag
fvec = data_y - expr(data_x, x)
end subroutine
end subroutine
end module
program snwdeplcrv
! Find a nonlinear fit of the form y = (A - D) / (1 + (x**B/C)) + D.
use find_fit_module, only: find_fit
use types, only: dp
use f_vals
implicit none
real(dp) :: pars(1), y_int_at_1
real(dp) :: y(1) = 1.0 ! Initialization of value to be reset by user (y: value of S-curve # x=1)
real(dp) :: A, B, C
integer :: i
allocate(fixed_vals(3)) ! A, B, C parameters
pars = [1._dp] ! D parameter in S-curve function
! Read PEST-specified parameters
write(*,*) ' Enter value that S-curve should equal when SWE=1 (must be <= 1)'
read(*,*) y_int_at_1
if(y_int_at_1 > 1.0) y_int_at_1 = 1
y = y_int_at_1
! Read PEST-specified parameters
write(*,*) ' Enter S-curve parameters: A, B, & C. D parameter to be estimated '
read(*,*) A, B, C
fixed_vals(1) = A
fixed_vals(2) = B
fixed_vals(3) = C
call find_fit([(real(i, dp), i=1,size(y))], y, expression, pars)
print *, pars
contains
function expression(x, pars) result(y)
use f_vals
real(dp), intent(in) :: x(:), pars(:)
real(dp) :: y(size(x))
real(dp) :: A, B, C, D
A = fixed_vals(1)
B = fixed_vals(2)
C = fixed_vals(3)
D = pars(1)
y = (A - D) / (1 + (x**B / C)) + D
end function
end program

Storing a Variable with a Multi-Dimensional Index in Fortran

Question
Consider the following code:
program example
implicit none
integer, parameter :: n_coeffs = 1000
integer, parameter :: n_indices = 5
integer :: i
real(8), dimension(n_coeffs) :: coeff
integer, dimension(n_coeffs,n_indices) :: index
do i = 1, n_coeffs
coeff(i) = real(i*3,8)
index(i,:) = [2,4,8,16,32]*i
end do
end
For any 5 dimensional index I need to obtain the associated coefficient, without knowing or calculating i. For instance, given [2,4,8,16,32] I need to obtain 3.0 without computing i.
Is there a reasonable solution, perhaps using sparse matrices, that would work for n_indices in the order of 100 (though n_coeffs still in the order of 1000)?
A Bad Solution
One solution would be to define a 5 dimensional array as in
real(8), dimension(2000,4000,8000,16000,32000) :: coeff2
do i = 1, ncoeffs
coeff2(index(i,1),index(i,2),index(i,3),index(i,4),index(i,5)) = coeff(i)
end do
then, to get the coefficient associated with [2,4,8,16,32], call
coeff2(2,4,8,16,32)
However, besides being very wasteful of memory, this solution would not allow n_indices to be set to a number higher than 7 given the limit of 7 dimensions to an array.
OBS: This question is a spin-off of this one. I have tried to ask the question more precisely having failed in the first attempt, an effort that greatly benefited from the answer of #Rodrigo_Rodrigues.
Actual Code
In case it helps here is the code for the actual problem I am trying to solve. It is an adaptive sparse grid method for approximating a function. The main goal is to make the interpolation at the and as fast as possible:
MODULE MOD_PARAMETERS
IMPLICIT NONE
SAVE
INTEGER, PARAMETER :: d = 2 ! number of dimensions
INTEGER, PARAMETER :: L_0 = 4 ! after this adaptive grid kicks in, for L <= L_0 usual sparse grid
INTEGER, PARAMETER :: L_max = 9 ! maximum level
INTEGER, PARAMETER :: bound = 0 ! 0 -> for f = 0 at boundary
! 1 -> adding grid points at boundary
! 2 -> extrapolating close to boundary
INTEGER, PARAMETER :: max_error = 1
INTEGER, PARAMETER :: L2_error = 1
INTEGER, PARAMETER :: testing_sample = 1000000
REAL(8), PARAMETER :: eps = 0.01D0 ! epsilon for adaptive grid
END MODULE MOD_PARAMETERS
PROGRAM MAIN
USE MOD_PARAMETERS
IMPLICIT NONE
INTEGER, DIMENSION(d,d) :: ident
REAL(8), DIMENSION(d) :: xd
INTEGER, DIMENSION(2*d) :: temp
INTEGER, DIMENSION(:,:), ALLOCATABLE :: grid_index, temp_grid_index, grid_index_new, J_index
REAL(8), DIMENSION(:), ALLOCATABLE :: coeff, temp_coeff, J_coeff
REAL(8) :: temp_min, temp_max, V, T, B, F, x1
INTEGER :: k, k_1, k_2, h, i, j, L, n, dd, L1, L2, dsize, count, first, repeated, add, ind
INTEGER :: time1, time2, clock_rate, clock_max
REAL(8), DIMENSION(L_max,L_max,2**(L_max),2**(L_max)) :: coeff_grid
INTEGER, DIMENSION(d) :: level, LL, ii
REAL(8), DIMENSION(testing_sample,d) :: x_rand
REAL(8), DIMENSION(testing_sample) :: interp1, interp2
! ============================================================================
! EXECUTABLE
! ============================================================================
ident = 0
DO i = 1,d
ident(i,i) = 1
ENDDO
! Initial grid point
dsize = 1
ALLOCATE(grid_index(dsize,2*d),grid_index_new(dsize,2*d))
grid_index(1,:) = 1
grid_index_new = grid_index
ALLOCATE(coeff(dsize))
xd = (/ 0.5D0, 0.5D0 /)
CALL FF(xd,coeff(1))
CALL FF(xd,coeff_grid(1,1,1,1))
L = 1
n = SIZE(grid_index_new,1)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO WHILE (L .LT. L_max)
L = L+1
n = SIZE(grid_index_new,1)
count = 0
first = 1
DEALLOCATE(J_index,J_coeff)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
J_index = 0
J_coeff = 0.0D0
DO k = 1,n
DO i = 1,d
DO j = 1,2
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ELSEIF (bound .EQ. 1) THEN
IF (grid_index_new(k,i) .EQ. 1) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(-(-1)**j)/)
ELSE
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ENDIF
ENDIF
CALL XX(d,temp(1:d),temp(d+1:2*d),xd)
temp_min = MINVAL(xd)
temp_max = MAXVAL(xd)
IF ((temp_min .GE. 0.0D0) .AND. (temp_max .LE. 1.0D0)) THEN
IF (first .EQ. 1) THEN
first = 0
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k_1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k_2 = 1,d
CALL XX(1,temp(k_2),temp(d+k_2),x1)
CALL BASE(x1,grid_index(k_1,k_2),grid_index(k_1,k_2+d),B)
T = T*B
ENDDO
V = V+coeff(k_1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
ELSE
repeated = 0
DO h = 1,count
IF (SUM(ABS(J_index(h,:)-temp)) .EQ. 0) THEN
repeated = 1
ENDIF
ENDDO
IF (repeated .EQ. 0) THEN
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k_1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k_2 = 1,d
CALL XX(1,temp(k_2),temp(d+k_2),x1)
CALL BASE(x1,grid_index(k_1,k_2),grid_index(k_1,k_2+d),B)
T = T*B
ENDDO
V = V+coeff(k_1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
ENDIF
ENDIF
ENDIF
ENDDO
ENDDO
ENDDO
ALLOCATE(temp_grid_index(dsize,2*d))
ALLOCATE(temp_coeff(dsize))
temp_grid_index = grid_index
temp_coeff = coeff
DEALLOCATE(grid_index,coeff)
ALLOCATE(grid_index(dsize+count,2*d))
ALLOCATE(coeff(dsize+count))
grid_index(1:dsize,:) = temp_grid_index
coeff(1:dsize) = temp_coeff
DEALLOCATE(temp_grid_index,temp_coeff)
grid_index(dsize+1:dsize+count,:) = J_index(1:count,:)
coeff(dsize+1:dsize+count) = J_coeff(1:count)
dsize = dsize + count
DO i = 1,count
coeff_grid(J_index(i,1),J_index(i,2),J_index(i,3),J_index(i,4)) = J_coeff(i)
ENDDO
IF (L .LE. L_0) THEN
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(count,2*d))
grid_index_new = J_index(1:count,:)
ELSE
add = 0
DO h = 1,count
IF (ABS(J_coeff(h)) .GT. eps) THEN
add = add + 1
J_index(add,:) = J_index(h,:)
ENDIF
ENDDO
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(add,2*d))
grid_index_new = J_index(1:add,:)
ENDIF
ENDDO
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, 'Elapsed real time1 = ', DBLE(time2-time1)/DBLE(clock_rate)
PRINT *, 'Grid Points = ', SIZE(grid_index,1)
! ============================================================================
! Compute interpolated values:
! ============================================================================
CALL RANDOM_NUMBER(x_rand)
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO i = 1,testing_sample
V = 0.0D0
DO L1=1,L_max
DO L2=1,L_max
IF (L1+L2 .LE. L_max+1) THEN
level = (/L1,L2/)
T = 1.0D0
DO dd = 1,d
T = T*(1.0D0-ABS(x_rand(i,dd)/2.0D0**(-DBLE(level(dd)))-DBLE(2*FLOOR(x_rand(i,dd)*2.0D0**DBLE(level(dd)-1))+1)))
ENDDO
V = V + coeff_grid(L1,L2,2*FLOOR(x_rand(i,1)*2.0D0**DBLE(L1-1))+1,2*FLOOR(x_rand(i,2)*2.0D0**DBLE(L2-1))+1)*T
ENDIF
ENDDO
ENDDO
interp2(i) = V
ENDDO
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, 'Elapsed real time2 = ', DBLE(time2-time1)/DBLE(clock_rate)
END PROGRAM
For any 5 dimensional index I need to obtain the associated
coefficient, without knowing or calculating i. For instance, given
[2,4,8,16,32] I need to obtain 3.0 without computing i.
function findloc_vector(matrix, vector) result(out)
integer, intent(in) :: matrix(:, :)
integer, intent(in) :: vector(size(matrix, dim=2))
integer :: out, i
do i = 1, size(matrix, dim=1)
if (all(matrix(i, :) == vector)) then
out = i
return
end if
end do
stop "No match for this vector"
end
And that's how you use it:
print*, coeff(findloc_vector(index, [2,4,8,16,32])) ! outputs 3.0
I must confess I was reluctant to post this code because, even though this answers your question, I honestly think this is not what you really want/need, but you dind't provide enough information for me to know what you really do want/need.
Edit (After actual code from OP):
If I decrypted your code correctly (and considering what you said in your previous question), you are declaring:
REAL(8), DIMENSION(L_max,L_max,2**(L_max),2**(L_max)) :: coeff_grid
(where L_max = 9, so size(coeff_grid) = 21233664 =~160MB) and then populating it with:
DO i = 1,count
coeff_grid(J_index(i,1),J_index(i,2),J_index(i,3),J_index(i,4)) = J_coeff(i)
ENDDO
(where count is of the order of 1000, i.e. 0.005% of its elements), so this way you can fetch the values by its 4 indices with the array notation.
Please, don't do that. You don't need a sparse matrix in this case either. The new approach you proposed is much better: storing the indices in each row of an smaller array, and fetching on the array of coefficients by the corresponding location of those indices in its own array. This is way faster (avoiding the large allocation) and much more memory-efficient.
PS: Is it mandatory for you to stick to Fortran 90? Its a very old version of the standard and chances are that the compiler you're using implements a more recent version. You could improve the quality of your code a lot with the intrinsic move_alloc (for less array copies), the kind constants from the intrinsic module iso_fortran_env (for portability), the [], >, <, <=,... notation (for readability)...

Unexpected Behavior With Basic Gaussian Elimination in Fortran

I am trying to solve a system of linear equations in matrix form using one of the many widely-available routines, specifically this function gauss (along with sample data with a known result), but regardless of which elimination strategy I use I consistently get incorrect results and don't know where to turn at this point. My code:
MODULE gaussMod
CONTAINS
function gauss(a,b) result(x)
implicit none
real*8 :: b(:), a(size(b), size(b))
real*8 :: x(size(b))
real*8 :: r(size(b))
integer i,j, neq
neq = size(b)
do i =1, neq
r = a(:,i)/a(i,i)
do j = i+1, neq
a(j,:) = a(j,:) - r(j)*a(i,:)
b(j) = b(j) - r(j)*b(j)
enddo
enddo
do i= neq, 1, -1
x(i) = (b(i) - sum(a(i, i+1:) * x(i+1:))) / a(i,i)
enddo
END function
END MODULE
SUBROUTINE outFile(n,x)
IMPLICIT none
INTEGER n
INTEGER i
REAL*8, DIMENSION(n) :: x
OPEN(UNIT=20,FILE="solution.csv",action="write",status="replace")
DO i=1, n
WRITE (20,"(1(f0.30,',',:))") x(i)
END DO
CLOSE(20)
END SUBROUTINE
PROGRAM elimtest
USE gaussMod
IMPLICIT none
INTEGER, PARAMETER :: coeff_kind = selected_real_kind(p=30, r=99)
INTEGER n
REAL*8, DIMENSION(:,:), ALLOCATABLE :: a
REAL*8, DIMENSION(:), ALLOCATABLE :: b
REAL*8, DIMENSION(:), ALLOCATABLE :: x
n = 4
ALLOCATE(a(n,n))
ALLOCATE(b(n))
ALLOCATE(x(n))
a(1,1) = 18.
a(1,2) = -6.
a(1,3) = -6.
a(1,4) = 0.
a(2,1) = -6.
a(2,2) = 12.
a(2,3) = 0.
a(2,4) = -6.
a(3,1) = -6.
a(3,2) = 0.
a(3,3) = 12.
a(3,4) = -6.
a(4,1) = 0.
a(4,2) = -6.
a(4,3) = -6.
a(4,4) = 18.
b(1) = 60.
b(2) = 0.
b(3) = 20.
b(4) = 0.
x = gauss(a,b)
CALL outFile(n,x)
END PROGRAM
Any help would be greatly appreciated!
It looks like the following line
b(j) = b(j) - r(j)*b(j)
is a typo of
b(j) = b(j) - r(j)*b(i)
With this modification your code gives the correct result (probably!):
x(:) = 8.3333333333333339 6.6666666666666670 8.3333333333333339 5.0000000000000000