Fortran array rank for matmul intrinsic - fortran

The following link https://gcc.gnu.org/onlinedocs/gcc-5.4.0/gfortran/MATMUL.html clearly states that gfortran expects matrices input to matmul to be of rank 1 OR 2. However the following snippet wont compile:
Program scratch
real(kind=8) :: A(10)=(/0,1,2,3,4,5,6,7,8,9/)
real(kind=8) :: B(10)=(/0,1,2,3,4,5,6,7,8,9/)
real(kind=8) :: C(10,10)
print *,rank(A),rank(B)
C=matmul(A,B)
End Program scratch
gfortran gives the error:
$gfortran scratch.f90
scratch.f90:6:13:
C=matmul(A,B)
1
Error: ‘matrix_b’ argument of ‘matmul’ intrinsic at (1) must be of rank 2
My gfortran is 5.4.0 (compatible with the link above). Am I doing something really stupid?

You can use RESHAPE to get them into a form MATMUL will like:
Program scratch
real(kind=8) :: A(10)=(/0,1,2,3,4,5,6,7,8,9/)
real(kind=8) :: B(10)=(/0,1,2,3,4,5,6,7,8,9/)
real(kind=8) :: C(10,10)
print *,rank(A),rank(B)
C = matmul( RESHAPE(A,(/10,1/)), RESHAPE(B,(/1,10/)) )
WRITE(*,"(10F7.2)") C
End Program scratch

You must do this for a tensor product of two vectors
Program scratch
integer, parameter :: dp = kind(1.d0)
real(dp) :: A(10,1)=reshape((/0,1,2,3,4,5,6,7,8,9/), (/ 10, 1 /))
real(dp) :: B(1,10)=reshape((/0,1,2,3,4,5,6,7,8,9/), (/ 1, 10 /))
real(dp) :: C(10,10)
print *,rank(A),rank(B)
C=matmul(A,B)
print *, C
End Program scratch
If you do
A(1,10)
B(10,1)
you will get a scalar product. With just two 1D arrays it is not clear which of the two products you want (although for a dot product there is a special function available).
A or B can be a 1D array when you are multiplying a matrix by a vector.

Related

How do I define integer array for given variable nBoxes [duplicate]

I'm writing a code for LU decomposition and I don't know how to fix the "unexpected data declaration statement" pointed at line 8 (where I'm declaring an array. See the code fragment). Why is it unexpected?
!Decomposição LU
!-----------------------------------------------------------
PROGRAM LUdecomp
IMPLICIT INTEGER (I-K,N), REAL (A-H, L-M,O-Z)
INTEGER, PARAMETER :: N=3
REAL, DIMENSION (N,N) :: A,L,U
A = reshape((/3.,1.,4.,4.,2.,0.,3.,2.,3./),(/3,3/)) !exemplo do Bortoli*******
REAL, DIMENSION(3) :: B=(/9.,3.,-2./),Z,X
OPEN(1,file = 'LUFACTOR.out')
!
! FORALL (I = 1:N, J = 1:N) A(I,J) = 1.0/REAL(I+J-1)
!-------Fazendo a fatoração A = LU-----------------------------
CALL LU(N, A, L, U)
DO I=1,N
WRITE(*,10)(L(I,J), J=1,N), (U(I,J), J=1,N)
END DO
10 FORMAT(3(F8.4), 7x, 3(F8.4))
!
This statement
REAL, DIMENSION(3) :: B=(/9.,3.,-2./),Z,X
is in the wrong place. In a Fortran program-unit (program, subroutine, function) -- certainly one without the new ASSOCIATE and BLOCK constructs -- all declarations have to precede all executable statements.
Move the misplaced statement ahead of the first executable statement.

Fortran character format string as subroutine argument

I am struggling with reading a text string in. Am using gfortran 4.9.2.
Below I have written a little subroutine in which I would like to submit the write format as argument.
Ideally I'd like to be able to call it with
call printarray(mat1, "F8.3")
to print out a matrix mat1 in that format for example. The numbers of columns should be determined automatically inside the subroutine.
subroutine printarray(x, udf_temp)
implicit none
real, dimension(:,:), intent(in) :: x ! array to be printed
integer, dimension(2) :: dims ! array for shape of x
integer :: i, j
character(len=10) :: udf_temp ! user defined format, eg "F8.3, ...
character(len = :), allocatable :: udf ! trimmed udf_temp
character(len = 10) :: udf2
character(len = 10) :: txt1, txt2
integer :: ncols ! no. of columns of array
integer :: udf_temp_length
udf_temp_length = len_trim(udf_temp)
allocate(character(len=udf_temp_length) :: udf)
dims = shape(x)
ncols = dims(2)
write (txt1, '(I5)') ncols
udf2 = trim(txt1)//adjustl(udf)
txt2 = "("//trim(udf2)//")"
do i = 1, dims(1)
write (*, txt2) (x(i, j), j = 1, dims(2)) ! this is line 38
end do
end suroutine printarray
when I set len = 10:
character(len=10) :: udf_temp
I get compile error:
call printarray(mat1, "F8.3")
1
Warning: Character length of actual argument shorter than of dummy argument 'udf_temp' (4/10) at (1)
When I set len = *
character(len=*) :: udf_temp
it compiles but at runtime:
At line 38 of file where2.f95 (unit = 6, file = 'stdout')
Fortran runtime error: Unexpected element '( 8
What am I doing wrong?
Is there a neater way to do this?
Here's a summary of your question that I will try to address: You want to have a subroutine that will print a specified two-dimensional array with a specified format, such that each row is printed on a single line. For example, assume we have the real array:
real, dimension(2,8) :: x
x = reshape([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16], shape=[2,8], order=[2,1])
! Then the array is:
! 1.000 2.000 3.000 4.000 5.000 6.000 7.000 8.000
! 9.000 10.000 11.000 12.000 13.000 14.000 15.000 16.000
We want to use the format "F8.3", which prints floating point values (reals) with a field width of 8 and 3 decimal places.
Now, you are making a couple of mistakes when creating the format within your subroutine. First, you try to use udf to create the udf2 string. This is a problem because although you have allocated the size of udf, nothing has been assigned to it (pointed out in a comment by #francescalus). Thus, you see the error message you reported: Fortran runtime error: Unexpected element '( 8.
In the following, I make a couple of simplifying changes and demonstrate a few (slightly) different techniques. As shown, I suggest the use of * to indicate that the format can be applied an unlimited number of times, until all elements of the output list have been visited. Of course, explicitly stating the number of times to apply the format (ie, "(8F8.3)" instead of "(*(F8.3))") is fine, but the latter is slightly less work.
program main
implicit none
real, dimension(2,8) :: x
character(len=:), allocatable :: udf_in
x = reshape([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16], shape=[2,8], order=[2,1])
udf_in = "F8.3"
call printarray(x, udf_in)
contains
subroutine printarray(x, udf_in)
implicit none
real, dimension(:,:), intent(in) :: x
character(len=*), intent(in) :: udf_in
integer :: ncols ! size(x,dim=2)
character(len=10) :: ncols_str ! ncols, stringified
integer, dimension(2) :: dims ! shape of x
character(len=:), allocatable :: udf0, udf1 ! format codes
integer :: i, j ! index counters
dims = shape(x) ! or just use: ncols = size(x, dim=2)
ncols = dims(2)
write (ncols_str, '(i0)') ncols ! use 'i0' for min. size
udf0 = "(" // ncols_str // udf_in // ")" ! create string: "(8F8.3)"
udf1 = "(*(" // udf_in // "))" ! create string: "(*(F8.3))"
print *, "Version 1:"
do i = 1, dims(1)
write (*, udf0) (x(i, j), j = 1,ncols) ! implied do-loop over j.
end do
print *, "Version 2:"
do i = 1, dims(1)
! udf1: "(*(F8.3))"
write (*, udf1) (x(i, j), j = 1,ncols) ! implied do-loop over j
end do
print *, "Version 3:"
do i = 1, size(x,dim=1) ! no need to create nrows/ncols vars.
write(*, udf1) x(i,:) ! let the compiler handle the extents.
enddo
end subroutine printarray
end program main
Observe: the final do-loop ("Version 3") is very simple. It does not need an explicit count of ncols because the * takes care of it automatically. Due to its simplicity, there is really no need for a subroutine at all.
besides the actual error (not using the input argument), this whole thing can be done much more simply:
subroutine printarray(m,f)
implicit none
character(len=*)f
real m(:,:)
character*10 n
write(n,'(i0)')size(m(1,:))
write(*,'('//n//f//')')transpose(m)
end subroutine
end
note no need for the loop constructs as fortran will automatically write the whole array , line wrapping as you reach the length of data specified by your format.
alternately you can use a loop construct, then you can use a '*' repeat count in the format and obviate the need for the internal write to construct the format string.
subroutine printarray(m,f)
implicit none
character(len=*)f
real m(:,:)
integer :: i
do i=1,size(m(:,1))
write(*,'(*('//f//'))')m(i,:)
enddo
end subroutine
end

Fortran strange segmentation fault

I have some a problem with my main code, so I tried to isolate the problem.
Therefore, I have this small code :
MODULE Param
IMPLICIT NONE
integer, parameter :: dr = SELECTED_REAL_KIND(15, 307)
integer :: D =3
integer :: Q=10
integer :: mmo=16
integer :: n=2
integer :: x=80
integer :: y=70
integer :: z=20
integer :: tMax=8
END MODULE Param
module m
contains
subroutine compute(f, r)
USE Param, ONLY: dr, mmo, x, y, z, n
IMPLICIT NONE
real (kind=dr), intent(in) :: f(x,y,z, 0:mmo, n)
real (kind=dr), intent(out) :: r(x, y, z, n)
real (kind=dr) :: fGlob(x,y,z, 0:mmo)
!-------------------------------------------------------------------------
print*, 'We are in compute subroutine'
r= 00.0
fGlob=sum(f,dim=5)
r=sum(f, dim=4)
print*, 'fGlob=', fGlob(1,1,1, 1)
print*, 'f=', f(1,1,1, 0,1)
print*, 'r=', r(1,1,1, 1)
end subroutine compute
end module m
PROGRAM test_prog
USE Param
USE m
Implicit None
integer :: tStep
real (kind=dr), dimension(:,:,:, :,:), allocatable :: f
real (kind=dr), dimension(:,:,:,:), allocatable :: r
!----------------------------------------------------------------------------
! Initialise the parameters.
print*, 'beginning of the test'
! Allocate
allocate(f(x,y,z, 0:mmo,n))
allocate(r(x,y,z, n))
f=1.0_dr
! ---------------------------------------------------------
! Iteration over time
! ---------------------------------------------------------
do tStep = 1, tMax
print *, tStep
call compute(f,r)
f=f+1
print *, 'tStep', tStep
enddo
print*, 'f=', f(1,1,1, 0,1)
print*, 'r=', r(1,1,1, 1)
! Deallacation
deallocate(f)
deallocate(r)
print*, 'End of the test program'
END PROGRAM test_prog
For now, I am not able to understand why when I compile with ifort, I have a segmentation fault, and it works when I compile with gfortran. And worst, when I compile with both ifort and gfortran with their fast options, I get again a segmentation fault (core dumped) error. And more confusing, when I also tried with both compilers to compile with traceback options, everything works fine.
I know that segmentation fault (core dumped) error usually means that I try to read or write in a wrong location (matrix indices etc...); but here with this small code, I see no mistake like this.
Does anyone can help me to understand why theses errors occur?
The problem comes from the size of the stack used by some compilers by default (ifort) or by some others when they optimise the compilation (gfortran -Ofast). Here, our writings exceed the size of the stack.
To solve this, I use the options -heap-arrays for ifort compiler and -fno-stack-arrays for gfortran compiler.

Using MKL to solve a non-linear system of equations with an objective function stored in another module

I'm trying to use the MKL trust region algorithm to solve a nonlinear system of equations in a Fortran program. I started from the example provided online (ex_nlsqp_f90_x.f90 https://software.intel.com/en-us/node/501498) and everything works correctly. Now, because I have to use this in a much bigger program, I need the user defined objective function to be loaded from a separate module. Hence, I split the example into 2 separate files, but I'm not able to make it compile correctly.
So here is the code for module which contains user defined data structure and the objective function
module modFun
implicit none
private
public my_data, extended_powell
type :: my_data
integer a
integer sum
end type my_data
contains
subroutine extended_powell (m, n, x, f, user_data)
implicit none
integer, intent(in) :: m, n
real*8 , intent(in) :: x(n)
real*8, intent(out) :: f(m)
type(my_data) :: user_data
integer i
user_data%sum = user_data%sum + user_data%a
do i = 1, n/4
f(4*(i-1)+1) = x(4*(i-1)+1) + 10.0 * x(4*(i-1)+2)
f(4*(i-1)+2) = 2.2360679774998 * (x(4*(i-1)+3) - x(4*(i-1)+4))
f(4*(i-1)+3) = ( x(4*(i-1)+2) - 2.0 * x(4*(i-1)+3) )**2
f(4*(i-1)+4) = 3.1622776601684 * (x(4*(i-1)+1) - x(4*(i-1)+4))**2
end do
end subroutine extended_powell
end module modFun
and here the portion of the main program calling it
include 'mkl_rci.f90'
program EXAMPLE_EX_NLSQP_F90_X
use MKL_RCI
use MKL_RCI_type
use modFun
! user's objective function
! n - number of function variables
! m - dimension of function value
integer n, m
parameter (n = 4)
parameter (m = 4)
! precisions for stop-criteria (see manual for more details)
real*8 eps(6)
real*8 x(n)
real*8 fjac(m*n)
! number of iterations
integer fun
! Additional users data
type(my_data) :: m_data
m_data%a = 1
m_data%sum = 0
rs = 0.0
fun = djacobix(extended_powell,n,m,fjac,x,eps(1),%val(loc(m_data)))
end program EXAMPLE_EX_NLSQP_F90_X
Also djacobix code
INTERFACE
INTEGER FUNCTION DJACOBIX(fcn, n, m, fjac, x, eps, user_data)
USE, INTRINSIC :: ISO_C_BINDING
INTEGER, INTENT(IN) :: n
INTEGER, INTENT(IN) :: m
DOUBLE PRECISION, INTENT(IN) :: eps
DOUBLE PRECISION, INTENT(IN), DIMENSION(*) :: x
DOUBLE PRECISION, INTENT(OUT), DIMENSION(m, *) :: fjac
INTEGER(C_INTPTR_T) :: user_data
INTERFACE
SUBROUTINE fcn(m, n, x, f, user_data)
USE, INTRINSIC :: ISO_C_BINDING
INTEGER, INTENT(IN) :: n
INTEGER, INTENT(IN) :: m
DOUBLE PRECISION, INTENT(IN), DIMENSION(*) :: x
DOUBLE PRECISION, INTENT(OUT), DIMENSION(*) :: f
INTEGER(C_INTPTR_T), INTENT(IN) :: user_data
END SUBROUTINE
END INTERFACE
END FUNCTION
END INTERFACE
When i compile the following errors are generated:
mpiifort -g -t -mkl -I/apps/rhel6/intel/composer_xe_2015.3.187/mkl/include/intel64/lp64 -c modFun.f90
mpiifort -g -t -mkl -I/apps/rhel6/intel/composer_xe_2015.3.187/mkl/include/intel64/lp64 -c main.f90
main.f90(30): error #7065: The characteristics of dummy argument 5 of the associated actual procedure differ from the characteristics of dummy argument 5 of the dummy procedure. [EXTENDED_POWELL]
fun = djacobix(extended_powell,n,m,fjac,x,eps(1),%val(loc(m_data)))
-------------------^
I have the feeling I have to create an interface to override the check on the m_data, but I can't figure out where and how. Can anyone help me with this problem providing a working example?
I guess the reason is that the function djacobix passes the pointer instead of the true value of variable user_data.
You can check the manual at https://software.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/nonlinear-optimization-problem-solvers/jacobian-matrix-calculation-routines/jacobix.html where a sentence shows that "You need to declare fcn as extern in the calling program."

OpenBLAS slower than intrinsic function dot_product

I need make a dot product in Fortran. I can do with the intrinsic function dot_product from Fortran or use ddot from OpenBLAS. The problem is the ddot is slower. This is my code:
With BLAS:
program VectorBLAS
! time VectorBlas.e = 0.30s
implicit none
double precision, dimension(3) :: b
double precision :: result
double precision, external :: ddot
integer, parameter :: LargeInt_K = selected_int_kind (18)
integer (kind=LargeInt_K) :: I
DO I = 1, 10000000
b(:) = 3
result = ddot(3, b, 1, b, 1)
END DO
end program VectorBLAS
With dot_product
program VectorModule
! time VectorModule.e = 0.19s
implicit none
double precision, dimension (3) :: b
double precision :: result
integer, parameter :: LargeInt_K = selected_int_kind (18)
integer (kind=LargeInt_K) :: I
DO I = 1, 10000000
b(:) = 3
result = dot_product(b, b)
END DO
end program VectorModule
The two codes are compiled using:
gfortran file_name.f90 -lblas -o file_name.e
What am I doing wrong? BLAS not have to be faster?
While BLAS, and especially the optimized versions, are generally faster for larger arrays, the built-in functions are faster for smaller sizes.
This is especially visible from the linked source code of ddot, where additional work is spent on further functionality (e.g., different increments). For small array lengths, the work done here outweighs the performance gain of the optimizations.
If you make your vectors (much) larger, the optimized version should be faster.
Here is an example to illustrate this:
program test
use, intrinsic :: ISO_Fortran_env, only: REAL64
implicit none
integer :: t1, t2, rate, ttot1, ttot2, i
real(REAL64), allocatable :: a(:),b(:),c(:)
real(REAL64), external :: ddot
allocate( a(100000), b(100000), c(100000) )
call system_clock(count_rate=rate)
ttot1 = 0 ; ttot2 = 0
do i=1,1000
call random_number(a)
call random_number(b)
call system_clock(t1)
c = dot_product(a,b)
call system_clock(t2)
ttot1 = ttot1 + t2 - t1
call system_clock(t1)
c = ddot(100000,a,1,b,1)
call system_clock(t2)
ttot2 = ttot2 + t2 - t1
enddo
print *,'dot_product: ', real(ttot1)/real(rate)
print *,'BLAS, ddot: ', real(ttot2)/real(rate)
end program
The BLAS routines are quite a bit faster here:
OMP_NUM_THREADS=1 ./a.out
dot_product: 0.145999998
BLAS, ddot: 0.100000001