matmul with non-conforming matrices - fortran

I expected the intrinsic matmul to fail when multiplying non-conforming matrices. In this simple example (see a simple code below), I am multiplying a 4x3 matrix by 4x4 matrix using matmul. Interestingly the intel compiler does not issue any warning or fatal error message at either the run-time or compile time. I tried '-check all' flag and it did not catch this error, either. Does anyone have any thoughts on this?
P.S. gfortran does complain about this operation
program main
implicit none
interface
subroutine shouldFail(arrayInput, scalarOutput)
implicit none
real (8), intent(in) :: arrayInput(:,:)
real (8), intent(out) :: scalarOutput
end
end interface
real (8) :: scalarOutput, arrayInput(4, 3)
arrayInput(:,:) = 1.0
call shouldFail(arrayInput, scalarOutput)
write(*,*) scalarOutput
end program main
!#############################################
subroutine shouldFail(arrayInput, scalarOutput)
implicit none
real (8), intent(in) :: arrayInput(:,:)
real (8), intent(out) :: scalarOutput
real (8) :: jacobian(3, 4), derivative(4, 4)
derivative(:,:) = 1.0
jacobian = matmul(arrayInput, derivative)
scalarOutput = jacobian(1, 2)
end subroutine shouldFail

On Intel REAL(8) is consistent, so moving onto the question...
The answer is here: software.intel.com/en-us/node/693211
If array input has shape (n, m) and derivative has shape (m, k), the result is a rank-two array (Jacobite) with shape (n, k)...
Whether one needs/want to add in USE ISO_C_BINDING and then using REAL(KIND=C_FLOAT) can be worthwhile for some conceptual future where it need to be portable... (Usually after the MATMUL is convincing one that it works).
It could look like this:
subroutine shouldFail(arrayInput, scalarOutput)
USE ISO_C_BINDING
implicit none
real(KIND=C_DOUBLE), DIMENSION(:,:), intent(in ) :: arrayInput
real(KIND=C_DOUBLE) , intent( out) :: scalarOutput
real(KIND=C_DOUBLE), DIMENSION(3,4) :: Jacobian
real(KIND=C_DOUBLE), DIMENSION(4,4) :: derivative
(Here: You may want to consider checking the rank/shape before the MATMUL call if you are concerned.)

Related

array operation in fortran

I am writing a code with a lot of 2D arrays and manipulation of them. I would like the code to be as concise as possible, for that I would like to use as many 'implicit' operation on array as possible but I don't really know how to write them for 2D arrays.
For axample:
DO J=1,N
DO I=1,M
A(I,J)=B(J)*A(I,J)
ENDDO
ENDDO
become easily:
DO J=1,N
A(:,J)=B(J)*A(:,J)
ENDDO
Is there a way to reduce also the loop J?
Thanks
For brevity and clarity, you could wrap these operations in a derived type. I wrote a minimal example which is not so concise because I need to initialise the objects, but once this initialisation is done, manipulating your arrays becomes very concise and elegant.
I stored in arrays_module.f90 a derived type arrays2d_T which can hold the array coefficients, plus useful information (number of rows and columns). This type contains procedures for initialisation, and the operation you are trying to perform.
module arrays_module
implicit none
integer, parameter :: dp = kind(0.d0) !double precision definition
type :: arrays2d_T
real(kind=dp), allocatable :: dat(:,:)
integer :: nRow, nCol
contains
procedure :: kindOfMultiply => array_kindOfMuliply_vec
procedure :: init => initialize_with_an_allocatable
end type
contains
subroutine initialize_with_an_allocatable(self, source_dat, nRow, nCol)
class(arrays2d_t), intent(inOut) :: self
real(kind=dp), allocatable, intent(in) :: source_dat(:,:)
integer, intent(in) :: nRow, nCol
allocate (self%dat(nRow, nCol), source=source_dat)
self%nRow = nRow
self%nCol = nCol
end subroutine
subroutine array_kindOfMuliply_vec(self, vec)
class(arrays2d_t), intent(inOut) :: self
real(kind=dp), allocatable, intent(in) :: vec(:)
integer :: iRow, jCol
do jCol = 1, self%nCol
do iRow = 1, self%nRow
self%dat(iRow, jCol) = vec(jCol)*self%dat(iRow, jCol)
end do
end do
end subroutine
end module arrays_module
Then, in main.f90, I check the behaviour of this multiplication on a simple example:
program main
use arrays_module
implicit none
type(arrays2d_T) :: A
real(kind=dp), allocatable :: B(:)
! auxilliary variables that are only useful for initialization
real(kind=dp), allocatable :: Aux_array(:,:)
integer :: M = 3
integer :: N = 2
! initialise the 2d array
allocate(Aux_array(M,N))
Aux_array(:,1) = [2._dp, -1.4_dp, 0.3_dp]
Aux_array(:,2) = [4._dp, -3.4_dp, 2.3_dp]
call A%init(aux_array, M, N)
! initialise vector
allocate (B(N))
B = [0.3_dp, -2._dp]
! compute the product
call A%kindOfMultiply(B)
print *, A%dat(:,1)
print *, A%dat(:,2)
end program main
Compilation can be as simple as gfortran -c arrays_module.f90 && gfortran -c main.f90 && gfortran -o main.out main.o arrays_module.o
Once this object-oriented machinery exists, call A%kindOfMultiply(B) is much clearer than a FORALL approach (and much less error prone).
No one has mentioned do concurrent construct here, which has the potential to automatically parallelize and speed up your code,
do concurrent(j=1:n); A(:,j)=B(j)*A(:,j); end do
A one-line solution can be achieved by using FORALL:
FORALL(J=1:N) A(:,J) = B(J)*A(:,J)
Note that FORALL is deprecated in the most recent versions of the standard, but as far as I know, that is the only way you can perform that operation as a single line of code.

Why does a subroutine with an array from a "use module" statement give faster performance than the same subroutine a locally sized array?

Related to this question, but I believe the issue is more clearly identified with this example.
I have some legacy code that looks like this:
subroutine ID_OG(N, DETERM)
use variables, only: ID
implicit real (A-H,O-Z)
implicit integer(I-N)
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_OG
Replacing use variables, only: ID with real, dimension(N) :: ID or real, dimension(:), allocatable :: ID causes a noticeable performance loss. Why is this? Is this expected behavior? I am wondering if it has something to do with the program needing to repeatedly allocate memory for the local array ID, while the use statement allows the program to skip the memory allocation step.
In the legacy code ID is in module variables but it is only used within the subroutine ID_OG. It is not used anywhere else in the code - it is not an input or an output. To me, it seems like good programming practice for ID to be removed from module variables and defined locally in the subroutine. But perhaps that isn't the case.
Minimum working example (MWE):
compiling as gfortran -O3 test.f95 using gfortran 8.2.0
MODULE variables
implicit none
real, dimension(:), allocatable :: ID
END MODULE variables
program test
use variables
implicit none
integer :: N
integer :: loop_max = 1e6
integer :: ii ! loop index
real :: DETERM
real :: t1, t2
real :: t_ID_OG, t_ID_header, t_ID_no_ID, t_OG_no_ID, t_allocate
character(*), parameter :: format_header = '((A5, 1X), 20(A12,1X))'
character(*), parameter :: format_data = '((I5, 1X), 20(ES12.5, 1X))'
open(1, file = 'TimingSubroutines_ID.txt', status = 'unknown')
write(1,format_header) 'N', 't_Legacy', 't_header', 't_head_No_ID', 't_Leg_no_ID', &
& 't_allocate'
do N = 1, 100
allocate(ID(N))
call CPU_time(t1)
do ii = 1, loop_max
CALL ID_OG(N, DETERM)
end do
call CPU_time(t2)
t_ID_OG = t2 - t1
print*, N, DETERM
call CPU_time(t1)
do ii = 1, loop_max
CALL ID_header(N, DETERM)
end do
call CPU_time(t2)
t_ID_header = t2 - t1
print*, N, DETERM
call CPU_time(t1)
do ii = 1, loop_max
CALL ID_header_no_ID(N, DETERM)
end do
call CPU_time(t2)
t_ID_no_ID = t2 - t1
print*, N, DETERM
call CPU_time(t1)
do ii = 1, loop_max
CALL ID_OG_no_ID(N, DETERM)
end do
call CPU_time(t2)
t_OG_no_ID = t2 - t1
print*, N, DETERM
call CPU_time(t1)
do ii = 1, loop_max
CALL ID_OG_allocate(N, DETERM)
end do
call CPU_time(t2)
t_allocate = t2 - t1
print*, N, DETERM
deallocate(ID)
write(1,format_data) N, t_ID_OG, t_ID_header, t_ID_no_ID, t_OG_no_ID, t_allocate
end do
end program test
subroutine ID_OG(N, DETERM)
use variables, only: ID
implicit real (A-H,O-Z)
implicit integer(I-N)
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_OG
subroutine ID_header(N, DETERM)
use variables, only: ID
implicit none
integer, intent(in) :: N
real, intent(out) :: DETERM
integer :: I
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_header
subroutine ID_header_no_ID(N, DETERM)
implicit none
integer, intent(in) :: N
real, intent(out) :: DETERM
integer :: I
real, dimension(N) :: ID
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_header_no_ID
subroutine ID_OG_no_ID(N, DETERM)
implicit real (A-H,O-Z)
implicit integer(I-N)
real, dimension(N) :: ID
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_OG_no_ID
subroutine ID_OG_allocate(N, DETERM)
implicit real (A-H,O-Z)
implicit integer(I-N)
real, dimension(:), allocatable :: ID
allocate(ID(N))
DETERM = 1.0
DO 1 I=1,N
1 ID(I)=0
DETERM = sum(ID)
end subroutine ID_OG_allocate
Allocating the arrays takes time. The compiler is free to allocate the local arrays where-ever it wants, but it can typically be adjusted by compiler-specific flags. Use -fstack-arrays for gfortran to force local arrays to stack.
Allocating on the stack is just changing the stack pointer, it is virtually for free. Allocating on the heap, however, is more involved and requires some bookkeeping.
There are situations where local variables are in order and there are situations where global (module) variables are in order. One can also use local saved variables or variables that are components of some objects. One cannot say which one is better without seeing the complete design of the code in question.
FWIW, with -fstack-arrays I do not see much difference except when allocating explicitly using allocate():
Explicit allocate will always use the heap.
Without -fstack-arrays I do see some:
The graphs are quite noisy because my notebook is running many processes at the same time.
This is not to say that one should always use -fstack-arrays, I used to demonstrate the difference. The option is useful, but care must be taken to avoid a stack overflow error. -fmax-stack-var-size may help with that.
As your tests are pointing out, the additional overhead of all methods which do not use the module variable is due to the language's phylosophy to not bother the user with memory handling too much.
The compiler will decide where memory should be allocated, unless you start tinkering with compiler flags. You see allocation/freeing time as a drawback, but your analysis also shows:
stack vs. heap memory handling overhead quickly gets smaller and smaller: for N>=100, it is already <50%. a dimension(100) array is a ridiculously small memory chunk on a modern computer.
declaring a variable in a module just for speeding up storage is a Fortran 90 way of making it a global, and as such, it is a deprecated coding style.
I think the best strategy to make the code well-coded and fast is:
Is N going to be constant through the whole runtime? Then, it would be a good idea to encapsulate it into a class:
module myCalculation
implicit none
type, public: fancyMethod
integer :: N = 0
real, allocatable :: ID(:)
contains
procedure :: init
procedure :: compute
procedure :: is_init
end type fancyMethod
contains
elemental subroutine init(self,n)
class(fancyMethod), intent(inout) :: self
integer, intent(in) :: n
real, allocatable :: tmp(:)
self%N = n
allocate(tmp(N)); tmp(:) = 0
call move_alloc(from=tmp,to=self%ID)
end subroutine init
elemental logical function is_init(self)
class(fancyMethod), intent(in) :: self
is_init = allocated(self%ID) .and. size(self%ID)>0
end function is_init
real function compute(self,n,...) result(DETERM)
class(fancyMethod), intent(inout) :: self
integer, intent(in) :: n
....
if (.not.is_init(self)) call init(self,N)
DETERM = sum(self%ID(1:N))
end function compute
end module myCalculation
Is N going to be constant and small? Why not just use a PARAMETER to define its max size? if it is a parameter, the compiler will perhaps always put the automatic array on the stack:
real function computeWithMaxSize(N) result(DETERM)
integer, intent(in) :: N
integer, parameter :: MAX_SIZE = 1024
real :: ID(MAX_SIZE)
[...]
if (N>MAX_SIZE) stop ' N is too large! '
DETERM = sum(ID(1:N))
end function computeWithMaxSize
Is N going to be variable-sized and large? Then, the in-routine memory handling is fine, and its overhead is likely negligible, because the CPU time will be dominated by the calculation; use an allocatable version if you're not sure that the size can be so large to cause any stack issues:
real function computeWithAllocatable(N) result(DETERM)
integer, intent(in) :: N
real, allocatable :: ID(:)
allocate(ID(N))
[...]
DETERM = sum(ID(1:N))
end function computeWithAllocatable

ERROR: Parameter 8 was incorrect on entry to ZHEEV

module RandMat
implicit none
double complex, dimension(:,:), allocatable :: A, L, U
integer :: n
contains
function Diag() result(Eigenvalues)
implicit none
double complex, dimension(n) :: Eigenvalues
double complex, dimension(2*n-1) :: work
double precision, dimension(n,n) :: eigenmatrix
integer :: Lwork
integer :: info, ii
double precision :: rwork(3*n-2)
double complex :: check(n,n), alpha, beta
check=A
info=0
lwork =2*n-1
!call routine to diagonalize A
call zheev('V','U',n, check, n, Eigenvalues, work, lwork, rwork, info )
end function
end module
This function is declared within a module. A is a n x n Hermitian matrix that I defined within the module, so I can use it in here. n is defined in the module as well and is the dimension of A.
The problem is that I always get a run time error:
Intel MKL ERROR: Parameter 8 was incorrect on entry to ZHEEV
when calling zheev.
EDIT: I added the declaration of the variables in the module. The Matrix A is allocated and initialized with the following Routine:
subroutine Initialize
implicit none
integer, dimension(2) :: clock
integer , dimension(:), allocatable :: seed
integer :: ii,jj
double precision :: c,d
!create random complex matrix of dimension nxn
!-----------------------------------------------------------------------
!allocate A
if (.not.allocated(A)) then
allocate(A(N,N))
else if (size(A,1).neqv.N) then
allocate(A(N,N))
end if
!get execution time--> will be used as seed
call System_clock(count=clock(1))
call random_seed(put=clock)
!initialize matrix with random values
do ii=1,n
!since we are only interested in hermitian matrices is it enough to only !initialize the upper diagonal matrix
do jj=ii,n
call random_number(c)
Call random_number(d)
A(ii,jj)=cmplx(c, d)
end do
end do
!make matrix hermitian
!---------------------------------------------------------------------------
do ii=2, n
do jj=1, ii-1
A(ii,jj)=conjg(A(jj,ii))
end do
end do
end subroutine

What does "array cannot have a deferred shape" mean in fortran?

I have a simple fortran function that computes the Kronecker product:
function kron(A, B)
implicit none
real, intent(in) :: A(:, :), B(:, :)
integer :: i, j, ma, na, mb, nb
real, dimension(:, :) :: kron
ma = ubound(A, 1)
na = ubound(A, 2)
mb = ubound(b, 1)
nb = ubound(b, 2)
forall(i=1:ma, j=1:na)
kron(mb*(i-1)+1:mb*i, nb*(j-1)+1:nb*j) = A(i,j)*B
end forall
end function kron
It's inside a module, but when I compile it with gfortran -static -ffree-form -std=f2003 -Wall, I get these errors:
function kron(A, B)
1
Error: Array 'kron' at (1) cannot have a deferred shape
Is this error occurring because you're supposed to know the size of the array to be returned beforehand?
That is exactly what the error is telling you: kron must have an explicit shape. If you do not want to pass the array sizes beforehand, you'd have to define kron as
real, dimension(lbound(a,dim=1):ubound(a,dim=1),&
lbound(a,dim=2):ubound(a,dim=2)) :: kron
Using this particular explicit declaration above does compile for me on gfortran 4.6.3.
A deferred-shape array that has the ALLOCATABLE attribute is referred to as an allocatable array. Its bounds and shape are determined when storage is allocated for it by an ALLOCATE statement.
try this
real, intent(in), allocatable, dimension(:, :: A(:, :), B(:, :)
You just need to define the allocatable array as allocatable, i.e replace the kron definition with;
real, allocatable, dimension(:,:) :: kron
This also compiles fine in 4.6.3 and is defined at:
https://docs.roguewave.com/codedynamics/2017.0/html/index.html#page/TotalViewLH/totalviewlhug-examining-data.09.10.html
Hopefully this should save you some effort, especially considering there is no need to define a lower bound here!

Reading function from a file in Fortran 90

I have an optimization solver in Fortran 90. So, if I want to change the objective function
I have to modified the main file and write the objective function in this way:
subroutine fobj(n,x,f)
implicit none
integer :: n
real(8) :: f
real(8) :: x(n)
intent(in ) :: n,x
intent(out) :: f
!OBJECTIVE FUNCTION
f = x(1)**2-x(2)+2*x(3)
end subroutine fobj
I have a big objective function, so I want to call this line "f = x(1)**2-x(2)+2*x(3)" from an external file or at least the subrutine.
Is that possible? (I'm new in Fortran.)
I know that I can modified the file with Python, but I want to do it in other file.
Thanks a lot!
Sure. Use:
include 'file.inc'
to include source code from an external file.
I'm not sure if this is what you're looking for, but:
Fortran also allows you to pass subroutine/function names around as actual arguments to subroutine/function calls. The corresponding dummy arguments must have the "external" attribute.
subroutine fobj(n,x,f,func)
implicit none
integer :: n
real(8),external :: func
real(8) :: f
real(8) :: x(n)
intent(in ) :: n,x
intent(out) :: f
!OBJECTIVE FUNCTION
f=func(x,n)
end subroutine fobj
function func1(x,n)
implicit none
real(8) func1
integer n
real(8) :: f,x(n)
f = x(1)**2-x(2)+2*x(3)
end function func1
function func2(x,n)
implicit none
real(8) func2
integer n
real(8) :: f,x(n)
f = x(1)**2+x(2)+2*x(3)
end function func2
program main
real(8),external :: func1,func2
real(8),allocatable :: x(:)
real(8) :: f
integer n
n=50
allocate(x(n))
x=10. !Set X to a known value
call fobj(n,x,f,func1) !Call func1
print*,f !10**2-10+2*10 = 110
x=10. !Reset X ... just to make sure there is no funny business in func1,func2
call fobj(n,x,f,func2) !Call func2
print*,f !10**2+10+2*10 = 130
deallocate(x)
end program main
Of course, this program does nothing useful other than call func1 and func2 in obscure ways, but hopefully it illustrates the point. If you're looking to switch out the function at compile-time, then I think a include "myfile" is probably cleaner (just switching which file you're including at the time as suggested by #AlejandroLL)
You might also try to use Modules in your program. Sometimes when you pass special variables to your subroutines/functions you need to write interfaces for them. Using modules will improve your program structure and you'll be more effective and all interfaces would be generated automatically.