Related
This question is based on an answer to the post Fortran intent(inout) versus omitting intent, namely the one by user Vladimyr, #Vladimyr.
He says that "<...> Fortran copies that data into a contiguous section of memory, and passes the new address to the routine. Upon returning, the data is copied back into its original location. By specifying INTENT, the compiler can know to skip one of the copying operations."
I did not know this at all, I thought Fortran passes by reference exactly as C.
The first question is, why would Fortran do so, what is the rationale behind this choice?
As a second point, I put this behaviour to the test. If I understood correctly, use of INTENT(IN) would save the time spent in copying back the data to th original location, as the compiler is sure the data has not been changed.
I tried this little piece of code
function funco(inp) result(j)
!! integer, dimension(:), intent (in) :: inp
integer, dimension(:):: inp
integer, dimension(SIZE(inp)) :: j ! output
j = 0.0 !! clear whole vector
N = size(inp)
DO i = 1, N
j(i) = inp(i)
END DO
end function
program main
implicit none
interface
function funco(inp) result(j)
!! integer, dimension(:), intent (in) :: inp
integer, dimension(:) :: inp
integer, dimension(SIZE(inp)) :: j ! output
end function
end interface
integer, dimension(3000) :: inp , j
!! integer, dimension(3000) :: funco
integer :: cr, cm , c1, c2, m
real :: rate, t1, t2
! Initialize the system_clock
CALL system_clock(count_rate=cr)
CALL system_clock(count_max=cm)
CALL CPU_TIME(t1)
rate = REAL(cr)
WRITE(*,*) "system_clock rate ",rate
inp = 2
DO m = 1,1000000
j = funco(inp) + 1
END DO
CALL SYSTEM_CLOCK(c2)
CALL CPU_TIME(t2)
WRITE(*,*) "system_clock : ",(c2 - c1)/rate
WRITE(*,*) "cpu_time : ",(t2-t1)
end program
The function copies an array, and in the main body this is repeated many times.
According to the claim above, the time spent in copying back the array should somehow show up.
system_clock rate 1000.00000
system_clock : 2068.07910
cpu_time : 9.70935345
but the results are pretty much the same independently from whether INTENT is use or not.
Could anybody share some light on these two points, why does Fortran performs an additional copy (which seems ineffective at first, efficiency-wise) instead of passing by reference, and does really INTENT save the time of a copying operation?
The answer you are referring to speaks about passing some specific type of subsection, not of the whole array. In that case a temporary copy might be necessary, depending on the function. Your function uses and assumed shape array and a temporary array will not be necessary even if you try quite hard.
An example of what the author (it wasn't me) might have had in mind is
module functions
implicit none
contains
function fun(a, n) result(res)
real :: res
! note the explicit shape !!!
integer, intent(in) :: n
real, intent(in) :: a(n, n)
integer :: i, j
do j = 1, n
do i = 1, n
res = res + a(i,j) *i + j
end do
end do
end function
end module
program main
use functions
implicit none
real, allocatable :: array(:,:)
real :: x, t1, t2
integer :: fulln
fulln = 400
allocate(array(1:fulln,1:fulln))
call random_number(array)
call cpu_time(t1)
x = fun(array(::2,::2),(fulln/2))
call cpu_time(t2)
print *,x
print *, t2-t1
end program
This program is somewhat faster with intent(in) when compared to intent(inout) in Gfortran (not so much in Intel). However, it is even much faster with an assumed shape array a(:,:). Then no copy is performed.
I am also getting some strange uninitialized accesses in gfortran when running without runtime checks. I do not understand why.
Of course this is a contrived example and there are real cases in production programs where array copies are made and then intent(in) can make a difference.
I am writing a code with a lot of 2D arrays and manipulation of them. I would like the code to be as concise as possible, for that I would like to use as many 'implicit' operation on array as possible but I don't really know how to write them for 2D arrays.
For axample:
DO J=1,N
DO I=1,M
A(I,J)=B(J)*A(I,J)
ENDDO
ENDDO
become easily:
DO J=1,N
A(:,J)=B(J)*A(:,J)
ENDDO
Is there a way to reduce also the loop J?
Thanks
For brevity and clarity, you could wrap these operations in a derived type. I wrote a minimal example which is not so concise because I need to initialise the objects, but once this initialisation is done, manipulating your arrays becomes very concise and elegant.
I stored in arrays_module.f90 a derived type arrays2d_T which can hold the array coefficients, plus useful information (number of rows and columns). This type contains procedures for initialisation, and the operation you are trying to perform.
module arrays_module
implicit none
integer, parameter :: dp = kind(0.d0) !double precision definition
type :: arrays2d_T
real(kind=dp), allocatable :: dat(:,:)
integer :: nRow, nCol
contains
procedure :: kindOfMultiply => array_kindOfMuliply_vec
procedure :: init => initialize_with_an_allocatable
end type
contains
subroutine initialize_with_an_allocatable(self, source_dat, nRow, nCol)
class(arrays2d_t), intent(inOut) :: self
real(kind=dp), allocatable, intent(in) :: source_dat(:,:)
integer, intent(in) :: nRow, nCol
allocate (self%dat(nRow, nCol), source=source_dat)
self%nRow = nRow
self%nCol = nCol
end subroutine
subroutine array_kindOfMuliply_vec(self, vec)
class(arrays2d_t), intent(inOut) :: self
real(kind=dp), allocatable, intent(in) :: vec(:)
integer :: iRow, jCol
do jCol = 1, self%nCol
do iRow = 1, self%nRow
self%dat(iRow, jCol) = vec(jCol)*self%dat(iRow, jCol)
end do
end do
end subroutine
end module arrays_module
Then, in main.f90, I check the behaviour of this multiplication on a simple example:
program main
use arrays_module
implicit none
type(arrays2d_T) :: A
real(kind=dp), allocatable :: B(:)
! auxilliary variables that are only useful for initialization
real(kind=dp), allocatable :: Aux_array(:,:)
integer :: M = 3
integer :: N = 2
! initialise the 2d array
allocate(Aux_array(M,N))
Aux_array(:,1) = [2._dp, -1.4_dp, 0.3_dp]
Aux_array(:,2) = [4._dp, -3.4_dp, 2.3_dp]
call A%init(aux_array, M, N)
! initialise vector
allocate (B(N))
B = [0.3_dp, -2._dp]
! compute the product
call A%kindOfMultiply(B)
print *, A%dat(:,1)
print *, A%dat(:,2)
end program main
Compilation can be as simple as gfortran -c arrays_module.f90 && gfortran -c main.f90 && gfortran -o main.out main.o arrays_module.o
Once this object-oriented machinery exists, call A%kindOfMultiply(B) is much clearer than a FORALL approach (and much less error prone).
No one has mentioned do concurrent construct here, which has the potential to automatically parallelize and speed up your code,
do concurrent(j=1:n); A(:,j)=B(j)*A(:,j); end do
A one-line solution can be achieved by using FORALL:
FORALL(J=1:N) A(:,J) = B(J)*A(:,J)
Note that FORALL is deprecated in the most recent versions of the standard, but as far as I know, that is the only way you can perform that operation as a single line of code.
I am trying to adapt a Fortran code (Gfortran) to make use of OpenMP. It is a particle based code where the index of arrays can correspond to particles or pairs. The code uses a derived type to store a number of matrices for each particle. It is very common to come across loops which require the use of a matrix stored in this derived type. This matrix may be accessed by multiple threads. The loop also requires a reduction over an element in this derived type. I currently have to write a temporary array in order to do this reduction and then I set the element of the derived type equal to this temporary reduction array. If not using OpenMP no temporary array is needed.
Question: Is it possible to do a reduction over an element of a derived type? I don't think I can do a reduction over the entire derived type as I need to access some of the elements in the derived type to do work, which means it needs to be SHARED. (From reading the specification I understand that when using REDUCTION a private copy of each list item is created.)
Complete minimal working example below. It could be more minimal but I feared that removing more components might over simplify the problem.
PROGRAM TEST_OPEN_MP
USE, INTRINSIC :: iso_fortran_env
USE omp_lib
IMPLICIT NONE
INTEGER, PARAMETER :: dp = REAL64
INTEGER, PARAMETER :: ndim=3
INTEGER, PARAMETER :: no_partic=100000
INTEGER, PARAMETER :: len_array=1000000
INTEGER :: k, i, ii, j, jj
INTEGER, DIMENSION(1:len_array) :: pair_i, pair_j
REAL(KIND=dp), DIMENSION(1:len_array) :: pair_i_r, pair_j_r
REAL(KIND=dp), DIMENSION(1:no_partic) :: V_0
REAL(KIND=dp), DIMENSION(1:ndim,1:no_partic) :: disp, foovec
REAL(KIND=dp), DIMENSION(1:ndim,1:len_array) :: dvx
REAL(KIND=dp), DIMENSION(1:2*ndim,1:len_array):: vec
REAL(KIND=dp), DIMENSION(1:ndim) :: disp_ij,temp_vec1,temp_vec2
REAL(KIND=dp), DIMENSION(1:ndim,1:ndim) :: temp_ten1,temp_ten2
REAL(KIND=dp), DIMENSION(1:no_partic,1:ndim,1:ndim):: reduc_ten1
REAL(KIND=dp) :: sum_check1,sum_check2,cstart,cend
TYPE :: matrix_holder !<-- The derived type
REAL(KIND=dp), DIMENSION(1:ndim,1:ndim) :: mat1 !<-- The first element
REAL(KIND=dp), DIMENSION(1:ndim,1:ndim) :: mat2 !<-- The second element, etc.
END TYPE matrix_holder
TYPE(matrix_holder), DIMENSION(1:no_partic) :: matrix
! Setting "random" values to the arrays
DO k = 1, no_partic
CALL random_number(matrix(k)%mat1(1:ndim,1:ndim))
CALL random_number(matrix(k)%mat2(1:ndim,1:ndim))
END DO
CALL random_number(pair_i_r)
CALL random_number(pair_j_r)
CALL random_number(disp)
CALL random_number(vec)
CALL random_number(dvx)
CALL random_number(V_0)
disp = disp*10.d0
vec = vec*100.d0
dvx = dvx*200.d0
V_0 = V_0*10d0
pair_i = FLOOR(no_partic*pair_i_r)+1
pair_j = FLOOR(no_partic*pair_j_r)+1
! Doing the work
cstart = omp_get_wtime()
!$OMP PARALLEL DO DEFAULT(SHARED) &
!$OMP& PRIVATE(i,j,k,disp_ij,temp_ten1,temp_ten2,temp_vec1,temp_vec2,ii,jj), &
!$OMP& REDUCTION(+:foovec,reduc_ten1), SCHEDULE(static)
DO k= 1, len_array
i = pair_i(k)
j = pair_j(k)
disp_ij(1:ndim) = disp(1:ndim,i)-disp(1:ndim,j)
temp_vec1 = MATMUL(matrix(i)%mat2(1:ndim,1:ndim),&
vec(1:ndim,k))
temp_vec2 = MATMUL(matrix(j)%mat2(1:ndim,1:ndim),&
vec(1:ndim,k))
DO jj=1,ndim
DO ii = 1,ndim
temp_ten1(ii,jj) = -disp_ij(ii) * vec(jj,k)
temp_ten2(ii,jj) = disp_ij(ii) * vec(ndim+jj,k)
END DO
END DO
reduc_ten1(i,1:ndim,1:ndim)=reduc_ten1(i,1:ndim,1:ndim)+temp_ten1*V_0(j) !<--The temporary reduction array
reduc_ten1(j,1:ndim,1:ndim)=reduc_ten1(j,1:ndim,1:ndim)+temp_ten2*V_0(i)
foovec(1:ndim,i) = foovec(1:ndim,i) - temp_vec1(1:ndim)*V_0(j) !<--A generic reduction vector
foovec(1:ndim,j) = foovec(1:ndim,j) + temp_vec1(1:ndim)*V_0(i)
END DO
!$OMP END PARALLEL DO
cend = omp_get_wtime()
! Checking the results
sum_check1 = 0.d0
sum_check2 = 0.d0
DO i = 1,no_partic
matrix(i)%mat2(1:ndim,1:ndim)=reduc_ten1(i,1:ndim,1:ndim) !<--Writing the reduction back to the derived type element
sum_check1 = sum_check1+SUM(foovec(1:ndim,i))
sum_check2 = sum_check2+SUM(matrix(i)%mat2(1:ndim,1:ndim))
END DO
WRITE(*,*) sum_check1, sum_check2, cend-cstart
END PROGRAM TEST_OPEN_MP
The only other alternative I can think of would be to remove all the derived types and replace these with large arrays similar to reduc_ten1 in the example.
Unfortunately, what you want is not possible. At least if I understood your (very complicated for me!) code correctly.
The problem is that you have an array of derived types each have an array. You cannot reference that.
Consider this toy example:
type t
real :: mat(3)
end type
integer, parameter :: n = 100, nk = 1000
type(t) :: parts(n)
integer :: i
real :: array(3,n,nk)
do k = 1, nk
array(:,:,nk) = k
end do
do i = 1, n
parts(i)%mat = 0
end do
!$omp parallel do reduction(+:parts%mat)
do k = 1, nk
do i = 1, n
parts(i)%mat = parts(i)%mat + array(:,i,nk)
end do
end do
!$omp end parallel do
end
Intel Fortran gives a more concrete error:
reduction6.f90(23): error #6159: A component cannot be an array if the encompassing structure is an array. [MAT]
!$omp parallel do reduction(+:parts%mat)
--------------------------------------^
reduction6.f90(23): error #7656: Subobjects are not allowed in this OpenMP* clause; a named variable must be specified. [PARTS]
!$omp parallel do reduction(+:parts%mat)
--------------------------------^
Remember that it is not even allowed to do this, completely without OpenMP:
parts%mat = 0
Intel:
reduction6.f90(21): error #6159: A component cannot be an array if the encompassing structure is an array. [MAT]
gfortran:
Error: Two or more part references with nonzero rank must not be specified at (1)
You must do this:
do i = 1, n
parts(i)%mat = 0
end do
The reason for the error reported by Intel above is very similar.
Actually no derived type components are allowed in the reduction clause, only variable names can be used. That is the reason for the syntax error reported by gfortran. It does not expect any % there. Intel again gives a clearer error message.
But one could make a workaround around that, like passing it to a subroutine and do the reduction there.
I want to implement an efficient library for bitwise operations on big integers. I've written the following function that overrides BTEST:
FUNCTION testb_i2b(n,i)
INTEGER(I8B), DIMENSION(0:), INTENT(IN) :: n
INTEGER(I2B), INTENT(IN) :: i
INTEGER(I2B) :: j
LOGICAL :: testb_i2b
j = ISHFT(i,-6)
IF ( j .LE. UBOUND(n,1) ) THEN
testb_i2b = BTEST(n(j),i-ISHFT(j,6))
ELSE
testb_i2b = .FALSE.
END IF
END FUNCTION testb_i2b
The array n contains the 64*(SIZE(n)-1) bits of my big integer. Is there a more efficient way to obtain the same functionality?
I don't know whether this is faster than your version, I'll leave you to test that, but it involves fewer operations and no explicit if statement so might be. It gives the same results as your code for the few tests I've run. I've hard-wired the size of the integers in the bignum at 64 bits, you could make that a parameter if you wanted to.
LOGICAL FUNCTION btest_bignum(bn,ix)
IMPLICIT NONE
INTEGER(int64), DIMENSION(0:), INTENT(in) :: bn
INTEGER(int16), INTENT(in) :: ix
INTEGER :: array_ix
array_ix = ix/64
btest_bignum = BTEST(bn(array_ix), ix-(array_ix*64))
END FUNCTION btest_bignum
Note that I've used the now-standard kind declarations int64 and int16
I have an optimization solver in Fortran 90. So, if I want to change the objective function
I have to modified the main file and write the objective function in this way:
subroutine fobj(n,x,f)
implicit none
integer :: n
real(8) :: f
real(8) :: x(n)
intent(in ) :: n,x
intent(out) :: f
!OBJECTIVE FUNCTION
f = x(1)**2-x(2)+2*x(3)
end subroutine fobj
I have a big objective function, so I want to call this line "f = x(1)**2-x(2)+2*x(3)" from an external file or at least the subrutine.
Is that possible? (I'm new in Fortran.)
I know that I can modified the file with Python, but I want to do it in other file.
Thanks a lot!
Sure. Use:
include 'file.inc'
to include source code from an external file.
I'm not sure if this is what you're looking for, but:
Fortran also allows you to pass subroutine/function names around as actual arguments to subroutine/function calls. The corresponding dummy arguments must have the "external" attribute.
subroutine fobj(n,x,f,func)
implicit none
integer :: n
real(8),external :: func
real(8) :: f
real(8) :: x(n)
intent(in ) :: n,x
intent(out) :: f
!OBJECTIVE FUNCTION
f=func(x,n)
end subroutine fobj
function func1(x,n)
implicit none
real(8) func1
integer n
real(8) :: f,x(n)
f = x(1)**2-x(2)+2*x(3)
end function func1
function func2(x,n)
implicit none
real(8) func2
integer n
real(8) :: f,x(n)
f = x(1)**2+x(2)+2*x(3)
end function func2
program main
real(8),external :: func1,func2
real(8),allocatable :: x(:)
real(8) :: f
integer n
n=50
allocate(x(n))
x=10. !Set X to a known value
call fobj(n,x,f,func1) !Call func1
print*,f !10**2-10+2*10 = 110
x=10. !Reset X ... just to make sure there is no funny business in func1,func2
call fobj(n,x,f,func2) !Call func2
print*,f !10**2+10+2*10 = 130
deallocate(x)
end program main
Of course, this program does nothing useful other than call func1 and func2 in obscure ways, but hopefully it illustrates the point. If you're looking to switch out the function at compile-time, then I think a include "myfile" is probably cleaner (just switching which file you're including at the time as suggested by #AlejandroLL)
You might also try to use Modules in your program. Sometimes when you pass special variables to your subroutines/functions you need to write interfaces for them. Using modules will improve your program structure and you'll be more effective and all interfaces would be generated automatically.