I have the following implementation of the MergeSort algorithm in Fortran.
My question is about call merge(work(1 : half), A(half + 1:), A).
Obviously I have overlapping memory, but from looking at the code in merge, this should be no problem, as long as the input arrays are sorted. (Which they are assumed to be anyway.)
On the other hand Fortran compilers may assume non aliased memory,
so I always think "don't do this".
I have two questions now:
When and how can I run into problems with my merge subroutine.
If I cannot implement MergeSort like this, how do I do it without creating a temporary array.
!> Merge sorted arrays A and B into C while preversing order.
subroutine merge(A, B, C)
implicit none
integer, intent(in) :: A(:), B(:)
integer, intent(inout) :: C(:)
integer :: i, j, k
if (size(A) + size(B) > size(C)) abort
i = 1; j = 1
do k = 1, size(C)
if (i <= size(A) .and. j <= size(B)) then
if (A(i) <= B(j)) then
C(k) = A(i)
i = i + 1
else
C(k) = B(j)
j = j + 1
end if
else if (i <= size(A)) then
C(k) = A(i)
i = i + 1
else if (j <= size(B)) then
C(k) = B(j)
j = j + 1
end if
end do
end subroutine merge
recursive subroutine MergeSort(A, work)
implicit none
integer, intent(inout) :: A(:)
integer, intent(inout) :: work(:)
integer :: half
half = (size(A) + 1) / 2
if (size(A) < 2) then
continue
else if (size(A) == 2) then
call naive_sort(A)
else
call MergeSort(A( : half), work)
call MergeSort(A(half + 1 :), work)
if (A(half) > A(half + 1)) then
work(1 : half) = A(1 : half)
! TODO: Non aliasing rule.
call merge(work(1 : half), A(half + 1:), A)
endif
end if
end subroutine MergeSort
PS: As you perhaps notice, the array C in the merge subroutine is declared as an inout parameter, because it is later used with overlapping memory.
This use of aliasing in calling merge is erroneous.
With
call merge(work(1 : half), A(half + 1:), A)
the dummy argument B is associated with A(half+1:) and the dummy argument C with A which is the understood overlap.
This aliasing means that the elements of B may not be defined (which is additionally required by the intent) and that the last few elements of C may not be defined.
However, if we look at the main loop in merge we see that in general every element of C appears in a statement looking like C(k)=...: we expect at least one of those conditions inside to be true. This is therefore invalid.
To be clear: a statement like C(k)=B(j) would be an illegal definition even if the value of C(k) doesn't change as a result.
Fortunately, perhaps, there is an easy way to create a temporary array to avoid aliasing: give the dummy argument B the value attribute. You could even do the same to A and remove the workspace array.
Related
Im trying to print prime numbers till 10000. (display the first five for testing)
This is my program
program primes
implicit none
integer :: array(1229)
integer :: i, ind
logical :: is_prime
ind = 1
do i = 2, 10000, 1
if (is_prime(i) .eqv. .true.) then
array(ind) = i
ind = ind + 1
end if
end do
print *, array(1)
print *, array(2)
print *, array(3)
print *, array(4)
print *, array(5)
end program primes
function is_prime(n) result(ispr)
implicit none
integer :: c, i
integer, intent(in) :: n
logical :: ispr
c = 0
do i = 2, n
if (mod(i,2) == 0) then
c = c + 1
end if
end do
ispr = (c == 0)
end function is_prime
I don't know why but this is the output
9175178
6417360
5374044
6750309
7536745
Why does this happen and how to correct?
is_prime should be(n is the only divider of n besides 1 <=> c == 1)
function is_prime(n) result(ispr)
implicit none
integer :: c, i
integer, intent(in) :: n
logical :: ispr
c = 0
do i = 2, n
if (mod(n,i) == 0) then
c = c + 1
end if
end do
ispr = (c == 1)
end function is_prime
Could be optimezed by leaving the loop when c == 1 and i < n(after adding 1 to c)...
See on online fortran compiler
version with exit loop
While I am not familiar with modern Fortran, it looks to me as if function is_prime(n) result(ispr) is not working.
In the do loop in that function, you want a loop that tests thus:
is n divisible by 2?
is n divisible by 3?
is n divisible by 4?
is n divisible by 5?
and so on.
But, what it is actually doing is asking these:
is 2 divisible by 2?
is 3 divisible by 2?
is 4 divisible by 2?
is 5 divisible by 2?
and so on.
As a result, your counter will always have a non-zero value, and your function will always return false.
But, that's not the only problem. From your results, it looks like your Fortran implementation does not automatically initialize variables. Suppose I have statements like the following:
integer :: b
print *,b
What will be the result?
Remember, the names of variables represent locations in the computer's memory. If a variable is not initialized, it's value will be what was in the memory location before your program started to run. This value will not be related to your program.
I have 2 suggestions to fix the 2nd problem:
Prior to do i = 2, 10000, 1, have another loop that sets each value in array.
Set a values of each array (i) inside your do i = 2, 10000, 1 loop. One way to do this is to set one value when (is_prime(i) .eqv. .true.) is true and a different value when it is false.
Here's my factorial function in Fortran.
module facmod
implicit none
contains
function factorial (n) result (fac)
use FMZM
integer, intent(in) :: n
integer :: i
type(IM) :: fac
fac = 1
if(n==0) then
fac = 1
elseif(n==1) then
fac = 1
elseif(n==2) then
fac = 2
elseif(n < 0) then
write(*,*) 'Error in factorial N=', n
stop 1
else
do i = 1, n
fac = fac * i
enddo
endif
end function factorial
end module facmod
program main
use FMZM
use facmod, only: factorial
implicit none
type(IM) :: res
integer :: n, lenr
character (len=:), allocatable :: str
character(len=1024) :: fmat
print*,'enter the value of n'
read*, n
res = factorial(n)
lenr = log10(TO_FM(res))+2
allocate(character(len=lenr) :: str)
write (fmat, "(A5,I0)") "i", lenr
call im_form(fmat, res, str)
print*, trim( adjustl(str))
end program main
I compile using FMZM:
gfortran -std=f2008 fac.F90 fmlib.a -o fac
echo -e "1000" | .fac computes easy. However, if I give this echo -e "3600" | .fac, I already get an error on my machine:
Error in FM. More than 200000 type (FM), (ZM), (IM) numbers
have been defined. Variable SIZE_OF_START in file
FMSAVE.f95 defines this value.
Possible causes of this error and remedies:
(1) Make sure all subroutines (also functions that do not
return type FM, ZM, or IM function values) have
CALL FM_ENTER_USER_ROUTINE
at the start and
CALL FM_EXIT_USER_ROUTINE
at the end and before any other return, and all
functions returning an FM, ZM, or IM function value have
CALL FM_ENTER_USER_FUNCTION(F)
at the start and
CALL FM_EXIT_USER_FUNCTION(F)
at the end and before any other return, where the actual
function name replaces F above.
Otherwise that routine could be leaking memory, and
worse, could get wrong results because of deleting some
FM, ZM, or IM temporary variables too soon.
(2) Make sure all subroutines and functions declare any
local type FM, ZM, or IM variables as saved. Otherwise
some compilers create new instances of those variables
with each call, leaking memory.
For example:
SUBROUTINE SUB(A,B,C,X,Y,RESULT)
TYPE (FM) :: A,B,C,X,Y,RESULT,ERR,TOL,H
Here A,B,C,X,Y,RESULT are the input variables and
ERR,TOL,H are local variables. The fix is:
SUBROUTINE SUB(A,B,C,X,Y,RESULT)
TYPE (FM) :: A,B,C,X,Y,RESULT
TYPE (FM), SAVE :: ERR,TOL,H
(3) Since = assignments for multiple precision variables are
the trigger for cleaning up temporary multiple precision
variables, a loop with subroutine calls that has no =
assignments can run out of space to store temporaries.
For example:
DO J = 1, N
CALL SUB(A,B,C,TO_FM(0),TO_FM(1),RESULT)
ENDDO
Most compilers will create two temporary variables with
each call, to hold the TO_FM values.
One fix is to put an assignment into the loop:
DO J = 1, N
ZERO = TO_FM(0)
CALL SUB(A,B,C,ZERO,TO_FM(1),RESULT)
ENDDO
(4) If a routine uses allocatable type FM, ZM, or IM arrays
and allocates and deallocates with each call, then after
many calls this limit on number of variables could be
exceeded, since new FM variable index numbers are
generated for each call to the routine.
A fix for this is to call FM_DEALLOCATE before actually
deallocating each array, so those index numbers can be
re-used. For example:
DEALLOCATE(T)
becomes:
CALL FM_DEALLOCATE(T)
DEALLOCATE(T)
(5) If none of this helps, try running this program again
after increasing the value of SIZE_OF_START and
re-compiling.
What optimizations or Fortran idioms am I missing that is hurting my performance so much?
For example, in python, I can factorial numbers much larger than 3500:
>>> import math
>>> math.factorial(100000)
Or in Haskell:
Prelude> product [1..100000]
Both these compute, not exactly quickly, but without error.
How can I improve my algorithm or better use existing libraries to improve performance of large integer factorials in Fortran? Is there a more appropriate big integer library than FMZM?
Try this. Apart from minor cosmetic changes, I just followed the recommendations of the error message in your question:
added calls to FM_ENTER_USER_FUNCTION and FM_EXIT_USER_FUNCTION,
added an assignment inside the loop (without this ii = to_im(i), it still fails, but I'm not sure why, as there is already an assignment with fac = fac * i, and accordind to the doc the assignment triggers cleaning up temporaries),
renamed factorial in main program as there is already a function with this name in FMZM.
Tested with ifort and n=100000.
module fac_mod
implicit none
contains
function factorial(n) result(fac)
use FMZM
integer, intent(in) :: n
integer :: i
type(IM) :: fac
type(IM), save :: ii
call FM_ENTER_USER_FUNCTION(fac)
fac = to_im(1)
if (n < 0) then
write (*, *) "Error in factorial N=", n
stop 1
else if (n > 1) then
do i = 1, n
ii = to_im(i)
fac = fac * ii
end do
end if
call FM_EXIT_USER_FUNCTION(fac)
end function factorial
end module fac_mod
program main
use FMZM
use fac_mod, only: f=>factorial
implicit none
type(IM) :: res
integer :: n, lenr
character(:), allocatable :: str
character(1024) :: fmat
print *, "enter the value of n"
read *, n
res = f(n)
lenr = 2 + log10(TO_FM(res))
allocate (character(lenr) :: str)
write (fmat, "(A5,I0)") "i", lenr
call im_form(fmat, res, str)
print *, trim(adjustl(str))
end program main
I have this simple Fortran 90 program:
subroutine apc_wrapper(i, j, k)
implicit none
integer*8, intent(in) :: i, j
integer*8, intent(out) :: k
double precision t
k = i + js
end subroutine
compiled as a shared library
gfortran -O2 -shared -fPIC apc_wrapper.f90 -o apc_wrapper.so
Now, I want to call this subroutine from Julia, with all integer arguments, like this
i = 2
j = 3
k = 0
ccall( (:apc_wrapper_, "./apc_wrapper.so"), Void, (Ptr{Int64}, Ptr{Int64}, Ptr{Int64}), &i, &j, &k)
But it won't work. k won't change its value and keep evaluating to 0.
But, if I do this
i = 2
j = 3
kk = [0]
ccall( (:apc_wrapper_, "./apc_wrapper.so"), Void, (Ptr{Int64}, Ptr{Int64}, Ptr{Int64}), &i, &j, kk)
That is, use an array to store the output, it works! After calling the subroutine, kk evaluates to
1-element Array{Int64,1}:
5
And I didn't change the Fortran code at all, it didn't even know it was dealing with an array, just a block of memory.
So, if Fortran is able to read blocks of memory (i and j were properly red) why isn't able to write into them?
I don't have any problem with this. Actually, I want to use an array as output but still, this behavior surprised me.
Well, Julia is a fast-paced developing language and it turns out that the &variable syntax is deprecated.
This would be the proper way to do this:
i = 2
j = 3
k = 0
i_ref = Ref{Int64}(i)
j_ref = Ref{Int64}(j)
k_ref = Ref{Int64}(k)
ccall( (:apc_wrapper_, "./apc_wrapper.so"), Void,
(Ref{Int64}, Ref{Int64}, Ref{Int64}),
i_ref, j_ref, k_ref)
and then k_ref.x will evaluate to 5.
If I just use my function once, it works properly. If I make it do a loop like down below, the four lines of commented code, my code malfunctions. I can't really figure out why it will always return T or F for every other number after the initial value.
Asterisks are in the parenthesis of WRITE and READ but it doesn't show up here for some reason.
PROGRAM PRIME
INTEGER :: N = 0, i = 1,x = 0
LOGICAL :: IP
WRITE (*,*) "Enter a number:"
READ (*,*) N
!DO WHILE ( N < 1000)
IP = IsPrime(N)
WRITE (*,*) IP, N
!N = N + 1
!END DO
read(*,*) x
CONTAINS
FUNCTION IsPrime(N)
LOGICAL :: IsPrime
INTEGER, INTENT(IN) :: N
IsPrime = .TRUE.
IF (N == 2) THEN
WRITE (*,*) N
ELSE
DO WHILE (i <= (N/2))
i = i + 2
IF (mod(N,i) == 0) THEN
IsPrime = .FALSE.
END IF
END DO
END IF
RETURN
END FUNCTION IsPrime
END PROGRAM PRIME
You're forgetting to reset i to 1 during each call to IsPrime.
The first time IsPrime is called, i=1 from the top of program main. However, i is incremented during the first IsPrime call to something other than 1 so the second call starts with i/=0.
Note that because IsPrime is contained within program main, IsPrime inherits i from program main.
I'm also bound to remind you to use implicit none everywhere to avoid other errors, although it's not a problem in this case.
Hermite Interpolation woes
I am trying to find the Newton Dividing Differences for the function and derivative values of a given set of x's. I'm running into serious problems with my code working for tiny examples, but failing on bigger one's. As is clearly visible, my answers are very much larger than they original function values.
Does anybody have any idea what I'm doing wrong?
program inter
implicit none
integer ::n,m
integer ::i
real(kind=8),allocatable ::xVals(:),fxVals(:),newtonDivDiff(:),dxVals(:),zxVals(:),zdxVals(:),zfxVals(:)
real(kind=8) ::Px
real(kind=8) ::x
Open(Unit=8,File="data/xVals")
Open(Unit=9,File="data/fxVals")
Open(Unit=10,File="data/dxVals")
n = 4 ! literal number of data pts
m = n*2+1
!after we get the data points allocate the space
allocate(xVals(0:n))
allocate(fxVals(0:n))
allocate(dxVals(0:n))
allocate(newtonDivDiff(0:n))
!allocate the zvalue arrays
allocate(zxVals(0:m))
allocate(zdxVals(0:m))
allocate(zfxVals(0:m))
!since the size is the same we can read in one loop
do i=0,n
Read(8,*) xVals(i)
Read(9,*) fxVals(i)
Read(10,*) dxVals(i)
end do
! contstruct the z illusion
do i=0,m,2
zxVals(i) = xVals(i/2)
zxVals(i+1) = xVals(i/2)
zdxVals(i) = dxVals(i/2)
zdxVals(i+1) = dxVals(i/2)
zfxVals(i) = fxVals(i/2)
zfxVals(i+1) = fxVals(i/2)
end do
!slightly modified business as usual
call getNewtonDivDiff(zxVals,zdxVals,zfxVals,newtonDivDiff,m)
do i=0,n
call evaluatePolynomial(m,newtonDivDiff,xVals(i),Px,zxVals)
print*, xVals(i) ,Px
end do
close(8)
close(9)
close(10)
stop
deallocate(xVals,fxVals,dxVals,newtonDivDiff,zxVals,zdxVals,zfxVals)
end program inter
subroutine getNewtonDivDiff(xVals,dxVals,fxVals,newtonDivDiff,n)
implicit none
integer ::i,k
integer, intent(in) ::n
real(kind=8), allocatable,dimension(:,:) ::table
real(kind=8),intent(in) ::xVals(0:n),dxVals(0:n),fxVals(0:n)
real(kind=8), intent(inout) ::newtonDivDiff(0:n)
allocate(table(0:n,0:n))
table = 0.0d0
do i=0,n
table(i,0) = fxVals(i)
end do
do k=1,n
do i = k,n
if( k .eq. 1 .and. mod(i,2) .eq. 1) then
table(i,k) = dxVals(i)
else
table(i,k) = (table(i,k-1) - table(i-1,k-1))/(xVals(i) - xVals(i-k))
end if
end do
end do
do i=0,n
newtonDivDiff(i) = table(i,i)
!print*, newtonDivDiff(i)
end do
deallocate(table)
end subroutine getNewtonDivDiff
subroutine evaluatePolynomial(n,newtonDivDiff,x,Px,xVals)
implicit none
integer,intent(in) ::n
real(kind=8),intent(in) ::newtonDivDiff(0:n),xVals(0:n)
real(kind=8),intent(in) ::x
real(kind=8), intent(out) ::Px
integer ::i
Px = newtonDivDiff(n)
do i=n,1,-1
Px = Px * (x- xVals(i-1)) + newtonDivDiff(i-1)
end do
end subroutine evaluatePolynomial
Values
x f(x) f'(x)
1.16, 1.2337, 2.6643
1.32, 1.6879, 2.9989
1.48, 2.1814, 3.1464
1.64, 2.6832, 3.0862
1.8, 3.1553, 2.7697
Output
1.1599999999999999 62.040113431002474
1.3200000000000001 180.40121445431600
1.4800000000000000 212.36319446149312
1.6399999999999999 228.61845650513027
1.8000000000000000 245.11610836104515
You are accessing array newtonDivDiff out of bounds.
You are first allocating it as 0:n (main program's n) then you are passing to subroutine getNewtonDivDiff as 0:n (the subroutine's n) but you pass m (m=n*2+1) to the argument n. That means you tell the subroutine that the array has bounds 0:m which is 0:9, but it has only bounds 0:4.
It is quite difficult to debug the program as it stands, I had to use valgrind. If you move your subroutines to a module and change the dummy arguments to assumed shape arrays (:,:) then the bound checking in gfortran (-fcheck=all) will catch the error.
Other notes:
kind=8 is ugly, 8 can mean different things for different compilers. If you want 64bit variables, you can use kind=real64 (real64 comes from module iso_fortran_env in Fortran 2008) or use selected_real_kind() (Fortran 90 kind parameter)
You do not have to deallocate your local arrays in the subroutines, they are deallocated automatically.
Your deallocate statement in the main program is after the stop statement, it will never be executed. I would just delete the stop, there is no reason to have it.