I have the following Fortran 95 code:
MODULE ISSUE
IMPLICIT NONE
CONTAINS
SUBROUTINE PROBLEM(A)
IMPLICIT NONE
DOUBLE PRECISION, DIMENSION(:,:), INTENT(INOUT) :: A
INTEGER :: i, n
n = SIZE(A, 2)
DO i = 1, n
PRINT *, A(i, 1:n)
ENDDO
END SUBROUTINE PROBLEM
END MODULE ISSUE
PROGRAM TEST
USE ISSUE
IMPLICIT NONE
DOUBLE PRECISION, DIMENSION(5, 5) :: A
A = TRANSPOSE(RESHAPE((/ 1, 2, 3, 4, 5, &
6, 7, 8, 9, 0, &
1, 2, 3, 4, 5, &
6, 7, 8, 9, 0, &
1, 2, 3, 4, 5/), SHAPE(A)))
CALL PROBLEM(A)
END PROGRAM TEST
And now I compile it using
gfortran -g -O0 problem.f95 -o problem
Then I run the program using GDB 7.7.1 and I set a breakpoint to line 11, (DO i = 1, n). After that, I print the first element of the matrix A. The output that I get is:
(gdb) break 11
Breakpoint 1 at 0x4008c5: file problem.f95, line 11.
(gdb) run
Starting program: /path_to_problem_folder/problem
Breakpoint 1, issue::problem (a=...) at problem.f95:11
warning: Source file is more recent than executable.
11 DO i = 1, n
(gdb) p A(1, 1)
$1 = 6.9533558074105031e-310
I expect the output 1, but 6.9533558074105031e-310 is printed by gdb. Note that the subroutine PROBLEM prints all values correctly.
Why this happens and how can I print the value A(1, 1) correctly using GDB?
I updated my GDB version from 7.7.1 to 7.11.1 and the problem disappeared.
Related
Accelerating a Fortran code with OpenACC using the PGI compiler, I got problems with a matmul call in an accelerated loop.
In the simplified example, I apply the identity matrix on two vectors, so the input and the output values should be the same:
program test
implicit none
integer :: a(3, 3)
integer :: v1(3, 2), v2(3, 2)
integer :: i
a = reshape([1, 0, 0, 0, 1, 0, 0, 0, 1], [3, 3])
v1 = reshape([1, 2, 3, 4, 5, 6], [3, 2])
print *, v1
!$acc kernels copyin(a, v1) copyout(v2)
!$acc loop independent
do i = 1, 2
v2(:, i) = matmul(a, v1(:, i))
enddo
!$acc end kernels
print *, v2
endprogram
When compiling with the PGI compiler version 20.9, I got these information:
test:
12, Generating copyin(a(:,:),v1(:,:)) [if not already present]
Generating implicit copyout(z_a_0(:)) [if not already present]
Generating copyout(v2(:,:)) [if not already present]
14, Loop is parallelizable
Generating Tesla code
14, !$acc loop gang ! blockidx%x
15, !$acc loop vector(32) ! threadidx%x
15, Loop is parallelizable
Running the code gives the following values:
1 2 3 4 5 6
4 5 6 4 5 6
the second line should be like the first one, which is the case on sequential execution. What is wrong in the code?
Looks to be a compiler issue. The problem line being:
Generating implicit copyout(z_a_0(:))
"z_a_0" is compiler temp array being created to hold the intermediary result from the call to matmul. It's declaration is being hoisted out of the loop and then copied back in as shared array. Since it's shared, it then causes a race condition.
I've submitted a problem report (TPR #29482) and sent it to our engineers for further investigation.
#Mat Colgrove explained the reason of the incorrect behavior. The workaround I found was to write the matrix vector multiplication explicitly:
program test
implicit none
integer :: a(3, 3)
integer :: v1(3, 2), v2(3, 2)
integer :: i, j, k
a = reshape([1, 0, 0, 0, 1, 0, 0, 0, 1], [3, 3])
v1 = reshape([1, 2, 3, 4, 5, 6], [3, 2])
print *, v1
!$acc kernels copyin(a, v1) copyout(v2)
!$acc loop independent
do i = 1, 2
!$acc loop seq
do k = 1, 3
v2(k, i) = 0
!$acc loop seq
do j = 1, 3
v2(k, i) = v2(k, i) + a(j, k) * v1(j, i)
enddo
enddo
enddo
!$acc end kernels
print *, v2
endprogram
What s wrong with the following code? It works fine when I run it with 1,2 and 5 cpus,
while with 3 cpus, it breaks at (1), and with 4 cpus, it breaks at (2). In both the cases the error reads:
An error occurred in MPI_Scatterv
reported by process [3248685057,3]
on communicator MPI_COMM_WORLD
MPI_ERR_TRUNCATE: message truncated.
Probably the problem is correlated with memory allocation, but I cannot understand it deeply...(I am not totally confident about how I allocated the distributed variables (3)) Some suggestion is warmly accepted
program import_and_divide
implicit none
include 'mpif.h'
integer :: i, k, io, n ,nnz
integer,allocatable, dimension(:) :: Ai, Aj
real*8, allocatable, dimension(:) :: Aa, Ab, x
integer*4 :: rank, mpi_stat, size
integer, allocatable, dimension(:) :: sendcounts, displ, sendcounts1, displ1
integer :: n_distro
integer, allocatable, dimension(:) :: ia, ja
real*8, allocatable, dimension(:) :: a, b
n = 5
nnz = 13
!.. Initialize MPI.
call MPI_INIT(mpi_stat)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, mpi_stat)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, mpi_stat)
if (rank.eq.0) then
allocate(Ai(n+1), Aj(nnz), Aa(nnz), Ab(n), x(n))
Ai = (/0, 3, 5, 8, 11, 13/)
Aj = (/0, 1, 3, 0, 1, 2, 3, 4, 0, 2, 3, 1, 4/)
Aa = (/1, -1, -3, -2 , 5, 4, 6, 4, -4, 2, 7, 8, -5/)
Ab(:) = ([1, 2, 3, 4, 5])
x(:) = 0.d0
print*, 'Number of cpus: ', size
print*, 'n, nnz: ', n, nnz
endif
call MPI_BARRIER(MPI_COMM_WORLD, mpi_stat)
allocate(sendcounts(size), displ(size), sendcounts1(size), displ1(size))
n_distro=(n+1)/size
k = 0
do i=1,size
if (i<size) then
sendcounts(i) = n_distro
else
sendcounts(i) = (n+0)-(size-1)*n_distro
endif
displ(i) = k
k = k + sendcounts(i)
end do
if (rank.eq.0) then
displ1 = Ai(displ+1)
do i=1,size-1
sendcounts1(i) = displ1(i+1)-displ1(i)
end do
sendcounts1(size) = nnz-displ1(size)
endif
call MPI_BARRIER(MPI_COMM_WORLD,mpi_stat)
if (rank.eq.0) then
print*, 'Sendcounts: ', sendcounts
print*, 'Displ: ', displ
endif
call MPI_BCAST(displ1,shape(displ1), MPI_INT,0,MPI_COMM_WORLD,mpi_stat)
call MPI_BCAST(sendcounts1, shape(sendcounts1), MPI_INT,0,MPI_COMM_WORLD,mpi_stat)
do i=0,size-1
if (i.eq.rank) then
allocate(b(sendcounts(i+1)), ia(sendcounts(i+1)+1) ,&
a(sendcounts1(i+1)), ja(sendcounts1(i+1))) !(3)
ia(:) = rank
ja(:) = 0
b(:) = 0.d0
a(:) = 0.d0
end if
call MPI_BARRIER(MPI_COMM_WORLD,mpi_stat)
end do
call MPI_scatterv(Ab, sendcounts, displ, MPI_DOUBLE,&
b, sendcounts, MPI_DOUBLE,0, MPI_COMM_WORLD,mpi_stat) ! (2) breaks here with mpirun -np 4
call MPI_scatterv(Ai, sendcounts+1, displ, MPI_INT,&
ia, sendcounts+1, MPI_INT,0, MPI_COMM_WORLD,mpi_stat)
call MPI_scatterv(Aa, sendcounts1, displ1, MPI_DOUBLE,&
a, sendcounts1, MPI_DOUBLE,0, MPI_COMM_WORLD,mpi_stat)
call MPI_scatterv(Aj, sendcounts1, displ1, MPI_INT,&
ja, sendcounts1, MPI_INT,0, MPI_COMM_WORLD,mpi_stat) ! (1) breaks here with mpirun -np 3
call MPI_FINALIZE(mpi_stat)
end
When I replace include "mpif.h" with use mpi and I got the following errors:
There is no specific subroutine for the generic ‘mpi_bcast’
There is no specific subroutine for the generic ‘mpi_scatterv’.
I also add -fcheck=all and this option provides this additional info (when compiled with mpif.h):
Allocatable actual argument 'ab' is not allocated
but this do not clarify my idea.
If the arrays Ai, Aj, Ab, Aa are allocated on all the processors, I got the same behaviour (works with 1,2 and 5 cpus and breaks with 3 and 4 cpus) and disappear the message Allocatable actual argument 'ab' is not allocated
If I replace sendcounts and sendcounts1 in the second half of MPI_scatterv with sendcounts(rank+1) and sendcounts1(rank+1), the code seems to work fine only if I remove the compiler flag -fcheck=all/-fsanitize=address(if the arrays Ai,Aj,Aa,Ab are allocated on all the processes as well as if allocated only on one process (rank=0)). With one of these two options I got the following error:
==25041==ERROR: LeakSanitizer: detected memory leaks
I am looking to output some numbers to the screen in Intel Fortran, By default Fortran writes a new line everytime I output something to the screen. Is there anyway to avoid this?
Here is a sample of my code:
PROGRAM sample
integer :: ind
do ind = 1,20
write(*,*) ind
!Some other stuff here
end do
END PROGRAM
I tried to use the no-advance option but it did not work:
PROGRAM sample
integer :: ind
do ind = 1,20
write(*,*,advance='no') ind
!Some other stuff here
end do
END PROGRAM
Ideally what I want is to output the following to the screen:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
All in the same line. Is there ant way of doing this in fortran?
This does not work with the format "*" (list directed). But it is OK with an explicit format :
program test
implicit none
integer :: ind
do ind = 1,20
write(*,'(1x,i0)',advance='no') ind
!Some other stuff here
end do
end program
Result :
[coul#localhost ~]$ gfortran test.f90
[coul#localhost ~]$ ./a.out
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20[coul#localhost ~]$
This question already has answers here:
Why does my original list change? [duplicate]
(2 answers)
Closed 6 years ago.
The answer to my question seems pretty straightforward : it can't so it doesn't. Yet I believe it is happening to me and it is driving me fairly crazy. Therefore I would very much appreciate your opinion.
Here's the situation. I'm writing this script which has the following function in it :
def ReduceReferenceCode():
Code = ReferenceCode
if E == 7:
CritLimit = 2
elif E == 4:
CritLimit = 1
if D < CritLimit:
for i in [4, 5, 6]:
if Code[i] >= CritLimit:
print ReferenceCode
Code[i] = Code[i] - CritLimit
print ReferenceCode
break
else:
Code[7] = Code[7] - CritLimit
Code[9] = 1
return Code
The value of my ReferenceCode variable - which is passed as an argument to the program with sys.argv - is changed between the two print commands. My main function prints both ReferenceCode and my reduced code for comparison purposes, which is the value stored in Code
Here's the program's output:
[1, 4, 3, 4, 9, 7, 2, 0, 6, 7, 9, 2]
[1, 4, 3, 4, 7, 7, 2, 0, 6, 7, 9, 2]
The reference code is [1, 4, 3, 4, 7, 7, 2, 0, 6, 1, 9, 2] and the reduced reference code is [1, 4, 3, 4, 7, 7, 2, 0, 6, 1, 9, 2]
Both variables should not have the same value and I really don't see why the operation on Code[i] is affecting ReferenceCode's value.
Any insight would be highly appreciated :-)
The value of your ReferenceCode is a list. That list has another reference called Code. Updating the list from either of these references changes the one-any-only list object. The fix is to copy the list
Code = ReferenceCode[:]
You did that :
Code = ReferenceCode
So both references are pointing to the same object.
I would like to do something like this in Fortran:
program where
real :: a(6) = (/ 4, 5, 6, 7, 8, 9 /)
print *, a(a>7)
end program
In Python I would typically do this with NumPy like this:
import numpy
a = numpy.array([ 4, 5, 6, 7, 8, 9])
print a[numpy.where(a>7)]
#or
print a[a>7]
I've played around, but nothing has worked thus far, but I'm guessing it is fairly simple.
I'll extend slightly the answer by #VladimirF as I suspect you don't want to limit yourself to the exact print example.
a>7 returns a logical array corresponding to a with .true. at index where the condition is met, .false. otherwise. The pack intrinsic takes such a mask and returns an array with those elements with .true. in the mask.
However, you can do other things with the mask which may fit under your numpy.where desire. For example, there is the where construct (and where statement) and the merge intrinsic. Further you can use pack again with the mask to get the indices and do more involved manipulations.
program where
real :: a(6) = (/ 4, 5, 6, 7, 8, 9 /)
print *, pack(a,a>7)
end program
You can find a related topic here: Better way to mask a Fortran array?
I think both where and merge can do the task.
In python, where has the ability to assign different value according to the mask, for example
a = np.array([4, 5, 6, 7, 8, 9])
b = np.where(a>7, 1, -1)
b will be array([-1, -1, -1, -1, 1, 1])
In Fortran, the equivalent of this is merge
real :: a(6) = (/ 4, 5, 6, 7, 8, 9 /)
real, allocatable :: b(:)
b = merge(1,-1,a>7)
print*, b
end
The MERGE function chooses alternative values based on the value of a mask. http://www.lahey.com/docs/lfpro78help/F95ARMERGEFn.htm
where can also do this, but it is slightly more complicated.
real :: a(6) = (/ 4, 5, 6, 7, 8, 9 /)
real, allocatable :: b(:)
b = a
where (a>7)
b = 1
else where
b = -1
end where
print*, b
end
a short version is this
b = a
b = -1
where (a>7) b = 1
You can find more information of where here: http://www.personal.psu.edu/jhm/f90/statements/where.html