Stalling at deallocate - fortran

My 2D hydro code stalls during the following subroutine (which computes the y-direction flux):
ALLOCATE(W1d(1:my,nFields),q1d(nFields),&
Wl(1:my,nFields),Wr(1:my,nFields))
PRINT *,"Main loop"
DO i=1,mx
DO j=1,my
q1d(1) = qVar(i,j,1,iRho)
q1d(2) = qVar(i,j,1, iE)
q1d(3) = qVar(i,j,1, ivy)
q1d(4) = qVar(i,j,1, ivx)
CALL Cons2Prim(q1d(:), W1d(j,:))
ENDDO
CALL lr_states(grid, W1d, dt, dy, Wl, Wr, dir)
DO j=1,my
Flux(i,j,:) = hllc_flux(wl(j,:), wr(j,:))
ENDDO
DO j=1,my
CALL Prim2Cons(Wl(j,:),Ul(i,j,:))
CALL Prim2Cons(Wr(j,:),Ur(i,j,:))
ENDDO
ENDDO
PRINT *,"Deallocating"
DEALLOCATE(W1d,q1d,Wl,Wr)
PRINT *,"Returning"
I separated the DEALLOCATE statement into 4 separate statements and found that whichever 2D array would come first, W1d, wl, or wr, was the cause of the stall. Ignoring the DEALLOCATE statement (which should produce an automatic deallocate when going back to the main) also causes a stall. The subroutine for the x-direction flux has the same arrays, is called before this subroutine, and has no problems deallocating them.
Any suggestions?
EDIT This is run on Fedora 18 and compiled with Intel Fortran 2013.3. It is a parallelized code, but I am running it on a single processor for testing/debugging purposes.

I did three different things and it suddenly started working again. Two of them I do not believe could have done it, while it is possible the third did it. The changes I made:
I did have the bounds of i and j loops defined slightly differently, so I made it uniform between the two directional sweeps
I ran make clean and make
I added -check bounds -check pointers -check uninit flags to the Makefile
I think the first two did not really do anything. The variable grid in the code above is a 2x2 array that contains the bounds of qVar; in the x-sweep I had defined mx = grid(1,2) - grid(1,1) + 1, similarly for my, but grid(1,1) is 1, so it really does not do much different. The second item above I had done at least 3 times.
But the last one I tried once and it started working again. I do not know how that could have fixed it, so if someone does know, please tell me!

Related

How to loop over an array only with one index?

I have the code kind of like this in F90:
real(8), dimension(10,10,10) :: A
do i = 1, 1000
print*,A(i,1,1)
enddo
I'm very surprised this worked and it's faster than simply looping over 3 dimensions by i,j,k.
Can someone please explain why this works?
Your code is illegal. But under the hood the memory layout of the array happens to be in the column major order as in the triple loop k,j,i, so the code appears to work. But it is illegal.
If you enable runtime error checks in your compiler (see the manual), it will find the error and report it.
It may be slightly faster, if you do not enable compiler optimizations, because there is some overhead in nested loops, but an optimizing compiler will optimize the code to one loop.
If you actually did (you should always show your code!!!)
do i=1,10
do j=1,10
do k=1,10
something with A(i,j,k)
then please note that that is the wrong order and you should loop in the k,j,i order.
Also please note that measuring the speed of printing to the screen is not useful and can be very tricky. Some mathematical operations are more useful.

Segmentation fault when enable $OMP DO loop

I am trying to modifying legacy code to initialize array with openmp. However, I encounter Segmentation fault when enabling $OMP DO derivatives in the following code sections. Would you please point out what might be wrong?
I am using fortran and compile with gfortran and variables are declared as common variables
common/quant/keosc,vosc,rosc,frt,grt,dipole,v_solv
common/quant_avg/frt_avg,grt_avg,d_coup,rv_avg,b_avg
!$OMP PARALLEL
!$OMP DO private(m,j,l,mp) firstprivate(nstates,natoms) lastprivate(rv_avg,b_avg,grt_avg,frt_avg,d_coup)
do m = 0, nstates - 1
rv_avg(m) = 0d0
b_avg(m) = 0d0
do j = 1, 3
grt_avg(m,j) = 0d0
do l = 1, natoms
frt_avg(m,l,j) = 0d0
do mp = 0, nstates - 1
d_coup(m,mp,l,j) = 0d0
enddo
enddo
enddo
enddo
!$OMP END DO
!$OMP END PARALLEL
Have you measured where the CPU consumption is in your program? It is a waste of effort to speed up portions that don't consume much CPU time. I'd be surprised if array initializations were a high fraction of the CPU usage. The code would be more readable if instead you used array notation, e.g., rv_avg (0:nstates - 1) = 0d0.
You haven't shown your declaration of the dimensions of any of the arrays so I speculate that the lines
do m = 0, nstates - 1
rv_avg(m) = 0d0
write to a non-existent element of rv_avg, that is the element at index 0. Since Fortran programs don't, by default, check that array element accesses are within bounds, this write outside the bounds won't be caught by the run-time. If the write stays within the address space of the program when it executes it won't cause a segmentation fault. Given the common block declarations the 0-th element of rv_avg may well be part of d_coup.
Shake up the mapping of variables to address space by introducing OpenMP and it's easy to believe that the0-th element of rv_avg now lies outside the address space for a thread and causes the segmentation fault.
Since the program makes other references to array elements at 0 any one of them might be at the root of the segmentation fault.
Of course, if you follow #M.S.B.'s advice and use array syntax notation you can avoid out-of-bounds array accesses.
The problem is probably that you do not have enough stack space in the OpenMP threads to hold the private copies of all these arrays. Especially d_coup looks like a really big one having 3 x natoms x nstates^2 elements. Most Fortran compilers nowadays automatically resort to using heap allocation for such big arrays but when it comes to (first|last)private variables, some OpenMP compilers, including GCC and Intel Fortran Compiler, always place them on the stack. See my answer here for more information.
Edit: Now I see that M. S. B. has actually linked to that same question in his comment.

Replace do loop with array notation in fortran

I would like to replace the following do loop with FORTRAN's intrinsic functions and array notations.
do i=2, n
do j=2, n
a=b(j)-b(j-1)
c(i,j)=a*c(i-1,j)+d(i,j)
end do
end do
However, as c(i,j) depends on c(i-1,j) none of following trials worked. Because they do not update c(i,j)
!FORALL(i = 2:n , j = 2:n ) c(i,j)=c(i-1,j)*(b(j)-b(j-1))+d(i,j)
!FORALL(i = 2:n) c(i,2:n)=c(i-1,2:n)*(b(2:n)-b(1:n-1))+d(i,2:n)
!c(2:n,2:n)=RESHAPE( (/(c(i-1,2:n)*(b(2:n)-b(1:n-1))+d(i,2:n),i=2,n)/), (/n-1, n-1/))
!c(2:n,2:n)=RESHAPE((/(((b(j)-b(j-1)) *c(i-1,j)+d(i,j) ,j=2,n),i=2,n)/), (/n-1, n-1/))
!c(2:n,2:n)=spread(b(2:n)-b(1:n-1),ncopies = n-1,dim=1) * c(1:n-1,2:n) +d(2:n,2:n)
This is the best I can get. But it still has a do loop
do i=2, n
c(i,2:n)=c(i-1,2:n)*(b(2:n)-b(1:n-1))+d(i,2:n)
end do
Could all do loops be replaced by intrinsic functions and array notation. Or could this one be replaced somehow ?
In my experience, nothing beats the traditional do-loop. All the extension intrinsics create memory and CPU overhead by copying stuff to temporary space (on the stack usually), reshaping and the sort. If you're manipulating large arrays, you may encounter out-of-memory issues with the intrinsic functions.
Your best option is to stick to a 2-d loop that has the indexes correctly laid out:
do i=2, n
e = c(1:n,i-1)
do j=2, n
a=b(j)-b(j-1)
c(j,i)=a*e(j)+d(j,i)
end do
end do
By replacing the indexes (and making sure your dimension declarations follow), you are saving on the memory paging. c(j,i) and d(j,i) references travel column-wise inside memory while c(j,i-1) would have cut across columns (and produced paging overhead). So we copy it to a temporary e array.
I think this will be the fastest....
Starting with
do i=2, n
do j=2, n
a=b(j)-b(j-1)
c(i,j)=a*c(i-1,j)+d(i,j)
end do
end do
we can quickly eliminate the do loops with some nifty use of SIZE, SPREAD and EOSHIFT:
res = SPREAD(b - EOSHIFT(b,-1),2,SIZE(c,2))*EOSHIFT(c,-1) + d
Aha, turns out that the error I was receiving (V1) was due to my using RESHAPE rather than SPREAD. I fixed this in the current version (V2) and it compiles & works with both ifort and gfortran.

Stack overflow in Fortran 90

I have written a fairly large program in Fortran 90. It has been working beautifully for quite a while, but today I tried to step it up a notch and increase the problem size (it is a research non-standard FE-solver, if that helps anyone...) Now I get the "stack overflow" error message and naturally the program terminates without giving me anything useful to work with.
The program starts with setting up all relevant arrays and matrices, and after that is done it prints a few lines of stats regarding this to a log-file. Even with my new, larger problem, this works fine (albeit a little slow), but then it fails as the "number crunching" gets going.
What confuses me is that everything at that point is already allocated (and that worked without errors). I'm not entirely sure what the stack is (Wikipedia and several treads here didn't do much since I have only a quite basic knowledge of the "behind the scenes" workings of a computer).
Assume that I for instance have some arrays initialized as:
INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(:,:),ALLOCATABLE :: AA, BB
which after some initialization routines (i.e. read input from file and such) are allocated as (I store some size-integers for easier passing to subroutines in IA of fixed size):
ALLOCATE( AA(N1,N2) , BB(N1,N2) )
IA(1) = N1
IA(2) = N2
This is basically what happens in the initial portion, and so far so good. But when I then call a subroutine
CALL ROUTINE_ONE(AA,BB,IA)
And the routine looks like (nothing fancy):
SUBROUTINE ROUTINE_ONE(AA,BB,IA)
IMPLICIT NONE
INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(IA(1),IA(2)) :: AA, BB
...
do lots of other stuff
...
END SUBROUTINE ROUTINE_ONE
Now I get an error! The output to the screen says:
forrtl: severe (170): Program Exception - stack overflow
However, when I run the program with the debugger it breaks at line 419 in a file called winsig.c (not my file, but probably part of the compiler?). It seems to be part of a routine called sigreterror: and it is the default case that has been invoked, returning the text Invalid signal or error. There is a comment line attached to this which strangely says /* should never happen, but compiler can't tell */ ...?
So I guess my question is, why does this happen and what is actually happening? I thought that as long as I can allocate all the relevant memory I should be fine? Does the call to the subroutine make copies of the arguments, or just pointers to them? If the answer is copies then I can see where the problem might be, and if so: any ideas on how to get around it?
The problem I try to solve is big, but not insane in any way. Standard FE-solvers can handle bigger problems than my current one. I run the program on a Dell PowerEdge 1850 and the OS is Microsoft Server 2008 R2 Enterprise. According to systeminfo at the cmd prompt I have 8GB of physical memory and almost 16GB virtual. As far as I understand the total of all my arrays and matrices should not add up to more than maybe 100MB - about 5.5M integer(4) and 2.5M real(8) (which according to me should be only about 44MB, but let's be fair and add another 50MB for overhead).
I use the Intel Fortran compiler integrated with Microsoft Visual Studio 2008.
Adding some actual source code to clarify a bit
! Update continuum state
CALL UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,&
bmtrx,detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
effstress,aa,fi,errmsg)
is the actual call to the routine. Big arrays are posc, bmtrx and aa - all other are at least an order of magnitude smaller (if not more). posc is INTEGER(4) and bmtrx and aa is REAL(8)
SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
effstress,aa,fi,errmsg)
IMPLICIT NONE
!I/O
INTEGER(4) :: iTask, errmsg
INTEGER(4) :: iArray(64)
INTEGER(4),DIMENSION(iArray(15),iArray(15),iArray(5)) :: posc
INTEGER(4),DIMENSION(iArray(22),iArray(21)+1) :: nodedof
INTEGER(4),DIMENSION(iArray(29),iArray(3)+2) :: elm
REAL(8),DIMENSION(iArray(14)) :: dof, dof_k
REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)*iArray(5)) :: bmtrx
REAL(8),DIMENSION(iArray(5)*iArray(17)) :: detjac
REAL(8),DIMENSION(iArray(17)) :: w
REAL(8),DIMENSION(iArray(23),iArray(19)) :: mtrlprops
REAL(8),DIMENSION(iArray(8),iArray(8),iArray(23)) :: demtrx
REAL(8) :: dt
REAL(8),DIMENSION(2,iArray(12)*iArray(17)*iArray(5)) :: stress
REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: strain
REAL(8),DIMENSION(2,iArray(17)*iArray(5)) :: effstrain, effstress
REAL(8),DIMENSION(iArray(25)) :: aa
REAL(8),DIMENSION(iArray(14)) :: fi
!Locals
INTEGER(4) :: i, e, mtrl, i1, i2, j1, j2, k1, k2, dim, planetype, elmnodes, &
Nec, elmpnodes, Ndisp, Nstr, Ncomp, Ngpt, Ndofelm
INTEGER(4),DIMENSION(iArray(15)) :: doflist
REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)) :: belm
REAL(8),DIMENSION(iArray(17)) :: jelm
REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: dstrain
REAL(8),DIMENSION(iArray(12)*iArray(17)) :: s
REAL(8),DIMENSION(iArray(17)) :: ep, es, dep
REAL(8),DIMENSION(iArray(15),iArray(15)) :: kelm
REAL(8),DIMENSION(iArray(15)) :: felm
dim = iArray(1)
...
And it fails before the last line above.
As per steabert's request, I'll just summarize the conversation in the comments here where it's a bit more visible, even though M.S.B.'s answer already gets right to the nub of the problem.
In technical programming, where procedures often have large local arrays for intermediate computation, this happens a lot. Local variables are generally stored on the stack, which typically (and quite reasonably) a small fraction of overall system memory -- usually of order 10MB or so. When the local variable sizes exceed the stack size, you see exactly the symptoms described here -- a stack overflow occuring after a call to the relevant subroutine but before its first executable statement.
So when this problem happens, the best thing to do is to find the relevant large local variables, and decide what to do. In this case, at least the variables belm and dstrain were getting quite sizable.
Once the variables are located, and you've confirmed that's the problem, there's a few options. As MSB points out, if you can make your arrays smaller, that's one option. Alternatively, you can make the stack size larger; under linux, that's done with ulimit -s [newsize]. That really just postpones the problem, though, and you have to do something different on windows machines.
The other class of ways to avoid this problem is not to put the large data on the stack, but in the rest of memory (the "heap"). You can do that by giving the arrays the save attribute (in C, static); this puts the variable on the heap and thus makes the values persistent between calls. The downside there is that this potentially changes the behavior of the subroutine, and means the subroutine can't be used recursively, and similarly is non-threadsafe (if you're ever in a position where multiple threads will enter the routine simulatneously, they'll each see the same copy of the local varaiable and potentially overwrite each other's results). The upside is that it's easy and very portable -- it should work everywhere. However, this will only work with fixed-size local variables; if the temporary arrays have sizes that depend on the inputs, you can't do this (since there'd no longer be a single variable to save; it could be different size every time the procedure is called).
There are compiler-specific options which put all arrays (or all arrays of larger than some given size) on the heap rather than on the stack; every Fortran compiler I know has an option for this. For ifort, used in the OPs post, it's -heap-arrays in linux, or /heap-arrays for windows. For gfortran, this may actually be the default. This is good for making sure you know what's going on, but it means you have to have different incantations for every compiler to make sure your code works.
Finally, you can make the offending arrays allocatable. Allocated memory goes on the heap; but the variable which points to them is on the stack, so you get the benefits of both approaches. Also, this is completely standard fortran and so totally portable. The downside is that it requires code changes. Also, the allocation process can take nontrivial amounts of time; so if you're going to be calling the routine zillions of times, you may notice this slows things down slightly. (This possible performance regression is easy to fix, though; if you'll be calling it zillions of times with the same size arrays, you can have an optional argument to pass in a pre-allocated local array and use that instead, so that you only allocate/deallocate once).
Allocating/deallocating each time would look like:
SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
effstress,aa,fi,errmsg)
IMPLICIT NONE
!...arguments....
!Locals
!...
REAL(8),DIMENSION(:,:), allocatable :: belm
REAL(8),DIMENSION(:), allocatable :: dstrain
allocate(belm(iArray(12)*iArray(17),iArray(15))
allocate(dstrain(iArray(12)*iArray(17)*iArray(5))
!... work
deallocate(belm)
deallocate(dstrain)
Note that if the subroutine does a lot of work (eg, takes seconds to execute), the overhead from a couple allocate/deallocates should be negligable. If not, and you want to avoid the overhead, using the optional arguments for preallocated worskpace would look something like:
SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
effstress,aa,fi,errmsg,workbelm,workdstrain)
IMPLICIT NONE
!...arguments....
real(8),dimension(:,:), optional, target :: workbelm
real(8),dimension(:), optional, target :: workdstrain
!Locals
!...
REAL(8),DIMENSION(:,:), pointer :: belm
REAL(8),DIMENSION(:), pointer :: dstrain
if (present(workbelm)) then
belm => workbelm
else
allocate(belm(iArray(12)*iArray(17),iArray(15))
endif
if (present(workdstrain)) then
dstrain => workdstrain
else
allocate(dstrain(iArray(12)*iArray(17)*iArray(5))
endif
!... work
if (.not.(present(workbelm))) deallocate(belm)
if (.not.(present(workdstrain))) deallocate(dstrain)
Not all of the memory is created when the program starts. When you call the subroutine the executable is creating the memory that the subroutine needs for local variables. Typically arrays with simple declarations that are local to that subroutine -- neither allocatable, nor pointer -- are allocated on the stack. You could have simply run of of stack space when you reached these declarations. You might have reached a 2GB limit on a 32-bit OS with some array. Sometimes executable statements implicitly create a temporary array on the stack.
Possible solutions: 1) make your arrays smaller (not attractive), 2) make the stack larger), 3) some compilers have options to switch from placing arrays on the stack to dynamically allocating them, similar to the method used for "allocate", 4) identify large arrays and make them allocatable.
The stack is the memory area where the information needed to return from a function, and the information locally defined in a function is stored. So a stack overflow may indicate you have a function that calls another function which in its turn calls another function, etc.
I am not familiar with Fortran (anymore) but another cause might be that those functions declare tons of local variables, or at least variables that need a lot of place.
A last one: the stack is typically rather small, so it's not a priori relevant how much memory the machine has. It should be quite simple to instruct the linker to increase the stack size, at least if you are certain it's just a lack of space, and not a bug in your application.
Edit: do you use recursion in your program? Recursive calls can eat through the stack very quickly.
Edit: have a look at this: (emphasis mine)
On Windows, the stack space to
reserved for the program is set using
the /Fn compiler option, where n is
the number of bytes. Additionally,
the stack reserve size can be
specified through the Visual Studio
IDE which adds the Microsoft Linker
option /STACK: to the linker command
line. To set this, go to Property
Pages>Configuration
Properties>Linker>System>Stack Reserve
Size. There you can specify the stack
size in bytes in either decimal or
C-language notation. If not specified,
the default stack size is 1MB.
The only problem I ran into with a similar test code, is the 2Gb allocation limit for 32-bit compilation. When I exceed it I get an error message on line 419 in winsig.c
Here is the test code
program FortranCon
implicit none
! Variables
INTEGER :: IA(64), S1
REAL(8), DIMENSION(:,:), ALLOCATABLE :: AA, BB
REAL(4) :: S2
INTEGER, PARAMETER :: N = 10960
IA(1)=N
IA(2)=N
ALLOCATE( AA(N,N), BB(N,N) )
AA(1:N,1:N) = 1D0
BB(1:N,1:N) = 2D0
CALL TEST(AA,BB,IA)
S1 = SIZEOF(AA) !Size of each array
S2 = 2*DBLE(S1)/1024/1024 !Total size for 2 arrays in Mb
WRITE (*,100) S2, ' Mb' ! When allocation reached 2Gb then
100 FORMAT (F8.1,A) ! exception occurs in Win32
DEALLOCATE( AA, BB )
end program FortranCon
SUBROUTINE TEST(AA,BB,IA)
IMPLICIT NONE
INTEGER, DIMENSION(64),INTENT(IN) :: IA
REAL(8), DIMENSION(IA(1),IA(2)),INTENT(INOUT) :: AA,BB
... !Do stuff with AA,BB
END SUBROUTINE
When N=10960 it runs ok showing 1832.9 Mb. With N=11960 it crashes. Of course when I compile with x64 it works ok. Each array has 8*N^2 bytes storage. I don't know if it helps but I recommend using the INTENT() keywords for the dummy variables.
Are you using some parallelization? This can be a problem with statically declared arrays. Try all bigger arrays make ALLOCATABLE, otherwise, they will be placed on the stack in autoparallel or OpenMP threads.
For me the issue was the stack reserve size. I went and changed the stack reserved size from 0 to 100000000 and recompiled the code. The code now runs smoothly.

Fortran 77 handling C++ memory allocations

I'm trying to write a C++ program that utilizes a few tens of thousands of lines of Fortran 77 code, but running into some strange errors. I'm passing three coordinates (x,y,z) and the address of three vectors from C++ into fortran, then having fortran run some computations on the initial points and return results in the three vectors.
I do this a few hundred times in a C++ function, leave that function, and then come back to do it again. It works perfectly the first time through, but the second time through it stops returning useful results (returns nan) for points with a positive x component.
Initially it seems like an algorithm problem, except for three things:
It works perfectly the first 200 times I run it
It works if I call it from fortran and eliminate C++ altogether (not viable for the final program)
I've tried adding print statements to fortran to debug where it goes wrong, but turns out if I add print statments to a specific subroutine (even something as simple as PRINT *,'Here'), the program starts returning NaNs even on the first run.
This is why I think it's something to do with how memory is being allocated and deallocated between C and fortran function/subroutine calls. The basic setup looks like this:
C++:
void GetPoints(void);
extern"C"
{
void getfield_(float*,float*,float*,float[],float[],float[],int*,int*);
}
int main(void)
{
GetPoints(); //Works
GetPoints(); //Doesn't
}
void GetPoints(void)
{
float x,y,z;
int i,n,l;
l=50;
n=1;
x=y=z=0.0;
float xx[l],yy[l],zz[l]
for(i=0;i<l;i++)
getfield_(&x,&y,&z,xx,yy,zz,&n,&l);
//Store current xx,yy,zz in large global array
}
Fortran:
SUBROUTINE GETFIELD(XI,YI,ZI,XX,YY,ZZ,IIN,NP)
DIMENSION XX(NP),YY(NP),ZZ(NP)
EXTERNAL T89c
T89c(XI,YI,ZI,XX,YY,ZZ)
RETURN
END
!In T89c.f
SUBROUTINE T89c(XI,YI,ZI,XX,YY,ZZ)
COMMON /STUFF/ ARRAY(100)
!Lots of calculations
!Calling ~20 other subroutines
RETURN
END
Do any of you see any glaring memory issues that I'm creating? Perhaps common blocks that fortran thinks exist but are really deallocated by C++? Without the ability to debug using print statements, nor the time to try to understand the few thousand lines of someone else's Fortran 77 code, I'm open to trying just about anything you all can suggest or think of.
I'm using g++ 4.5.1 for compiling the C++ code and final linking, and gfortran 4.5.1 for compiling the fortran code.
Thanks
**Edit:**
I've tracked the error down to some obscure piece of the code that was written before I was even born. It appears it's looking for some common variable that got removed in the updates over the years. I don't know why it only affected one dimension, nor why the bug was replicatable by adding a print statement, but I've eliminated it nonetheless. Thank you all for the help.
You may be running into the "off-by-one" error. Fortran arrays are 1-based, while C arrays are 0-based. Make sure the array sizes you pass into Fortran are not 1 less than they should be.
Edit:
I guess it looks right... Still, I would try allocating 51 elements in the C++ function, just to see what happens.
By the way float xx[l]; is not standard. This is a gcc feature. Normally you should be allocating memory with new here, or you should be using std::vector.
Also, I am confused by the call to getfield_ in the loop. Shouldn't you be passing i to getfield_?
You should declare XX, YY and ZZ as arrays also in the subroutine T89c as follows:
REAL*4 XX(*)
REAL*4 YY(*)
REAL*4 ZZ(*)
C/C++ should in general never deallocate any Fortran common blocks. These are like structs in C (i.e. memory is reserved at compile time, not at runtime).
For some reason, gfortran seems to accept the following in T89c even without the above declarations:
print *,XX(1)
during compilation but when executing it I get a segmentation fault.