How can I access a function on device via a function pointer?
In below code I am trying to access init0 or init1 using function pointer init. The code does work as intended if OpenACC is not enabled during compilation. However, it fails when compiled with OpenACC. Below code is saved as stackOverflow2.f95:
module modTest2
use openacc
implicit none
type :: Container
sequence
integer :: n
integer, allocatable :: arr(:)
end type Container
interface Container
procedure :: new_Container
end interface
abstract interface
integer function function_template (i)
integer, intent (in) :: i
end function function_template
end interface
contains
type(Container) function new_Container(n)
integer, intent(in) :: n
allocate(new_Container%arr(n))
end function new_Container
end module modTest2
program test2
use modTest2
implicit none
integer :: n, x, i
type(Container) :: c
procedure(function_template), pointer :: init
print *, "Enter array size: "
read *, n
print *, "Allocating..."
c = Container(n)
print *, "Allocation complete!"
print *, "Enter initialization type (x): "
read *, x
print *, "Initializing..."
select case (x)
case (0)
init => init0
case default
init => init1
end select
!$acc data copyin(c) copyout(c%arr)
!$acc parallel loop present(c)
do i = 1, n
c%arr(i) = init(i)
end do
!$acc end data
print *, "Initialization complete..."
do i = 1, n
print *, i, c%arr(i)
end do
contains
integer function init0(i)
!$acc routine
integer, intent(in) :: i
init0 = 10*i
end function init0
integer function init1(i)
!$acc routine
integer, intent(in) :: i
init1 = 20*i
end function init1
end program test2
Correct output is seen without OpenACC:
$ gfortran -c stackOverflow2.f95
$ gfortran stackOverflow2.o -o a.out
$ ./a.out
Enter array size:
3
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 10
2 20
3 30
Incorrect output is seen below with OpenACC (Note that NVIDIA compiler is used here):
$ /opt/nvidia/hpc_sdk/Linux_x86_64/22.1/compilers/bin/nvfortran stackOverflow2.f95 -acc; ./a.out
Enter array size:
3
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 0
2 0
3 0
Sorry but function pointers (along with C++ virtual functions) are not yet supported on the device. Adding the compiler feedback flag (-Minfo=accel), you'll see the following message:
% nvfortran -acc -Minfo=accel test.f90
test2:
62, Generating copyout(c%arr(:)) [if not already present]
Generating copyin(c) [if not already present]
65, Accelerator restriction: Indirect function/procedure calls are not supported
The problem being that indirect functions require a device jump table and runtime dynamic linking which is currently unavailable. While I don't have a timeline, we are exploring options on how to offer this support in the future.
Using gfortran-11 with the below did the trick:
module modTest2
use openacc
implicit none
type :: Container
sequence
integer :: n
integer, allocatable :: arr(:)
end type Container
interface Container
procedure :: new_Container
end interface
abstract interface
integer function function_template (i)
integer, intent (in) :: i
end function function_template
end interface
contains
type(Container) function new_Container(n)
integer, intent(in) :: n
allocate(new_Container%arr(n))
end function new_Container
end module modTest2
program test2
use modTest2
implicit none
integer :: n, x, i
type(Container) :: c
procedure(function_template), pointer :: init
print *, "Enter array size: "
read *, n
print *, "Allocating..."
c = Container(n)
print *, "Allocation complete!"
print *, "Enter initialization type (x): "
read *, x
print *, "Initializing..."
select case (x)
case (0)
init => init0
case default
init => init1
end select
!$acc enter data copyin(c)
!$acc enter data create(c%arr)
!$acc parallel loop present(c)
do i = 1, n
c%arr(i) = init(i)
end do
!$acc exit data copyout(c%arr)
!$acc exit data delete(c)
print *, "Initialization complete..."
do i = 1, n
print *, i, c%arr(i)
end do
contains
integer function init0(i)
!$acc routine
integer, intent(in) :: i
init0 = 10*i
end function init0
integer function init1(i)
!$acc routine
integer, intent(in) :: i
init1 = 20*i
end function init1
end program test2
Here's the output:
$ gfortran-11 -fopenacc stackOverflow2.f95
$ gfortran-11 -fopenacc stackOverflow2.o -o stackOverflow2
$ ./stackOverflow2
Enter array size:
4
Allocating...
Allocation complete!
Enter initialization type (x):
0
Initializing...
Initialization complete...
1 10
2 20
3 30
4 40
$ ./stackOverflow2
Enter array size:
4
Allocating...
Allocation complete!
Enter initialization type (x):
9
Initializing...
Initialization complete...
1 20
2 40
3 60
4 80
Related
We know we can optimise our code from outside.
We know in fortran programming we define variables first in the program. Then we can take inputs from outside (via read statements) but can we make a small code that take kind of variable from outside.
I.e. if we put in the terminal 4 kind 4(i.e.real(kind =4) ) variable is introduced ie if we put 8 kind 8(i.e real(kind=8) variable is introduced . Is there any way.
I know we can do three separate if loops to define the variables (namely kind 4 , kind 8 ,kind 16 and repeat the program the program three times ).
The code i wrote was for findinding value of y using eulers method.
I want to generalise to any kind and calculate the time taken. I hope this can be done in lesser cumbersome way.
The code I wrote:
program euler
implicit none
real(kind=4)::t,h,y,s,e,r
real(kind=8)::t8,h8,y8,s8,e8,r8
real(kind=16)::t16,h16,y16,s16,e16,r16
integer::k,i
t=0
t8=0
t16=0
y=10
y8=10
y16=10
print *,"enter the kind you want to work with"
read(*,*) k
!so if user writes 4 kind 4 variables would do the work
if(k==4) then
print *,"enter the grid step"
read(*,*) h
r=10*exp(-5.0)
call cpu_time(s)
do i=1,999999999
if(0.le.t.and.t.le.25) then
y=y-h*y/5.0
t=t+h
end if
end do
call cpu_time(e)
print *,"solution is",y
print *,"the error is",(r-y)/r
print *,"time taken in seconds =",e-s
else if(k==8) then
print *,"enter the grid step"
read(*,*) h8
r8=10*exp(-5.0D0)
call cpu_time(s8)
do i=1,999999999
if(0.le.t8.and.t8.le.25) then
y8=y8-h8*y8/5.0
t8=t8+h8
end if
end do
call cpu_time(e8)
print *,"solution is",y8
print *,"the error is",(r8-y8)/r8
print *,"time taken in seconds for kind 8 =",e8-s8
else if(k==16) then
print *,"enter the grid step"
read(*,*) h16
r16=10*exp(-5.0D0)
call cpu_time(s16)
do i=1,999999999
if(0.le.t16.and.t16.le.25) then
y16=y16-h16*y16/5.0
t16=t16+h16
end if
end do
call cpu_time(e16)
print *,"solution is",y16
print *,"the error is",(r16-y16)/r16
print *,"time taken in seconds for kind 16 =",e16-s16
end if
end program euler
But im looking something more smart and less cumbersome.
I don't think this is 100% what you want because there are still three separate subroutines each for a different kind, but they are called using the interface euler which determines which one to use based on the arguments
program SO_Euler
use iso_fortran_env, only : sp=>real32, dp=>real64, qp=>real128, i4=>int32, i8=>int64
implicit none
interface euler
procedure euler_driver_sp, euler_driver_dp, euler_driver_qp
end interface
real(sp), parameter :: r4 = 10*exp(-5.0)
real(dp), parameter :: r8 = 10*exp(-5d0)
real(qp), parameter :: r16 = 10*exp(-5q0)
real(sp) :: h
print *,"enter the grid step"
read(*,*) h
print *, ""
call euler(r4, h)
call euler(r8, h)
call euler(r16, h)
contains
subroutine euler_driver_sp(r,h_in)
real(sp), intent(in) :: r
real(sp), intent(in) :: h_in
real(sp) :: h, y, t
integer(i8) :: s, e, rate
integer :: i
print '(a15,1x,g0)', "kind is ", kind(r)
h = h_in
t = 0
y = 10
call SYSTEM_CLOCK(s,rate)
do i=1,999999999
if(0<=t .and. t<=25) then
y=y-h*y/5
t=t+h
else
exit
end if
end do
call SYSTEM_CLOCK(e,rate)
print '(a15,1x,g0.15)',"solution is", y
print '(a15,1x,g0.15)',"the error is", (r-y)/r
print '(a15,1x,g0.4,1x,a)',"time taken is", real(e-s)/rate,"seconds"
print *, ""
end subroutine
subroutine euler_driver_dp(r, h_in)
real(dp), intent(in) :: r
real(sp), intent(in) :: h_in
real(dp) :: h, y, t
integer(i8) :: s, e, rate
integer :: i
print '(a15,1x,g0)', "kind is ", kind(r)
h = h_in !! convert sp=>dp
t = 0
y = 10
call SYSTEM_CLOCK(s,rate)
do i=1,999999999
if(0<=t .and. t<=25) then
y=y-h*y/5
t=t+h
else
exit
end if
end do
call SYSTEM_CLOCK(e,rate)
print '(a15,1x,g0.15)',"solution is", y
print '(a15,1x,g0.15)',"the error is", (r-y)/r
print '(a15,1x,g0.4,1x,a)',"time taken is", real(e-s)/rate,"seconds"
print *, ""
end subroutine
subroutine euler_driver_qp(r, h_in)
real(qp), intent(in) :: r
real(sp), intent(in) :: h_in
real(qp) :: h, y, t
integer(i8) :: s, e, rate
integer :: i
print '(a15,1x,g0)', "kind is ", kind(r)
h = h_in ! convert sp=>qp
t = 0
y = 10
call SYSTEM_CLOCK(s,rate)
do i=1,999999999
if(0<=t .and. t<=25) then
y=y-h*y/5
t=t+h
else
exit
end if
end do
call SYSTEM_CLOCK(e,rate)
print '(a15,1x,g0.15)',"solution is", y
print '(a15,1x,g0.15)',"the error is", (r-y)/r
print '(a15,1x,g0.4,1x,a)',"time taken is", real(e-s)/rate,"seconds"
print *, ""
end subroutine
end program
here is some sample output of the procedure
enter the grid step
0.000002
kind is 4
solution is .547848604619503E-01
the error is .186920538544655
time taken is .1020 seconds
kind is 8
solution is .673793765102040E-01
the error is .138737586862949E-05
time taken is .7200E-01 seconds
kind is 16
solution is .673793765102226E-01
the error is .138737559174033E-05
time taken is 1.535 seconds
Note that I am compiling in 64bit release mode, and have floating-point model not fast, but strict as well as the option to extend the precision of real constants.
Let's say I want to perform an operation in my main program (in Fortran). And lets say that operation is finding minimum number in a 1D array. I wish to do so by passing the array into the call subroutine and the subroutine will print the minimum value on the screen. There are different ways or algorithms to find minimum value in an array. Lets say I have 100 different methods: Method1, Method2..... Method100. Now I want to try using each one of these methods separately (I don't want to try all of them at once, but one method in each run). I don't want to create 100 different subroutines and change the code every time to decide which one to call, rather I want to mention in the input file which one I want to choose. So basically, the computer has to read the input file (to know which method to use) and perform the task using the specified method amongst different methods available.
I can write a Subroutine dump all the methods into that subroutine and put an IF condition to choose among various methods. But IF conditions are in efficient particularly on GPUs, I want to know the most efficient way of doing this.
MAIN PROGRAM
INTEGER Method !will be read from input file
Array = [12,5,3,4,1,7,4,3]
call print_Minimum(Array)
END PROGRAM
SUBROUTINE print_Minimum(Array)
IF (METHOD == 1)
<method 1 code>
ELSE IF (METHOD == 2)
<method 2 code>
:
:
:
:
ELSE IF (METHOD == 100)
<method100 code>
END IF
END SUBROUTINE
Thanks in advance.
This is probably best done using a function pointer and/or functions as arguments.
You can set a function pointer to a certain function and do this in your nested ifs and you can pass functions as arguments.
Both methods are implemented in the following example.
module minimum_mod
implicit none
private
public :: get_min_t, naive_min, time_min_function
abstract interface
integer pure function get_min_t(X)
integer, intent(in) :: X(:)
end function
end interface
contains
subroutine time_min_function(f, X)
procedure(get_min_t) :: f
integer, intent(in) :: X(:)
integer :: res
res = f(X)
write(*, *) res
end subroutine
integer pure function naive_min(X)
integer, intent(in) :: X(:)
integer :: i
naive_min = huge(naive_min)
do i = 1, size(X)
naive_min = min(naive_min, X(i))
end do
end function
end module
program time_min_finders
use minimum_mod, only: get_min_t, naive_min, time_min_function
implicit none
integer, parameter :: test_set(5) = [1, 10, 3, 5, 7]
procedure(get_min_t), pointer :: f
f => naive_min
call time_min_function(f, test_set)
end program
PS: Note that you can now do all the timinig logic inside time_min_function.
You can create an array of a derived type that contains a function pointer, effectively an array of function pointers. Then in principle you could initialize the function pointers to point at all of your test functions so that you could refer to each function by its index without having to test with a SELECT CASE of IF block: this is the typical Fortran way. However, either I've got the syntax for initialization wrong or my old version of gfortran just isn't capable, so I had to initialize one at a time. Sigh.
module minfuncs
implicit none
abstract interface
function func(array)
integer, intent(in) :: array(:)
integer func
end function func
end interface
type func_node
procedure(func), NOPASS, pointer :: f
end type func_node
! type(func_node) :: method(5) = [func_node(min_1),func_node(min_2), &
! func_node(min_3),func_node(min_4),func_node(min_5)]
contains
function min_1(array)
integer, intent(in) :: array(:)
integer min_1
integer i
min_1 = array(1)
do i = 2, size(array)
min_1 = min(min_1,array(i))
end do
end function min_1
function min_2(array)
integer, intent(in) :: array(:)
integer min_2
integer i
min_2 = array(1)
do i = 8, size(array), 7
min_2 = min(min_2,array(i-6),array(i-5),array(i-4), &
array(i-3),array(i-2),array(i-1),array(i))
end do
do i = i-6, size(array)
min_2 = min(min_2, array(i))
end do
end function min_2
function min_3(array)
integer, intent(in) :: array(:)
integer min_3
integer i
min_3 = array(1)
do i = 2, size(array)
min_3 = min_3-dim(min_3,array(i))
end do
end function min_3
function min_4(array)
integer, intent(in) :: array(:)
integer min_4
integer ymm(8)
integer i
if(size(array) >= 8) then
ymm = array(1:8)
do i = 16, size(array), 8
ymm = min(ymm,array(i-7:i))
end do
min_4 = minval([ymm,array(i-7:size(array))])
else
min_4 = minval(array)
end if
end function min_4
function min_5(array)
integer, intent(in) :: array(:)
integer min_5
min_5 = minval(array)
end function min_5
end module minfuncs
program test
use minfuncs
implicit none
integer, parameter :: N = 75
integer i
integer :: A(N) = modulo(5*[(i,i=1,N)]**2,163)
! type(func_node) :: method(5) = [func_node(min_1),func_node(min_2), &
! func_node(min_3),func_node(min_4),func_node(min_5)]
type(func_node) method(5)
method(1)%f => min_1
method(2)%f => min_2
method(3)%f => min_3
method(4)%f => min_4
method(5)%f => min_5
do i = 1, size(method)
write(*,*) method(i)%f(A)
end do
end program test
Output:
2
2
2
2
2
I am currently accelerating a Fortran code where I have a main accelerated loop in subroutine sub. In the loop, I want to call subroutine subsub on the device with acc routine. The subroutine has an intent(out) argument val, which is private in the loop. As subsub has a loop itself, I want to use the vector clause:
module calc
implicit none
public :: sub
private
contains
subroutine sub()
integer :: i
integer :: array(10)
integer :: val
!$acc kernels loop independent private(val)
do i = 1, 10
call subsub(val)
array(i) = val
enddo
print "(10(i0, x))", array
endsubroutine
subroutine subsub(val)
!$acc routine vector
integer, intent(out) :: val
integer :: i
val = 0
!$acc loop independent reduction(+:val)
do i = 1, 10
val = val + 1
enddo
endsubroutine
endmodule
program test
use calc, only: sub
implicit none
call sub()
endprogram
When compiling with the PGI compiler version 20.9-0 and running the program, I get gibberish values in variable array. When I simply use acc routine for subsub, I get the correct behavior (10 in all values of array). What is wrong in my approach to parallelize this subroutine?
It does look like a compiler code generation issue on how val is getting handled in the main loop. Luckily the workaround is easy, just add the installation of val in the main loop.
% cat test.f90
module calc
implicit none
public :: sub
private
contains
subroutine sub()
integer :: i
integer :: array(10)
integer :: val
!$acc kernels loop independent private(val)
do i = 1, 10
val = 0
call subsub(val)
array(i) = val
enddo
print "(10(i0, x))", array
endsubroutine
subroutine subsub(val)
!$acc routine vector
integer, intent(out) :: val
integer :: i
val = 0
!$acc loop independent reduction(+:val)
do i = 1, 10
val = val + 1
enddo
endsubroutine
endmodule
program test
use calc, only: sub
implicit none
call sub()
endprogram
% nvfortran -acc -Minfo=accel test.f90 -V20.9 ; a.out
sub:
10, Generating implicit copyout(array(:)) [if not already present]
11, Loop is parallelizable
Generating Tesla code
11, !$acc loop gang ! blockidx%x
subsub:
18, Generating Tesla code
24, !$acc loop vector ! threadidx%x
Generating reduction(+:val)
Vector barrier inserted for vector loop reduction
24, Loop is parallelizable
10 10 10 10 10 10 10 10 10 10
Consider the following Fortran program:
program test_prg
implicit none
integer :: i
integer, allocatable :: arr(:, :)
arr = reshape([(i, i = 1, 100)], [10, 10])
do i = 1, 3
print *, 'Column', i
call write_array(arr(1:2, i))
end do
contains
subroutine write_array(array)
class(*), intent(in) :: array(:)
integer :: i
do i = 1, size(array)
select type (elem => array(i))
type is (integer)
print '(I0)', elem
end select
end do
end subroutine
subroutine write_array_alt(array)
class(*), intent(in) :: array(:)
integer :: i
select type (array)
type is (integer)
print '(I0)', array
end select
end subroutine
end program
Compiled with gfortran 9.3.0, the program prints:
Column 1
1
2
Column 2
21
22
Column 3
41
42
As you can see, it skips every other column. If I replace the call to write_array with the call to write_array_alt, where I select type of the whole array and not individual elements, then every column is printed, as expected. Is this a gfortran bug or is this code illegal?
I have written a module that contains a interface called 'push' that pushes values onto allocatable arrays. I want it to have generic behavior so that I can add a new function for a given type to the 'push' interface as needed. The problem is that as the amount of functions for a given interface grows, so does the strange behavior of the push interface.
Code for the module (push_array.f90):
module push_array
implicit none
! usage:
! array = push(array,val)
interface push
module procedure push_scalar_int_onto_rank1_int
module procedure push_scalar_int2_onto_rank1_int2
module procedure push_rank1_int_onto_rank2_int
module procedure push_rank1_real8_onto_rank2_real8
end interface push
contains
function push_scalar_int_onto_rank1_int (array,val) result (new_array)
integer,intent(in),allocatable :: array(:)
integer,intent(in) :: val
integer,allocatable :: new_array(:)
integer :: length
if (allocated(array)) then
length = size(array) + 1
else
length = 1
end if
allocate(new_array(size(array) + 1))
if (allocated(array)) new_array(:) = array(:)
new_array(length) = val
return
end function push_scalar_int_onto_rank1_int
function push_scalar_int2_onto_rank1_int2 (array,val) result (new_array)
integer(2),intent(in),allocatable :: array(:)
integer(2),intent(in) :: val
integer(2),allocatable :: new_array(:)
integer :: length
if (allocated(array)) then
length = size(array) + 1
else
length = 1
end if
allocate(new_array(size(array) + 1))
if (allocated(array)) new_array(:) = array(:)
new_array(length) = val
return
end function push_scalar_int2_onto_rank1_int2
function push_rank1_int_onto_rank2_int (array,val) result (new_array)
integer,intent(in),allocatable :: array(:,:)
integer,intent(in) :: val(:)
integer,allocatable :: new_array(:,:)
integer :: length
if (allocated(array)) then
length = size(array,2) + 1
else
length = 1
end if
allocate(new_array(1:size(val),length))
if (allocated(array)) new_array(1:size(val),:) = array(1:size(val),:)
new_array(1:size(val),length) = val
return
end function push_rank1_int_onto_rank2_int
function push_rank1_real8_onto_rank2_real8 (array,val) result (new_array)
real(8),intent(in),allocatable :: array(:,:)
real(8),intent(in) :: val(:)
real(8),allocatable :: new_array(:,:)
integer :: length
if (allocated(array)) then
length = size(array,2) + 1
else
length = 1
end if
allocate(new_array(1:size(val),length))
if (allocated(array)) new_array(1:size(val),:) = array(1:size(val),:)
new_array(1:size(val),length) = val
return
end function push_rank1_real8_onto_rank2_real8
end module push_array
Test code (test_push_array.f90):
program main
use push_array, only: push
implicit none
integer,allocatable :: a(:)
integer(2),allocatable :: b(:)
integer,allocatable :: c(:,:)
real(8),allocatable :: d(:,:)
integer :: xp(3)
real(8) :: xp8(3)
integer :: i
integer(2) :: j
! test that a scalar integer can be pushed onto a rank1 integer array
do i=1,100
a = push(a,i)
end do
print *, a(1),a(100)
! test that a scalar integer(2) can be pushed onto a rank1 integer(2) array
do j=1,100
b = push(b,j)
end do
print *, b(1),b(100)
! test that a rank1 integer can be pushed onto a rank2 integer
do i=1,100
xp = [i,i+1,i+2]
c = push(c,xp)
end do
print *, c(1:3,1),c(1:3,100)
! test that a rank1 real(8) can be pushed onto a rank2 real(8)
do i=1,100
xp8 = [i + 0.001,i + 0.002, i + 0.003]
d = push(d,xp8)
end do
print *, d(:,1),d(:,100)
end program main
make output to show compiler flags:
$ make
gfortran -g -O2 -c push_array.f90
gfortran -g -O2 -o main test_push_array.f90 push_array.o
My compiler version:
$ gfortran --version
GNU Fortran (GCC) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
My system:
$ uname -a
Darwin darthan 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64
If I run the test code with as given, it goes into an infinite loop and my system memory is completely exhausted. I tried to trace the test case in gdb by setting breakpoint where I push i onto a in the first loop,but gdb is unable to step into the module function.
If I comment just the first test loop where i is pushed onto a, here are the results:
$ ./main
1 100
1 2 3 100 101 102
1.0010000467300415 1.0019999742507935 1.0030000209808350 100.00099945068359 100.00199890136719 100.00299835205078
These would be expected.
If I comment out just the second loop where I push j onto b, here are the results:
$ ./main
1 100
1 2 3 100 101 102
1.0010000467300415 1.0019999742507935 1.0030000209808350 100.00099945068359 100.00199890136719 100.00299835205078
Once again, as expected.
Things start getting strange when I comment out just the third loop where I push xp onto c:
$ ./main
1 0
1 0
1.0010000467300415 1.0019999742507935 1.0030000209808350 100.00099945068359 100.00199890136719 100.00299835205078
The pattern continues when I comment out just the fourth loop where I push xp8 onto d:
$ ./main
1 0
1 0
1 2 3 100 101 102
My questions:
Why does the main test program go into a infinite loop when I try to use all four functions defined in the push interface in the same program?
In the case where I comment out the third and fourth loops , why do the results for a(100) and b(100) both equal to 0?
Any feedback would be appreciated...thanks!
Edit:
The two functions that needed to be changed are given below
function push_scalar_int_onto_rank1_int (array,val) result (new_array)
integer,intent(in),allocatable :: array(:)
integer,intent(in) :: val
integer,allocatable :: new_array(:)
integer :: length
if (allocated(array)) then
length = size(array) + 1
else
length = 1
end if
allocate(new_array(length)) ! changed
if (allocated(array)) new_array(:) = array(:)
new_array(length) = val
return
end function push_scalar_int_onto_rank1_int
function push_scalar_int2_onto_rank1_int2 (array,val) result (new_array)
integer(2),intent(in),allocatable :: array(:)
integer(2),intent(in) :: val
integer(2),allocatable :: new_array(:)
integer :: length
if (allocated(array)) then
length = size(array) + 1
else
length = 1
end if
allocate(new_array(length)) ! changed
if (allocated(array)) new_array(:) = array(:)
new_array(length) = val
return
end function push_scalar_int2_onto_rank1_int2
You allocate statement in some of the function bodies references the size of the array argument. If the array argument is not allocated, that reference is invalid.
Earlier in the procedure you test for allocation status and set a variable named length - perhaps you meant to use that.
(For clarity - perhaps look at the allocate statement in the push_scalar_int_onto_rank1_int function.)
IanH explains the problem. The compiler can help you find it. I get different responses from gfortran 4.8 depending on the compilation options. With -O2 -fimplicit-none -Wall -Wline-truncation -Wcharacter-truncation -Wsurprising -Waliasing -Wimplicit-interface -Wunused-parameter -fcheck=all -std=f2008 -pedantic -fbacktrace it compiles, then gives runtime error:
At line 25 of file push.f90
Fortran runtime error: Index '1' of dimension 1 of array 'new_array' above upper bound of -265221874
line 25 is the line before return in push_scalar_int_onto_rank1_int.