Fortran error: size of variable is too large - fortran

I have a long program and the goal is to solve the matrix system ax=b. When I run it, it reveals that "error: size of variable is too large".
program ddm
integer :: i,j,k
integer, parameter :: FN=1,FML=80,FMH=80
integer, parameter :: NBE=1*80*80 !NBE=FN*FML*FMH
double precision, dimension(1:3*NBE,1:3*NBE) :: AA
double precision, dimension(1:3*NBE) :: BB
double precision :: XX(3*NBE)
double precision, dimension(1:NBE) :: DSL,DSH,DNN
double precision, dimension(1:FML,1:FMH) :: DSL1,DSH1,DNN1
! Construct a block matrix
AA(1:NBE,1:NBE) = SLSL
AA(1:NBE,NBE+1:2*NBE) = SLSH
AA(1:NBE,2*NBE+1:3*NBE) = SLNN
AA(NBE+1:2*NBE,1:NBE) = SHSL
AA(NBE+1:2*NBE,NBE+1:2*NBE) = SHSH
AA(NBE+1:2*NBE,2*NBE+1:3*NBE) = SHNN
AA(2*NBE+1:3*NBE,1:NBE) = NNSL
AA(2*NBE+1:3*NBE,NBE+1:2*NBE) = NNSH
AA(2*NBE+1:3*NBE,2*NBE+1:3*NBE) = NNNN
! Construct a block matrix for boundary condition
BB(1:NBE) = SLBC
BB(NBE+1:2*NBE) = SHBC
BB(2*NBE+1:3*NBE) = NNBC
call GE(AA,BB,XX,3*NBE)
DSL = XX(1:NBE)
DSH = XX(NBE+1:2*NBE)
DNN = XX(2*NBE+1:3*NBE)
DSL1 = reshape(DSL,(/FML,FMH/))
DSH1 = reshape(DSH,(/FML,FMH/))
DNN1 = reshape(DNN,(/FML,FMH/))
open(unit=2, file='DNN2.txt', ACTION="write", STATUS="replace")
do i=1,80
write(2,'(*(F14.7))') real(DNN1(i,:))
end do
end program ddm
Note: GE(AA,BB,XX,3*NBE) is the function for solving the matrix system. Below is the GE function.
subroutine GE(a,b,x,n)
!===========================================================
! Solutions to a system of linear equations A*x=b
! Method: Gauss elimination (with scaling and pivoting)
!-----------------------------------------------------------
! input ...
! a(n,n) - array of coefficients for matrix A
! b(n) - array of the right hand coefficients b
! n - number of equations (size of matrix A)
! output ...
! x(n) - solutions
! coments ...
! the original arrays a(n,n) and b(n) will be destroyed
! during the calculation
!===========================================================
implicit none
integer n
double precision a(n,n),b(n),x(n)
double precision s(n)
double precision c, pivot, store
integer i, j, k, l
! step 1: begin forward elimination
do k=1, n-1
! step 2: "scaling"
! s(i) will have the largest element from row i
do i=k,n ! loop over rows
s(i) = 0.0
do j=k,n ! loop over elements of row i
s(i) = max(s(i),abs(a(i,j)))
end do
end do
! step 3: "pivoting 1"
! find a row with the largest pivoting element
pivot = abs(a(k,k)/s(k))
l = k
do j=k+1,n
if(abs(a(j,k)/s(j)) > pivot) then
pivot = abs(a(j,k)/s(j))
l = j
end if
end do
! Check if the system has a sigular matrix
if(pivot == 0.0) then
write(*,*) "The matrix is singular"
return
end if
! step 4: "pivoting 2" interchange rows k and l (if needed)
if (l /= k) then
do j=k,n
store = a(k,j)
a(k,j) = a(l,j)
a(l,j) = store
end do
store = b(k)
b(k) = b(l)
b(l) = store
end if
! step 5: the elimination (after scaling and pivoting)
do i=k+1,n
c=a(i,k)/a(k,k)
a(i,k) = 0.0
b(i)=b(i)- c*b(k)
do j=k+1,n
a(i,j) = a(i,j)-c*a(k,j)
end do
end do
end do
! step 6: back substiturion
x(n) = b(n)/a(n,n)
do i=n-1,1,-1
c=0.0
do j=i+1,n
c= c + a(i,j)*x(j)
end do
x(i) = (b(i)- c)/a(i,i)
end do
end subroutine GE

Turn your arrays (at least AA, BB, XX) into allocatable arrays and allocate them by yourself in the code. You are hitting the memory limit of statically allocated arrays. There is a limit of 2GB on some systems if I remember well (experts will confirm or give the right numbers).

Related

Solving a linear system with 0s on the main diagonal in Fortran

As per title, what's the best algorithm to numerically solve a linear system in Fortran, if this system has 0s along the main diagonal?
Up to now, I had been fine using simple Gaussian elimination:
SUBROUTINE solve_lin_sys(A, c, x, n)
! =====================================================
! Uses gauss elimination and backwards substitution
! to reduce a linear system and solve it.
! Problem (0-div) can arise if there are 0s on main diag.
! =====================================================
IMPLICIT NONE
INTEGER:: i, j, k
REAL*8::fakt, summ
INTEGER, INTENT(in):: n
REAL*8, INTENT(inout):: A(n,n)
REAL*8, INTENT(inout):: c(n)
REAL*8, INTENT(out):: x(n)
DO i = 1, n-1 ! pick the var to eliminate
DO j = i+1, n ! pick the row where to eliminate
fakt = A(j,i) / A(i,i) ! elimination factor
DO k = 1, n ! eliminate
A(j,k) = A(j,k) - A(i,k)*fakt
END DO
c(j)=c(j)-c(i)*fakt ! iterate on known terms
END DO
END DO
! Actual solving:
x(n) = c(n) / A(n,n) ! last variable being solved
DO i = n-1, 1, -1
summ = 0.d0
DO j = i+1, n
summ = summ + A(i,j)*x(j)
END DO
x(i) = (c(i) - summ) / A(i,i)
END DO
END SUBROUTINE solve_lin_sys
As you can see I'm dividing by A(i,i) in the calculations. Same problem arises using Gauss-Jordan transformation or Gauss-Seidel elimination.
What's the best solution? I know i'm probably missing some really basic step but I'm a beginner programmer and apparently my linear algebra is getting rusty.

Infinite do loop issue

Just to preface this question, I am a couple weeks new to Fortran and I have run into an infinite do loop error when running the actual program.
I am 99% sure that the infinite sequence is coming from the do loop, but I thought I should ask just to be sure. I am not sure what part of the do loop is causing the infinite do loop but any help would be greatly appreciated!
Here is the code:
implicit none
! Declare variables - Add variables as necessary (integer only!)
! M will store the encoding matrix, MInv will store its inverse
! Decoded_message will store the decoded message
integer :: M(2,2), MInv(2,2), Determinant, a, b, c, d, detM, i, v(:,:), ascIIcode(:,:)
allocatable :: v, ascIIcode
character*32 :: Decoded_Message
! open data file and read in the encoding matrix
open(42,file='Data3.txt')
read(42,*) M(1,1), M(1,2)
read(42,*) M(2,1), M(2,2)
! Invert the encoding matrix and store it in MInv
detM = determinant(M)
MInv(1,1) = +detM *M(2,2)
MInv(1,2) = -detM *M(2,1)
MInv(2,1) = -detM *M(1,2)
MInv(2,2) = +detM *M(1,1)
! Processing steps required:
! Read from the file in 2 numbers at a time and store in a vector array
do i = 2, 31
allocate (v(2,1), ascIIcode(2,1))
read(42,*) v(1,1)
read(42,*) v(2,1)
! decode the 2 numbers read in (1) by multiplying Minv by the vector array from (1)
ascIIcode(1,1) = ((MInv(1,1)*v(1,1))+(MInv(1,2)*v(2,1)))
ascIIcode(2,1) = ((MInv(2,1)*v(1,1))+(MInv(2,2)*v(2,1)))
! Insert the result from (2) into the character string Decoded_Message. To concatinate
Decoded_Message = char(ascIIcode(1,1))//char(ascIIcode(2,1))
! Use a loop that advances in steps of 2 and goes to 31
deallocate (v)
deallocate (ascIIcode)
end do
! print results.
print*, Decoded_Message
! close files
close(42)
end program Decode
integer function Determinant(M)
! This function computes the determinant of matices of size 2 or 3
! M is the matrix for which the determinant is calculated (square matrix only)
! n is the number of rows or columns in M
implicit none
integer :: M(2,2), a, b, c, d, e, f, g, h, i, Det
do
a = M(1,1)
b = M(1,2)
c = M(2,1)
d = M(2,2)
Det = (a*d)-(b*c)
end do
end function Determinant

Using two conditions to write a conditional loop in Fortran

Hoping someone could help me. I was just introduced to Fortran and can't seem to figure out why my code is producing an infinite loop.
I want to write a code that finds the root (c) of a function f(x)= x^3 - 3x - 4 between the intervals [2,3]:
I want the steps to be: initialize a and b.
Then calculate c = (a+b)/2.
Then if f(c) < 0, set b=c and repeat the previous step. If f(c) > 0, then set a=c and repeat the previous step.
The point is to repeat these steps until we get 1e-4 close to the actual root.
This is what I have written so far and is it producing an infinite loop.
I am also confused about whether it is a good idea to use the two condition loop (as in the function has to be greater/less than 0 .AND. absolute value of the function has to be less than 1e-4).
Any help/tips would be greatly appreciated!
MY CODE:
PROGRAM proj
IMPLICIT NONE
REAL :: a=2.0, b=3.0, c, f
INTEGER :: count1
c = (a + b)/2
f = c**3 - 3c - 4
DO
IF (( f .GT. 0.0) .AND. ( ABS(f) .LT. 1e-4)) EXIT
c = (a+c)/2
f = c**3 - 3c - 4
count1 = count1 + 1
PRINT*, f, c,count1
END DO
PRINT*, c, f
END PROGRAM proj
I want to be able to show the iterations and print each step (getting closer to the actual root).
What you have described is the bisection method for localizing a zero
of a function in the interval [a:b]. There are three possibilities.
The interval does not contain a zero.
An endpoint of the interval is a zero.
There are more than one zero in the interval.
This program implements bisection where a number of subintervals
are inspected. There are other, and better, methods but this should
be understandable for you.
!
! use bisection to locate the zeros of a function f(x) in the interval
! [a,b]. There are three possibilities to consider: (1) The interval
! contains no zeros; (2) One (or both) endpoints is a zero; or (3)
! more than one point is a zero.
!
program proj
implicit none
real dx, fl, fr, xl, xr
real, allocatable :: x(:)
integer i
integer, parameter :: n = 1000
xl = 2 ! Left endpoint
xr = 3 ! Right endpoint
dx = (xr - xl) / (n - 1) ! Coarse increment
allocate(x(n))
x = xl + dx * [(i, i=0, n-1)] ! Precompute n x-values
x(n) = xr ! Make sure last point is xr
!
! Check end points for zeros. Comparison of a floating point variable
! against zero is exact.
!
fl = f(xl)
if (fl == 0) then
call prn(xl, fl)
x(1) = x(1) + dx / 10 ! Nudge passed xl
end if
fr = f(xr)
if (fr == 0) then
call prn(xr, fr)
x(n) = x(n) - dx / 10 ! Reduce upper xr
end if
!
! Now do bisection. Assumes at most one zero in a subinterval.
! Make n above larger for smaller intervals.
!
do i = 1, n - 1
call bisect(x(i), x(i+1))
end do
contains
!
! The zero satisfies xl < zero < xr
!
subroutine bisect(xl, xr)
real, intent(in) :: xl, xr
real a, b, c, fa, fb, fc
real, parameter :: eps = 1e-5
a = xl
b = xr
do
c = (a + b) / 2
fa = f(a)
fb = f(b)
fc = f(c)
if (fa * fc <= 0) then ! In left interval
if (fa == 0) then ! Endpoint is a zero.
call prn(a, fa)
return
end if
if (fc == 0) then ! Endpoint is a zero.
call prn(c, fc)
return
end if
!
! Check for convergence. The zero satisfies a < zero < c.
!
if (abs(c - a) < eps) then
c = (a + c) / 2
call prn(c, f(c))
return
end if
!
! Contract interval and try again.
!
b = c
else if (fc * fb <= 0) then ! In right interval
if (fc == 0) then ! Endpoint is a zero.
call prn(c, fc)
return
end if
if (fb == 0) then ! Endpoint is a zero.
call prn(b, fb)
return
end if
!
! Check for convergence. The zero satisfies c < zero < b.
!
if (abs(b - c) < eps) then
c = (b + c) / 2
call prn(c, f(c))
return
end if
!
! Contract interval and try again.
!
a = c
else
return ! No zero in this interval.
end if
end do
end subroutine bisect
elemental function f(x)
real f
real, intent(in) :: x
f = x**3 - 3 * x - 4
end function f
subroutine prn(x, f)
real, intent(in) :: x, f
write(*,*) x, f
end subroutine prn
end program proj

Storing a Variable with a Multi-Dimensional Index in Fortran

Question
Consider the following code:
program example
implicit none
integer, parameter :: n_coeffs = 1000
integer, parameter :: n_indices = 5
integer :: i
real(8), dimension(n_coeffs) :: coeff
integer, dimension(n_coeffs,n_indices) :: index
do i = 1, n_coeffs
coeff(i) = real(i*3,8)
index(i,:) = [2,4,8,16,32]*i
end do
end
For any 5 dimensional index I need to obtain the associated coefficient, without knowing or calculating i. For instance, given [2,4,8,16,32] I need to obtain 3.0 without computing i.
Is there a reasonable solution, perhaps using sparse matrices, that would work for n_indices in the order of 100 (though n_coeffs still in the order of 1000)?
A Bad Solution
One solution would be to define a 5 dimensional array as in
real(8), dimension(2000,4000,8000,16000,32000) :: coeff2
do i = 1, ncoeffs
coeff2(index(i,1),index(i,2),index(i,3),index(i,4),index(i,5)) = coeff(i)
end do
then, to get the coefficient associated with [2,4,8,16,32], call
coeff2(2,4,8,16,32)
However, besides being very wasteful of memory, this solution would not allow n_indices to be set to a number higher than 7 given the limit of 7 dimensions to an array.
OBS: This question is a spin-off of this one. I have tried to ask the question more precisely having failed in the first attempt, an effort that greatly benefited from the answer of #Rodrigo_Rodrigues.
Actual Code
In case it helps here is the code for the actual problem I am trying to solve. It is an adaptive sparse grid method for approximating a function. The main goal is to make the interpolation at the and as fast as possible:
MODULE MOD_PARAMETERS
IMPLICIT NONE
SAVE
INTEGER, PARAMETER :: d = 2 ! number of dimensions
INTEGER, PARAMETER :: L_0 = 4 ! after this adaptive grid kicks in, for L <= L_0 usual sparse grid
INTEGER, PARAMETER :: L_max = 9 ! maximum level
INTEGER, PARAMETER :: bound = 0 ! 0 -> for f = 0 at boundary
! 1 -> adding grid points at boundary
! 2 -> extrapolating close to boundary
INTEGER, PARAMETER :: max_error = 1
INTEGER, PARAMETER :: L2_error = 1
INTEGER, PARAMETER :: testing_sample = 1000000
REAL(8), PARAMETER :: eps = 0.01D0 ! epsilon for adaptive grid
END MODULE MOD_PARAMETERS
PROGRAM MAIN
USE MOD_PARAMETERS
IMPLICIT NONE
INTEGER, DIMENSION(d,d) :: ident
REAL(8), DIMENSION(d) :: xd
INTEGER, DIMENSION(2*d) :: temp
INTEGER, DIMENSION(:,:), ALLOCATABLE :: grid_index, temp_grid_index, grid_index_new, J_index
REAL(8), DIMENSION(:), ALLOCATABLE :: coeff, temp_coeff, J_coeff
REAL(8) :: temp_min, temp_max, V, T, B, F, x1
INTEGER :: k, k_1, k_2, h, i, j, L, n, dd, L1, L2, dsize, count, first, repeated, add, ind
INTEGER :: time1, time2, clock_rate, clock_max
REAL(8), DIMENSION(L_max,L_max,2**(L_max),2**(L_max)) :: coeff_grid
INTEGER, DIMENSION(d) :: level, LL, ii
REAL(8), DIMENSION(testing_sample,d) :: x_rand
REAL(8), DIMENSION(testing_sample) :: interp1, interp2
! ============================================================================
! EXECUTABLE
! ============================================================================
ident = 0
DO i = 1,d
ident(i,i) = 1
ENDDO
! Initial grid point
dsize = 1
ALLOCATE(grid_index(dsize,2*d),grid_index_new(dsize,2*d))
grid_index(1,:) = 1
grid_index_new = grid_index
ALLOCATE(coeff(dsize))
xd = (/ 0.5D0, 0.5D0 /)
CALL FF(xd,coeff(1))
CALL FF(xd,coeff_grid(1,1,1,1))
L = 1
n = SIZE(grid_index_new,1)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO WHILE (L .LT. L_max)
L = L+1
n = SIZE(grid_index_new,1)
count = 0
first = 1
DEALLOCATE(J_index,J_coeff)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
J_index = 0
J_coeff = 0.0D0
DO k = 1,n
DO i = 1,d
DO j = 1,2
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ELSEIF (bound .EQ. 1) THEN
IF (grid_index_new(k,i) .EQ. 1) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(-(-1)**j)/)
ELSE
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ENDIF
ENDIF
CALL XX(d,temp(1:d),temp(d+1:2*d),xd)
temp_min = MINVAL(xd)
temp_max = MAXVAL(xd)
IF ((temp_min .GE. 0.0D0) .AND. (temp_max .LE. 1.0D0)) THEN
IF (first .EQ. 1) THEN
first = 0
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k_1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k_2 = 1,d
CALL XX(1,temp(k_2),temp(d+k_2),x1)
CALL BASE(x1,grid_index(k_1,k_2),grid_index(k_1,k_2+d),B)
T = T*B
ENDDO
V = V+coeff(k_1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
ELSE
repeated = 0
DO h = 1,count
IF (SUM(ABS(J_index(h,:)-temp)) .EQ. 0) THEN
repeated = 1
ENDIF
ENDDO
IF (repeated .EQ. 0) THEN
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k_1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k_2 = 1,d
CALL XX(1,temp(k_2),temp(d+k_2),x1)
CALL BASE(x1,grid_index(k_1,k_2),grid_index(k_1,k_2+d),B)
T = T*B
ENDDO
V = V+coeff(k_1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
ENDIF
ENDIF
ENDIF
ENDDO
ENDDO
ENDDO
ALLOCATE(temp_grid_index(dsize,2*d))
ALLOCATE(temp_coeff(dsize))
temp_grid_index = grid_index
temp_coeff = coeff
DEALLOCATE(grid_index,coeff)
ALLOCATE(grid_index(dsize+count,2*d))
ALLOCATE(coeff(dsize+count))
grid_index(1:dsize,:) = temp_grid_index
coeff(1:dsize) = temp_coeff
DEALLOCATE(temp_grid_index,temp_coeff)
grid_index(dsize+1:dsize+count,:) = J_index(1:count,:)
coeff(dsize+1:dsize+count) = J_coeff(1:count)
dsize = dsize + count
DO i = 1,count
coeff_grid(J_index(i,1),J_index(i,2),J_index(i,3),J_index(i,4)) = J_coeff(i)
ENDDO
IF (L .LE. L_0) THEN
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(count,2*d))
grid_index_new = J_index(1:count,:)
ELSE
add = 0
DO h = 1,count
IF (ABS(J_coeff(h)) .GT. eps) THEN
add = add + 1
J_index(add,:) = J_index(h,:)
ENDIF
ENDDO
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(add,2*d))
grid_index_new = J_index(1:add,:)
ENDIF
ENDDO
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, 'Elapsed real time1 = ', DBLE(time2-time1)/DBLE(clock_rate)
PRINT *, 'Grid Points = ', SIZE(grid_index,1)
! ============================================================================
! Compute interpolated values:
! ============================================================================
CALL RANDOM_NUMBER(x_rand)
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO i = 1,testing_sample
V = 0.0D0
DO L1=1,L_max
DO L2=1,L_max
IF (L1+L2 .LE. L_max+1) THEN
level = (/L1,L2/)
T = 1.0D0
DO dd = 1,d
T = T*(1.0D0-ABS(x_rand(i,dd)/2.0D0**(-DBLE(level(dd)))-DBLE(2*FLOOR(x_rand(i,dd)*2.0D0**DBLE(level(dd)-1))+1)))
ENDDO
V = V + coeff_grid(L1,L2,2*FLOOR(x_rand(i,1)*2.0D0**DBLE(L1-1))+1,2*FLOOR(x_rand(i,2)*2.0D0**DBLE(L2-1))+1)*T
ENDIF
ENDDO
ENDDO
interp2(i) = V
ENDDO
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, 'Elapsed real time2 = ', DBLE(time2-time1)/DBLE(clock_rate)
END PROGRAM
For any 5 dimensional index I need to obtain the associated
coefficient, without knowing or calculating i. For instance, given
[2,4,8,16,32] I need to obtain 3.0 without computing i.
function findloc_vector(matrix, vector) result(out)
integer, intent(in) :: matrix(:, :)
integer, intent(in) :: vector(size(matrix, dim=2))
integer :: out, i
do i = 1, size(matrix, dim=1)
if (all(matrix(i, :) == vector)) then
out = i
return
end if
end do
stop "No match for this vector"
end
And that's how you use it:
print*, coeff(findloc_vector(index, [2,4,8,16,32])) ! outputs 3.0
I must confess I was reluctant to post this code because, even though this answers your question, I honestly think this is not what you really want/need, but you dind't provide enough information for me to know what you really do want/need.
Edit (After actual code from OP):
If I decrypted your code correctly (and considering what you said in your previous question), you are declaring:
REAL(8), DIMENSION(L_max,L_max,2**(L_max),2**(L_max)) :: coeff_grid
(where L_max = 9, so size(coeff_grid) = 21233664 =~160MB) and then populating it with:
DO i = 1,count
coeff_grid(J_index(i,1),J_index(i,2),J_index(i,3),J_index(i,4)) = J_coeff(i)
ENDDO
(where count is of the order of 1000, i.e. 0.005% of its elements), so this way you can fetch the values by its 4 indices with the array notation.
Please, don't do that. You don't need a sparse matrix in this case either. The new approach you proposed is much better: storing the indices in each row of an smaller array, and fetching on the array of coefficients by the corresponding location of those indices in its own array. This is way faster (avoiding the large allocation) and much more memory-efficient.
PS: Is it mandatory for you to stick to Fortran 90? Its a very old version of the standard and chances are that the compiler you're using implements a more recent version. You could improve the quality of your code a lot with the intrinsic move_alloc (for less array copies), the kind constants from the intrinsic module iso_fortran_env (for portability), the [], >, <, <=,... notation (for readability)...

Sum duplicate values when converting from COO to CSR sparse matrix format

How would one sum up duplicate values efficently when converting from COO format to CSR. Does something similar to scipy implementation (http://docs.scipy.org/doc/scipy-0.9.0/reference/sparse.html) exist written in a subroutine for fortran? I am using Intel's MKL auxiliary routines for converting from COO to CSR, but it seems that it doesn't work for duplicate values.
In my codes I am using this subroutine that I wrote:
subroutine csr_sum_duplicates(Ap, Aj, Ax)
! Sum together duplicate column entries in each row of CSR matrix A
! The column indicies within each row must be in sorted order.
! Explicit zeros are retained.
! Ap, Aj, and Ax will be modified *inplace*
integer, intent(inout) :: Ap(:), Aj(:)
real(dp), intent(inout) :: Ax(:)
integer :: nnz, r1, r2, i, j, jj
real(dp) :: x
nnz = 1
r2 = 1
do i = 1, size(Ap) - 1
r1 = r2
r2 = Ap(i+1)
jj = r1
do while (jj < r2)
j = Aj(jj)
x = Ax(jj)
jj = jj + 1
do while (jj < r2)
if (Aj(jj) == j) then
x = x + Ax(jj)
jj = jj + 1
else
exit
end if
end do
Aj(nnz) = j
Ax(nnz) = x
nnz = nnz + 1
end do
Ap(i+1) = nnz
end do
end subroutine
and you can use this subroutine to sort the indices:
subroutine csr_sort_indices(Ap, Aj, Ax)
! Sort CSR column indices inplace
integer, intent(inout) :: Ap(:), Aj(:)
real(dp), intent(inout) :: Ax(:)
integer :: i, r1, r2, l, idx(size(Aj))
do i = 1, size(Ap)-1
r1 = Ap(i)
r2 = Ap(i+1)-1
l = r2-r1+1
idx(:l) = argsort(Aj(r1:r2))
Aj(r1:r2) = Aj(r1+idx(:l)-1)
Ax(r1:r2) = Ax(r1+idx(:l)-1)
end do
end subroutine
where argsort is
function iargsort(a) result(b)
! Returns the indices that would sort an array.
!
! Arguments
! ---------
!
integer, intent(in):: a(:) ! array of numbers
integer :: b(size(a)) ! indices into the array 'a' that sort it
!
! Example
! -------
!
! iargsort([10, 9, 8, 7, 6]) ! Returns [5, 4, 3, 2, 1]
integer :: N ! number of numbers/vectors
integer :: i,imin,relimin(1) ! indices: i, i of smallest, relative imin
integer :: temp ! temporary
integer :: a2(size(a))
a2 = a
N=size(a)
do i = 1, N
b(i) = i
end do
do i = 1, N-1
! find ith smallest in 'a'
relimin = minloc(a2(i:))
imin = relimin(1) + i - 1
! swap to position i in 'a' and 'b', if not already there
if (imin /= i) then
temp = a2(i); a2(i) = a2(imin); a2(imin) = temp
temp = b(i); b(i) = b(imin); b(imin) = temp
end if
end do
end function
That should do what you wanted.