compressed sparse column to row conversion - fortran

i was trying to convert a compressed sparse column format into compressed sparse row format using fortran. here's what i have so far:
program test
implicit none
real*4,dimension(19)::csc_data=(/10.,3.,3.,9.,7.,8.,4.,8.,8.,7.,7.,9.,-2.,5.,9.,2.,3.,13.,-1./)
integer*4,dimension(19)::csc_index=(/1,2,4,2,3,5,6,3,4,3,4,5,1,4,5,6,2,5,6/)
integer*4,dimension(7)::csc_pointer=(/1,4,8,10,13,17,20/)
integer*4,dimension(7)::csr_pointer
integer*4,dimension(19)::csr_index
real*4,dimension(19)::csr_data
integer*4::global_counter,counter,i
integer*4::num_nonzero,num_cols,num_rows
integer*4::s1,s2,c,r
num_nonzero=19
num_rows=6
num_cols=6
csr_pointer(1)=1
global_counter=1
do i=1,num_rows
counter=0
do c=1,num_cols
s1=csc_pointer(c)
s2=csc_pointer(c+1)-1
do r=s1,s2
if(csc_index(r).eq.i) then
counter=counter+1
csr_data(global_counter)=csc_data(r)
csr_index(global_counter)=c
global_counter=global_counter+1
end if
end do
end do
csr_pointer(i+1)=csr_pointer(i)+counter
end do
end program test
Could anyone show more efficient approach? I would really appreciate if you could also show it with OpenMP parallelization. Thanks.

While I cannot comment on the correctness of the algorithm, I can make a few points on the code itself
csc_data is defined as real*4 while csr_data is defined as integer*4
csc_pointer is defined as integer*4 while csr_pointer is defined as real*4
If you output the data, aside form the integer/real difference, the data is the same, something I would not expect for converting the matrix
csr_index has an index of global_i which should be global_counter

Related

Equivalent of -ffree-line-length-none in mpiifort? [duplicate]

I've spent hours scouring the internet for a solution to this problem and can't find anything. I have been trying to write unformatted output to a CSV output file with multiple very long lines of varying length and multiple data types. I'm trying to first write a long header that indicates the variables that will be written below, separated by commas. Then on the lines below that, I am writing the values specified in the header. However, with sequential access, the long output lines are broken into multiple shorter lines, which is not what I was hoping for. I tried controlling the line length using recl in the open statement, but that only added a bunch of garble text and symbol after the output with the same problem still occurring. I also tried using direct access but the lines are not the same length so that would not work either. I've read about using stream i/o in Fortran2003 but I'm using Fortran90, so that won't work either. I am using Fortran 90 with the Plato IDE which uses the FTN95 compiler. I included an example program similar to what I want to do below, using an array and some dummy text, and I've included the output below that illustrating the problem. Anyone know how I can just one line per write statement? Any help would be greatly appreciated.
module types
integer, parameter :: dp=selected_real_kind(15)
end module types
program blah
use types
use inputoutput
implicit none
integer :: i
character(50)::fileNm
integer :: unitout2=20
real(dp), dimension(100) :: bigArray
fileNm='predictout2.csv'
open(unit=unitout2,file=fileNm,status="replace")
do i=1,100
bigArray(i)=i
end do
write(unitout2,*)"word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word"
write(unitout2,*)bigArray
close(unitout2)
end program
Here's the output for the program above (without recl):
word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word
,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,wo
rd,word,word,word,word,word
1.00000000000 2.00000000000 3.00000000000 4.00000000000
5.00000000000 6.00000000000 7.00000000000 8.00000000000
9.00000000000 10.0000000000 11.0000000000 12.0000000000
13.0000000000 14.0000000000 15.0000000000 16.0000000000
17.0000000000 18.0000000000 19.0000000000 20.0000000000
21.0000000000 22.0000000000 23.0000000000 24.0000000000
25.0000000000 26.0000000000 27.0000000000 28.0000000000
29.0000000000 30.0000000000 31.0000000000 32.0000000000
33.0000000000 34.0000000000 35.0000000000 36.0000000000
37.0000000000 38.0000000000 39.0000000000 40.0000000000
41.0000000000 42.0000000000 43.0000000000 44.0000000000
45.0000000000 46.0000000000 47.0000000000 48.0000000000
49.0000000000 50.0000000000 51.0000000000 52.0000000000
53.0000000000 54.0000000000 55.0000000000 56.0000000000
57.0000000000 58.0000000000 59.0000000000 60.0000000000
61.0000000000 62.0000000000 63.0000000000 64.0000000000
65.0000000000 66.0000000000 67.0000000000 68.0000000000
69.0000000000 70.0000000000 71.0000000000 72.0000000000
73.0000000000 74.0000000000 75.0000000000 76.0000000000
77.0000000000 78.0000000000 79.0000000000 80.0000000000
81.0000000000 82.0000000000 83.0000000000 84.0000000000
85.0000000000 86.0000000000 87.0000000000 88.0000000000
89.0000000000 90.0000000000 91.0000000000 92.0000000000
93.0000000000 94.0000000000 95.0000000000 96.0000000000
97.0000000000 98.0000000000 99.0000000000 100.000000000
This isn't a problem with the ACCESS used for the file (stream, sequential or direct) - it is a consequence of the format specification that you are using.
Note that you are not doing unformatted output. Formatted versus unformatted is a question of whether the output is intended to be human readable.
The star in the second specifier of the WRITE statement is a specification of list directed formatting. This means that the format used for the output is based on the list of things to be output. Beyond that and a small set of rules in the language for list directed output, you are pretty much leaving the appearance of things up to the Fortran processor (the compiler).
With list directed formatted output the processor is specifically allowed to insert as many records as it sees fit between items. It does that here, quite reasonably, in order to make it easier for people to read the file.
If you want more control over the appearance of your output, then use an explicit format. For example, something like:
write(unitout2,"(9999(G12.5,:,','))") bigArray
might be more appropriate.
(Technically when a sequential file is opened there is a processor defined maximum record length (in the absence of a programmer specified maximum length) that should not be exceeded. Practically, given the way sequential formatted files are stored on disk by nearly all current Fortran compilers, that technicality doesn't cause any problems.)

How to write columnwise in FORMAT in FORTRAN 77

I am using FORTRAN77 as a third party language on ANSYS computation software. Here we can write the entire row and columns to files during I/O operations. I am not able to however move the cursor to the first row and write column wise thereafter- for every column in the 2D array defined. It writes all the data in the single column unfortunately. I need to know what I can use at the place quoted as XXX
*CFOPEN, ACT_STR, CSV,,APPEND
*DO,INF,1,2*S,1
*VWRITE, S0(1,INF),
(XXX,F10.2,',')
*CFCLOS
You can try transpose of the matrix and then print the matrix row-wise. you can write a small subroutine that can do the transpose for SO.

Convert CSV to Gridded Binary

I am trying to convert a CSV text file with three columns and 572 rows to a gridded binary file (.bin) using gfortran.
I have two Fortran programs that I have written to achieve this.
The issue is that my binary file size is ending up way too large (9.6GB) by the end, which is not correct.
I have a sneaking suspicion that my nx and ny values in ascii2grd.90 are not correct and that is leading to the bad .bin file being created. With such a small list (only 572 rows), I am expecting the final .bin to be more in KBs, not GBs.
temp.90
!PROGRAM TO CONVERT ASCII TO GRD
program gridded
real lon(572),lat(572),temp(572)
open(2,file='/home/weather/data/file')
open(3,file='/home/weather/out.dat')
do 20 i=1,572
read(2,*)lat(i),lon(i),temp(i)
write(3,*)temp(i)
20 continue
stop
end
ascii2grd.f90
!PROGRAM TO CONVERT ASCII TO GRD
program ascii2grd
parameter(nx=26,ny=22,np=1)
real u(nx,ny,np),temp1(nx,ny)
integer :: reclen
inquire(iolength=reclen)a
open(12,file='/home/weather/test.bin',&
form='unformatted',access='direct',recl=nx*ny*reclen)
open(11,file='/home/weather/out.dat')
do k=1,np
read(11,*)((u(j,i,k),j=1,nx),i=1,ny)
10 continue
enddo
rec=1
do kk=1,np
write(12,rec=irec)((u(j,i,kk),j=1,nx),i=1,ny)
write(*,*)'Processing...'
irec=irec+1
enddo
write(*,*)'Finished'
stop
end
Sample from out.dat
6.90000010
15.1999998
21.2999992
999.000000
6.50000000
10.1000004
999.000000
18.0000000
999.000000
20.1000004
15.6000004
8.30000019
9.89999962
999.000000
Sample from file
-69.93500 43.90028 6.9
-69.79722 44.32056 15.2
-69.71076 43.96401 21.3
-69.68333 44.53333 999.00000
-69.55380 45.46462 6.5
-69.53333 46.61667 10.1
-69.1 44.06667 999.00000
-68.81861 44.79722 18.0
-68.69194 45.64778 999.00000
-68.36667 44.45 20.1
-68.30722 47.28500 15.6
-68.05 46.68333 8.3
-68.01333 46.86722 9.9
-67.79194 46.12306 999.00000
I would suggest a general strategy like the following:
Read the CSV with python/pandas (it could be many other things, although using python will be nice for step 2, as you'll see). But the important thing is that many other languages are more convenient than fortran for reading a CSV, and that will allow you to check that that step 1 is working before moving on.
Output to binary with numpy's tofile(). Also note that numpy will default to 'c' order for arrays so you may need to specify 'f' (fortran) order.
I have a utility at github called dataset2binary that automates this and may be of interest to you, or you could refer to the code at this answer. That is probably overkill though, because you seem to just be reading one big array of the same datatype. Nevertheless, the code you'd want will be similar, just simpler.

Fortran "write" column limit? [duplicate]

I've spent hours scouring the internet for a solution to this problem and can't find anything. I have been trying to write unformatted output to a CSV output file with multiple very long lines of varying length and multiple data types. I'm trying to first write a long header that indicates the variables that will be written below, separated by commas. Then on the lines below that, I am writing the values specified in the header. However, with sequential access, the long output lines are broken into multiple shorter lines, which is not what I was hoping for. I tried controlling the line length using recl in the open statement, but that only added a bunch of garble text and symbol after the output with the same problem still occurring. I also tried using direct access but the lines are not the same length so that would not work either. I've read about using stream i/o in Fortran2003 but I'm using Fortran90, so that won't work either. I am using Fortran 90 with the Plato IDE which uses the FTN95 compiler. I included an example program similar to what I want to do below, using an array and some dummy text, and I've included the output below that illustrating the problem. Anyone know how I can just one line per write statement? Any help would be greatly appreciated.
module types
integer, parameter :: dp=selected_real_kind(15)
end module types
program blah
use types
use inputoutput
implicit none
integer :: i
character(50)::fileNm
integer :: unitout2=20
real(dp), dimension(100) :: bigArray
fileNm='predictout2.csv'
open(unit=unitout2,file=fileNm,status="replace")
do i=1,100
bigArray(i)=i
end do
write(unitout2,*)"word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word"
write(unitout2,*)bigArray
close(unitout2)
end program
Here's the output for the program above (without recl):
word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word
,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,wo
rd,word,word,word,word,word
1.00000000000 2.00000000000 3.00000000000 4.00000000000
5.00000000000 6.00000000000 7.00000000000 8.00000000000
9.00000000000 10.0000000000 11.0000000000 12.0000000000
13.0000000000 14.0000000000 15.0000000000 16.0000000000
17.0000000000 18.0000000000 19.0000000000 20.0000000000
21.0000000000 22.0000000000 23.0000000000 24.0000000000
25.0000000000 26.0000000000 27.0000000000 28.0000000000
29.0000000000 30.0000000000 31.0000000000 32.0000000000
33.0000000000 34.0000000000 35.0000000000 36.0000000000
37.0000000000 38.0000000000 39.0000000000 40.0000000000
41.0000000000 42.0000000000 43.0000000000 44.0000000000
45.0000000000 46.0000000000 47.0000000000 48.0000000000
49.0000000000 50.0000000000 51.0000000000 52.0000000000
53.0000000000 54.0000000000 55.0000000000 56.0000000000
57.0000000000 58.0000000000 59.0000000000 60.0000000000
61.0000000000 62.0000000000 63.0000000000 64.0000000000
65.0000000000 66.0000000000 67.0000000000 68.0000000000
69.0000000000 70.0000000000 71.0000000000 72.0000000000
73.0000000000 74.0000000000 75.0000000000 76.0000000000
77.0000000000 78.0000000000 79.0000000000 80.0000000000
81.0000000000 82.0000000000 83.0000000000 84.0000000000
85.0000000000 86.0000000000 87.0000000000 88.0000000000
89.0000000000 90.0000000000 91.0000000000 92.0000000000
93.0000000000 94.0000000000 95.0000000000 96.0000000000
97.0000000000 98.0000000000 99.0000000000 100.000000000
This isn't a problem with the ACCESS used for the file (stream, sequential or direct) - it is a consequence of the format specification that you are using.
Note that you are not doing unformatted output. Formatted versus unformatted is a question of whether the output is intended to be human readable.
The star in the second specifier of the WRITE statement is a specification of list directed formatting. This means that the format used for the output is based on the list of things to be output. Beyond that and a small set of rules in the language for list directed output, you are pretty much leaving the appearance of things up to the Fortran processor (the compiler).
With list directed formatted output the processor is specifically allowed to insert as many records as it sees fit between items. It does that here, quite reasonably, in order to make it easier for people to read the file.
If you want more control over the appearance of your output, then use an explicit format. For example, something like:
write(unitout2,"(9999(G12.5,:,','))") bigArray
might be more appropriate.
(Technically when a sequential file is opened there is a processor defined maximum record length (in the absence of a programmer specified maximum length) that should not be exceeded. Practically, given the way sequential formatted files are stored on disk by nearly all current Fortran compilers, that technicality doesn't cause any problems.)

Fortran 90 how to write very long output lines of different length

I've spent hours scouring the internet for a solution to this problem and can't find anything. I have been trying to write unformatted output to a CSV output file with multiple very long lines of varying length and multiple data types. I'm trying to first write a long header that indicates the variables that will be written below, separated by commas. Then on the lines below that, I am writing the values specified in the header. However, with sequential access, the long output lines are broken into multiple shorter lines, which is not what I was hoping for. I tried controlling the line length using recl in the open statement, but that only added a bunch of garble text and symbol after the output with the same problem still occurring. I also tried using direct access but the lines are not the same length so that would not work either. I've read about using stream i/o in Fortran2003 but I'm using Fortran90, so that won't work either. I am using Fortran 90 with the Plato IDE which uses the FTN95 compiler. I included an example program similar to what I want to do below, using an array and some dummy text, and I've included the output below that illustrating the problem. Anyone know how I can just one line per write statement? Any help would be greatly appreciated.
module types
integer, parameter :: dp=selected_real_kind(15)
end module types
program blah
use types
use inputoutput
implicit none
integer :: i
character(50)::fileNm
integer :: unitout2=20
real(dp), dimension(100) :: bigArray
fileNm='predictout2.csv'
open(unit=unitout2,file=fileNm,status="replace")
do i=1,100
bigArray(i)=i
end do
write(unitout2,*)"word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,&
&word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word"
write(unitout2,*)bigArray
close(unitout2)
end program
Here's the output for the program above (without recl):
word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word
,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,word,wo
rd,word,word,word,word,word
1.00000000000 2.00000000000 3.00000000000 4.00000000000
5.00000000000 6.00000000000 7.00000000000 8.00000000000
9.00000000000 10.0000000000 11.0000000000 12.0000000000
13.0000000000 14.0000000000 15.0000000000 16.0000000000
17.0000000000 18.0000000000 19.0000000000 20.0000000000
21.0000000000 22.0000000000 23.0000000000 24.0000000000
25.0000000000 26.0000000000 27.0000000000 28.0000000000
29.0000000000 30.0000000000 31.0000000000 32.0000000000
33.0000000000 34.0000000000 35.0000000000 36.0000000000
37.0000000000 38.0000000000 39.0000000000 40.0000000000
41.0000000000 42.0000000000 43.0000000000 44.0000000000
45.0000000000 46.0000000000 47.0000000000 48.0000000000
49.0000000000 50.0000000000 51.0000000000 52.0000000000
53.0000000000 54.0000000000 55.0000000000 56.0000000000
57.0000000000 58.0000000000 59.0000000000 60.0000000000
61.0000000000 62.0000000000 63.0000000000 64.0000000000
65.0000000000 66.0000000000 67.0000000000 68.0000000000
69.0000000000 70.0000000000 71.0000000000 72.0000000000
73.0000000000 74.0000000000 75.0000000000 76.0000000000
77.0000000000 78.0000000000 79.0000000000 80.0000000000
81.0000000000 82.0000000000 83.0000000000 84.0000000000
85.0000000000 86.0000000000 87.0000000000 88.0000000000
89.0000000000 90.0000000000 91.0000000000 92.0000000000
93.0000000000 94.0000000000 95.0000000000 96.0000000000
97.0000000000 98.0000000000 99.0000000000 100.000000000
This isn't a problem with the ACCESS used for the file (stream, sequential or direct) - it is a consequence of the format specification that you are using.
Note that you are not doing unformatted output. Formatted versus unformatted is a question of whether the output is intended to be human readable.
The star in the second specifier of the WRITE statement is a specification of list directed formatting. This means that the format used for the output is based on the list of things to be output. Beyond that and a small set of rules in the language for list directed output, you are pretty much leaving the appearance of things up to the Fortran processor (the compiler).
With list directed formatted output the processor is specifically allowed to insert as many records as it sees fit between items. It does that here, quite reasonably, in order to make it easier for people to read the file.
If you want more control over the appearance of your output, then use an explicit format. For example, something like:
write(unitout2,"(9999(G12.5,:,','))") bigArray
might be more appropriate.
(Technically when a sequential file is opened there is a processor defined maximum record length (in the absence of a programmer specified maximum length) that should not be exceeded. Practically, given the way sequential formatted files are stored on disk by nearly all current Fortran compilers, that technicality doesn't cause any problems.)