How to find statistical mode in Fortran - fortran

I'm trying to write a program to find the mean, median, mode of an integer array but am having some complications in finding the mode. The following is the code that I've written so far.
First, the program will prompt user to enter a value for the number of integers that will be entered followed by request to enter that number of integers. The integers are then sorted in ascending order and the mean and median are found.
The problem I am having is when I try to get the mode. I am able to count the number of occurrence of a repetitive value. By finding the value with highest occurrence, we'll be able to find Mode. But I am unsure how to do this. Is there any intrinsic function in Fortran to calculate number of occurrence of input values and the value with highest occurrence?
PROGRAM STATISTICS
!Created by : Rethnaraj Rambabu
IMPLICIT NONE
REAL, DIMENSION(:), ALLOCATABLE:: VAL
REAL TEMP, MEDIAN
REAL EVEN, MEAN, SUM, FMODE
INTEGER N, I,J
WRITE(*,*)' WHAT IS THE VALUE FOR N? '
READ(*,*) N
ALLOCATE(VAL(N))
WRITE(*,*) 'ENTER THE NUMBERS'
OPEN(1,FILE='FILE.TXT')
READ(1,*)(VAL(I),I=1,N)
CLOSE(1)
WRITE(*,*) VAL
!/---FOR SORTING----/!
DO I=1,N-1
DO J=1,N-1
IF(VAL(J) > VAL(J+1)) THEN
TEMP=VAL(J)
VAL(J)=VAL(J+1)
VAL(J+1)=TEMP
END IF
END DO
END DO
WRITE(*,*) VAL
!/-----MEDIAN----/!
IF ((N/2*2) /= N) THEN
MEDIAN=VAL((N+1)/2)
ELSE IF ((N/2*2) == N) THEN
EVEN= (VAL(N/2)+VAL((N+2)/2))
MEDIAN=EVEN/2
END IF
WRITE(*,*)'MEDIAN=', MEDIAN
!/----MEAN----/
SUM=0
DO I=1,N
SUM=SUM+VAL(I)
END DO
MEAN=SUM/N
WRITE(*,*)'MEAN=', MEAN
!/------MODE----/
FMODE=1
DO I=1,N-1
IF (VAL(I) == VAL(I+1)) THEN
FMODE=FMODE+1
END IF
END DO
WRITE(*,*)FMODE
END PROGRAM
The FILE.TXT contains
10 8 1 9 8 9 9 7 5 9 3 5 6

But, how to do that? Or is there any intrinsic function in Fortran to calculate number of occurrence of input values and the value with highest occurrence.
No, there is not. You'll have to calculate the mode by hand.
The following code should work (on a sorted array):
FMODE = VAL(1)
COUNT = 1
CURRENTCOUNT = 1
DO I = 2, N
! We are going through the loop looking for values == VAL(I-1)...
IF (VAL(I) == VAL(I-1)) THEN
! We spotted another VAL(I-1), so increment the count.
CURRENTCOUNT = CURRENTCOUNT + 1
ELSE
! There are no more VAL(I-1)
IF (CURRENTCOUNT > COUNT) THEN
! There were more elements of value VAL(I-1) than of value FMODE
COUNT = CURRENTCOUNT
FMODE = VAL(I-1)
END IF
! Next we are looking for values == VAL(I), so far we have spotted one...
CURRENTCOUNT = 1
END
END DO
IF (CURRENTCOUNT > COUNT) THEN
! This means there are more elements of value VAL(N) than of value FMODE.
FMODE = VAL(N)
END IF
Explanation:
We keep the best-so-far mode in the FMODE variable, and the count of the FMODE in the COUNT variable. As we step through the array we count the number of hits that are equal to what we are looking at now, in the CURRENTCOUNT variable.
If the next item we look at is equal to the previous, we simply increment the CURRENTCOUNT. If it's different, then we need to reset the CURRENTCOUNT, because we will now count the number of duplications of the next element.
Before we reset the CURRENTCOUNT we check if it's bigger than the previous best result, and if it is, we overwrite the previous best result (the FMODE and COUNT variables) with the new best results (whatever is at VAL(I) and CURRENTCOUNT), before we continue.
This reset doesn't happen at the end of the loop, so I inserted another check at the end in case the most frequent element happens to be the final element of the loop. In that case we overwrite FMODE, like we would have done in the loop.

It is a bit lengthy, you could probably get rid of the optional argument, but there is an example provided here. They use the quick sort algorithm as implemented here.
Alternatively, you could use
integer function mode(arr) result(m)
implicit none
integer, dimension(:), intent(in) :: arr
! Local variables
integer, dimension(:), allocatable :: counts
integer :: i, astat
character(len=128) :: error_str
! Initialise array to count occurrences of each value.
allocate(counts(minval(arr):maxval(arr)), stat=astat, errmsg=error_str)
if (astat/=0) then
print'("Allocation of counts array failed.")'
print*, error_str
end if
counts = 0
! Loop over inputted array, counting occurrence of each value.
do i=1,size(arr)
counts(arr(i)) = counts(arr(i)) + 1
end do
! Finally, find the mode
m = minloc(abs(counts - maxval(counts)),1)
end function mode
This doesn't require any sorting.

Related

How to obtain the smallest figure out of five figures using OO Fortran

In what way can I get the smallest figure in a given five digit figure. E.g 23764. How do I get 2 being the smallest.
Taking the figure as a digit such as 456879, in order to obtain the smallest from the digit which is 4, I implemented the following
program findsmallestFigure
implicit none
integer:: figure
integer:: c,smallest,largest
smallest = 9
largest = 0
figure = 23456
do while(figure .GT. 0 )
c = MOD(figure,10)
largest = max(c,largest)
smallest = min(c,smallest)
figure = figure/10
end do
print *,'The smallest digit is',smallest
end
How do I achieve the same result using Object Oriented approach in Fortran ?
Create a module with a user-defined type that contains all the results, and the subroutines to fill in the values
module numstat
! Holds the statistics of a figure
type stat
integer :: smallest, largest, count
end type
! Constructor from a figure. Invoke by 'stat(1234)`
interface stat
module procedure :: calc_stat
end interface
contains
! Fill statistics type from a figure
function calc_stat(fig) result(s)
integer, intent(in) :: fig
type(stat) :: s
integer :: digit, f
! make a copy of the figure because intent(in) arguments
! are immutable (cannot change).
f = fig
s%smallest = 9
s%largest = 0
s%count = 0
do while(f > 0 )
s%count = s%count + 1
digit = mod(f, 10)
s%largest = max(s%largest, digit)
s%smallest = min(s%smallest, digit)
f = f/10
end do
end function
end module
Then use the module in the main program
program SONumstat
use numstat
implicit none
type(stat) :: s
integer :: figure
figure = 23456
s = stat(figure)
print *,'The number of digits is ', s%count
print *,'The smallest digit is ',s%smallest
print *,'The largest digit is ',s%largest
end program SONumstat

A fortran equivalent to unique

I have found many questions that turn around this issue, but none that directly answer the question:
-in fortran, what are (a) the fastest (wall clock) and (b) the most elegant (concise and clear) way to eliminate duplicates from a list of integers
There has to be a better way than my feeble attempt:
Program unique
implicit none
! find "indices", the list of unique numbers in "list"
integer( kind = 4 ) :: kx, list(10)
integer( kind = 4 ),allocatable :: indices(:)
logical :: mask(10)
!!$ list=(/3,2,5,7,3,1,4,7,3,3/)
list=(/1,(kx,kx=1,9)/)
mask(1)=.true.
do kx=10,2,-1
mask(kx)= .not.(any(list(:kx-1)==list(kx)))
end do
indices=pack([(kx,kx=1,10)],mask)
print *,indices
End Program unique
My attempt expects the list to be ordered, but it would be better if that requirement were lifted
I just couldn't help myself, so I wrote up an answer you may enjoy. The following code will return an array of unique values in ascending order for an input array of unsorted integers. Note that the output results are the actual values, not just the indices.
program unique_sort
implicit none
integer :: i = 0, min_val, max_val
integer, dimension(10) :: val, unique
integer, dimension(:), allocatable :: final
val = [ 3,2,5,7,3,1,4,7,3,3 ]
min_val = minval(val)-1
max_val = maxval(val)
do while (min_val<max_val)
i = i+1
min_val = minval(val, mask=val>min_val)
unique(i) = min_val
enddo
allocate(final(i), source=unique(1:i)) !<-- Or, just use unique(1:i)
print "(10i5:)", final
end program unique_sort
! output: 1 2 3 4 5 7
See this gist for timing comparisons between (unique_sort) above, your example (unique_indices), and the example at Rosetta Code (remove_dups) as well as a couple of variations. I'd like to test #High Performance Mark's code but haven't yet.
Run program 1,000,000 times, 100 integers 0<=N<=50
- unique_sort t~2.1 sec input: unsorted, w/duplicates output: sorted unique values
- remove_dup t~1.4 input: unsorted, w/duplicates output: unsorted unique values
- unique_indices t~1.0 input: sorted, w/duplicates output: unsorted indices for unique values
- BONUS!(Python) t~4.1 input: unsorted, w/duplicates output: sorted unique values
Bottom line: on my machine (i7 8GB laptop) unique_indices is slightly faster than remove_dups. However, remove_dups does not require the input array to be pre-sorted, and actually returns the values rather than the indices (see the gist for a modified version of unique_indices that returns the values instead, which doesn't seem to slow it down much at all).
On the other hand, unique_sort takes around twice as long, but is designed to handle unsorted input, and also returns the values in sorted order, in 8 LOC (minus the var declarations). So that seems a fair trade-off. Anywho, I'm sure unique_sort can be optimized for greater speed using some sort of masking statement, but that's for another day.
Update
The timings shown above were obtained from a test program where each subroutine was placed in a module and executed via a procedure call. However, I found a surprisingly large improvement in performance when unique_sort was placed directly in the main program, completing in only ~0.08 sec for 1 million runs. A speedup of ~25x simply by not using a procedure seems strange to me - ordinarily, I assume that the compiler optimizes the cost of procedure calls away. For example, I found no difference in performance for remove_dup or unique_indices whether they were executed via a procedure or placed directly in the main program.
After #VladimirF pointed out that I was overcomparing, I found I could vectorize my original code (remove the do loop do kx....). I have coupled the "unique" function with a mergesort algorithm loosely based on wikipedia. The guts are contained in module SortUnique
Module SortUnique
contains
Recursive Subroutine MergeSort(temp, Begin, Finish, list)
! 1st 3 arguments are input, 4th is output sorted list
implicit none
integer(kind=4),intent(inout) :: Begin,list(:),temp(:)
integer(kind=4),intent(in) :: Finish
integer(kind=4) :: Middle
if (Finish-Begin<2) then !if run size =1
return !it is sorted
else
! split longer runs into halves
Middle = (Finish+Begin)/2
! recursively sort both halves from list into temp
call MergeSort(list, Begin, Middle, temp)
call MergeSort(list, Middle, Finish, temp)
! merge sorted runs from temp into list
call Merge(temp, Begin, Middle, Finish, list)
endif
End Subroutine MergeSort
Subroutine Merge(list, Begin, Middle, Finish, temp)
implicit none
integer(kind=4),intent(inout) :: list(:),temp(:)
integer(kind=4),intent(in) ::Begin,Middle,Finish
integer(kind=4) :: kx,ky,kz
ky=Begin
kz=Middle
!! While there are elements in the left or right runs...
do kx=Begin,Finish-1
!! If left run head exists and is <= existing right run head.
if (ky.lt.Middle.and.(kz.ge.Finish.or.list(ky).le.list(kz))) then
temp(kx)=list(ky)
ky=ky+1
else
temp(kx)=list(kz)
kz = kz + 1
end if
end do
End Subroutine Merge
Function Unique(list)
!! usage sortedlist=Unique(list)
implicit none
integer(kind=4) :: strt,fin,N
integer(kind=4), intent(inout) :: list(:)
integer(kind=4), allocatable :: unique(:),work(:)
logical,allocatable :: mask(:)
! sort
work=list;strt=1;N=size(list);fin=N+1
call MergeSort(work,strt,fin,list)
! cull duplicate indices
allocate(mask(N));
mask=.false.
mask(1:N-1)=list(1:N-1)==list(2:N)
unique=pack(list,.not.mask)
End Function Unique
End Module SortUnique
Program TestUnique
use SortUnique
implicit none
! find "indices", the list of unique numbers in "list"
integer (kind=4),allocatable :: list(:),newlist(:)
integer (kind=4) :: kx,N=100000 !N even
real (kind=4) :: start,finish,myrandom
allocate(list(N))
do kx=1,N
call random_number(myrandom)
list(kx)=ifix(float(N)/2.*myrandom)
end do
call cpu_time(start)
newlist=unique(list)
call cpu_time(finish)
print *,"cull duplicates: ",finish-start
print *,"size(newlist) ",size(newlist)
End Program TestUnique
At #HighPerformanceMark 's suggestion, the function is simply invoked as newlist=unique(list). The above is certainly not concise, but it seems clear, and it is about 200 times faster than either my original or the other solutions proposed.

Do-loop ignores if-statement

I'm trying to use an if statement in a do loop which is supposed to generate prime numbers. For that I used modulo to sort out the numbers. After it found a prime number I want it to go a step further and add 1 so that the next prime number can be found and added to the array pzahl. My problem is that the loop seems to ignore that it should go a step further with plauf after it found a prime number so that it just keeps going till infinity... I tried to rearrange the contents of the loop and if statement but it's just not working. Here is the code:
PROGRAM Primzahlen
IMPLICIT NONE
INTEGER :: start, plauf, n, a
INTEGER, ALLOCATABLE, DIMENSION(:) :: pzahlen !array into which the prime numbers should be added
INTEGER :: input
INTEGER, DIMENSION(:), ALLOCATABLE :: alle
PRINT *, "How many prime numbers should be listed"
READ (*,*) input
ALLOCATE (pzahlen(input))
pzahlen(1) = 1
start = 2
plauf = 1
loop1: DO
ALLOCATE(alle(start))
loop2: DO n = 1,start
alle(n)= MODULO(start,n)
END DO loop2
IF (minval(alle) /= 0) THEN ! This is what it seems to ignore.
plauf= plauf + 1
pzahlen(plauf) = start
PRINT *, plauf
END IF
start = start + 1
IF (plauf == eingabe) then
EXIT
END IF
PRINT *, alle
DEALLOCATE(alle)
END DO loop1
PRINT *, "prime numbers:" , pzahlen(1:input)
END PROGRAM Primzahlen
I use the gfortran compiler and write it in Emacs if that helps to know.
It's not ignoring it, it executes correctly:
loop2: DO n = 1,start
alle(n)= MODULO(start,n)
END DO loop2
It doesn't matter what start is, alle(1) will always be zero, as every integer is evenly divisible by 1. That means that minval(alle) will also always be zero, which means that the condition minval(alle) /= 0 is never true, and the statement will never execute.
Added: The last value, alle(start), will also be zero, as every number is evenly divisible by itself.

Count the numbers of equal rows in a file

Suppose you have a file.dat of the form:
1
1
1
2
2
3
3
3
3
...
I want to count how many equal numbers there are and save them iteratively in a string. For instance:
m = 3 (times 1),
m = 2 (times 2),
m = 4 (times 3).
I put here my code:
program sele
implicit none
integer::j,k,s,n,l,r,m
real*8,allocatable::ID(:)
real*8:: j_r8,i_r8
open(10,file='data.dat')
n=0
DO
READ(10,*,END=100)
n=n+1
END DO
100 continue
rewind(10)
allocate(ID(n))
s=0
do s=1, n
read(10,*) ID(s)
end do
do r=1,n-1
if (ID(r)-ID(r+1) .EQ. 0) then
m = m + 1
print*, m
end if
end do
end program
The last do is the condition I'd like to expand, with something like:
if (condition is true) then
save an index of the number of equal digits
use this to do some operations:
do i = 1, number of equal digits
if (condition is not true) then
restart with the other digits.
If the values you want to read are integer values in a given limited range (for instance from 1 to 100), then the simplest way is the following :
program sele
implicit none
integer, parameter :: vmin=1
integer, parameter :: vmax=100
integer :: list(vmin:vmax)
integer :: value,i
open(10,file='data.dat')
list=0
do
read(10,*,end=10) value
if(value < vmin .OR. value > vmax) then
write(*,*) 'invalid value ',value
stop
endif
list(value)=list(value)+1
enddo
10 continue
do i=vmin,vmax
if(list(i) > 0) then
write(*,*) list(i),' times ',i
endif
enddo
end program
Which gives on your example :
3 times 1
2 times 2
4 times 3
It is possible to improve easily that program to manage variable vmin and vmax (the array list must then be declared allocatable and allocated at the right size).
If the range is too large, then a simple array is not accurate anymore and the right algorithm becomes more complicated : it must avoid to store unused values.

Fortran 90 - segmentation fault

I am making a program that converts a decimal integer into its binary representation. Here is my code:
program test
implicit none
integer, dimension(:), allocatable :: binary
integer :: decimalnum, i, initvalue
print*, "Enter decimal number to convert: "
read*,initvalue
decimalnum = initvalue
i = 0
do while (decimalnum > 0)
if (MOD(decimalnum,2)==0) then
binary(i) = 0 ! this is as far as the program executes up to
decimalnum = decimalnum / 2
i = i + 1
else if (MOD(decimalnum,2)==1) then
binary(i) = 1
decimalnum = (decimalnum -1) / 2
i = i + 1
end if
end do
end program test
At the marked point, it returns the error Segmentation fault and exits with code 139.
Why does this happen?
Thanks in advance.
Here's a simple way to convert an integer i to its binary representation:
write(*,'(b16)') i
As written, this won't write any leading 0s. If you want the leading 0s, try
write(*,'(b16.16)') i
Of course, the preceding code writes the binary representation to the default output unit but using Fortran's internal write capabilities I could just as easily write the bits to a character variable. For example:
character(len=16) :: bits
...
write(bits,'(b16.16)') i
writes the binary digits of i into the character variable bits.
Now, if what you really want is to create an array of integers each representing one bit of the binary representation, then something like this
integer, dimension(16) :: bitarray
...
bitarray = 0
...
do ix = 1,len(bits)
if (bits(ix:ix)=='1') bitarray(ix) = 1
end do
would probably work.
1) Your crash occurs because you had allocated only 1 element for the array binary(:), and the While loop presumably had moved on to i = 2, at which point your array is index out of bounds (crash).
2) Fortran has a number of intrinsic functions that deal with bits directly. For example,
a) Bit_Size(var) returns the number of bits in "var", so if you must use an allocatable, now you know the array size required in advance.
b) BTest(iVar, pos) returns .True. if the bit at pos in iVar is 1
For example, using the other declarations above,:
Integer :: n
n = Bit_Size(decimalnum)
If( Allocated(Binary) ) DeAllocate(Binary) ! good practice
Allocate( Binary(1:n) ) ! in general, should get IOStat, just in case
Binary(:) = 0
ForAll(i=1:n, Mask = BTest( decimalnum, i-1 ) ) ! remember, bit intrinsics index from 0
Binary(i) = 1
End ForAll
... this is a little more efficient compared to Do and While, and may help (a little) towards smp. The Where/EndWhere construct can be used also, but I find ForAll's a little more efficient.
c) IBits(var, pos, len) extracts the bits from var starting at pos for len number of bits, for example if you wanted to create an "explicit" binary representation, that might be one way to go.
etc etc
3) If you "really mean" convert DecimalNum to Bin, then you have (substantial) additional problems, if Dec also includes floating point Dec (i.e. Reals), since the bit representation of Reals is in terms of exponents. I will assume that is not the case, as the code/explanation for that is much much more involved.
Finally, in Fortran, Nums are usually "signed" Nums and the leading bit is used to determine +ve (0) or -ve (1). So if you were going in the "other" direction (Bin2Dec), then would prefer an additional arg (perhaps optional) that controls whether the result is signed or unsigned. If unsigned, then the output var will need to be "bigger" compared to the input var (e.g. if converting unsigned 1-byt int to Fortran int, then must use at least 2-byte int (i.e. input Integer(1) must be output to an Integer(2)) etc.
As per the comments, you need to have executed an allocate statement (or something which does an allocation for you under the covers) before you can define the binary array. The simplest form of allocation statement would look something like ALLOCATE(binary(10)), which would given the binary array 10 elements, using the default (it can be changed for that array using the allocate statement) starting array index of 1.
Where the size of the allocation is not easily known before working with an array there are two basic approaches:
Do two passes, the first pass of which simply counts how many elements are required, then the array is allocated, then the second pass actually does the assignment to the relevant elements.
Allocate the array to an initial size (which may be zero), the progressively grow the array as required.
There are trade-offs associated with the decision around the approach to use associated with the relative overheads of things like allocation and the evaluation of each test when counting.
In Fortran 90 (time to move on to at least Fortran 95!), growing an allocatable array is somewhat convoluted (allocate a temporary, copy data from original to temporary, deallocate original, allocate original to new size, copy data from temporary back to resized original, deallocate temporary). In Fortran 2003 this operation becomes trivial.
so this is probably awful form, and certainly bad runtime (it copies the array for every single bit), but here's what I came up with. It seems to work.
program test
implicit none
integer, dimension(:), allocatable :: binary
integer :: decimalnum, i, initvalue, curSize, curBit
print*, "Enter decimal number to convert: "
read*,initvalue
decimalnum = initvalue
i = 1
ALLOCATE ( binary(1) )
curSize = 1
DO WHILE (decimalnum > 0)
IF (i > curSize ) THEN
curSize = curSize * 2
CALL expandArray( curSize, i-1 )
END IF
IF (MOD(decimalnum,2)==0) then
binary(i) = 0 ! this is as far as the program executes up to
decimalnum = decimalnum / 2
i = i + 1
ELSE IF (MOD(decimalnum,2)==1) then
binary(i) = 1
decimalnum = (decimalnum -1) / 2
i = i + 1
END IF
end do
PRINT*, binary
CONTAINS
SUBROUTINE expandArray( newSize, oldSize )
IMPLICIT NONE
INTEGER, DIMENSION(:), ALLOCATABLE :: temp
INTEGER :: j, newSize, oldSize
ALLOCATE( temp(newSize) )
DO j=1,oldSize
temp(j) = binary(j)
END DO
DEALLOCATE (binary)
ALLOCATE( binary(newSize) )
DO j=1,oldSize
binary(j) = temp(j)
END DO
DO j=oldSize+1,newSize
binary(j) = 0
END DO
DEALLOCATE (temp)
END SUBROUTINE
END PROGRAM test