How to efficiently read a line of integers in ocaml - ocaml
I'd like to efficiently read a large line (~4000 characters) from stdin. I'll have to read ~4000 lines as well.
The line is formatted as follows:
INTEGERwhitespaceINTEGERwhitespace....
For example, 100 33 22 19 485 3601...
Afterwards, the data needs to be processed, so the initial solution I used with read_line() |> String.split_on_char ' ' |> ... was too slow O(3n).
I want to use something like Scanf:
bscanf ic "%d" int_of_string
But I'm not sure how to account for the whitespaces, or if it's fast enough. Are there any solutions for this?
I created a file with 10000 lines of 4000 random integers.
I then wrote these 4 main functions (read_int is an auxiliary one) that have the same output:
let read_int ic =
let rec aux acc =
match input_char ic with
| ' ' | '\n' -> acc
| c -> aux ((10 * acc) + (Char.code c - 48))
in
aux 0
let read_test_int () =
let ic = open_in "test" in
let max = ref 0 in
try
while true do
read_int ic |> fun e -> if e > !max then max := e
done
with End_of_file ->
close_in ic;
Format.eprintf "%d#." !max
let read_test_line () =
let ic = open_in "test" in
let max = ref 0 in
try
while true do
input_line ic |> String.split_on_char ' '
|> List.iter (fun e ->
let e = int_of_string e in
if e > !max then max := e)
done
with End_of_file ->
close_in ic;
Format.eprintf "%d#." !max
let read_test_line_map () =
let ic = open_in "test" in
let max = ref 0 in
try
while true do
input_line ic |> String.split_on_char ' ' |> List.map int_of_string
|> List.iter (fun e -> if e > !max then max := e)
done
with End_of_file ->
close_in ic;
Format.eprintf "%d#." !max
let read_test_scanf () =
let ic = Scanf.Scanning.open_in "test" in
let max = ref 0 in
try
while true do
Scanf.bscanf ic "%d " (fun i -> i) |> fun e -> if e > !max then max := e
done
with End_of_file ->
Scanf.Scanning.close_in ic;
Format.eprintf "%d#." !max
read_test_int creates an integer by reading characters one by one
read_test_line is your initial solution
read_test_line_map is your initial solution with a mapping from string to int
read_test_scanf is the solution you'd like to test
I then tested the four of them with hyperfine and here are the outputs:
hyperfine --warmup 3 -P arg 1 4 'dune exec program -- {arg}'
read_int
Benchmark #1: dune exec program -- 1
Time (mean ± σ): 1.509 s ± 0.072 s [User: 1.460 s, System: 0.049 s]
Range (min … max): 1.436 s … 1.618 s 10 runs
read_line
Benchmark #2: dune exec program -- 2
Time (mean ± σ): 1.818 s ± 0.016 s [User: 1.717 s, System: 0.100 s]
Range (min … max): 1.794 s … 1.853 s 10 runs
read_line_map
Benchmark #4: dune exec program -- 4
Time (mean ± σ): 2.158 s ± 0.127 s [User: 2.108 s, System: 0.050 s]
Range (min … max): 2.054 s … 2.482 s 10 runs
read_scanf
Benchmark #3: dune exec program -- 3
Time (mean ± σ): 5.017 s ± 0.103 s [User: 4.957 s, System: 0.060 s]
Range (min … max): 4.893 s … 5.199 s 10 runs
It looks like my own implementation of read_int is the better one and input_line is just slightly worse since you first create a string then go through it once to split it then go through the list to read the integers. scanf is sadly always the worst. The difference starts to be visible with these kind of values (10000 lines, 4000 integers), for 4000 lines of 4000 characters I couldn't find any real difference.
Hyperline gives the following summary:
Summary
'dune exec program -- 1' ran
1.20 ± 0.06 times faster than 'dune exec program -- 2'
1.43 ± 0.11 times faster than 'dune exec program -- 4'
3.33 ± 0.17 times faster than 'dune exec program -- 3'
[EDIT]
I created two new benchs using OCamllex:
lexer.mll
let digit = ['0'-'9']
rule integers = parse
| ' ' | '\n' { integers lexbuf }
| digit+ as inum { int_of_string inum }
| _ { failwith "not a digit or a space" }
| eof { raise End_of_file }
and
lexer_list.mll
{ let l = ref [] }
let digit = ['0'-'9']
rule integers = parse
| ' ' | '\n' { integers lexbuf }
| digit+ as inum { l := int_of_string inum :: !l; integers lexbuf }
| _ { failwith "not a digit or a space" }
| eof { !l }
Rerunning the benchmarks here are the results:
❯ hyperfine --warmup 3 -P arg 1 6 'dune exec program -- {arg}'
Benchmark #1: dune exec program -- 1
Time (mean ± σ): 1.394 s ± 0.044 s [User: 1.358 s, System: 0.036 s]
Range (min … max): 1.360 s … 1.483 s 10 runs
Benchmark #2: dune exec program -- 2
Time (mean ± σ): 1.674 s ± 0.011 s [User: 1.590 s, System: 0.084 s]
Range (min … max): 1.657 s … 1.692 s 10 runs
Benchmark #3: dune exec program -- 3
Time (mean ± σ): 4.886 s ± 0.304 s [User: 4.847 s, System: 0.037 s]
Range (min … max): 4.627 s … 5.460 s 10 runs
Benchmark #4: dune exec program -- 4
Time (mean ± σ): 1.949 s ± 0.023 s [User: 1.908 s, System: 0.041 s]
Range (min … max): 1.925 s … 1.984 s 10 runs
Benchmark #5: dune exec program -- 5
Time (mean ± σ): 2.824 s ± 0.013 s [User: 2.784 s, System: 0.039 s]
Range (min … max): 2.798 s … 2.843 s 10 runs
Benchmark #6: dune exec program -- 6
Time (mean ± σ): 5.832 s ± 0.074 s [User: 5.493 s, System: 0.333 s]
Range (min … max): 5.742 s … 5.981 s 10 runs
Summary
'dune exec program -- 1' ran
1.20 ± 0.04 times faster than 'dune exec program -- 2'
1.40 ± 0.05 times faster than 'dune exec program -- 4'
2.03 ± 0.07 times faster than 'dune exec program -- 5'
3.51 ± 0.24 times faster than 'dune exec program -- 3'
4.18 ± 0.14 times faster than 'dune exec program -- 6'
Creating a list before iterating over it is the worst possible solution (even worse than scanf, imagine!) but lexing is not that bad (but not that good either)
So, to summarise, the solutions from best to worst are:
custom read int
read line
lexing int by int
read line with a mapping
scanf
lexing the whole file to a list of int
[Benching with memtrace]
This made me realise something, by the way, in case you ever read this:
if you're trying to bench your solutions, never have memtrace in your code. I was trying something and had Memtrace.trace_if_requested (); at the start of my entry point. Well, it just messes with everything and the benchs were completely wrong:
❯ hyperfine --warmup 3 -P arg 1 6 'dune exec program -- {arg}'
Benchmark #1: dune exec program -- 1
Time (mean ± σ): 7.003 s ± 0.201 s [User: 6.959 s, System: 0.043 s]
Range (min … max): 6.833 s … 7.420 s 10 runs
Benchmark #2: dune exec program -- 2
Time (mean ± σ): 1.801 s ± 0.060 s [User: 1.697 s, System: 0.104 s]
Range (min … max): 1.729 s … 1.883 s 10 runs
Benchmark #3: dune exec program -- 3
Time (mean ± σ): 4.817 s ± 0.120 s [User: 4.757 s, System: 0.058 s]
Range (min … max): 4.679 s … 5.068 s 10 runs
Benchmark #4: dune exec program -- 4
Time (mean ± σ): 2.028 s ± 0.023 s [User: 1.994 s, System: 0.032 s]
Range (min … max): 1.993 s … 2.071 s 10 runs
Benchmark #5: dune exec program -- 5
Time (mean ± σ): 2.997 s ± 0.108 s [User: 2.948 s, System: 0.046 s]
Range (min … max): 2.889 s … 3.191 s 10 runs
Benchmark #6: dune exec program -- 6
Time (mean ± σ): 6.109 s ± 0.161 s [User: 5.753 s, System: 0.349 s]
Range (min … max): 5.859 s … 6.322 s 10 runs
Summary
'dune exec program -- 2' ran
1.13 ± 0.04 times faster than 'dune exec program -- 4'
1.66 ± 0.08 times faster than 'dune exec program -- 5'
2.67 ± 0.11 times faster than 'dune exec program -- 3'
3.39 ± 0.14 times faster than 'dune exec program -- 6'
3.89 ± 0.17 times faster than 'dune exec program -- 1'
My understanding is that memtrace is able to do a lot of work on my custom solution since the whole code is directly available whereas for the rest it can just scratch the surface (I may be completely wrong but it took me some time to figure out that memtrace was spoiling my benchmarks)
[Following #ivg's comment]
lexer_parser.mll
{
open Parser
}
let digit = ['0'-'9']
rule integers = parse
| ' ' | '\n' { integers lexbuf }
| digit+ as inum { INT (int_of_string inum) }
| _ { failwith "not a digit or a space" }
| eof { raise End_of_file }
and parser.mly
%token <int> INT
%start main /* the entry point */
%type <int> main
%%
main:
| INT { $1 }
;
and in main.ml
let read_test_lexer_parser () =
let ic = open_in "test" in
let lexbuf = Lexing.from_channel ic in
let max = ref 0 in
try
while true do
let result = Parser.main Lexer_parser.integers lexbuf in
if result > !max then max := result
done
with End_of_file ->
close_in ic;
Format.eprintf "%d#." !max
(I cut some benchs)
❯ hyperfine --warmup 3 -P arg 1 7 'dune exec program -- {arg}'
Benchmark #1: dune exec program -- 1
Time (mean ± σ): 1.357 s ± 0.030 s [User: 1.316 s, System: 0.041 s]
Range (min … max): 1.333 s … 1.431 s 10 runs
Benchmark #6: dune exec program -- 6
Time (mean ± σ): 5.745 s ± 0.289 s [User: 5.230 s, System: 0.513 s]
Range (min … max): 5.549 s … 6.374 s 10 runs
Benchmark #7: dune exec program -- 7
Time (mean ± σ): 7.195 s ± 0.049 s [User: 7.161 s, System: 0.034 s]
Range (min … max): 7.148 s … 7.300 s 10 runs
Summary
'dune exec program -- 1' ran
4.23 ± 0.23 times faster than 'dune exec program -- 6'
5.30 ± 0.12 times faster than 'dune exec program -- 7'
I may have not done it properly hence the poor result but this doesn't seem promising. The way I'm doing it is that I want to get the value as soon as it's read to handle it otherwise I'll have to create a list of values and this will be even worse (believe me, I tried, it took 30 seconds to find the max value).
My dune file, in case you're wondering, looks like this (I have an empty program.opam file to please dune):
(executable
(name main)
(public_name program)
)
(ocamllex lexer)
(ocamllex lexer_list)
(ocamllex lexer_parser)
(ocamlyacc parser)
Related
Unhandled Exception with OCaml 5.0.0~beta1
I'm inconsistently getting this error in a first experiment with OCaml 5.0.0~beta1: Fatal error: exception Stdlib.Effect.Unhandled(Domainslib__Task.Wait(_, _)) My setup: Processor: Intel(R) Core(TM) i7-8750H CPU # 2.20GHz Debian 10 (buster) opam version 2.1.3 installed as binary from this script opam switch: "→ 5.0.0~beta1 ocaml-base-compiler.5.0.0~beta1 5.0.0~beta1" After a quick read of this tutorial, I copied the parallel_matrix_multiply function and added some code in the end just to use it: open Domainslib let parallel_matrix_multiply pool a b = let i_n = Array.length a in let j_n = Array.length b.(0) in let k_n = Array.length b in let res = Array.make_matrix i_n j_n 0 in Task.parallel_for pool ~start:0 ~finish:(i_n - 1) ~body:(fun i -> for j = 0 to j_n - 1 do for k = 0 to k_n - 1 do res.(i).(j) <- res.(i).(j) + a.(i).(k) * b.(k).(j) done done); res ;; let pool = Task.setup_pool ~num_domains:3 () in let a = Array.make_matrix 2 2 1 in let b = Array.make_matrix 2 2 2 in let c = parallel_matrix_multiply pool a b in for i = 0 to 1 do for j = 0 to 1 do Printf.printf "%d " c.(i).(j) done; print_char '\n' done;; I then compile it with no errors with ocamlfind ocamlopt -linkpkg -package domainslib parallel_for.ml and then comes the problem: executing the generated a.out file sometimes (rarely) prints the expected output 4 4 4 4 but usually ends with the error mentioned earlier: Fatal error: exception Stdlib.Effect.Unhandled(Domainslib__Task.Wait(_, _)) Sorry if I am making some trivial mistake, but I can't understand what is going on, especially given that the error happens inconsistently.
The parallel_matrix_multiply computation is running outside of the Domainslib scheduler, thus whenever a task yields to the scheduler, the Wait effect is unhandled and transformed into a Effect.Unhandled exception. The solution is to run the parallel computation within Task.run: ... let c = Task.run pool (fun () -> parallel_matrix_multiply pool a b) in ...
Fortran runtime error for an input.dat file
Similar question was asked before but my problem is different: I've been trying an old fortran code to execute with gfortran on my mac. The input file is not working for some reason - I don't know whether it's a shortcoming of the code or the input file. The source code and the input file are on the same directory. Here's the code: C***************** M.R.T.M ********************************************* C IMPLICIT REAL*8(A-H,O-Z) CHARACTER*64 FNAMEI,FNAMEO COMMON/L1/ C(101),DC(101),DU(101),DL(101),E(101),S1(101),S2(101) COMMON/L2/ SIR(101),CX(101),S1X(101),S2X(101) COMMON/L3/ X(101),S3(101),S3X(101) COMMON/L4/ TH,ROU,COL,WFLX,CI,CS,D,K1,K2,W,K3,K4,U,KS,K5,K6,KD COMMON/L5/ NEQ,IT,N,NM1,NP1 COMMON/L6/ TPULSE,TTOTAL,TPRINT,DT,DX,GAMMA,BETA CHARACTER*64 USER,SOIL,SOLUTE,DATE REAL*8 K1,K2,K3,K4,K5,K6,KS,KD,NEQ C C C------ READ INPUT PARAMETERS---------------- C WRITE(*,*) 'PLEASE ENTER USER NAME (OPTIONAL):' READ(*,800) USER WRITE(*,*) ' PLEASE ENTER NAME OF SOIL (OPTIONAL):' READ(*,800) SOIL WRITE(*,*) ' PLEASE ENTER NAME OF SOLUTE (OPTIONAL):' READ(*,800) SOLUTE WRITE(*,*) ' ENTER DATE OR OTHER IDENTIFICATION (OPTIONAL):' READ(*,800) DATE WRITE(*,*) ' ' WRITE(*,*) $'--------- INPUT PARAMETERS SECTION -------------' WRITE(*,*) ' ' WRITE(*,*) ' INPUT PARAMETERS CAN BE PROVIDED IN TWO WAYS; ' WRITE(*,*) ' ENTER 1 if you wish to enter the input data using' WRITE(*,*) ' the keyboard (i.e. interactively) ' WRITE(*,*) ' ' WRITE(*,*) ' OR ' WRITE(*,*) ' ' WRITE(*,*) ' ENTER 2 if an input data file is to be provided ' WRITE(*,*) $' PLEASE ENTER EITHER 1 OR 2' READ(*,950) IFLAG IF(IFLAG.NE.1) THEN WRITE(*,'(A)') ' PLEASE ENTER NAME OF INPUT FILE?' WRITE(*,*) '(for example A:XX.DAT or C:UU.DAT for hard disk)' READ(*,'(A)') FNAMEI OPEN(5,FILE=FNAMEI) C C READ(5,700) TH,ROU,COL,WFLX READ(5,700) CI,CS,D READ(5,700) KD,NEQ READ(5,700) K1,K2,W READ(5,700) K3,K4,U READ(5,700) KS READ(5,700) K5,K6 READ(5,750) IT READ(5,700) TPULSE,TTOTAL,TPRINT,DT,DX ELSE C WRITE(*,*) $'PLEASE ENTER THE FOLLOWING INPUT PARAMETERS :' WRITE(*,*) ' ' WRITE(*,*) $' (1) MOISTURE CONTENT, CM3/CM3 (TH) =' WRITE(*,*) $' (Values usually less than 0.65 cm3/cm3). Enter your value NOW' READ(*,900) TH WRITE(*,*) $' (2) BULK DENSITY, G/CM3 (ROU) =' WRITE(*,*) $' (Range of values 1.1 - 1.7 g/cm3). Enter your value NOW' READ(*,900) ROU WRITE(*,*) $' (3) PROFILE OR SOIL COLUMN LENGTH, CM (COL) =' READ(*,900) COL WRITE(*,*) $' (4) WATER FLUX, CM/HOUR (WFLX) =' WRITE(*,*) $'(Range of values 0.01 - 5 cm/hr). Enter your value NOW' READ(*,900) WFLX WRITE(*,*) $' (5) INITIAL CONCENTRATION, MG/L (CI)=' READ(*,900) CI WRITE(*,*) $' (6) APPLIED CONCENTRATION, MG/L (CS)=' READ(*,900) CS WRITE(*,*) $' (7) DISPERSION COEFFICIENT,D, CM2/HOUR (D) =' WRITE(*,*) $'(Range of values 0.1 - 1.5 cm2/hour). Enter your value NOW' READ(*,900) D WRITE(*,*) $' (8) DISTRIBUTION COEFFICIENT, KD (KD) =' WRITE(*,*) $' (Range of values 0 - 300 cm3/g) Enter your value NOW' READ(*,900) KD WRITE(*,*) $' (9) NONLINEAR FREUNDLICH PARAMETER, N (NEQ)=' WRITE(*,*) '(Range of values 0.3 - 0.9). Enter your value NOW' READ(*,900) NEQ WRITE(*,*) $' (10) FORWARD RATE REACTION, K1, HR-1 (K1) =' WRITE(*,*) '(Range of values 0.01 - 2 hr-1). Enter your value NOW' READ(*,900) K1 WRITE(*,*) $' (11) BACKWARD RATE REACTION, K2, HR-1 (K2) =' WRITE(*,*) '(Range of values 0.01 - 5 hr-1). Enter your value NOW' READ(*,900) K2 WRITE(*,*) $' (12) NONLINEAR KINETIC PARAMETER, W, (W)=' WRITE(*,*) '(Range of values 0.3 - 0.9). Enter your value NOW' READ(*,900) W WRITE(*,*) $' (13) FORWARD RATE REACTION, K3, HR-1 (K3)=' WRITE(*,*) '(Ranges from 0.0001 - 0.1 hr-1). Enter your value NOW' READ(*,900) K3 WRITE(*,*) $' (14) BACKWARD RATE REACTION, K4, HR-1 (K4)=' WRITE(*,*) '(Ranges from 0.01 - 0.1 hr-1). Enter your value NOW' READ(*,900) K4 WRITE(*,*) $' (15) NONLINEAR KINETIC PARAMETER, U, (U) =' WRITE(*,*) '(Range of values 0.3 - 0.9). Enter your value NOW' READ (*,900) U WRITE(*,*) $' (16) IRREVERSIBLE REACTION PATE,KS,HR-1 (KS) =' WRITE(*,*) '(Range is 0.0001 - 0.01 hr-1). Enter your value NOW' READ(*,900) KS WRITE(*,*) $' (17) FORWARD RATE REACTION, K5,HR-1 (K5) =' WRITE(*, *) '(Range is 0.0001 - 0.01 hr-1). Enter your value NOW' READ(*,900) K5 WRITE(*,*) $' (18) BACKWARD RATE REACTION, K6, HR-1 (K6) =' WRITE(*,*) '(Range is 0.001 - 0.1 hr-1). Enter your value NOW' READ(*,900) K6 WRITE(*,*) $' (19) NUMBER OF ITERATIONS (IT) AN INTEGER (FROM 0 TO 9)' READ(*,950) IT WRITE(*,*) $' (20) INPUT PULSE DURATION, HOURS (TPULSE) =' READ(*,900) TPULSE WRITE(*,*) $' (21) TOTAL SIMULATION TIME, HOURS (TTOTAL) =' READ(*,900) TTOTAL WRITE(*,*) $' (22) PRINTOUT TIME DESIRED, HOURS (TPRINT) =' READ(*, 900) TPRINT WRITE(*,*) $' (23) INCREMENTAL TIME STEP, HOURS (DT) =' WRITE(*,*) $' A default value of DT=0.02 is given' READ(*,900) DDT WRITE(*,*) $' (24) INCREMENTAL DEPTH, CM (DX)=' WRITE(*,*) $' A default value of DX=1.00 is given ' READ(*,900) DDX ENDIF C XIN=1.00 IF(DDX.NE.0.0) THEN DX=DDX ELSE DX=XIN ENDIF C PIN=0.02 IF(DDT.NE.0.0) THEN DT=DDT ELSE DT=PIN ENDIF WRITE(*,'(A)') 'PLEASE ENTER NAME OF THE OUTPUT FILE (FOR EXAMPLE * B:ZZ.DAT)' READ(*,'(A)') FNAMEO OPEN (6,FILE=FNAMEO,STATUS='UNKNOWN') PV=WFLX/TH RS=NEQ*ROU*KD/TH C0=CS C TIME=0.0D0 EF=0.0D0 5 CONTINUE GAMMA=DT/(2.D0*DX*DX) BETA=DT/DX IF((BETA*PV).GT.0.50D0) GO TO 7 IF((GAMMA*D/(BETA*PV)).LT.0.5D0) GO TO 6 GO TO 8 6 DX=DX/2 GO TO 5 7 DT=DT/2 GO TO 5 8 CONTINUE N=INT(COL/DX) NM1=N-1 NM2=N-2 NP1=N+1 GAMMA=DT/(2*DX*DX) BETA=DT/DX C IF(N.LT.500) GO TO 9 WRITE(*,*) 'W A R N I N G' WRITE(*,*) &'Dimension of variables exceeds 500. Did you increase array sizes' WRITE(*,*) &' If not, the program will terminate abruptly (see text).' 9 CONTINUE C C--- WRITE TITLE HEADING --------------- WRITE(6,800) USER WRITE(6,800) SOIL WRITE(6,800) SOLUTE WRITE(6,800) DATE WRITE(6,300) TH,ROU,COL,WFLX,CI,CS,D,K1,K2,B,K3,K4,W,KS WRITE(6,310) K5,K6,IT,KD,NEQ &,TPULSE,TTOTAL,TPRINT WRITE(6,400) DX,DT C DO 10 I=1,NP1 S1(I)=0.0D0 S2(I)=0.0D0 S3(I)=0.0D0 SIR(I)=0.0D0 S1X(I)=0.0D0 S2X(I)=0.0D0 S3X(I)=0.0D0 CX(I)=CI 10 C(I)=CI WRITE(*,*) '------INITIAL CONDITIONS COMPLETED --------' C WRITE(*,*) '------Execution Begins Please Wait---------------' WRITE(*,*) '------Please Wait -------------' IT=IT+1 FF=2*DX NKK=INT(TPRINT/DT+0.50D0) KLM=INT(TTOTAL/DT+0.50D0) KK=INT(KLM/NKK+0.5D0) C L=0 SINT=TPULSE*CS*WFLX DO 50 JJ=1,KK DO 20 LL=1,NKK TT=LL*DT+(JJ-1)*TPRINT IF(DABS(TT-TPULSE).LT.0.01D0) CS=0.0D0 L=L+1 CALL SMRTM EF=C(N)+EF 20 CONTINUE TIME=JJ*TPRINT C WRITE(6,500) TIME VV0=WFLX*TIME/(COL*TH) CC0=C(N)/C0 WRITE(6,525) VV0,CC0 WRITE(*, 650) TIME,VV0,CC0 WRITE(*,*) '--------Execution Continues--------' WRITE(*,*) '--------Please Wait---------' WRITE (6, 550) DO 30 I=1, NP1 DEPTH=DX*(I-1) SEQ=KD*C(I)**NEQ TOTAL=SEQ+S1(I)+S2(I)+S3(I)+SIR(I) 30 WRITE(6,600) DEPTH, C(I),SEQ,S1(I),S2(I),S3(I),SIR(I),TOTAL CALL INTEG(DX,C,X,NP1) TSWATR=TH*X(NP1) C DO 40 I=1,NP1 40 E(I)=C(I)**NEQ CALL INTEG(DX, E, X, NP1) TSEQ=ROU*KD*X(NP1) SINP=TIME*CS*WFLX IF(SINP.GT.SINT) SINP=SINT IF(CS.EQ.0.D0) SINP=SINT C CALL INTEG(DX,S1,X,NP1) TSKIN1=ROU*X(NP1) C CALL INTEG(DX,S2,X,NP1) TSKIN2=ROU*X(NP1) C CALL INTEG(DX,S3,X,NP1) TSKIN3=ROU*X(NP1) C TEFFL=DT*WFLX*EF C CALL INTEG(DX,SIR,X,NP1) TSIR=ROU*X(NP1) BAL=(TEFFL+TSKIN1+TSKIN2+TSKIN3+TSIR+TSEQ+TSWATR)*100.0D0/SINP 50 WRITE(6,200) SINP,TSWATR,TSEQ,TSKIN1,TSKIN2,TSKIN3,TSIR,TEFFL,BAL CONTINUE C 200 FORMAT(//,2X,'S A L T B A L A N C E:',// &7X, 'TOTAL INPUT SOLUTE FROM PULSE (MG) = ',F10.4,/ &7X, 'TOTAL SOLUTE SOIL SOLUTION PHASE (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN (EQUILIB) PHASE SE (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN (KINETIC) PHASE S1 (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN (KINETIC) PHASE S2 (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN (KINETIC) PHASE S3 (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN IRREVERSIBLE PHASE (MG) = ',F10.4,/, &7X, 'TOTAL SORBED IN THE EFFLUENT (MG) = ',F10.4,/, &7X, 'MASS BALANCE (CALC.OUTPUT/INPUT) (%) = ',F10.4,/) 300 FORMAT(//, $2X, 'INPUT PARAMETERS :',// $5X,'1. MOISTURE CONTENT, CM3/CM3 (TH) = ',F10.5,/ $5X,'2. BULK DENSITY, G/CM3 (ROU) = ',F10.5,/ $5X,'3. COLUMN LENGTH, CM (COL) = ',F10.5,/ $5X,'4. WATER FLUX, CM/HOUR (WFLX) = ',F10.5,/ $5X,'5. INITIAL CONCENTRATION, MG/L (CI) = ',F10.5,/ $5X,'6. CONCEN.IN INPUT PULSE, MG/L (CS) = ',F10.5,/ $5X,'7. DISPERSION COEFFICIENT, CM2/HR (D) = ',F10.5,/ $5X,'8. FOWARD RATE REACTION, K1,HR-1 (K1) = ',F10.5,/ $5X,'9. BACKWARD RATE REACTION, K2,HR-1 (K2) = ',F10.5,/ $4X,'10. NONLINEAR KINETIC PARAMETER, W, (W) = ',F10.5,/ $4X,'11. FORWARD RATE REACTION, K3/HR-1 (K3) = ',F10.5,/ $4X,'12. BACKWARD RATE REACTION, K4/HR-1 (K4) = ',F10.5,/ $4X,'13. NONLINEAR KINETIC PARAMETER, U, (U) = ',F10.5,/ $4X,'14. IRREVERSIBLE REACTION RATE, KS/HR-1 (KS) = ',F10.5,/) 310 FORMAT( $4X,'15. FORWARD RATE REACTION, K5,HR-1 (K5) = ',F10.5,/ $4X,'16. BACKWARD RATE REACTION, K6,HR-1 (K6) = ',F10.5,/ $4X,'17. NUMBER OF ITERATIONS (IT) = ',I10.5,/ $4X,'18. DISTRIBUTION COEFFICIENT FOR EQUILIBRIUM',/ $4X,' SORPTION, KD, CM3/G (KD) = ',F10.5,/ $4X,'19. NONLINEAR PARAMETER FOR EQUILIBRIRUM',/ $4X,' Mechanism, NEQ (NEQ) = ',F10.5,/ $4X,'20. INPUT PULSE DURATION, HR (TPULSE) = ',F10.5,/ $4X,'21. TOTAL SIMULATION TIME, HR (TTOTAL) = ',F10.5,/ $4X,'22. PRINTOUT TIME DESIRED,HR (TPRINT) = ',F10.5,////) 400 FORMAT(2X, 'THE INCREMENTS USED WERE : ',// $5X,'1. SIMULATION DEPTH INTERVAL, CM (DX)=',F10.5,/ $5X,'2. INCREMENTAL TIME STEP,HR (DT)=',F10.5,///) 500 FORMAT(/////////, $2X'S I M U L A T I O N T I M E (HOUR) = ',F8.2/) 525 FORMAT( $2X'PORE VOLUMES (V/V0) = ',F10.2,8X,'REL. CONCENTRATION (C/C0) =', &F8.4) 550 FORMAT(///1H, 72(1H*)//1H, 20X, 'CONCENTRATION DISTRIBUTION', *//1H , 172(1H*)//1H, 2X, *'DEPTH SOLUT EQUIL KINETIC KINETIC KINETIC IRREV. *TOTAL'/, 9X, 'CONC.', 4X, *'PHASE PHASE 1 PHASE 2 PHASE 3 SINK SORBED'/, *' X C SE S1 S2 S3 SIR * S'//,1X 1,' CM ',2X,'--MG/L--',2X, 1'--------------------- MG/KG ---------------------'/) 600 FORMAT(1X,F6.2,1X,F9.4,1X,F8.4,1X,F8.3, *1X,3(F9.3,1X),F7.3) 650 FORMAT(/////,2X,'SIMULATIONS ARE NOW COMPLETE UP TO',///,5X, $'S I M U L A T I O N T I M E (HOUR) = ',F8.2,//2X, $'PORE VOLUMES (V/V0) = ',F10.2,8X,'REL CONCENTRATION (C/C0)=', &F8.4//) 700 FORMAT(50X, E10.6) 750 FORMAT(50X,I3) 800 FORMAT(A64) 900 FORMAT(F12.0) 950 FORMAT(I1) WRITE(*,*) WRITE(*,*) '------ Requested Simulations Completed ------' WRITE(*,*) WRITE(*,*) '------- MRTM TERMINATED SUCCESSFULLY -------' WRITE(*,*) WRITE(*,*) '------- THANK YOU FOR USING MRTM --------' END C C C ************************************************************** C SUBROUNTINE SMRTM GIVES A SOLUTION OF THE FINITE DIFFERENCE EQ. C OF THE CONVECTIVE-DISPERSION AND MULTIREACTION SYSTEM C *************************************************************** C SUBROUTINE SMRTM IMPLICIT REAL*8 (A-H,O-Z) COMMON/LI/ C(101),DC(101),DU(101),DL(101),E(101),S1(101),S2(101) COMMON/L2/ SIR(101),CX(101),S1X(101),S2X(101) COMMON/L3/ X(101) ,S3(101) ,S3X(101) COMMON/L4/ TH,ROU,COL,WFLX,CI,CS,D,K1,K2,W,K3,K4,U,KS,K5,K6,KD COMMON/L5/ NEQ,IT,N,NM1,NP1 COMMON/L6/ TPULSE,TTOTAL,TPRINT,DT,DX,GAMMA,BETA REAL*8 K1,K2,K3,K4,K5,K6,KS,KD,NEQ C C FF=2*DX PV=WFLX/TH RS=NEQ*ROU*KD/TH C(1)=(WFLX*FF*CS+D*TH*C(3))/(WFLX*FF+D*TH) DO 35 IJ=1,IT M=2 DO 10 I=1,NM1 DC(I) =1.0D0+2.D0*GAMMA*D-BETA*PV DU(I)=BETA*PV-GAMMA*D E(I)=C(M)+GAMMA*D*(C(M+1) -2.0D0*C(M) + C(M-1)) DL(I)=-GAMMA*D M=I+2 10 CONTINUE M=N DC(NM1)=1.D0+GAMMA*D E(1)=E(1)+GAMMA*D*C(1) C C INCORPORATION OF NONLINEAR KINETIC AND EQUILIBRIUM PROCESSES C (REVERSIBLE) IN MAIN DIAGONAL ELEMENTS AND RHS VECTOR C DO 20 I=1,NM1 DC(I)=DC(I)+DT*KS/2 R=0.0D0 H1=0.0D0 H2=0.0D0 IF((C(I+1).LT.1.0D-4) .OR. (CX(I+1).LE.1.0D-4)) GO TO 15 R =RS*(0.50D0*(C(I+1)+CX(I+1)))**(NEQ-1.0D0) H1=(0.50D0*(C(I+1)+CX(I+1)))**W H2=(0.50D0*(C(I+1)+CX(I+1)))**U 15 DC(I)=DC(I)+R E(I)=E(I)-DT*(K1*H1-K2*(ROU/TH)*(S1(I+1)+S1X(I+1))/2) &-DT*(K3*H2-K4*(ROU/TH)*(S2(I+1)+S2X(I+1))/2) 20 E(I)=E(I)+C(I+1)*R-DT*(KS/2)*((C(I+1)+CX(I+1))/2) C CALL TRIDM(DC,DU,DL,E,NM1) DO 25 I=2,N 25 CX(I)=E(I-1) CX(NP1)=CX(N) CX(1)=C(1) DO 30 I=1,NP1 H1=0.0D0 H2=0.0D0 IF(C(I).GT.1.0D-4) H1=((C(I)+CX(I))/2)**W IF(C(I).GT.1.0D-4) H2=((C(I)+CX(I))/2)**U S1X(I) =S1(I)+ DT*(K1*(TH/ROU)*H1-K2*(S1(I)+S1X(I))/2) S2X(I) =S2(I) + DT*K3*(TH/ROU)*H2-(K4+K5)*DT*(S2(I)+S2X(I))/2 $+DT*K6*S3(I) 30 CONTINUE 35 CONTINUE C C DO 50 I=1, NP1 C(I)=CX(I) S1(I)=S1X(I) S2(I)=S2X(I) S3(I)=S3(I)+DT*K5*S2(I) $-DT*K6*S3(I) 50 SIR(I)=SIR(I) + DT*KS*(TH/ROU)*C(I) RETURN END C C ***************************************************************** C SUBROUNTINE TRIDM GIVES A SOLUTION OF A TRIDIAGONAL MATRIX-VECTOR C EQUATION USING THOMAS ALGORITHM C *************************************************************** C SUBROUTINE TRIDM(A,B,C,D,N) IMPLICIT REAL*8(A-H,O-Z) DIMENSION A(N),B(N),C(N),D(N) DO 1 I=2,N C(I)=C(I)/A(I-1) A(I)=A(I)-(C(I)*B(I-1)) 1 CONTINUE DO 2 I=2, N D(I)=D(I)-(C(I)*D(I-1)) 2 CONTINUE D(N)=D(N)/A(N) DO 3 I=2, N D(N+1-I)=(D(N+1-I)-(B(N+1-I)*D(N+2-I)))/A(N+1-I) 3 CONTINUE RETURN END C C ***************************************************************** C SUBROUNTINE INTEG PERFORMS INTEGRATION OF A TABULAR FUNCTION Y C GIVEN AT EQUAL DISTANCES H USING TRAPEZOIDAL RULE C *************************************************************** C SUBROUTINE INTEG(H,Y,Z,N) IMPLICIT REAL*8(A-H,O-Z) DIMENSION Y(N),Z(N) S2=0.0D0 IF(N-1) 40,30,10 10 HH=H/2.0D0 DO 20 I=2,N S1=S2 S2=S2+HH*(Y(I)+Y(I-1)) 20 Z(I-1)=S1 30 Z(N)=S2 40 RETURN END here is the input file: 1. MOISTURE CONTENT,CM3/CM3 (TH) = 0.400E00 2. BULK DENSITY,G/CM3 (ROU) = 1.250E00 3. COLUMN LENGTH,CM (COL) = 10.000E00 4. WATER FLUX,CM/HR (WFLX) = 1.000E00 5. INITIAL CONCENTRATION,MG/L (CI) = 0.000E00 6. CONCEN.IN INPUT PULSE, MG/L (CS) = 10.000E00 7. DISPERSION COEFFICIENT,D,CM2/HR (D) = 1.000E00 8. DISTRIB. COEFF.FOR EQL. SORP,CM3/G (KD) = 1.000E00 9. NONLINEAR PARAM.FOR EQUL. MECH. (NEQ) = 1.000E00 10. FORWARD RATE REACTION, K1,HR-1 (K1) = 0.100E00 11. BACKWARD RATE REACTION, K2,HR-1 (K2) = 0.100E00 12. NONLINEAR KINETIC PARAMETER, W, (W) = 0.500E00 13. FORWARD RATE REACTION, K3,HR-1 (K3) = 0.000E00 14. BACKWARD RATE REACTION, K4,HR-1 (K4) = 0.000E00 15. NONLINEAR KINETIC PARAMETER, U, (U) = 0.000E00 16. IRREVERSIBLE REACTION RATE,KS,HR-1 (KS) = 0.000E00 17. FORWARD RATE REACTION, K5,HR-1 (K5) = 0.000E00 18. BACKWARD RATE REACTION, K6,HR-1 (K6) = 0.000E00 19. NUMBER OF ITERATIONS (M) (IT) = 000 20. INPUT PULSE DURATION,HR (TPULSE) = 12.000E00 21. TOTAL SIMULATION TIME,HR (TTOTAL) = 16.000E00 22. PRINTOUT TIME DESIRED,HR (TPRINT) = 4.000E00 23. INCREMENTAL TIME STEP,HR (DT) = 0.200E00 24. INCREMENTAL DEPTH, CM (DX) = 1.000E00 And the error I'm receiving: Ms-MacBook-Pro-2:~ Tonoy$ gfortran mrtm.f Ms-MacBook-Pro-2:~ Tonoy$ ./a.out PLEASE ENTER USER NAME (OPTIONAL): rm PLEASE ENTER NAME OF SOIL (OPTIONAL): bd PLEASE ENTER NAME OF SOLUTE (OPTIONAL): cr ENTER DATE OR OTHER IDENTIFICATION (OPTIONAL): 2015 --------- INPUT PARAMETERS SECTION ------------- INPUT PARAMETERS CAN BE PROVIDED IN TWO WAYS; ENTER 1 if you wish to enter the input data using the keyboard (i.e. interactively) OR ENTER 2 if an input data file is to be provided PLEASE ENTER EITHER 1 OR 2 2 PLEASE ENTER NAME OF INPUT FILE? (for example A:XX.DAT or C:UU.DAT for hard disk) input.DAT PLEASE ENTER NAME OF THE OUTPUT FILE (FOR EXAMPLE B:ZZ.DAT) At line 173 of file mrtm.f (unit = 5, file = 'input.DAT') Fortran runtime error: End of file Error termination. Backtrace: #0 0x10c688729 #1 0x10c6893f5 #2 0x10c689b59 #3 0x10c751f8b #4 0x10c752527 #5 0x10c74f5c3 #6 0x10c7545b4 #7 0x10c679590 #8 0x10c67b2a0 Ms-MacBook-Pro-2:~ Tonoy$ The expected outcome given the input should give (first 8 hr out of 16 hr):
In FORTRAN convention on Unix machines Unit 5 is connected to Standard Input when the program stats. The statement on line 45 OPEN(5,FILE=FNAMEI) happens to disconnect standard input and attaches an input file to standard input so that subsequent READ(*,FORMAT) statements try to read from this file and encounter its end. This causes the Input/output error you report. If you follow the suggestion given by francescalus in the comments, and replace Unit 5 on lines 45 and 48-56 with Unit 15, this error will be gone. If you compile with the options suggested by Vladimir, i.e. gfortran -Wall -g -fbacktrace -fcheck=all mrtm.for -o mrtm the program runs to completion and produces some output. :~> ./mrtm PLEASE ENTER USER NAME (OPTIONAL): rm PLEASE ENTER NAME OF SOIL (OPTIONAL): bd PLEASE ENTER NAME OF SOLUTE (OPTIONAL): cr ENTER DATE OR OTHER IDENTIFICATION (OPTIONAL): 2015 --------- INPUT PARAMETERS SECTION ------------- INPUT PARAMETERS CAN BE PROVIDED IN TWO WAYS; ENTER 1 if you wish to enter the input data using the keyboard (i.e. interactively) OR ENTER 2 if an input data file is to be provided PLEASE ENTER EITHER 1 OR 2 2 PLEASE ENTER NAME OF INPUT FILE? (for example A:XX.DAT or C:UU.DAT for hard disk) input.dat PLEASE ENTER NAME OF THE OUTPUT FILE (FOR EXAMPLE B:ZZ.DAT) zz.dat Without -fcheck=all the program fails with Segmentation fault - invalid memory reference on line 253.
How to count frequencies of certain character in a string?
If I have a run of characters such as "AABBABBBAAAABBAAAABBBAABBBBABABB". Is there a way to get R to count the runs of A and state how many of each length ? So I'd like to know how many instances of 3 A's in a row, how many instances of a single A, how many instances of 2 A's in a row, etc.
table(rle(strsplit("AABBABBBAAAABBAAAABBBAABBBBABABB","")[[1]])) gives values lengths A B 1 3 1 2 2 3 3 0 2 4 2 1 which (reading down the A column) means there were 3 A runs of length 1, 2 A runs of length 2 and 2 A runs of length 4.
Try v1 <- scan(text=gsub('[^A]+', ',', str1), sep=',', what='', quiet=TRUE) table(v1[nzchar(v1)]) # A AA AAAA # 3 2 2 Or library(stringi) table(stri_extract_all_regex(str1, '[A]+')[[1]]) # A AA AAAA # 3 2 2 Benchmarks set.seed(42) x1 <- stri_rand_strings(1,1e7, pattern='[A-G]') system.time(table(stri_split_regex(x1, "[^A]+", omit_empty = TRUE))) # user system elapsed # 0.829 0.002 0.831 system.time(table(stri_extract_all_regex(x1, '[A]+')[[1]])) # user system elapsed # 0.790 0.002 0.791 system.time(table(rle(strsplit(x1,"")[[1]])) ) # user system elapsed # 30.230 1.243 31.523 system.time(table(strsplit(x1, "[^A]+"))) # user system elapsed # 4.253 0.006 4.258 system.time(table(attr(gregexpr("A+",x1)[[1]], 'match.length'))) # user system elapsed # 1.994 0.004 1.999 library(microbenchmark) microbenchmark(david=table(stri_split_regex(x1, "[^A]+", omit_empty = TRUE)), akrun= table(stri_extract_all_regex(x1, '[A]+')[[1]]), david2 = table(strsplit(x1, "[^A]+")), glen = table(rle(strsplit(x1,"")[[1]])), plannapus = table(attr(gregexpr("A+",x1)[[1]], 'match.length')), times=20L, unit='relative') #Unit: relative # expr min lq mean median uq max neval cld # david 1.0000000 1.000000 1.000000 1.000000 1.0000000 1.000000 20 a # akrun 0.7908313 1.023388 1.054670 1.336510 0.9903384 1.004711 20 a # david2 4.9325256 5.461389 5.613516 6.207990 5.6647301 5.374668 20 c # glen 14.9064240 15.975846 16.672339 20.570874 15.8710402 15.465140 20 d #plannapus 2.5077719 3.123360 2.836338 3.557242 2.5689176 2.452964 20 b data str1 <- 'AABBABBBAAAABBAAAABBBAABBBBABABB'
Here's additional way using strsplit x <- "AABBABBBAAAABBAAAABBBAABBBBABABB" table(strsplit(x, "[^A]+")) # A AA AAAA # 3 2 2 Or similarly with the stringi package library(stringi) table(stri_split_regex(x, "[^A]+", omit_empty = TRUE))
For completeness, here is another way, using the regmatches and gregexpr combo, to extract regexes: x <- "AABBABBBAAAABBAAAABBBAABBBBABABB" table(regmatches(x,gregexpr("A+",x))[[1]]) # A AA AAAA # 3 2 2 Or in fact, since gregexpr keeps the length of the captured substring as attribute, one could even do, directly: table(attr(gregexpr("A+",x)[[1]],'match.length')) # 1 2 4 # 3 2 2
How do I print out lines recursively from a text file along with the average value of total elements from per line?
I know this question sounds a bit challenging, but hey I find nothing like this relates with F sharp here. Okay so, since in my previous question I mentioned I'm new to F#. Thanks to several of fellow programmers' helps here to solve my sum function in my previous question. So, I have a text file that contained more than 20 lines and I want to print out lines with year and average of total elements from each year and its elements. Sample text lines 2009 1.3 3.51 6.76 5.80 4.48 5.47 2.06 4.3 0.54 7.69 1.27 2.9 2008 3.53 3.71 1.88 2.46 4.63 4.88 4.53 1.51 10.83 2.7 1.28 6.51 2007 2.88 2.19 3.55 3.95 2 3.1 4.18 8.76 1.91 2.01 1.67 3.54 2006 3.48 1.33 3.16 3.87 3.19 3.87 4.24 7.12 4.32 6.63 2.97 3.37 2005 5.32 2.41 1.76 1.63 1.78 1.07 2.07 1.44 2.68 1.14 2.15 1.38 2004 1.09 0.75 3.93 1.9 5.57 2.94 4.46 5.01 0.86 2.42 5.02 1.75 .... Now, I have a couple functions to show you. Unfortunately, my print function only prints out the first line. let rec print year values = if values = [] then () else printfn "" printfn "%A: %A" year values and the 2nd function which does the sum of elements perfectly, but I cannot manage to get it to divide it by 12 elements properly. let sum (values: double list) = let rec sum values accum = match values with | [] -> accum | head :: tail -> sum tail (accum + head) / 12.0 // would 12.0 work right? sum values 0.0 in the main let (year, values) = ParseLine file.Head printfn "%A: %A" print (year (sum values)) // year gets the error, according to visual studio
Thanks to John Palmer who posted a quick solution. the fixed code is let sum (values: double list) = let rec sum values accum = match values with | [] -> accum | head :: tail -> sum tail (accum + head/12.0) sum values 0.0
Can Haskell optimize function calls the same way Clang / GCC does?
I want to ask you if Haskell and C++ compilers can optimize function calls the same way. Please look at following codes. In the following example Haskell is significantly faster than C++. I have heard that Haskell can compile to LLVM and can be optimized by the LLVM passes. Additionally I have heard that Haskell has some heavy optimizations under the hood. But the following examples should be able to work with the same performance. I want to ask: Why my sample benchmark in C++ is slower than the on in Haskell? is it possible to further optimize the codes? (I'm using LLVM-3.2 and GHC-7.6). C++ code: #include <cstdio> #include <cstdlib> int b(const int x){ return x+5; } int c(const int x){ return b(x)+1; } int d(const int x){ return b(x)-1; } int a(const int x){ return c(x) + d(x); } int main(int argc, char* argv[]){ printf("Starting...\n"); long int iternum = atol(argv[1]); long long int out = 0; for(long int i=1; i<=iternum;i++){ out += a(iternum-i); } printf("%lld\n",out); printf("Done.\n"); } compiled with clang++ -O3 main.cpp haskell code: module Main where import qualified Data.Vector as V import System.Environment b :: Int -> Int b x = x + 5 c x = b x + 1 d x = b x - 1 a x = c x + d x main = do putStrLn "Starting..." args <- getArgs let iternum = read (head args) :: Int in do putStrLn $ show $ V.foldl' (+) 0 $ V.map (\i -> a (iternum-i)) $ V.enumFromTo 1 iternum putStrLn "Done." compiled with ghc -O3 --make -fforce-recomp -fllvm ghc-test.hs speed results: Running testcase for program 'cpp/a.out' ------------------- cpp/a.out 100000000 0.0% avg time: 105.05 ms cpp/a.out 200000000 11.11% avg time: 207.49 ms cpp/a.out 300000000 22.22% avg time: 309.22 ms cpp/a.out 400000000 33.33% avg time: 411.7 ms cpp/a.out 500000000 44.44% avg time: 514.07 ms cpp/a.out 600000000 55.56% avg time: 616.7 ms cpp/a.out 700000000 66.67% avg time: 718.69 ms cpp/a.out 800000000 77.78% avg time: 821.32 ms cpp/a.out 900000000 88.89% avg time: 923.18 ms cpp/a.out 1000000000 100.0% avg time: 1025.43 ms Running testcase for program 'hs/main' ------------------- hs/main 100000000 0.0% avg time: 70.97 ms (diff: 34.08) hs/main 200000000 11.11% avg time: 138.95 ms (diff: 68.54) hs/main 300000000 22.22% avg time: 206.3 ms (diff: 102.92) hs/main 400000000 33.33% avg time: 274.31 ms (diff: 137.39) hs/main 500000000 44.44% avg time: 342.34 ms (diff: 171.73) hs/main 600000000 55.56% avg time: 410.65 ms (diff: 206.05) hs/main 700000000 66.67% avg time: 478.25 ms (diff: 240.44) hs/main 800000000 77.78% avg time: 546.39 ms (diff: 274.93) hs/main 900000000 88.89% avg time: 614.12 ms (diff: 309.06) hs/main 1000000000 100.0% avg time: 682.32 ms (diff: 343.11) EDIT Of course we cannot compare speed of languages, but the speed of implementiations. But I'm curious if Ghc and C++ compilers can optimize function calls the same way I've edited the question with new benchmark and codes based on your help :)
If your goal is to get this running as quickly as your C++ compiler, then you would want to use a data structure that the compiler can have its way with. module Main where import qualified Data.Vector as V b :: Int -> Int b x = x + 5 c x = b x + 1 d x = b x - 1 a x = c x + d x main = do putStrLn "Starting..." putStrLn $ show $ V.foldl' (+) 0 $ V.map a $ V.enumFromTo 1 100000000 putStrLn "Done." GHC is able to completely eliminate the loop and just inserts a constant into the resulting assembly. On my computer, this now has a runtime of < 0.002s, when using the same optimization flags as you originally specified. As a follow up based on the comments by #Yuras, the core produced by the vector based solution and the stream-fusion solution are functionally identical. Vector main_$s$wfoldlM'_loop [Occ=LoopBreaker] :: Int# -> Int# -> Int# main_$s$wfoldlM'_loop = \ (sc_s2hW :: Int#) (sc1_s2hX :: Int#) -> case <=# sc1_s2hX 100000000 of _ { False -> sc_s2hW; True -> main_$s$wfoldlM'_loop (+# sc_s2hW (+# (+# (+# sc1_s2hX 5) 1) (-# (+# sc1_s2hX 5) 1))) (+# sc1_s2hX 1) } stream-fusion $wloop_foldl [Occ=LoopBreaker] :: Int# -> Int# -> Int# $wloop_foldl = \ (ww_s1Rm :: Int#) (ww1_s1Rs :: Int#) -> case ># ww1_s1Rs 100000000 of _ { False -> $wloop_foldl (+# ww_s1Rm (+# (+# (+# ww1_s1Rs 5) 1) (-# (+# ww1_s1Rs 5) 1))) (+# ww1_s1Rs 1); True -> ww_s1Rm } The only real difference is the choice of comparison operation for the termination condition. Both versions compile to tight tail recursive loops that can be easily optimized by LLVM.
ghc doesn't fuse lists (avoiding success at all costs?) Here is version that uses stream-fusion package: module Main where import Prelude hiding (map, foldl) import Data.List.Stream import Data.Stream (enumFromToInt, unstream) import Text.Printf import Control.Exception import System.CPUTime b :: Int -> Int b x = x + 5 c x = b x + 1 d x = b x - 1 a x = c x + d x main = do putStrLn "Starting..." putStrLn $ show $ foldl (+) 0 $ map (\z -> a z) $ unstream $ enumFromToInt 1 100000000 putStrLn "Done." I don't have llvm installed to compare with your results, but it is 10x faster then your version (compiled without llvm). I think vector fusion should perform even faster.
As others have pointed out, you're not comparing equivalent algorithms. As Yuras pointed out GHC doesn't fuse lists. Your Haskell version will actually allocate that entire list, it will be done lazily one cell at a time, but it will be done. Below is a version that's algorithmically closer to your C version. On my system it runs in the same time as the C version. {-# LANGUAGE BangPatterns #-} module Main where import Text.Printf import Control.Exception import System.CPUTime import Data.List a,b,c :: Int -> Int b x = x + 5 c x = b x + 1 d x = b x - 1 a !x = c x + d x -- Don't allocate a list, iterate and increment as the C version does. applyTo !acc !n | n > 100000000 = acc | otherwise = applyTo (acc + a n) (n + 1) main = do putStrLn "Starting..." print $ applyTo 0 1 putStrLn "Done." Comparing it with time: ghc -O3 bench.hs -fllvm -fforce-recomp -o bench-hs && time ./bench-hs [1 of 1] Compiling Main ( bench.hs, bench.o ) Linking bench-hs ... Starting... 10000001100000000 Done. ./bench-hs 0.00s user 0.00s system 0% cpu 0.003 total Compared to C: clang++ -O3 bench.cpp -o bench && time ./bench Starting... 10000001100000000 Done. ./bench 0.00s user 0.00s system 0% cpu 0.004 total