I have the following SML source file with a trivial function in it:
(* fact.sml *)
fun fact_unguarded 0 = 1
| fact_unguarded n = n * fact_unguarded(n-1)
fun fact 0 = SOME(1)
| fact n = if n > 0 then SOME(n * fact_unguarded(n-1)) else NONE
I'm trying to compile it with MLTon using the C backend and look at the generated C code.
% mlton -codegen c fact.sml
However, none of the intermediate files are dumped in the current working directory and there appears to be nothing relevant in /tmp either. Is there a way to direct MLTon to either a) produce just the C source file and stop or b) keep intermediate files around even after the final artifact is produced.
% pwd
~/tmp/sml
% ls
fact* fact.sml
mlton -stop g -codegen c should do what you want, but due to the way MLton works as a whole-program compiler, there will not be anything left of your functions.
Related
As part of a Compiler Principles course I'm taking in my university, we're writing a compiler that's implemented in OCaml, which compiles Scheme code into CISC-like assembly (which is just C macros).
the basic operation of the compiler is such:
Read a *.scm file and convert it to an OCaml string.
Parse the string and perform various analyses.
Run a code generator on the AST output from the semantic analyzer, that outputs text into a *.c file.
Compile that file with GCC and run it in the terminal.
Well, all is good and well, except for this: I'm trying to read an input file, that's around 4000 lines long, and is basically one huge expressions that's a mix of Scheme if & and.
I'm executing the compiler via utop. When I try to read the input file, I immediately get a stack overflow error message. It is my initial guess that the file is just to large for OCaml to handle, but I wasn't able to find any documentation that would support this theory.
Any suggestions?
The maximum string length is given by Sys.max_string_length. For a 32-bit system, it's quite short: 16777211. For a 64-bit system, it's 144115188075855863.
Unless you're using a 32-bit system, and your 4000-line file is over 16MB, I don't think you're hitting the string length limit.
A stack overflow is not what you'd expect to see when a string is too long.
It's more likely that you have infinite recursion, or possibly just a very deeply nested computation.
Well, it turns out that the limitation was the amount of maximum ram the OCaml is configured to use.
I ran the following command in the terminal in order to increase the quota:
export OCAMLRUNPARAM="l=5555555555"
This worked like a charm - I managed to read and compile the input file almost instantaneously.
For reference purposes, this is the code that reads the file:
let file_to_string input_file =
let in_channel = open_in input_file in
let rec run () =
try
let ch = input_char in_channel in ch :: (run ())
with End_of_file ->
( close_in in_channel;
[] )
in list_to_string (run ());;
where list_to_string is:
let list_to_string s =
let rec loop s n =
match s with
| [] -> String.make n '?'
| car :: cdr ->
let result = loop cdr (n + 1) in
String.set result n car;
result
in
loop s 0;;
funny thing is - I wrote file_to_string in tail recursion. This prevented the stack overflow, but for some reason went into an infinite loop. Oh, well...
I would like to change my stack size to allow a project with many non-tail-recursive functions to run on larger data. To do so, I tried to set OCAMLRUNPARAM="l=xxx" for varying values of xxx (in the range 0 through 10G), but it did not have any effect. Is setting OCAMLRUNPARAM even the right approach?
In case it is relevant: The project I am interested in is built using OCamlMakefile, target native-code.
Here is a minimal example where simply a large list is created without tail recursion. To quickly check whether the setting of OCAMLRUNPARAM has an effect, I compiled the program stacktest.ml:
let rec create l =
match l with
| 0 -> []
| _ -> "00"::(create (l-1))
let l = create (int_of_string (Sys.argv.(1)))
let _ = print_endline("List of size " ^ string_of_int (List.length l) ^ " created.")
using the command
ocamlbuild stacktest.native
and found out roughly at which length of the list a stack overflow occurs by (more or less) binary search with the following bash script foo.sh:
#!/bin/bash
export OCAMLRUNPARAM="l=$1"
increment=1000000
length=1
while [[ $increment > 0 ]] ; do
while [[ $(./stacktest.native $length) ]]; do
length=$(($length+$increment))
done
length=$(($length-$increment))
increment=$(($increment/2))
length=$(($length+$increment))
done
length=$(($length-$increment))
echo "Largest list without overflow: $length"
echo $OCAMLRUNPARAM
The results vary between runs of this script (and the intermediate results are not even consistent within one run, but let's ignore that for now), but they are similar no matter whether I call
bash foo.sh 1
or
bash foo.sh 1G
i.e. whether the stack size is set to 1 or 2^30 words.
Changing the stack limit via OCAMLRUNPARAM works only for bytecode executables, that are run by the OCaml interpreter. A native program is handled by an operating system and executed directly on CPU. Thus, in order to change the stack limit, you need to use facilities, provided by your operating system.
For example, on Linux there is the ulimit command that handles many process parameters, including the stack limit. Add the following to your script
ulimit -s $1
And you will see that the result is changing.
I'm working on some scientific code that is mostly F77 but also some F95. In places, I need to include F77 code into my F95 code. Is there a way to get this code to play nicely within my code by using a particular compiler flag or something? I'm using gfortran and occasionally ifort. It is possible for me to modify the legacy code but I would need to do it in a sensible way to maintain backwards compatibility with other F77 code while also being forwards compatible with F95 code.
I get errors like:
cstruc:16.12:
Included at mod_op.f90:6:
REAL*8
1
Error: Invalid character in name at (1)
cstruc:17.6:
Included at mod_op.f90:6:
& RH, RH1, ! ln rho
1
Error: Invalid character in name at (1)
cstruc:18.6:
Included at mod_op.f90:6:
& RHP, RHP1, ! d ln rho / d ln p
1
Error: Invalid character in name at (1)
cstruc:19.6:
Included at mod_op.f90:6:
& RHT, RHT1, ! d ln rho / d ln T
1
Error: Invalid character in name at (1)
cstruc looks like this:
REAL*8
& RH, RH1, ! ln rho
& RHP, RHP1, ! d ln rho / d ln p
& RHT, RHT1, ! d ln rho / d ln T
& PSI, ! ln Lambda (for degenerate gas)
& RHPSI, ! d ln rho / d PSI
& RHPSIP, ! d2 ln rho / d PSI d ln P
& RHPSIT, ! d2 ln rho / d PSI d ln T
& PL, ! P at J1
& TONI ! T at J1
Any help is much appreciated. Thanks!
With some exceptions, Fortran 77 code is Fortran 95 code. I guess that your errors come from that you are attempting to include fixed-form source code (your F77 code in cstruc) into a free-form source code file mod_op.f90. This is unlikely to end well.
Most compilers will assume a file ending in ".f90" is free-form, so if you really are using fixed-form then you will need a compiler flag to override the assumption.
It is possible to combine free- and fixed-form code into a final object (each compiled separately), but a good suggestion as to how to resolve the problems you are seeing can come only with more detail.
However, if you are attempting with your include to create a module to replace a common block, then there is no reason why you can't use the F95 feature with fixed-form. Just do that selectively.
Alternatively, you can see the answer by Vladimir F which explains how to write source code that is valid as both free-form and fixed-form source. You can use this to modify the Fortran 77 fixed-form code to be include-able by the Fortran 90 free-form code while still being compilable as fixed-form (but not valid Fortran 77).
I suggest to try the "intersection" form from http://fortranwiki.org/fortran/show/Continuation+lines
It is legal as both free and fixed source form.
I try to load an external .sml file - let's say a.sml - and execute a fun (add: int -> int -> int) listed in this file.
I perfectly know how to do this in the interactive shell: use "a.sml";
But how to achieve this in a .sml file? I tried the following:
val doTest =
let
val _ = print ("Loading..." ^ "\n")
val _ = use "a.sml"
val _ = print ("1 + 2 = " ^ Int.toString (add 1 2) ^ "\n")
in
1
end
But the compilers reaction is:
test.sml:7.49-7.52 Error: unbound variable or constructor: add
BTW: I know that using the CM is the more appropriate way. But in my case I do not know the file a.sml prior to the compilation.
You can't do this. The compiler must know the types of the functions you are calling at compile time. What you are asking is for SML to load a file at run time (use ...) and subsequently run the code therein. This isn't possible due to the phase distinction; type checking occurs during compilation, after which all type information can be forgotten.
If you're generating code and know the file name, you can still use the CM and compile in two steps using your build system. Then you'd get the type errors from the generated code in the second compilation step. Please describe your situation if such an approach doesn't work for you.
I start to learn Standard ML, and now I try to use Standard ML of New Jersey compiler.
Now I can use interactive loop, but how I can compile source file to standalone executable?
In C, for example, one can just write
$ gcc hello_world.c -o helloworld
and then run helloworld binary.
I read documentation for SML NJ Compilation Manager, but it don`t have any clear examples.
Also, is there another SML compiler (which allow standalone binary creating) available?
Both MosML and MLton also have the posibility to create standalone binary files. MosML through mosmlc command and MLton through the mlton command.
Note that MLton doesn't have an interactive loop but is a whole-program optimising compiler. Which in basic means that it takes quite some time to compile but in turn it generates incredibly fast SML programs.
For SML/NJ you can use the CM.mk_standalone function, but this is not advised in the CM User Manual page 45. Instead they recommend that you use the ml-build command. This will generate a SML/NJ heap image. The heap image must be run with the #SMLload parameter, or you can use the heap2exec program, granted that you have a supported system. If you don't then I would suggest that you use MLton instead.
The following can be used to generate a valid SML/NJ heap image:
test.cm:
Group is
test.sml
$/basis.cm
test.sml:
structure Test =
struct
fun main (prog_name, args) =
let
val _ = print ("Program name: " ^ prog_name ^ "\n")
val _ = print "Arguments:\n"
val _ = map (fn s => print ("\t" ^ s ^ "\n")) args
in
1
end
end
And to generate the heap image you can use: ml-build test.cm Test.main test-image and then run it by sml #SMLload test-image.XXXXX arg1 arg2 "this is one argument" where XXXXX is your architecture.
If you decide to MLton at some point, then you don't need to have any main function. It evaluates everything at toplevel, so you can create a main function and have it called by something like this:
fun main () = print "this is the main function\n"
val foo = 4
val _ = print ((Int.toString 4) ^ "\n")
val _ = main ()
Then you can compile it by mlton foo.sml which will produce an executable named "foo". When you run it, it will produce this as result:
./foo
4
this is the main function
Note that this is only one file, when you have multiple files you will either need to use MLB (ML Basis files) which is MLtons project files or you can use cm files and then compile it by mlton projectr.mlb