SML-NJ, how to compile standalone executable - sml

I start to learn Standard ML, and now I try to use Standard ML of New Jersey compiler.
Now I can use interactive loop, but how I can compile source file to standalone executable?
In C, for example, one can just write
$ gcc hello_world.c -o helloworld
and then run helloworld binary.
I read documentation for SML NJ Compilation Manager, but it don`t have any clear examples.
Also, is there another SML compiler (which allow standalone binary creating) available?

Both MosML and MLton also have the posibility to create standalone binary files. MosML through mosmlc command and MLton through the mlton command.
Note that MLton doesn't have an interactive loop but is a whole-program optimising compiler. Which in basic means that it takes quite some time to compile but in turn it generates incredibly fast SML programs.
For SML/NJ you can use the CM.mk_standalone function, but this is not advised in the CM User Manual page 45. Instead they recommend that you use the ml-build command. This will generate a SML/NJ heap image. The heap image must be run with the #SMLload parameter, or you can use the heap2exec program, granted that you have a supported system. If you don't then I would suggest that you use MLton instead.
The following can be used to generate a valid SML/NJ heap image:
test.cm:
Group is
test.sml
$/basis.cm
test.sml:
structure Test =
struct
fun main (prog_name, args) =
let
val _ = print ("Program name: " ^ prog_name ^ "\n")
val _ = print "Arguments:\n"
val _ = map (fn s => print ("\t" ^ s ^ "\n")) args
in
1
end
end
And to generate the heap image you can use: ml-build test.cm Test.main test-image and then run it by sml #SMLload test-image.XXXXX arg1 arg2 "this is one argument" where XXXXX is your architecture.
If you decide to MLton at some point, then you don't need to have any main function. It evaluates everything at toplevel, so you can create a main function and have it called by something like this:
fun main () = print "this is the main function\n"
val foo = 4
val _ = print ((Int.toString 4) ^ "\n")
val _ = main ()
Then you can compile it by mlton foo.sml which will produce an executable named "foo". When you run it, it will produce this as result:
./foo
4
this is the main function
Note that this is only one file, when you have multiple files you will either need to use MLB (ML Basis files) which is MLtons project files or you can use cm files and then compile it by mlton projectr.mlb

Related

How to get results of a OCaml program in OCaml? (i.e OCaml version of ProcessBuilder in JAVA)

I'm very new to OCaml and recently studying on program verification.
For implementation, I need a library module of Ocaml that gets a result of another Ocaml program. I hope the library has the same functions as the ProcessBuilder in JAVA.
I wonder if there is such one for Ocaml.
Sure, OCaml provides facilities to create processes. Here is an example, showing how to use the Unix.open_process_in function,
# let input = Unix.open_process_in "echo 'hello, world'";;
val input : in_channel = <abstr>
# input_line input;;
- : string = "hello, world"
# input_line input;;
Exception: End_of_file.
You can spawn a process that runs any program, no matter in which language it is written. If you want your processes to communicate OCaml data structures, then you can use the Marshal module to safely translate your OCaml values to and from strings.

OCaml string length limitation when reading from stdin\file

As part of a Compiler Principles course I'm taking in my university, we're writing a compiler that's implemented in OCaml, which compiles Scheme code into CISC-like assembly (which is just C macros).
the basic operation of the compiler is such:
Read a *.scm file and convert it to an OCaml string.
Parse the string and perform various analyses.
Run a code generator on the AST output from the semantic analyzer, that outputs text into a *.c file.
Compile that file with GCC and run it in the terminal.
Well, all is good and well, except for this: I'm trying to read an input file, that's around 4000 lines long, and is basically one huge expressions that's a mix of Scheme if & and.
I'm executing the compiler via utop. When I try to read the input file, I immediately get a stack overflow error message. It is my initial guess that the file is just to large for OCaml to handle, but I wasn't able to find any documentation that would support this theory.
Any suggestions?
The maximum string length is given by Sys.max_string_length. For a 32-bit system, it's quite short: 16777211. For a 64-bit system, it's 144115188075855863.
Unless you're using a 32-bit system, and your 4000-line file is over 16MB, I don't think you're hitting the string length limit.
A stack overflow is not what you'd expect to see when a string is too long.
It's more likely that you have infinite recursion, or possibly just a very deeply nested computation.
Well, it turns out that the limitation was the amount of maximum ram the OCaml is configured to use.
I ran the following command in the terminal in order to increase the quota:
export OCAMLRUNPARAM="l=5555555555"
This worked like a charm - I managed to read and compile the input file almost instantaneously.
For reference purposes, this is the code that reads the file:
let file_to_string input_file =
let in_channel = open_in input_file in
let rec run () =
try
let ch = input_char in_channel in ch :: (run ())
with End_of_file ->
( close_in in_channel;
[] )
in list_to_string (run ());;
where list_to_string is:
let list_to_string s =
let rec loop s n =
match s with
| [] -> String.make n '?'
| car :: cdr ->
let result = loop cdr (n + 1) in
String.set result n car;
result
in
loop s 0;;
funny thing is - I wrote file_to_string in tail recursion. This prevented the stack overflow, but for some reason went into an infinite loop. Oh, well...

Programmatically load code in sml/nj

I try to load an external .sml file - let's say a.sml - and execute a fun (add: int -> int -> int) listed in this file.
I perfectly know how to do this in the interactive shell: use "a.sml";
But how to achieve this in a .sml file? I tried the following:
val doTest =
let
val _ = print ("Loading..." ^ "\n")
val _ = use "a.sml"
val _ = print ("1 + 2 = " ^ Int.toString (add 1 2) ^ "\n")
in
1
end
But the compilers reaction is:
test.sml:7.49-7.52 Error: unbound variable or constructor: add
BTW: I know that using the CM is the more appropriate way. But in my case I do not know the file a.sml prior to the compilation.
You can't do this. The compiler must know the types of the functions you are calling at compile time. What you are asking is for SML to load a file at run time (use ...) and subsequently run the code therein. This isn't possible due to the phase distinction; type checking occurs during compilation, after which all type information can be forgotten.
If you're generating code and know the file name, you can still use the CM and compile in two steps using your build system. Then you'd get the type errors from the generated code in the second compilation step. Please describe your situation if such an approach doesn't work for you.

Differences when writing to / reading from the console between gfortran- and g77-compiled code

This one's going to take a bit of explaining. Please bear with me.
What I Have
I have in my possession some Fortran source code and some binaries that have been compiled from that code. I did not do the compilation, but there is a build script that suggests G77 was used to do it.
As well as the Fortran stuff, there is also some Java code that provides users with a GUI "wrapper" around the binaries. It passes information between itself and the binaries via their input/output/error pipes. The Java code is very messy, and this way of doing things adds a lot of boilerplate and redundancy, but it does the job and I know it works.
What I Need
Unfortunately, I'd like to make some changes:
I want to create a new Python wrapper for the binaries (or, more precisely, extend an existing Python program to become the new wrapper).
I want to be able to compile the Fortran code as part of this existing program's build process. I would like to use gfortran for this, since MinGW is used elsewhere in the build and so it will be readily available.
The Problem
When I compile the Fortran code myself using gfortran, I cannot get the resulting binaries to "talk" to either the current Java wrapper or my new Python wrapper.
Here are the various ways of printing to the console that I have tried in the Fortran code:
subroutine printA(message)
write(6,*) message
end
subroutine printB(message)
write(*,*) message
end
subroutine printC(message)
use iso_fortran_env
write(output_unit,*) message
end
There are also read commands as well, but the code doesn't even get a change to execute that part so I'm not worrying about it yet.
Extra Info
I have to call gfortran with the -ffixed-line-length-132 flag so that the code compiles, but apart from that I don't use anything else. I have tried using the -ff2c flag in the vague hope that it will make a difference. It doesn't.
This stackoverflow post is informative, but doesn't offer me anything that works.
The relavant manual page suggests that printA should work just fine.
I'm working on Windows, but will need this to be multi-platform.
Juse in case you're intested, the Java code uses Runtime.getRuntime().exec("prog.exe") to call the binaries and then the various "stream" methods of the resulting Process object to communicate with them. The Python code uses equivalents of this provided by the Popen object of the subprocess module.
I should also say that I am aware there are alternatives. Rewriting the code in Python (or something else like C++), or making amendments so that is it can be called via F2Py have been ruled out as options. Using g77 is also a no-go; we have enough dependencies as it is. I'd like to be able to write to / read from the console properly with gfortran, or know that it's just not possible.
Hard to say without seeing more details from your Fortran and Python codes. The following pair of code works for me (at least under Linux):
Fortran program repeating its input line by line prefixed with line number:
program test_communication
use iso_fortran_env, stdout => output_unit, stdin => input_unit
implicit none
character(100) :: buffer
integer :: ii
ii = 1
do while (.true.)
read(stdin, *) buffer
write(stdout, "(I0,A,A)") ii, "|", trim(buffer)
flush(stdout)
ii = ii + 1
end do
end program test_communication
Python program invoking the Fortran binary. You can feed it with arbitrary strings from the console.
import subprocess as sub
print "Starting child"
proc = sub.Popen("./a.out", stdin=sub.PIPE, stdout=sub.PIPE)
while True:
send = raw_input("Enter a string: ")
if not send:
print "Exiting loop"
break
proc.stdin.write(send)
proc.stdin.write("\n")
proc.stdin.flush()
print "Sent:", send
recv = proc.stdout.readline()
print "Received:", recv.rstrip()
print "Killing child"
proc.kill()

Can frama-c be used for header file analysis?

I was looking at frama-c as a way to process C header files in OCaml (e.g. for generating language bindings). It's attractive because it seems like a very well-documented and maintained project. However, after a lot of googling and searching through the documentation, I can't find anything suitable for the purpose. Am I just missing the right way to do this, or is it outside the scope of frama-c? It seems like a fairly trivial thing for it to do, compared to some of the other plugins.
As Pascal said, I don't think that it is possible from the command line, but because you will have to write some code anyway, you can set the flag Rmtmps.keepUnused. This is a script that you can use to see the declarations :
let main () =
Rmtmps.keepUnused := true;
let file = File.from_filename "t.h" in
let () = File.init_from_c_files [ file ] in
let _ast = Ast.get () in
let show_function f =
let name = Kernel_function.get_name f in
if not (Cil.Builtin_functions.mem name) then
Format.printf "Function #[<2>%a:# #[#[type: %a#]# #[%s at %a#]#]#]#."
Kernel_function.pretty f
Cil_datatype.Typ.pretty (Kernel_function.get_type f)
(if Kernel_function.is_definition f then "defined" else "declared")
Cil.d_loc (Kernel_function.get_location f)
in Globals.Functions.iter show_function
let () = Db.Main.extend main
To run it, you have to use the -load-script option like this :
$ frama-c -load-script script.ml
Developing a plug-in will be more appropriate for more complex processing (see the Developer Manual for that), but a script make it easy to test.
In the current state, I would say that it is unfortunately impossible to use Frama-C to parse declarations of functions that are neither defined or used.
t.h:
int mybinding (int x, int y);
This gives you a view of the normalized AST. Normalized means that everything that could be simplified was:
$ frama-c -print t.h
[kernel] preprocessing with "gcc -C -E -I. t.h"
/* Generated by Frama-C */
And unfortunately, since mybinding was neither used nor defined, it was erased.
There is an option to keep declarations with specifications, but what you want is an option to keep all declarations. I have never noticed such an option:
$ frama-c -kernel-help
...
-keep-unused-specified-functions keep specified-but-unused functions (set by
default, opposite option is
-remove-unused-specified-functions)
And the option to keep functions with specifications does not do what you want:
$ frama-c -keep-unused-specified-functions -print t.h
[kernel] preprocessing with "gcc -C -E -I. t.h"
/* Generated by Frama-C */