Convert TextIO stream into BinIO stream in Standard ML - sml

In Standard ML, I have a function that writes to BinIO.outstream and I would like it to write to the standard output.
While TextIO structure has stdOut of TextIO.outstream type, BinIO has no such variable, and TexIO.outstream is not compatible with BinIO.outstream:
- f;
val it = fn : BinIO.outstream -> unit
- TextIO.stdOut;
val it = - : TextIO.outstream
- f TextIO.stdOut;
stdIn:6.1-6.16 Error: operator and operand don't agree [tycon mismatch]
operator domain: BinIO.outstream
operand: TextIO.outstream
in expression:
f TextIO.stdOut
Now what is easiest way to convert TextIO.outstream into BinIO.outstream? i.e. how to implement ??? below?
- f (??? TextIO.stdOut);
Update:
For those who are interested, here is an implementation in accordance with Andreas' answer:
fun textWriterToBinWriter (TextPrimIO.WR { name,
chunkSize,
writeVec,
writeArr,
writeVecNB,
writeArrNB,
block,
canOutput,
getPos,
setPos,
endPos,
verifyPos,
close,
ioDesc }) =
let
fun convertWriteVec textWriteVec =
textWriteVec o CharVectorSlice.full o Byte.unpackStringVec
fun convertWriteArr textWriteArr =
textWriteArr o CharArraySlice.full o CharArray.fromList o explode o Byte.unpackString
in
BinPrimIO.WR {
name = name,
chunkSize = chunkSize,
writeVec = Option.map convertWriteVec writeVec,
writeArr = Option.map convertWriteArr writeArr,
writeVecNB = Option.map convertWriteVec writeVecNB,
writeArrNB = Option.map convertWriteArr writeArrNB,
block = block,
canOutput = canOutput,
getPos = getPos,
setPos = setPos,
endPos = endPos,
verifyPos = verifyPos,
close = close,
ioDesc = ioDesc }
end
fun textStreamToBinStream' textStream =
let
val (textWriter, bufferMode) = TextIO.StreamIO.getWriter textStream
in
BinIO.StreamIO.mkOutstream (textWriterToBinWriter textWriter, bufferMode)
end
fun textStreamToBinStream textStream =
let
val textStream' = TextIO.getOutstream textStream
in
BinIO.mkOutstream (textStreamToBinStream' textStream')
end

In principle, it should be possible to write a TextIO.outstream to BinIO.outstream conversion function (or vice versa), but while relatively mechanical, it requires a bit of work. You need to implement:
using the Byte structure, a conversion function TextPrimIO.writer -> BinPrimIO.writer
using that, a conversion function TextIO.StreamIO.outstream -> BinIO.StreamIO.outstream
using that, a conversion function TextIO.outstream -> BinIO.outstream
However, I doubt it is recommended doing a conversion like that. In particular, OS interfaces and tools typically assume that stdout and friends are, in fact, text streams.
If all you need is to write a Word8 vector, then it should be enough to convert it to string beforehand, e.g. using the Byte structure.

Related

Tokenize string with parameterised delimiter

I need to tokenize string to list of words in Standard ML based on a delimeter which is to be passed as a function parameter. This is the code I have so far:
val splitter = String.token(fn (c:string,x:char) => c=x);
I tried this but i know its wrong .Please help me to modify it.
the type of c is string while the type of x is char. They are not comparable. You can convert x to string with Char.toString.
splitter = String.token(fn (c:string,x:char) => c=Char.toString x);
There is no standard library function called String.token, but maybe you mean String.tokens:
- String.tokens;
> val it = fn : (char -> bool) -> string -> string list
You're not saying if your separator is a string or a char, but assuming it's a char,
fun splitter sep s = String.tokens (fn c => c = sep) s
You could also define it as such,
fun curry f a b = f (a, b)
val splitter = String.tokens o curry op=

F# UnitTesting function with side effect

I am C# dev that has just starting to learn F# and I have a few questions about unit testing. Let's say I want to the following code:
let input () = Console.In.ReadLine()
type MyType= {Name:string; Coordinate:Coordinate}
let readMyType =
input().Split(';')
|> fun x -> {Name=x.[1]; Coordinate = {
Longitude = float(x.[4].Replace(",","."))
Latitude =float(x.[5].Replace(",","."))
}}
As you can notice, there are a few points to take in consideration:
readMyType is calling input() with has a side effect.
readMyType assume many thing on the string read (contains ';' at least 6 columns, some columns are float with ',')
I think the way of doing this would be to:
inject the input() func as parameter
try to test what we are getting (pattern matching?)
Using NUnit as explained here
To be honest I'm just struggling to find an example that is showing me this, in order to learn the syntax and other best practices in F#. So if you could show me the path that would be very great.
Thanks in advance.
First, your function is not really a function. It's a value. The distinction between functions and values is syntactic: if you have any parameters, you're a function; otherwise - you're a value. The consequence of this distinction is very important in presence of side effects: values are computed only once, during initialization, and then never change, while functions are executed every time you call them.
For your specific example, this means that the following program:
let main _ =
readMyType
readMyType
readMyType
0
will ask the user for only one input, not three. Because readMyType is a value, it gets initialized once, at program start, and any subsequent reference to it just gets the pre-computed value, but doesn't execute the code over again.
Second, - yes, you're right: in order to test this function, you'd need to inject the input function as a parameter:
let readMyType (input: unit -> string) =
input().Split(';')
|> fun x -> {Name=x.[1]; Coordinate = {
Longitude = float(x.[4].Replace(",","."))
Latitude =float(x.[5].Replace(",","."))
}}
and then have the tests supply different inputs and check different outcomes:
let [<Test>] ``Successfully parses correctly formatted string``() =
let input() = "foo;the_name;bar;baz;1,23;4,56"
let result = readMyType input
result |> should equal { Name = "the_name"; Coordinate = { Longitude = 1.23; Latitude = 4.56 } }
let [<Test>] ``Fails when the string does not have enough parts``() =
let input() = "foo"
(fun () -> readMyType input) |> shouldFail
// etc.
Put these tests in a separate project, add reference to your main project, then add test runner to your build script.
UPDATE
From your comments, I got the impression that you were seeking not only to test the function as it is (which follows from your original question), but also asking for advice on improving the function itself, so as to make it more safe and usable.
Yes, it is definitely better to check error conditions within the function, and return appropriate result. Unlike C#, however, it is usually better to avoid exceptions as control flow mechanism. Exceptions are for exceptional situations. For such situations that you would have never expected. That is why they are exceptions. But since the whole point of your function is parsing input, it stands to reason that invalid input is one of the normal conditions for it.
In F#, instead of throwing exceptions, you would usually return a result that indicates whether the operation was successful. For your function, the following type seems appropriate:
type ErrorMessage = string
type ParseResult = Success of MyType | Error of ErrorMessage
And then modify the function accordingly:
let parseMyType (input: string) =
let parts = input.Split [|';'|]
if parts.Length < 6
then
Error "Not enough parts"
else
Success
{ Name = parts.[0]
Coordinate = { Longitude = float(parts.[4].Replace(',','.')
Latitude = float(parts.[5].Replace(',','.') }
}
This function will return us either MyType wrapped in Success or an error message wrapped in Error, and we can check this in tests:
let [<Test>] ``Successfully parses correctly formatted string``() =
let input() = "foo;the_name;bar;baz;1,23;4,56"
let result = readMyType input
result |> should equal (Success { Name = "the_name"; Coordinate = { Longitude = 1.23; Latitude = 4.56 } })
let [<Test>] ``Fails when the string does not have enough parts``() =
let input() = "foo"
let result = readMyType input
result |> should equal (Error "Not enough parts)
Note that, even though the code now checks for enough parts in the string, there are still other possible error conditions: for example, parts.[4] may be not a valid number.
I am not going to expand on this further, as that will make the answer way too long. I will only stop to mention two points:
Unlike C#, verifying all error conditions does not have to end up as a pyramid of doom. Validations can be nicely combined in a linear-looking way (see example below).
The F# 4.1 standard library already provides a type similar to ParseResult above, named Result<'t, 'e>.
For more on this approach, check out this wonderful post (and don't forget to explore all links from it, especially the video).
And here, I will leave you with an example of what your function could look like with full validation of everything (keep in mind though that this is not the cleanest version still):
let parseFloat (s: string) =
match System.Double.TryParse (s.Replace(',','.')) with
| true, x -> Ok x
| false, _ -> Error ("Not a number: " + s)
let split n (s:string) =
let parts = s.Split [|';'|]
if parts.Length < n then Error "Not enough parts"
else Ok parts
let parseMyType input =
input |> split 6 |> Result.bind (fun parts ->
parseFloat parts.[4] |> Result.bind (fun lgt ->
parseFloat parts.[5] |> Result.bind (fun lat ->
Ok { Name = parts.[1]; Coordinate = { Longitude = lgt; Latitude = lat } } )))
Usage:
> parseMyType "foo;name;bar;baz;1,23;4,56"
val it : Result<MyType,string> = Ok {Name = "name";
Coordinate = {Longitude = 1.23;
Latitude = 4.56;};}
> parseMyType "foo"
val it : Result<MyType,string> = Error "Not enough parts"
> parseMyType "foo;name;bar;baz;badnumber;4,56"
val it : Result<MyType,string> = Error "Not a number: badnumber"
This is a little follow-up to the excellent answer of #FyodorSoikin trying to explore the suggestion
keep in mind though that this is not the cleanest version still
Making the ParseResult generic
type ParseResult<'a> = Success of 'a | Error of ErrorMessage
type ResultType = ParseResult<Defibrillator> // see the Test Cases
we can define a builder
type Builder() =
member x.Bind(r :ParseResult<'a>, func : ('a -> ParseResult<'b>)) =
match r with
| Success m -> func m
| Error w -> Error w
member x.Return(value) = Success value
let builder = Builder()
so we get a concise notation:
let parse input =
builder {
let! parts = input |> split 6
let! lgt = parts.[4] |> parseFloat
let! lat = parts.[5] |> parseFloat
return { Name = parts.[1]; Coordinate = { Longitude = lgt; Latitude = lat } }
}
Test Cases
Tests are always fundamental
let [<Test>] ``3. Successfully parses correctly formatted string``() =
let input = "foo;the_name;bar;baz;1,23;4,56"
let result = parse input
result |> should equal (ResultType.Success { Name = "the_name"; Coordinate = { Longitude = 1.23; Latitude = 4.56 } })
let [<Test>] ``3. Fails when the string does not have enough parts``() =
let input = "foo"
let result = parse input
result |> should equal (ResultType.Error "Not enough parts")
let [<Test>] ``3. Fails when the string does not contain a number``() =
let input = "foo;name;bar;baz;badnumber;4,56"
let result = parse input
result |> should equal (ResultType.Error "Not a number: badnumber")
Notice the usage of a specific ParseResult from the generic one.
minor note
Double.TryParse is just enough in the following
let parseFloat (s: string) =
match Double.TryParse s with
| true, x -> Success x
| false, _ -> Error ("Not a number: " + s)

Write command-line arguments to file in SML

I am trying to write the command line arguments from my SML program into a file, each on a separate line. If I were to run sml main.sml a b c easy as 1 2 3 on the command line, the desired output would be to have a file with the contents:
a
b
c
easy
as
1
2
3
However, I am getting the following output from SML:
$ sml main.sml a b c easy as 1 2 3
val filePath = "/Users/Josue/Desktop/espi9890.txt" : string
val args = ["a","b","c","easy","as","1","2","3"] : string list
main.sml:4.21 Error: syntax error: inserting EQUALOP
/usr/local/smlnj/bin/sml: Fatal error -- Uncaught exception Compile with "syntax error" raised at
../compiler/Parse/main/smlfile.sml:15.24-15.46
With this code:
val filePath = "/Users/Josue/Desktop/espi9890.txt";
val args = CommandLine.arguments();
fun writeListToFile x =
val str = hd x ^ "\n";
val fd = TextIO.openAppend filePath;
TextIO.output (fd, str);
TextIO.closeOut fd;
writeListToFile (tl x);
| fun writeListToFile [] =
null;
writeListToFile args;
Am I missing something?
The correct syntax for nested value declarations is:
fun writeListToFile (s::ss) =
let val fd = TextIO.openAppend filePath
val _ = TextIO.output (fd, s ^ "\n")
val _ = TextIO.closeOut fd
in writeListToFile ss end
| writeListToFile [] = ()
That is,
(Error) You're forgetting the let ... in ... end.
(Error) Your second pattern, [], will never match because the first one, x, is more general and matches all input lists (including the empty one). So even if your syntax error was fixed, this function would loop until it crashes because you are trying to take the hd/tl of an empty list.
(Error) When a function has multiple match cases, only the first one must be prepended with fun and the rest must have a | instead. (You can decide freely how to indent this.)
(Error) There are two kinds of semicolons in SML: One is for separating declarations, and one is an operator that discards the value (but not the effect) of its first operand. The first kind that separates declarations can always be avoided. The second kind is the one you are trying to employ in order to chain multiple expressions that each have a desired (file I/O) effect (and is equivalent to having a let-expressions with multiple effectful declarations in a row, like above).
But... at the top-level (e.g. in a function body), SML is unable to tell the difference between the two kinds of semicolons, since they could both occur there. After all, the first kind that we want to avoid marks the ending of the function body while the second kind just marks the end of a sub-expression in the function body.
The way to avoid this ambiguity is to wrap the ; operator where no declarations are allowed, e.g. between in and end, or inside a parenthesis.
(Error) There is no point in having this function return null. You were probably thinking nil (the empty list, aka []), but val null : 'a list -> bool is a function! Really, it is nonsensical to have a return value for this function. If anything, it could be a bool indicating if the lines were written successfully (in which case you probably need to handle IO exceptions). The closest you get to a function that does not return anything is a function that returns the type unit (with the value ()).
(Suggestion) You can use hd/tl to split the list, but you can also use pattern matching. Use pattern matching, like the examples I've given.
(Suggestion) You can use semi-colons instead of the val _ = ... declarations; also; it's just a matter of taste. E.g.:
fun writeListToFile (s::ss) =
let val fd = TextIO.openAppend filePath
in TextIO.output (fd, s ^ "\n")
; TextIO.closeOut fd
; writeListToFile ss
end
| writeListToFile [] = ()
(Suggestion) It is rather silly that every time the function calls itself, it opens the file, appends, and closes the file. Ideally you only open and close the file once:
fun writeListToFile lines =
let val fd = TextIO.openAppend filePath
fun go [] = TextIO.closeOut fd
| go (s::ss) = ( TextIO.output (fd, s ^ "\n") ; go ss )
in go lines end
(Suggestion) Since you are doing the same thing to each element in a list, you may also consider using a higher-order function that generalizes the iteration. Normally, that would be a val map : ('a -> 'b) -> 'a list -> 'b list, but since TextIO.output returns a unit, the very similar val app : ('a -> unit) -> 'a list -> unit is even better:
fun writeListToFile lines =
let val fd = TextIO.openAppend filePath
in List.app (fn s => TextIO.output (fd, s ^ "\n")) lines
; TextIO.closeOut fd
end
(Suggestion) Lastly, you may want to call this function appendListToFile, or simply appendLines, and take filePath as an argument to the function, since filePath implies that it is to a file, and the function does add linebreaks to each s. Names matter.
fun appendLines filePath lines =
let val fd = TextIO.openAppend filePath
in List.app (fn s => TextIO.output (fd, s ^ "\n")) lines
; TextIO.closeOut fd
end

Empty character in OCaml

I am trying to do something fairly simple. I want to take a string such as "1,000" and return the string "1000".
Here was my attempt:
String.map (function x -> if x = ',' then '' else x) "1,000";;
however I get a compiler error saying there is a syntax error wrt ''
Thanks for the insight!
Unfortunately, there's no character like the one you're looking for. There is a string that's 0 characters long (""), but there's no character that's not there at all. All characters (so to speak) are 1 character.
To solve your problem you need a more general operation than String.map. The essence of a map is that its input and output have the same shape but different contents. For strings this means that the input and output are strings of the same length.
Unless you really want to avoid imperative coding (which is actually a great thing to avoid, especially when starting out with OCaml), you would probably do best using String.iter and a buffer (from the Buffer module).
Update
The string_map_partial function given by Andreas Rossberg is pretty nice. Here's another implementation that uses String.iter and a buffer:
let string_map_partial f s =
let b = Buffer.create (String.length s) in
let addperhaps c =
match f c with
| None -> ()
| Some c' -> Buffer.add_char b c'
in
String.iter addperhaps s;
Buffer.contents b
Just an alternate implementation with different stylistic tradeoffs. Not faster, probably not slower either. It's still written imperatively (for the same reason).
What you'd need here is a function like the following, which unfortunately is not in the standard library:
(* string_map_partial : (char -> char option) -> string -> string *)
let string_map_partial f s =
let buf = String.create (String.length s) in
let j = ref 0 in
for i = 0 to String.length s - 1 do
match f s.[i] with
| None -> ()
| Some c -> buf.[!j] <- c; incr j
done;
String.sub buf 0 !j
You can then write:
string_map_partial (fun c -> if c = ',' then None else Some c) "1,000"
(Note: I chose an imperative implementation for string_map_partial, because a purely functional one would require repeated string concatenation, which is fairly expensive in OCaml.)
A purely functional version could be this one:
let string_map_partial f s =
let n = String.length s in
let rec map_str i acc =
if i < n then
map_str (i + 1) (acc ^ (f (String.make 1 s.[i])))
else acc
in map_str 0 ""
Which is terminal recursive, but less performant than the imperative version.

Linear types in OCaml

Rust has a linear type system. Is there any (good) way to simulate this in OCaml? E.g., when using ocaml-lua, I want to make sure some functions are called only when Lua is in a specific state (table on top of stack, etc).
Edit: Here's a recent paper about resource polymorphism relevant to the question: https://arxiv.org/abs/1803.02796
Edit 2: There are also a number of articles about session types in OCaml available, including syntax extensions to provide some syntactic sugar.
As suggested by John Rivers, you can use a monadic style to represent
"effectful" computation in a way that hides the linear constraint in
the effect API. Below is one example where a type ('a, 'st) t is
used to represent computation using a file handle (whose identity is
implicit/unspoken to guarantee that it cannot be duplicated), will
product a result of type 'a and leave the file handle in the state
'st (a phantom type being either "open" or "close"). You have to use
the run of the monad¹ to actually do anything, and its type ensure
that the file handles are correctly closed after use.
module File : sig
type ('a, 'st) t
type open_st = Open
type close_st = Close
val bind : ('a, 's1) t -> ('a -> ('b, 's2) t) -> ('b, 's2) t
val open_ : string -> (unit, open_st) t
val read : (string, open_st) t
val close : (unit, close_st) t
val run : ('a, close_st) t -> 'a
end = struct
type ('a, 'st) t = unit -> 'a
type open_st = Open
type close_st = Close
let run m = m ()
let bind m f = fun () ->
let x = run m in
run (f x)
let close = fun () ->
print_endline "[lib] close"
let read = fun () ->
let result = "toto" in
print_endline ("[lib] read " ^ result);
result
let open_ path = fun () ->
print_endline ("[lib] open " ^ path)
end
let test =
let open File in
let (>>=) = bind in
run begin
open_ "/tmp/foo" >>= fun () ->
read >>= fun content ->
print_endline ("[user] read " ^ content);
close
end
(* starting with OCaml 4.13, you can use binding operators:
( let* ) instead of ( >>= ) *)
let test =
let open File in
let ( let* ) = bind in
run begin
let* () = open_ "/tmp/foo" in
let* content = read in
print_endline ("[user] read " ^ content);
close
end
Of course, this is only meant to give you a taste of the style of
API. For more serious uses, see Oleg's monadic
regions examples.
You may also be interested in the research programming language
Mezzo, which aims to
be a variant of ML with finer-grained control of state (and related
effectful patterns) through a linear typing discipline with separated
resources. Note that it is only a research experiment for now, not
actually aimed at users. ATS is also relevant,
though finally less ML-like. Rust may actually be a reasonable
"practical" counterpart to these experiments.
¹: it is actually not a monad because it has no return/unit combinator, but the point is to force type-controlled sequencing as the monadic bind operator does. It could have a map, though.