Printing only print output with SML/NJ - sml

I'm trying to use SML/NJ, and I use sml < source.sml to run the code, but it prints out too much information.
For example, this is the source.sml:
fun fac 0 = 1
| fac n = n * fac (n - 1)
val r = fac 10 ;
print(Int.toString(r));
This is the output:
Standard ML of New Jersey v110.77 [built: Tue Mar 10 07:03:24 2015]
- val fac = fn : int -> int
val r = 3628800 : int
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[autoloading done]
3628800val it = () : unit
From Suppress "val it" output in Standard ML, How to disable SMLNJ warnings?, and SMLNJ want to remove "val it = () : unit" from every print statement execution, I got some hints how to suppress them.
I execute CM_VERBOSE=false sml < $filename and added one line of Control.Print.out := {say=fn _=>(), flush=fn()=>()}; in the code, but I still have some message:
Standard ML of New Jersey v110.77 [built: Tue Mar 10 07:03:24 2015]
- 3628800
How can I print out only the output?

The sml command is intended to be used interactively. It sounds to me like you would be better off building a standalone executable from your program instead.
There are a few options:
If you are relying on SML/NJ extensions, or if you simply cannot use another ML implementation, you can follow the instructions in this post to build an SML/NJ heap image that can be turned into a standalone executable using heap2exec.
A better option might be to use the MLton compiler, another implementation of Standard ML. It lacks a REPL, but unlike SML/NJ it requires no boilerplate to generate a standalone executable. Building is as simple as issuing:
$ mlton your-program.sml
$ ./your-program
3628800

Related

R: need to replace invisible/accented characters with regex

I'm working with a file generated from several different machines that had different locale-settings, so I ended up with a column of a data frame with different writings for the same word:
CÓRDOBA
CÓRDOBA
CÒRDOBA
I'd like to convert all those to CORDOBA. I've tried doing
t<-gsub("Ó|Ó|Ã’|°|°|Ò","O",t,ignore.case = T) # t is the vector of names
Wich works until it finds some "invisible" characters:
As you can see, I'm not able to see, in R, the additional charater that lies between à and \ (If I copy-paste to MS Word, word shows it with an empty rectangle). I've tried to dput the vector, but it shows exactly as in screen (without the "invisible" character).
I ran Encoding(t), and ir returns unknown for all values.
My system configuration follows:
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Spanish_Colombia.1252 LC_CTYPE=Spanish_Colombia.1252 LC_MONETARY=Spanish_Colombia.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Colombia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zoo_1.7-12 dplyr_0.4.2 data.table_1.9.4
loaded via a namespace (and not attached):
[1] R6_2.1.0 assertthat_0.1 magrittr_1.5 plyr_1.8.3 parallel_3.2.1 DBI_0.3.1 tools_3.2.1 reshape2_1.4.1 Rcpp_0.11.6 stringi_0.5-5
[11] grid_3.2.1 stringr_1.0.0 chron_2.3-47 lattice_0.20-31
I've saveRDS a file with a data frame of actual and expected toy values, wich could be loadRDS from here. I'm not absolutely sure it will load with the same problems I have (depending on you locale), but I hope it does, so you can provide some help.
At the end, I'd like to convert all those special characters to unaccented ones (Ó to O, etc.), hopefully without having to manually input each one of the special ones into a regex (in other words, I'd like --if possible-- some sort of gsub("[:weird:]","[:equivalentToWeird:]",t). If not possible, at least I'd like to be able to find (and replace) those "invisible" characters.
Thanks,
############## EDIT TO ADD ###################
If I run the following code:
d<-readRDS("c:/path/to(downloaded/Dropbox/file/inv_char.Rdata")
stri_escape_unicode(d$actual)
This is what I get:
[1] "\\u00c3\\u201cN N\\u00c2\\u00b0 08 \\\"CACIQUE CALARC\\u00c3\\u0081\\\" - ARMENIA"
[2] "\\u00d3N N\\u00b0 08 \\\"CACIQUE CALARC\\u00c1\\\" - ARMENIA"
[3] "\\u00d3N N\\u00b0 08 \\\"CACIQUE CALARC\\u00c1\\\" - ARMENIA(ALTERNO)"
Normal output is:
> d$actual
[1] ÓN N° 08 "CACIQUE CALARCÃ" - ARMENIA ÓN N° 08 "CACIQUE CALARCÁ" - ARMENIA ÓN N° 08 "CACIQUE CALARCÁ" - ARMENIA(ALTERNO)
With the help of #hadley, who pointed me towards stringi, I ended up discovering the offending characters and replacing them. This was my initial attempt:
unweird<-function(t){
t<-stri_escape_unicode(t)
t<-gsub("\\\\u00c3\\\\u0081|\\\\u00c1","A",t)
t<-gsub("\\\\u00c3\\\\u02c6|\\\\u00c3\\\\u2030|\\\\u00c9|\\\\u00c8","E",t)
t<-gsub("\\\\u00c3\\\\u0152|\\\\u00c3\\\\u008d|\\\\u00cd|\\\\u00cc","I",t)
t<-gsub("\\\\u00c3\\\\u2019|\\\\u00c3\\\\u201c|\\\\u00c2\\\\u00b0|\\\\u00d3|\\\\u00b0|\\\\u00d2|\\\\u00ba|\\\\u00c2\\\\u00ba","O",t)
t<-gsub("\\\\u00c3\\\\u2018|\\\\u00d1","N",t)
t<-gsub("\\u00a0|\\u00c2\\u00a0","",t)
t<-gsub("\\\\u00f3","o",t)
t<-stri_unescape_unicode(t)
}
which produced the expected result. I was a little bit curious about other stringi functions, so I wondered if its substitution one could be faster on my 3.3 million rows. I then tried stri_replace_all_regex like this:
stri_unweird<-function(t){
stri_unescape_unicode(stri_replace_all_regex(stri_escape_unicode(t),
c("\\\\u00c3\\\\u0081|\\\\u00c1",
"\\\\u00c3\\\\u02c6|\\\\u00c3\\\\u2030|\\\\u00c9|\\\\u00c8",
"\\\\u00c3\\\\u0152|\\\\u00c3\\\\u008d|\\\\u00cd|\\\\u00cc",
"\\\\u00c3\\\\u2019|\\\\u00c3\\\\u201c|\\\\u00c2\\\\u00b0|\\\\u00d3|\\\\u00b0|\\\\u00d2|\\\\u00ba|\\\\u00c2\\\\u00ba",
"\\\\u00c3\\\\u2018|\\\\u00d1",
"\\u00a0|\\u00c2\\u00a0",
"\\\\u00f3"),
c("A","E","I","O","N","","o"),
vectorize_all = F))
}
As a side note, I ran microbenchmark on both methods, these are the results:
g<-microbenchmark(unweird(t),stri_unweird(t),times = 100L)
summary(g)
min lq mean median uq max neval cld
1 423.0083 425.6400 431.9609 428.1031 432.6295 490.7658 100 b
2 118.5831 119.5057 121.2378 120.3550 121.8602 138.3111 100 a

OCaml compile error with ocamlfind

Here is the code:
class parser =
let test1 = function
| 1 -> print_int 1
| 2 -> print_int 2
| _ -> print_int 3 in
let test = function
| 1 -> print_int 1
| 2 -> print_int 2
| _ -> print_int 3 in
object(self)
end
Here is the _tags
true: syntax(camlp4o)
true: package(deriving,deriving.syntax)
true: thread,debug,annot
true: bin_annot
Here is the compile command:
ocamlbuild -use-ocamlfind test.native
Here is the compile error:
Warning: tag "package" does not expect a parameter, but is used with parameter "deriving,deriving.syntax"
Warning: tag "syntax" does not expect a parameter, but is used with parameter "camlp4o"
+ /usr/local/bin/ocamldep.opt -modules test.ml > test.ml.depends
File "test.ml", line 8, characters 0-3:
Error: Syntax error
Command exited with code 2.
Compilation unsuccessful after building 1 target (0 cached) in 00:00:00.
However, when I use this:
ocamlbuild test.native
Then the code can be successfully compiled...
This is because ocamlbuild -use-ocamlfind test.native directs compiler to use camlp4 parser. It is a bit different from standard OCaml parser. Actually, parser is a keyword in camlp4, so you can't use it as a class name. Just rename it.

Rocket UNIVERSE, ODBC, queries to datafile won't work without a call to common file handle

I am trying to query a rocket UNIVERSE database. For the most part it works, until I hit certain types of fields that are of type I (not all, but some). In the vendor documentation (EPICOR ECLIPSE) it mentions as a note the following, "Any dictionary that contains a reference to a common file handle will not work without a call to 'OPEN.STANDARD.FILES' so you may need to wrap standard dictionaries."
So my question is how to do that?
When I query the database directly from TCL (cd c:/u2/eclipse and type "uv" to get to the TCL environment) I get the following.
"LIST PSUB TSN.COMMENT 07:37:39am 22 Mar 2014 PAGE 1
#ID..................................... TSN..........
**Program "DICT.GET.LEDGER.DET.VALUE": Line 9, Improper data type.**
When I run the same query within the vendor's application environment it works. Their environment is a DOS like menu system that does allow one to drop down to a TCL environment as well. But, obviously something in their managed environment satisfies dependencies that are needed to make a successful query.
The first few lines of the subroutine are as follows:
>ED OC OPEN.STANDARD.FILES
429 lines long.
----: P
0001: SUBROUTINE
0002: $INCLUDE AD.DIR CC~COMMON
0003: *
0004: *
0005: *
0006: *
0007: *
0008: *
0009: *
0010: *
0011: *
0012: *
0013: *
0014: *
0015: *
0016: *
0017: *
0018: *
0019: *
0020: IF FILES.ARE.OPEN$ THEN RETURN
0021: *
0022: OPEN 'ABC.CODES' TO ABCCFILE ELSE
----:
0023: FLNM = 'ABC.CODES'
----:
0024: GOSUB EXIT.OPN
.
.
.
To create a wrapper for this routine, do the following:
>ED BP OPEN.STANDARD.FILES.TCL
001 * OPEN.STANDARD.FILES.TCL
002 CALL OPEN.STNADARD.FILES
003 STOP
004 END
>BASIC BP OPEN.STANDARD.FILES.TCL
>CATALOG BP OPEN.STANDARD.FILES.TCL
Then you can execute OPEN.STANDARD.FILES.TCL before your list statement. I just noticed that you are have this tagged for "u2netsdk". Are you accessing Epicor using .NET api, or the "cd c:/u2/eclipse" and "uv"?
If you are using the .NET api, then you can call OPEN.STANDARD.FILES directly from the .NET api before your EXECUTE the list statement.
-Nathan

Fortran95 -- Reading from a formatted text file

I need to read some values from a table. These are the first five rows, to give you some idea of what it should look like:
1 + 3 98 96 1
2 + 337 2799 2463 1
3 + 2801 3733 933 1
4 + 3734 5020 1287 1
5 + 5234 5530 297 1
My interest is in the first four columns of each row. I need to read these into arrays. I used the following code:
program ----
implicit none
integer, parameter :: totbases = 4639675, totgenes = 4395
integer :: codtot, ks
integer, dimension(totgenes) :: ngene, lend, rend
character :: genome*4639675, sign*4
open(1,file='e_coli_g_info')
open(2,file='e_coli_g_str')
do ks = 1, totgenes
read(1,100) ngene(ks),sign(ks:ks),lend(ks), rend(ks)
end do
100 format(1x,i4,8x,a1, 2(5x,i7), 22x)
do ks = 1, 100
write(*,*) ngene(ks), sign(ks:ks),lend(ks), rend(ks)
end do
end program
The loop at the end of the program is to print the first hundred entries to test that they are being read correctly. The problem is that I am getting this garbage (the fourth row is the problem):
1 + 3 757934891
2 + 337 724249387
3 + 2801 757803819
4 + 3734 757803819
5 + 5234 757935405
Clearly, the fourth column is way off. In fact, I cannot find these values anywhere in the file that I am reading from. I am using the gfortran compiler for Ubuntu 12.04. I would greatly appreciate if somebody would point me in the right direction. I'm sure it's likely that I'm missing something very obvious because I'm new at Fortran.
Fortran formats are (traditionally, there's some newer stuff that I won't go into here) fixed format, that is, they are best suited for file formats with fixed columns. I.e. column N always starts at character position M, no ifs or buts. If your file format is more "free format"-like, that is, columns are separated by whitespace, it's often easier and more robust to read data using list formatting. That is, try to do your read loop as
do ks = 1, totgenes
read(1, *) ngene(ks), sign(ks:ks), lend(ks), rend(ks)
end do
Also, as a general advice, when opening your own files, start from unit 10 and go upwards from there. Fortran implementations typically use some of the low-numbered units for standard input, output, and error (a common choice is units 1, 5, and 6). You probably don't want to redirect those.
PS 2: I haven't tried your code, but it seems that you have a bounds overflow in the sign variable. It's declared of length 4, but then you assign to index ks which goes all the way up to totgenes. As you're using gfortran on Ubuntu 12.04 (that is, gfortran 4.6), when developing compile with options "-O1 -Wall -g -fcheck=all"

formatting the binary representation of data in gdb

these days i use the examine memory feature of gdb a lot. however, i find the binary representation of data not very readable as all the bits are cramped together. i'd like to add some spacing to make it more readable, so for example instead of 01101100011011000110010101001000 i'll have 0110-1100-0110-1100-0110-0101-0100-1000 or something similar.
is this possible? the closest i got was x/4bt s which is close, but there are still two problems: the data is grouped in bytes (8 bits and not 4) which are layed out in reverse (so its 01001000 01100101 01101100 01101100)
thanks
What you need is something similar to pretty printers for GDB, they are most often use to print an STL / linked list etc. There might be some solution for your need that already exists so google a bit for pretty printers.
The below script is just an example of how you can write a custom command with python extensions, this will work of GDB version 7.3 and greater, however I tested it for version 7.5.
import gdb
class ppbin(gdb.Command):
def __init__(self):
super(ppbin, self).__init__("ppbin", gdb.COMMAND_USER)
def invoke(self, arg, tty):
print arg
arg_list = gdb.string_to_argv(arg)
if len(arg_list) < 2:
print "usage: <address>, <byte-count>"
return
res = gdb.execute("x/%sxt %s" %(arg_list[1], arg_list[0]), False, True)
res = res.split("\t")
ii = 0
for bv in res:
if ii % 4:
print "%s-%s-%s-%s-%s-%s-%s-%s" %(bv[0:4], bv[4:8],
bv[8:12], bv[12:16], \
bv[16:20], bv[20:24], \
bv[24:28],bv[28:32])
ii += 1
ppbin()
Invoking the new ppbin command
(gdb) source pp-bin.py
(gdb) ppbin 0x601040 10
0x601040 10
0000-0000-1010-1010-0011-0011-0101-0101
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
0000-0000-0000-0000-0000-0000-0000-0000
(gdb)
Above code is shared https://skamath#bitbucket.org/skamath/ppbin.git
P.S. - I usually find debugging memory in hex (x command) is easier than binary, so I will not use my solution.