Lookup tables in OCaml - ocaml

I would like to create a lookup table in OCaml. The table will have 7000+ entries that, upon lookup (by int), return a string. What is an appropriate data structure to use for this task? Should the table be externalized from the base code and if so, how does one go about "including" the lookup table to be accessible from his/her program?
Thanks.

If the strings are addressed using consecutive integers you could use an array.
Otherwise you can use a hash table (non-functional) or a Map (functional). To get started with the Map try:
module Int =
struct
type t = int
let compare = compare
end ;;
module IntMap = Map.Make(Int) ;;
If the table is too large to store in memory, you could store it in an external database and use bindings to dbm, bdb, sqlite,...

let table : (int,string) Hashtbl.t = Hashtbl.create 8192

To store the table in a separate file (e.g. as an array), simply create a file strings.ml with the content:
let tbl = [|
"String 0";
"String 1";
"String 2";
...7000 more...
|]
Compile this with:
ocamlc -c strings.ml
As explained in the manual, this defines a module Strings that other Ocaml modules can reference. For example, you can start a toplevel:
ocaml strings.cmo
And lookup a string by accessing a particular position in the array:
Strings.tbl.(1234) ;;

Related

Merge 2 object lists in java

i have two lists listA and listB of type object
ListA[name=abc, age=34, weight=0, height=0] data collected from excel sheet
ListB[name=null, age=0, weight=70, height=6] data collected from database
Now i want to combine both the lists into a single list
MergedList[name=abc, age=34, weight=70, height=6]
Note: my obj class has more than 15 properties so adding each property one by one using getProperty() will be time-consuming.is there a better way?
Convert them to a Map where the key is the name of the object ( you denoting the elements as name=abc suggests they are name/value pairs ).
Map<String,MyMysteriousObject> converted = list.stream().collect( Collectors.toMap(MyMysteriousObject::getName, Function.identity() ) );
( replace the getName with what ever function you use to get the name of your object )
And then just merge the maps. How to merge maps is described here for example.
While at it, consider replacing the List with Map in your entire code. Will surely save a lot of work elsewhere too.
But if you have to have a list again, just List<MyMysteriousObject> resultList = new ArrayList<>(resultMap);

Lua functions use "self" in source but no metamethod allows to use them

I've been digging into Lua's source code, both the C source from their website and the lua files from Lua on Windows. I found something odd that I can't find any information about, as to why they chose to do this.
There are some methods in the string library that allows OOP calling, by attaching the method to the string like this:
string.format(s, e1, e2, ...)
s:format(e1, e2, ...)
So I dug into the source code for the module table, and found that functions like table.remove(), also allows for the same thing.
Here's the source code from UnorderedArray.lua:
function add(self, value)
self[#self + 1] = value
end
function remove(self, index)
local size = #self
if index == size then
self[size] = nil
elseif (index > 0) and (index < size) then
self[index], self[size] = self[size], nil
end
end
Which indicate that the functions should support the colon method. Lo' and behold when I copy table into my new list, the methods carry over. Here's an example using table.insert as a method:
function copy(obj, seen) -- Recursive function to copy a table with tables
if type(obj) ~= 'table' then return obj end
if seen and seen[obj] then return seen[obj] end
local s = seen or {}
local res = setmetatable({}, getmetatable(obj))
s[obj] = res
for k, v in pairs(obj) do res[copy(k, s)] = copy(v, s) end
return res
end
function count(list) -- Count a list because #table doesn't work on keyindexed tables
local sum = 0; for i,v in pairs(list) do sum = sum + 1 end; print("Length: " .. sum)
end
function pts(s) print(tostring(s)) end -- Macro function
local list = {1, 2, 3}
pts(list.insert) --> nil
pts(table["insert"]) --> function: 0xA682A8
pts(list["insert"]) --> nil
list = copy(_G.table)
pts(table["insert"]) --> function: 0xA682A8
pts(list["insert"]) --> function: 0xA682A8
count(list) --> Length: 9
list:insert(-1, "test")
count(list) --> Length: 10
Was Lua 5.1 and newer supposed to support table methods like the string library but they decided to not implement the meta method?
EDIT:
I'll explain it a little further so people understand.
Strings have metamethods attached that you can use on the strings OOP style.
s = "test"
s:sub(1,1)
But tables doesn't. Even though the methods in the table's source code allow for it using "self" functions. So the following code doesn't work:
t = {1,2,3}
t:remove(#t)
The function has a self member defined in the argument (UnorderedArray.lua:25: function remove(self,index)).
You can find the metamethods of strings by using:
for i,v in pairs(getmetatable('').__index) do
print(i, tostring(v))
end
which prints the list of all methods available for strings:
sub function: 0xB4ABC8
upper function: 0xB4AB08
len function: 0xB4A110
gfind function: 0xB4A410
rep function: 0xB4AD88
find function: 0xB4A370
match function: 0xB4AE08
char function: 0xB4A430
dump function: 0xB4A310
gmatch function: 0xB4A410
reverse function: 0xB4AE48
byte function: 0xB4A170
format function: 0xB4A0F0
gsub function: 0xB4A130
lower function: 0xB4AC28
If you attach the module/library table to a table like Oka showed in the example, you can use the methods that table has just the same way the string metamethods work.
The question is: Why would Lua developers allow metamethods of strings by default but tables doesn't even though table's library and it's methods allow it in the source code?
The question was answered: It would allow a developer of a module or program to alter the metatables of all tables in the program, leading to the result where a table would behave differently from vanilla Lua when used in a program. It's different if you implement a class of a data type (say: vectors) and change the metamethods of that specific class and table, instead of changing all of Lua's standard table metamethods. This also slightly overlaps with operator overloading.
If I'm understanding your question correctly, you're asking why it is not possible to do the following:
local tab = {}
tab:insert('value')
Having tables spawn with a default metatable and __index breaks some assumptions that one would have about tables.
Mainly, empty tables should be empty. If tables were to spawn with an __index metamethod lookup for the insert, sort, etc., methods, it would break the assumption that an empty table should not respond to any members.
This becomes an issue if you're using a table as a cache or memo, and you need to check if the 'insert', or 'sort' strings exist or not (think arbitrary user input). You'd need to use rawget to solve a problem that didn't need to be there in the first place.
Empty tables should also be orphans. Meaning that they should have no relations without the programmer explicitly giving them relations. Tables are the only complex data structure available in Lua, and are the foundation for a lot of programs. They need to be free and flexible. Pairing them with the the table table as a default metatable creates some inconsistencies. For example, not all tables can make use of the generic sort function - a weird cruft for dictionary-like tables.
Additionally, consider that you're utilizing a library, and that library's author has told you that a certain function returns a densely packed table (i.e., an array), so you figure that you can call :sort(...) on the returned table. What if the library author has changed the metatable of that return table? Now your code no longer works, and any generic functions built on top of a _:sort(...) paradigm can't accept these tables.
Basically put, strings and tables are two very different beasts. Strings are immutable, static, and their contents are predictable. Tables are mutable, transient, and very unpredictable.
It's much, much easier to add this in when you need it, instead of baking it into the language. A very simple function:
local meta = { __index = table }
_G.T = function (tab)
if tab ~= nil then
local tab_t = type(tab)
if tab_t ~= 'table' then
error(("`table' expected, got: `%s'"):format(tab_t), 0)
end
end
return setmetatable(tab or {}, meta)
end
Now any time you want a table that responds to functions found in the table table, just prefix it with a T.
local foo = T {}
foo:insert('bar')
print(#foo) --> 1

casting dictionary in kdb

I want to cast dictionary and log it.
dict:(enlist`code)!(enlist`B10005)
when I do
type value dict / 11h
but the key looks like ,code
when I do
type string value dict / 0h
I am not sure why.
I want to concatenate with strings and log it. So it will be something like:
"The constraint is ",string key dict
But it did not work. The constraint will be like each letter for each line. How I can cast the dictionary so I can concatenate and log it.
Have a look at http://code.kx.com/q/ref/dotq/#qs-plain-text for logging arbitrary kdb+ datatypes.
q)show dict:`key1`key2!`value1`value2
key1| value1
key2| value2
q).Q.s dict
"key1| value1\nkey2| value2\n"
There are several things are going on here.
dict has one key/value pair only but this fact doesn't affect how key and value behave: they return all keys and values. This is why type value dict is 11h which is a list of symbols. For exact same reason key dict is ,`code where comma means enlist: key dict is a list of symbols which (in your particular example) happens to have just one symbol `code.
string applied to a list of symbols converts every element of that list to a string and returns a list of strings
a string in q is a simple list of characters (see http://code.kx.com/wiki/Tutorials/Lists for more on simple and mixed lists)
when you join a simple list of characters "The constraint is " with a list of strings, i.e. a list of lists of characters a result cannot be represented as a simple list anymore and becomes a generic list. This is why q converts "The constraint is " (simple list) to ("T";"h";"e",...) (generic list) before joining and you q starts displaying each character on a separate line.
I hope you understand now what's happening. Depending on your needs you can fix your code like this:
"The constraint is ",first string key dict / displays the first key
"The constraint is ",", "sv string key dict / displays all keys separated by commas
Hope this helps.
if you are looking something for nice logging, something like this should help you(and is generic)
iterate through values, and convert to strings
s:{$[all 10h=type each x;","sv x;0h~t:type x;.z.s each x;10h~t;x;"," sv $[t<0;enlist#;(::)]string x]}/
string manipulation
fs:{((,)string[count x]," keys were passed")," " sv/:("Key:";"and the values for it were:"),'/:flip (string key#;value)#\:s each x}
examples
d:((,)`a)!(,)`a
d2:`a`b!("he";"lo")
d3:`a`b!((10 20);("he";"sss";"ssss"))
results and execution
fs each (d;d2;d3)
you can tailor obviously to your exact needs - this is not tested for complex dict values

ocaml hashtbl remove function

How come the Hashtbl remove restores the previous binding.
Hashtbl.add t key1
Hashtbl.remove t key1
Hashtbl.remove t key1 => This should do anything but not restore the key1 !
Anyway, how come I can remove something making sure and if it was deleted before then proper flow shall be followed?
val remove : ('a, 'b) t -> 'a -> unit
Hashtbl.remove tbl x removes the current binding of x in tbl, restoring the previous binding if it exists. It does nothing if x is not bound in tbl.
There are two legitimate mode of uses of Hashtbl: always using Hashtbl.replace, which ensures that each key only has one binding in the table, or using the table as a multi-mapping (each key pointing to a list of values) with Hasthbl.add, Hashtbl.find and Hashtbl.find_all.
Please make sure that you understand which mode of use you're interested in. There is no point in adding several bindings to the same key if you don't want to keep old bindings (this can result in performance issues, memory leaks and stack overflows); in that case you should use Hashtbl.replace instead of Hashtbl.add, and Hashtbl.remove will do exactly what you expect.
If you are using the hashtable as a multi-mapping, and want a function that remove all bindings for a key, you can implement it yourslef (code untested):
let rec remove_all tbl key =
if Hashtbl.mem tbl key then begin
Hashtbl.remove tbl key;
remove_all tbl key
end
Edit: I just understood that another way to read your (hard to understand) question is "how can I make sure that there is a key to remove in the table, instead of silently doing nothing when remove is called?". cago provides a code snippet for that, in essence you can use Hashtbl.mem to check that the binding exists when you assume it should exist.
If you use Hashtbl.replace instead of Hashtbl.add you'll replace the current binding of the key in t. So the function Hashtbl.remove will not restore anything.
You can also write your own remove function :
let remove tbl key =
if Hashtbl.mem tbl key then Hashtbl.remove tbl key
else raise Nothing_to_remove_in_the_hashtbl
Hashtbl.replace t key1 value;;
remove t key1;;
remove t key1;; (* raise Nothing_to_remove_in_the_hashtbl *)

Yesod: Is it possible to to iterate a haskell list in Julius?

I have a list of coordinates that I need to put on map. Is it possible in julius to iterate the list ? Right now I am creating an hidden table in hamlet and accessing that table in julius which does not seem to be an ideal solution.
Could some one point to a better solution ? Thanks.
edit: Passing a JSON string for the list (which can be read by julius) seems to solve my problem.
As far as I know, you can't directly iterate over a list in julius. However, you can use the Monoid instance for the Javascript type to accomplish a similar effect. For example:
import Text.Julius
import Data.Monoid
rows :: [Int] -> t -> Javascript
rows xs = mconcat $ map row xs
where
row x = [julius|v[#{show x}] = #{show x};
|]
Then you can use rows xs wherever you'd normally put a julius block. For example, in ghci:
> renderJavascript $ rows [1..5] ()
"v[1] = 1;\nv[2] = 2;\nv[3] = 3;\nv[4] = 4;\nv[5] = 5;\n"