marshalling a hash table in ocaml - ocaml

I have a hash table in ocaml and I want to store this entire hash table as value field in a Berkeley DB. So I am trying to Marshal the hash table using Marshal.to_string. This returns a string but when I try to unmarshal the same string using Marshal.from_string, an Exception is thrown.
Any ideas on what the issue here is?

You have to annotate the type of the value you're unmarshaling. Like so (in top-level):
type t = (string, string) Hashtbl.t;;
let key = "key" in
let t_original : t = Hashtbl.create 1 in
Hashtbl.add t_original key "value";
let t_marshalled = Marshal.to_string t_original [] in
let t_unmarshalled : t = Marshal.from_string t_marshalled 0 in
assert ((Hashtbl.find t_original key) = (Hashtbl.find t_unmarshalled key));;

Related

Dynamically referring to JSON node in Power Query M

I have a function that extracts a node from JSON document as follows:
...
Json = GetJson(Url),
Value = Json[#"values"]
values correspond to the actual node within the JSON document.
I would like to generalize this piece of code and provide the name of the node as a variable like:
let myFunc = (parentNodeName as text) =>
...
Json = GetJson(Url),
Value = Json[parentNodeName]
However getting this error:
An error occurred in the ‘myFunc’ query. Expression.Error: The field 'parentNodeName' of the record wasn't found.
How can I refer to the JSON node dynamically?
Try
(Json, parentNodeName ) =>
let
...
Value = Record.Field(Json,parentNodeName)
in Value
sample code:
let Json = Json.Document(Web.Contents("http://soundcloud.com/oembed?url=http%3A//soundcloud.com/forss/flickermood&format=json")),
Value=myFunc(Json,"title")
in Value
and myFunc:
(Json, parentNodeName ) =>
let
Value = Record.Field(Json,parentNodeName)
in Value

Query condition missed key schema element: source_transaction_id in dynamoDB Scala

I am trying to execute a query with secondary Index as following
val valMap = new ValueMap()
.withString(":v_source_transaction_id", "843f45ad-cb1d-4f41-9ede-366c9304e447")
//.withString(":source_transaction_trace_id","843f45ad-cb1d-4f41-9ede-366c9304e443")
println(valMap)
//search is defined then extract dates from search.. else continue with simple logic.
val keyConditionExpression = """source_transaction_id =
| :v_source_transaction_id""".stripMargin
val spec = new QuerySpec()
.withProjectionExpression("source_transaction_id, transaction_date")
.withKeyConditionExpression(keyConditionExpression)
.withValueMap(valMap)
.withMaxResultSize(2)
case class DataItems(transaction_date: String)
val itemList = new ListBuffer[DataItems]
val items = table.getIndex("gsi-settlement").query(spec)
println(table.getIndex("gsi-settlement"))
val iterator = items.iterator()
while (iterator.hasNext) {
val next = iterator.next()
itemList += DataItems(next.getString("transaction_date"))
}
itemList.foreach(println)
here gsi settlement is the secondary index and source transaction id is primary key and I am getting the following error:
[AmazonDynamoDBException: Query condition missed key schema element: source_transaction_id
Try this :
val keyConditionExpression = """source_transaction_id = :v_source_transaction_id""".stripMargin
Currently your keyConditionExpression coming up as
source_transaction_id =
:v_source_transaction_id
Error : AmazonDynamoDBException: Query condition missed key schema element: source_transaction_id is suggesting that you are missing the GSI key in your query

Handling null value in a dictionary object

I am using a web service to get information. When said info is returned, I convert the data received from jason to a dict.
When Dumping dict object, some of the items arrive like this:
▿ (2 elements)
- key: "street1"
- value: <null> #4
How would i go about reading this data and knowing that the value is NULL
I have tried the following:
let street1:String = dict?["street1"] as! String
This fails with: Could not cast value of type 'NSNull' (0x10fbf7918) to 'NSString' (0x10f202c60).
The data could have a String value. So I tried:
let street1:Any = dict?["street1"] as Any
When I print street1 thus
street1: Optional()
I get the following
street1: Optional()
So my question is:
How would i go about reading this data and knowing that the value is null.
You can use if let for this type of nil check.
Try this instead:
if let street1 = dict?["street1"] as? String {
// If this succeeds then you can use street1 in here
print(street1)
}
Update:
var t_street1 = ""
if let street1 = dict?["street1"] as? String {
t_street1 = street1
}
No need for the else t_street1 is automatically empty since you assign it empty.

Spark - remove special characters from rows Dataframe with different column types

Assuming I've a Dataframe with many columns, some are type string others type int and others type map.
e.g.
field/columns types: stringType|intType|mapType<string,int>|...
|--------------------------------------------------------------------------
| myString1 |myInt1| myMap1 |...
|--------------------------------------------------------------------------
|"this_is_#string"| 123 |{"str11_in#map":1,"str21_in#map":2, "str31_in#map": 31}|...
|"this_is_#string"| 456 |{"str12_in#map":1,"str22_in#map":2, "str32_in#map": 32}|...
|"this_is_#string"| 789 |{"str13_in#map":1,"str23_in#map":2, "str33_in#map": 33}|...
|--------------------------------------------------------------------------
I want to remove some characters like '_' and '#' from all columns of String and Map type
so the result Dataframe/RDD will be:
|------------------------------------------------------------------------
|myString1 |myInt1| myMap1|... |
|------------------------------------------------------------------------
|"thisisstring"| 123 |{"str11inmap":1,"str21inmap":2, "str31inmap": 31}|...
|"thisisstring"| 456 |{"str12inmap":1,"str22inmap":2, "str32inmap": 32}|...
|"thisisstring"| 789 |{"str13inmap":1,"str23inmap":2, "str33inmap": 33}|...
|-------------------------------------------------------------------------
I am not sure if it's better to convert the Dataframe into an RDD and work with it or perform the work in the Dataframe.
Also, not sure how to handle the regexp with different column types in the best way (I am sing scala).
And I would like to perform this action for all column of these two types (string and map), trying to avoid using the column names like:
def cleanRows(mytabledata: DataFrame): RDD[String] = {
//this will do the work for a specific column (myString1) of type string
val oneColumn_clean = mytabledata.withColumn("myString1", regexp_replace(col("myString1"),"[_#]",""))
...
//return type can be RDD or Dataframe...
}
Is there any simple solution to perform this?
Thanks
One option is to define two udfs to handle string type column and Map type column separately:
import org.apache.spark.sql.functions.udf
val df = Seq(("this_is#string", 3, Map("str1_in#map" -> 3))).toDF("myString", "myInt", "myMap")
df.show
+--------------+-----+--------------------+
| myString|myInt| myMap|
+--------------+-----+--------------------+
|this_is#string| 3|Map(str1_in#map -...|
+--------------+-----+--------------------+
1) Udf to handle string type columns:
def remove_string: String => String = _.replaceAll("[_#]", "")
def remove_string_udf = udf(remove_string)
2) Udf to handle Map type columns:
def remove_map: Map[String, Int] => Map[String, Int] = _.map{ case (k, v) => k.replaceAll("[_#]", "") -> v }
def remove_map_udf = udf(remove_map)
3) Apply udfs to corresponding columns to clean it up:
df.withColumn("myString", remove_string_udf($"myString")).
withColumn("myMap", remove_map_udf($"myMap")).show
+------------+-----+-------------------+
| myString|myInt| myMap|
+------------+-----+-------------------+
|thisisstring| 3|Map(str1inmap -> 3)|
+------------+-----+-------------------+

Extract String value from Node

I'm trying to query a database using something like this:
let db = drop.database?.driver as? MySQLDriver
let query = "select \(fields) from \(table) where \(condition)"
let result = try db.raw(query)
I get the following Node object as result:
array([Node.Node.object(["field_name": Node.Node.string("value_info")])])
How can I get the value_info into a String variable?
You can use PathIndexable to step into the result object, then Polymorphic to cast it as a string.
Should look something like this:
let valueInfo = result[0, "field_name"]?.string