Clojure casting string to double from user input - clojure

Over at clojuredocs.com they discussed why to not use read or read-string from untrusted sources. One way to use it might be as such:
=> (double (read-string "1.99"))
=> 1.99
=> (.Double "1.22")
=> IllegalArgumentException No matching field found: Double for class...
Which is useful if you need a double to store in the database. But what if the data is from user input? Suppose users want to enter their height in feet or something. How could we take the input string from a webpage and convert it to a double or other numeric value safely?

put the . on the other side of Double
user> (Double. "1.22")
1.22
This calls the Double class's constructor which takes a string and produces a new double.
It's syntactic sugar for (new Double "1.22")

Related

How to read and output numeric values properly in BigQuery?

I'm trying to read the following rows out of a CSV file stored in GCS
headers: "A","B","C","D"
row1:"4000,0000000000000","15400000,000","12311918,400000","3088081,600"
row2:"5000,0000000000000","19250000,000","15389898,000000","3860102,000"
The issue here is how BigQuery is actually interpreting and thus outputting these numbers:
Results query number 1
It's interpreting A as FLOAT64, and B, C and D as INT64, which is okay since I decided to use autodetect schema. But when I try to convert it to a different type it's still outputting the numbers unproperly.
This is the query:
SELECT
CAST(quantity AS INT64) AS A,
CAST(expenses_2 AS FLOAT64) AS B,
CAST(cexpenses_3AS FLOAT64) AS C,
CAST(expenses_4 AS FLOAT64) AS D
FROM
`wide-gecko-289100.bqtest.expenses`
These are the results of query above:
Result query number 2
Either way, it's misinterpreting how to read the numbers, it should be as follows:
row1: [4000] [15400000] [12311918,4] [3088081,6]
row2: [5000] [19250000] [15389898] [3860102]
Is there a way to solve this?
This is due to BigQuery not understanding the localized format you're using for the numeric values. It expects the period (.) character for the decimal separator.
If you can't deal with this early in the process that produces the CSV files in BigQuery, another strategy is to instead use a string type for the columns, and then do some manipulation.
Here's a simple conversion example that shows some string manipulation and casting to get to the desired type. If you're using both commas and periods as part of the localized format, you'll need a more complex string manipulation.
WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)
SELECT
A,
CAST(REPLACE(A,",",".") AS FLOAT64) as A_as_float64,
CAST(CAST(REPLACE(A,",",".") AS FLOAT64) AS INT64) as A_as_int64
FROM
sample_row
You could also generalize this as a user defined function (temporary or persisted) to make it easier to reuse:
CREATE TEMPORARY FUNCTION parseAsFloat(instr STRING) AS (CAST(REPLACE(instr,",",".") AS FLOAT64));
WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)
SELECT
CAST(parseAsFloat(A) AS INT64) as A,
parseAsFloat(B) as B,
parseAsFloat(C) as C,
parseAsFloat(D) as D,
FROM
sample_row
I think this is an issue with how BigQuery interprets a comma. It seems to detect it as a thousands separator rather than a decimal.
https://issuetracker.google.com/issues/129992574
Is it possible to replace with a "." instead?

Meaning of # in clojure

In clojure you can create anonymous functions using #
eg
#(+ % 1)
is a function that takes in a parameter and adds 1 to it.
But we also have to use # for regex
eg
(clojure.string/split "hi, buddy" #",")
Are these two # related?
There are also sets #{}, fully qualified class name constructors #my.klass_or_type_or_record[:a :b :c], instants #inst "yyyy-mm-ddThh:mm:ss.fff+hh:mm" and some others.
They are related in a sence that in these cases # starts a sequence recognisible by clojure reader, which dispatches every such instance to an appropriate reader.There's a guide that expands on this.
I think this convention exists to reduce the number of different syntaxes to just one and thus simplify the reader.
The two uses have no (direct) relationship.
In Clojure, when you see the # symbol, it is a giant clue that you are "talking" to the Clojure Reader, not to the Clojure Compiler. See the full docs on the Reader here: https://clojure.org/reference/reader.
The Reader is responsible for converting plain text from a source file into a collection of data structures. For example, comparing Clojure to Java we have
; Clojure ; Java
"Hello" => new String( "Hello" )
and
[ "Goodbye" "cruel" "world!" ] ; Clojure vector of 3 strings
; Java ArrayList of 3 strings
var msg = new ArrayList<String>();
msg.add( "Goodbye" );
msg.add( "cruel" );
msg.add( "world!" );
Similarly, there are shortcuts that the Reader recognizes even within Clojure source code (before the compiler converts it to Java bytecode), just to save you some typing. These "Reader Macros" get converted from your "short form" source code into "standard Clojure" even before the Clojure compiler gets started. For example:
#my-atom => (deref my-atom) ; not using `#`
#'map => (var map)
#{ 1 2 3 } => (hash-set 1 2 3)
#_(launch-missiles 12.3 45.6) => `` ; i.e. "nothing"
#(+ 1 %) => (fn [x] (+ 1 x))
and so on. As the # or deref operator shows, not all Reader Macros use the # (hash/pound/octothorpe) symbol. Note that, even in the case of a vector literal:
[ "Goodbye" "cruel" "world!" ]
the Reader creates a result as if you had typed:
(vector "Goodbye" "cruel" "world!" )
Are these two # related?
No, they aren't. The # literal is used in different ways. Some of them you've already mentioned: these are an anonymous function and a regex pattern. Here are some more cases:
Prepending an expression with #_ just wipes it from the compiler as it has never been written. For example: #_(/ 0 0) will be ignored on reader level so none of the exception will appear.
Tagging primitives to coerce them to complex types, for example #inst "2019-03-09" will produce an instance of java.util.Date class. There are also #uuid and other built-in tags. You may register your own ones.
Tagging ordinary maps to coerce them to types maps, e.g. #project.models/User {:name "John" :age 42} will produce a map declared as (defrecord User ...).
Other Lisps have proper programmable readers, and consequently read macros. Clojure doesn't really have a programmable reader - users cannot easily add new read macros - but the Clojure system does internally use read macros. The # read macro is the dispatch macro, the character following the # being a key into a further read macro table.
So yes, the # does mean something; but it's so deep and geeky that you do not really need to know this.

How this code translates a number to English?

I'm new to Clojure and found there's a piece of code like following
user=> (def to-english (partial clojure.pprint/cl-format nil
"~#(~#[~R~]~^ ~A.~)"))
#'user/to-english
user=> (to-english 1234567890)
"One billion, two hundred thirty-four million, five hundred sixty-seven
thousand, eight hundred ninety"
at https://clojuredocs.org/clojure.core/partial#example-542692cdc026201cdc326ceb. I know what partial does and I checked clojure.pprint/cl-format doc but still don't understand how it translates an integer to English words. Guess secret is hidden behind "~#(~#[~R~]~^ ~A.~)" but I didn't find a clue to read it.
Any help will be appreciated!
The doc mentions it, but one good resource is A Few FORMAT Recipes from Seibel's Practical Common Lisp.
Also, check §22.3 Formatted Output from the HyperSpec.
In Common Lisp:
CL-USER> (format t "~R" 10)
ten
~#(...~^...) is case conversion, where the # prefix means to capitalize (upcase only the first word). It contains an escape upward operation ~^, which in this context marks the end of what is case-converted. It also exits the current context when there are no more argument available.
~#[...] is conditional format: the inner format is applied on a value only if it is non nil.
The final ~A means that the function should be able to accept one more argument and print it.
In fact, your example looks like the one in §22.3.9.2:
If ~^ appears within a ~[ or ~( construct, then all the commands up to
the ~^ are properly selected or case-converted, the ~[ or ~(
processing is terminated, and the outward search continues for a ~{ or
~< construct to be terminated. For example:
(setq tellstr "~#(~#[~R~]~^ ~A!~)")
=> "~#(~#[~R~]~^ ~A!~)"
(format nil tellstr 23) => "Twenty-three!"
(format nil tellstr nil "losers") => " Losers!"
(format nil tellstr 23 "losers") => "Twenty-three losers!"

Using Clojure do-template to set prepared-statement columns

I have defined the following code to allow me to set column values in a java.sql.PreparedStatement. Is this code reasonable/idiomatic? How could it be improved?
(use '(clojure.template :only [do-template]))
; (import all java types not in java.lang)
(defprotocol SetPreparedStatement
(set-prepared-statement [this prepared-statement index]))
(do-template [type-name set-name]
(extend-type type-name
SetPreparedStatement
(set-prepared-statement [this prepared-statement index]
(set-name prepared-statement index this)))
BigDecimal .setBigDecimal
Boolean .setBoolean
Byte .setByte
Date .setDate
Double .setDouble
Float .setFloat
Integer .setInt
Long .setLong
Object .setObject
Short .setShort
Time .setTime
Timestamp .setTimestamp)
; Sample use
(set-prepared-statement 42 some-prepared-statement 1)
Your example looks as close to idiomatic Clojure as I can tell :)
It could perhaps benefit from abstracting the type mapping out if you have situations where you will be creating more than one template though if your creating just this one then this looks like excellent clojure to me.

How to write list to file?

I am trying:
import System.IO
saveArr = do
outh <- openFile "test.txt" WriteMode
hPutStrLn outh [1,2,3]
hClose outh
but it doesn't works... output:
No instance for (Num Char) arising from the literal `1'
EDIT
OK hPrint works with ints but what about float number in array? [1.0, 2.0, 3.0]?
hPutStrLn can only print strings. Perhaps you want hPrint?
hPrint outh [1,2,3]
Arrays, Lists and Strings exists only in imagination of programmer and as a term in some languages.
File is a sequence of bytes, so when you want to write something to it you should encode that imaginary String/List/Array into sequence of bytes (by show or by something from Storable etc).
As well terminal is a sequence of bytes which is encoded representation of actions needed to show something to user.
You have many ways to encode. You can make CSV representation of array by foldr (\a b -> a (',' : b)) "\n" (map shows [1,2,3]) or you may want to print it show [1,2,3]
derive Binary for your type, then write the data in binnary form using 'encodeFile' from the Data.Binary package. This is similar to writing the data out as a bytestring.