Google App Engine Query matches case [duplicate] - django

Using the google appengine datastore, is there a way to perform a gql query that specifies a WHERE clause on a StringProperty datatype that is case insensitive? I am not always sure what case the value will be in. The docs specify that the where is case sensitive for my values, is there a way to make this insensitive?
for instance the db Model would be this:
from google.appengine.ext import db
class Product(db.Model):
id = db.IntegerProperty()
category = db.StringProperty()
and the data looks like this:
id category
===================
1 cat1
2 cat2
3 Cat1
4 CAT1
5 CAT3
6 Cat4
7 CaT1
8 CAT5
i would like to say
gqlstring = "WHERE category = '{0}'".format('cat1')
returnvalue = Product.gql(gqlstring)
and have returnvalue contain
id category
===================
1 cat1
3 Cat1
4 CAT1
7 CaT1

I don't think there is an operator like that in the datastore.
Do you control the input of the category data? If so, you should choose a canonical form to store it in (all lowercase or all uppercase). If you need to store the original case for some reason, then you could just store two columns - one with the original, one with the standardized one. That way you can do a normal WHERE clause.

The datastore doesn't support case insensitive comparisons, because you can't index queries that use them (barring an index that transforms values). The solution is to store a normalized version of your string in addition to the standard one, as Peter suggests. The property classes in the AETycoon library may prove helpful, in particular, DerivedProperty.

This thread was helpful and makes me want to contribute with similar approach to make partial search match possible. I add one more field on datastore kind and save each word on normalized phrase as a set and then use IN filter to collide. This is an example with a Clojure. Normalize part should easy translate to java at least (thanks to #raek on #clojure), while database interaction should be convertable to any language:
(use '[clojure.contrib.string :only [split lower-case]])
(use '[appengine-magic.services.datastore :as ds])
; initialize datastore kind entity
(ds/defentity AnswerTextfield [value, nvalue, avalue])
; normalize and lowercase a string
(defn normalize [string-to-normalize]
(lower-case
(apply str
(remove #(= (Character/getType %) Character/NON_SPACING_MARK)
(java.text.Normalizer/normalize string-to-normalize java.text.Normalizer$Form/NFKD)))))
; save original value, normalized value and splitted normalized value
(defn textfield-save! [value]
(ds/save!
(let [nvalue (normalize value)]
(ds/new* AnswerTextfield [value nvalue (split #" " nvalue)]))))
; normalized search
(defn search-normalized [value]
(ds/query :kind AnswerTextfield
:filter [(= :nvalue (normalize value))]))
; partial normalized word search
(defn search-partial [value]
(flatten
(let [coll []]
(for [splitted-value (split #" " (normalize value))]
(merge coll
(ds/query :kind AnswerTextfield
:filter [(in :avalue [splitted-value])]))))))

Related

Reasons not to use use a wildcard pull?

Are there any reasons not to use a wildcard pull?
(defn pull-wild
"Pulls all attributes of a single entity."
[db ent-id]
(d/pull db '[*] ent-id))
It's much more convenient than explicitly stating the attributes.
It depends on which attributes you need to have in your application and if it's data intensive or whether you want to pull lots of entities.
In case you use the client-library, you might want to minimize the data that needs to be send over the wire.
I guess there are lots of other thoughts about that.
But as long as it's fast enough I would pull the wildcard.
fricke
You may also be interested in the entity-map function from Tupelo Datomic. Given an EID (or a Lookup Ref) it will return the full record as a regular Clojure map:
(let [
; Retrieve James' attr-val pairs as a map. An entity can be referenced either by EID or by a
; LookupRef, which is a unique attribute-value pair expressed as a vector.
james-map (td/entity-map (live-db) james-eid) ; lookup by EID
james-map2 (td/entity-map (live-db) [:person/name "James Bond"] ) ; lookup by LookupRef
]
(is (= james-map james-map2
{:person/name "James Bond" :location "London" :weapon/type #{:weapon/wit :weapon/gun} } ))

Polymorphic Schemas in Clojure

I want to create polymorphic schemas/types, and I'm curious of best practices. The following 2 examples allow me to create a Frequency schema which can repeat an event monthly by day of month, or monthly by day of week (eg, every 15th, or every first monday, respectively).
The first one uses the experimental abstract map to accomplish this, and its syntax of it is awkward (IMO). Plus being in the experimental package concerns me a bit.
The second one uses s/conditional, and this suffers from not being able to easily coerce the value of type from a string to keyword, which is useful when dealing with a REST API, or JSON. (whereas s/eq is great for this).
In the general case, is one of these, or some third option, the best practice for conveying: Type A is one of Types #{B C D ...}?
Two options:
;;OPTION 1
​
(s/defschema Frequency (field (abstract-map/abstract-map-schema
:type {})
{}))
(abstract-map/extend-schema MonthlyByDOM Frequency
[:monthly-by-dom]
{:days #{MonthDay}})
(abstract-map/extend-schema MonthlyByDOW Frequency
[:monthly-by-dow]
{:days #{WeekDay}
:weeks #{(s/enum 1 2 3 4 5)}})
;;OPTION 2
​
(s/defschema MonthlyByDOM "monthly by day of month, eg every 13th and 21st day" {:type (s/eq :monthly-by-dom)
:days #{MonthDay}})
(s/defschema MonthlyByDOW "monthly by day of week, eg first, and third friday" {:type (s/eq :monthly-by-dow)
:days #{WeekDay}
:weeks #{(s/enum 1 2 3 4 5)}})
(s/defschema Frequency (field (s/conditional #(= (s/eq :monthly-by-dom) (do (prn %) (:type %))) MonthlyByDOM
#(= :monthly-by-dow (:type %)) MonthlyByDOW)
{:default {:type :monthly-by-dom
:days #{1 11 21}}}))
Similar questions that don't quite help:
https://groups.google.com/forum/#!topic/prismatic-plumbing/lMvazYXRAQQ
Polymorphic schema validation in Clojure
Validating multiple polymorphic values using Prismatic Schema

Different values for MD 5 hash depending on technique

I'm trying to find a good way to hash a string. This method is working fine, but the results are not consistent with this website:
(defn hash-string
"Use java interop to flexibly hash strings"
[string algo base]
(let [hashed
(doto (java.security.MessageDigest/getInstance algo)
(.reset)
(.update (.getBytes string)))]
(.toString (new java.math.BigInteger 1 (.digest hashed)) base))
)
(defn hash-md5
"Generate a md5 checksum for the given string"
[string]
(hash-string string "MD5" 16)
)
When I use this, I do indeed get hashes. The problem is I'm trying a programming exercise at advent of code and it has its own examples of string hashes which offer a 3rd result different from the above 2!
How can one do an md5 in the "standard" way that is always expected?
Your MD5 operations are correct; you're just not displaying them properly.
Since an MD5 is 32 hexadecimal characters long, you need to format the string to pad it out correctly.
In other words, simply change this expression:
(.toString (new java.math.BigInteger 1 (.digest hashed)) base))
to one that uses format:
(format "%032x" (new java.math.BigInteger 1 (.digest hashed)))))

Cypher QL doesn't execute atomically query

I'm starting to work with Neo4j and I noticed a really bad behaviour when updating a property on a node I am reading at the same moment. The clojure code I wrote use Neocons library to communicate with Neo4j:
(ns ams.utils.t.test-cypher-error
(:require [clojurewerkz.neocons.rest :as rest]
[clojurewerkz.neocons.rest.nodes :as nodes]
[clojurewerkz.neocons.rest.cypher :as cypher]))
(rest/connect! "http://192.168.0.101:7474/db/data")
(def counter-id (:id (nodes/create {:counter 0})))
(defn update-counter [] (cypher/query "START c = node({id}) SET c.counter = c.counter + 1 RETURN c.counter as counter" {"id" counter-id}))
(doall (apply pcalls (repeat 10 update-counter)))
(println "Counter:" ((comp :counter :data) (nodes/get counter-id)))
(nodes/destroy counter-id)
Guess the result:
Counter: 4
Sometimes is 5, sometimes 4, but you got the problem here: between the START and the SET clause the value of the counter is changed but cypher doesn't catch it!
Two questions here:
Am I doing something wrong?
Is there any viable unique counter rest generation algorithm in Neo4j?
Neo4j version is 1.9RC1, thanks in advance!
The problem you're encountering is that neo4j does not have implicit read-locking. So here's what sometimes happens:
Query 1 starts
Query 1 reads counter value as 3
Query 1 sets counter values to 3+1 = 4
Query 2 starts
Query 2 reads counter value as 4
Query 2 sets counter values to 4+1 = 5
And here's what sometimes happens:
Query 1 starts
Query 2 starts
Query 1 reads counter value as 3
Query 2 reads counter value as 3
Query 1 sets counter values to 3+1 = 4
Query 2 sets counter values to 3+1 = 4
In read-locking databases (like most SQL servers) situation #2 could not happen. Query 2 would start and then would block until query 1 either committed or rolled back.
There may be a way to explicitly set read locks, but not through the REST API. The transaction API looks promising, though I'm not entirely sure it can give you what you want, and again, it is not supported via REST.

Lisp, add new list to db in "for" loop, why returning NIL?

I wonder, how can I print in LISP each new value from loop "for" in new list, which each time creates by calling the function.
I have created the func:
(defun make (id name surname) (list :id id :name name :surname surname) )
Here I created the global variable:
(defvar *db* nil)
And here I defined the func for adding each new value to store it in db:
(defun add (cd) (push cd *db*))
So, I'm able to add each new data to db, like that:
(add (make 0 "Oleg" "Orlov" ) )
To look the content of my db , I can use:
*db*
So, I wonder how to put each new record-list to db using "for" loop, I print values in "for" loop in lisp like this:
(loop for i from 1 to 10 do ( ... ))
If , I use:
(loop for i from 0 to 10 do (add (make i "Oleg" "Orlov") ) )
If you read db using *db* you will see, that all evelen records were added, but after calling the last line, you will get NIL result in return.
Why do I catch NIL result, not T and what does it mean?
Thanks, best regards!
Every form in Lisp evaluates to something.
If a form you type in doesn't return a value, it will evaluate to NIL by default (otherwise, it evaluates to the value(s) it returns). Your loop doesn't actually return a value itself; it just performs 10 assignments (each of the intermediate expressions do return a value, but you don't collect and return them). Therefore, that code will return NIL.
If you haven't done so already, check out chapter 3 of Practical Common Lisp, in which Peter Seibel goes step-by-step through creating a simple database. It might give you some insights into the basics of how Lisp works. The specific question you ask (why forms return NIL by default, and what it means specifically in the context of Common Lisp) is answered in chapter 2 of the same book
As for how you would explicitly cause your loop to emit the list of items it added to *db*, try the following
(loop for i from 1 to 10
for elem = (make i "Oleg" "Orlov")
do (add elem)
collect elem)