With being ignored by korma (clojure) - clojure

I have the following code:
(defentity users
(database korma-db)
(has-many tags))
(defentity tags
(database korma-db)
(belongs-to users))
(-> (select* users)
(with tags)
(fields :address)
(where {:id 1})
(as-sql))
and it generates the following sql:
SELECT "users"."address" FROM "users" WHERE ("users"."id" = ?)
While I would expect it to include a join to the tags table, by merit of having the with macro applied. Clearly this isn't the case, but executing it will produce an empty :tags key in the single returned record.
Am I missing something here?

did you create the actual referential constraint on the database?
I think I had the same issue once and I fixed it by creating a foreign key when defining the field.
i.e. in PostgreSQL
CREATE TABLE tags (
...
users_id INTEGER REFERENCES users(id),
)

Related

How can GORM's FirstOrCreate() method (or Django's get_or_create) ensure that just one row is created?

I'm considering using GORM for an application and was looking into how FirstOrCreate works, and it seems that it uses two database operations. Consider this example script:
package main
import (
"github.com/jinzhu/gorm"
_ "github.com/jinzhu/gorm/dialects/sqlite"
"github.com/sirupsen/logrus"
)
type User struct {
gorm.Model
Name string
Age uint
}
func main() {
db, err := gorm.Open("sqlite3", "examplegorm.db")
if err != nil {
logrus.Fatalf("open db: %v", err)
}
defer db.Close()
db.LogMode(true)
db.AutoMigrate(&User{})
var user User
db.Where(User{Name: "non_existing"}).Attrs(User{Age: 20}).FirstOrCreate(&user)
}
Upon running this and inspecting the logs, I see that (aside from the auto-migration) it uses two queries, one SELECT and one INSERT:
kurt#Kurts-MacBook-Pro-13 ~/D/Scratch> go run gorm_example.go
(/Users/kurt/Documents/Scratch/gorm_example.go:23)
[2020-01-05 09:09:10] [1.03ms] CREATE TABLE "users" ("id" integer primary key autoincrement,"created_at" datetime,"updated_at" datetime,"deleted_at" datetime,"name" varchar(255),"age" integer )
[0 rows affected or returned ]
(/Users/kurt/Documents/Scratch/gorm_example.go:23)
[2020-01-05 09:09:10] [0.86ms] CREATE INDEX idx_users_deleted_at ON "users"(deleted_at)
[0 rows affected or returned ]
(/Users/kurt/Documents/Scratch/gorm_example.go:26)
[2020-01-05 09:09:10] [0.28ms] SELECT * FROM "users" WHERE "users"."deleted_at" IS NULL AND (("users"."name" = 'non_existing')) ORDER BY "users"."id" ASC LIMIT 1
[0 rows affected or returned ]
(/Users/kurt/Documents/Scratch/gorm_example.go:26)
[2020-01-05 09:09:10] [0.31ms] INSERT INTO "users" ("created_at","updated_at","deleted_at","name","age") VALUES ('2020-01-05 09:09:10','2020-01-05 09:09:10',NULL,'non_existing',20)
[1 rows affected or returned ]
As I understand from https://stackoverflow.com/a/16128088/995862, however,
In a SQL DBMS, the select-test-insert approach is a mistake: nothing prevents another process from inserting the "missing" row between your select and insert statements.
It seems that Django's get_or_create() works in a similar fashion. Given this model,
from django.db import models
class User(models.Model):
name = models.CharField(max_length=255)
age = models.PositiveIntegerField()
if I enable database logging and run a get_or_create() query I see
In [1]: from djangoapp.models import *
In [2]: User.objects.get_or_create(name="jinzhu", age=20)
(0.000) SELECT "djangoapp_user"."id", "djangoapp_user"."name", "djangoapp_user"."age" FROM "djangoapp_user" WHERE ("djangoapp_user"."age" = 20 AND "djangoapp_user"."name" = 'jinzhu') LIMIT 21; args=(20, 'jinzhu')
(0.000) BEGIN; args=None
(0.000) INSERT INTO "djangoapp_user" ("name", "age") VALUES ('jinzhu', 20); args=['jinzhu', 20]
Out[2]: (<User: User object (1)>, True)
In short, if I want to be sure that only one record gets created, it seems that I should refrain from using an ORM such as GORM or the Django ORM and write my own query?
A second question I have is how to get the equivalent of Django's created Boolean in GORM. Should I determine whether the RowsAffected of the resulting gorm.DB is 1 to determine whether a row was actually created or not?
You should just add UNIQUE constraint on the query model fields and that would be enough to keep it consistent in db
for Django that would be adding meta class to model
class Meta:
unique_together = ['name', 'age']
for GORM
Name string `gorm:"unique_index:idx_name_age"`
Age uint `gorm:"unique_index:idx_name_age"`

DynamoDB : Global Secondary Index utilisation in queries

I am coming from RDMS background and I started using DynamoDB recently.
I have following DyamoDB table with three Global Secondary Indexes (GSI)
Id (primary key), user_id(GSI), event_type (GSI), product_id (GSI)
, rate, create_date
I have following three query patterns:
a) WHERE event_type=?
b) WHERE event_type=? AND product_id=?
c) WHERE product_id=?
d) WHERE product_id=? AND user_id=?
I know in MySQL I need to create following indexes to optimize above queries :
composite index (event_type,product_id) : for queries "a" and "b"
composite index (product_id,user_id) : for queries "c" and "d"
My question is , if I create three GSIs for 'event_type', 'product_id' and 'user_id' fields in DyanomoDB, do the query patterns "b" and "d" utilize these three independent GSIs ?
Firstly, unlike in RDBMS, the Dynamodb doesn't choose the GSI based on the fields used in filter expression (I meant there is no SQL optimizer to choose the appropriate index based on the fields used in SQL).
You will have to query the GSI directly to get the data. You can refer the GSI query page to understand more on this.
You can create two GSIs:-
1) Event type
2) Product id
You make sure to include the other required fields in the GSI especially product id, user id and any other required fields. This way when you query the GSI, you get all the fields required to fulfill the use case. As long as you have one field from GSI, you can include other fields in Filter expression to filter the data. This ensures that you dont create unnecessary GSIs which requires additional space and cost.

Recommended way to declare Datomic schema in Clojure application

I'm starting to develop a Datomic-backed Clojure app, and I'm wondering what's the best way to declare the schema, in order to address the following concerns:
Having a concise, readable representation for the schema
Ensuring the schema is installed and up-to-date prior to running a new version of my app.
Intuitively, my approach would be the following:
Declaring some helper functions to make schema declarations less verbose than with the raw maps
Automatically installing the schema as part of the initialization of the app (I'm not yet knowledgeable enough to know if that always works).
Is this the best way to go? How do people usually do it?
I Use Conformity for this see Conformity repository. There is also a very useful blogpost from Yeller Here which will guide you how to use Conformity.
Raw maps are verbose, but have some great advantages over using some high level api:
Schema is defined in transaction form, what you specify is transactable (assuming the word exists)
Your schema is not tied to a particular library or spec version, it will always work.
Your schema is serializable (edn) without calling a spec API.
So you can store and deploy your schema more easily in a distributed environment since it's in data-form and not in code-form.
For those reasons I use raw maps.
Automatically installing schema.
This I don't do either.
Usually when you make a change to your schema many things may be happening:
Add new attribute
Change existing attribute type
Create full-text for an attribute
Create new attribute from other values
Others
Which may need for you to change your existing data in some non obvious and not generic way, in a process which may take some time.
I do use some automatization for applying a list of schemas and schema changes, but always in a controlled "deployment" stage when more things regarding data updating may occur.
Assuming you have users.schema.edn and roles.schema.edn files:
(require '[datomic-manage.core :as manager])
(manager/create uri)
(manager/migrate uri [:users.schema
:roles.schema])
For #1, datomic-schema might be of help. I haven't used it, but the example looks promising.
My preference (and I'm biased, as the author of the library) lies with datomic-schema - It focusses on only doing the transformation to normal datomic schema - from there, you transact the schema as you would normally.
I am looking to use the same data to calculate schema migration between the live datomic instance and the definitions - so that the enums, types and cardinality gets changed to conform to your definition.
The important part (for me) of datomic-schema is that the exit path is very clean - If you find it doesn't support something (that I can't implement for whatever reason) down the line, you can dump your schema as plain edn, save it off and remove the dependency.
Conformity will be useful beyond that if you want to do some kind of data migration, or more specific migrations (cleaning up the data, or renaming to something else first).
Proposal: using transaction functions to make declaring schema attributes less verbose in EDN, this preserving the benefits of declaring your schema in EDN as demonstrated by #Guillermo Winkler's answer.
Example:
;; defining helper function
[{:db/id #db/id[:db.part/user]
:db/doc "Helper function for defining entity fields schema attributes in a concise way."
:db/ident :utils/field
:db/fn #db/fn {:lang :clojure
:require [datomic.api :as d]
:params [_ ident type doc opts]
:code [(cond-> {:db/cardinality :db.cardinality/one
:db/fulltext true
:db/index true
:db.install/_attribute :db.part/db
:db/id (d/tempid :db.part/db)
:db/ident ident
:db/valueType (condp get type
#{:db.type/string :string} :db.type/string
#{:db.type/boolean :boolean} :db.type/boolean
#{:db.type/long :long} :db.type/long
#{:db.type/bigint :bigint} :db.type/bigint
#{:db.type/float :float} :db.type/float
#{:db.type/double :double} :db.type/double
#{:db.type/bigdec :bigdec} :db.type/bigdec
#{:db.type/ref :ref} :db.type/ref
#{:db.type/instant :instant} :db.type/instant
#{:db.type/uuid :uuid} :db.type/uuid
#{:db.type/uri :uri} :db.type/uri
#{:db.type/bytes :bytes} :db.type/bytes
type)}
doc (assoc :db/doc doc)
opts (merge opts))]}}]
;; ... then (in a later transaction) using it to define application model attributes
[[:utils/field :person/name :string "A person's name" {:db/index true}]
[:utils/field :person/age :long "A person's name" nil]]
I would suggest using Tupelo Datomic to get started. I wrote this library to simplify Datomic schema creation and ease understanding, much like you allude in your question.
As an example, suppose we’re trying to keep track of information for the world’s premiere spy agency. Let’s create a few attributes that will apply to our heroes & villains (see the executable code in the unit test).
(:require [tupelo.datomic :as td]
[tupelo.schema :as ts])
; Create some new attributes. Required args are the attribute name (an optionally namespaced
; keyword) and the attribute type (full listing at http://docs.datomic.com/schema.html). We wrap
; the new attribute definitions in a transaction and immediately commit them into the DB.
(td/transact *conn* ; required required zero-or-more
; <attr name> <attr value type> <optional specs ...>
(td/new-attribute :person/name :db.type/string :db.unique/value) ; each name is unique
(td/new-attribute :person/secret-id :db.type/long :db.unique/value) ; each secret-id is unique
(td/new-attribute :weapon/type :db.type/ref :db.cardinality/many) ; one may have many weapons
(td/new-attribute :location :db.type/string) ; all default values
(td/new-attribute :favorite-weapon :db.type/keyword )) ; all default values
For the :weapon/type attribute, we want to use an enumerated type since there are only a limited number of choices available to our antagonists:
; Create some "enum" values. These are degenerate entities that serve the same purpose as an
; enumerated value in Java (these entities will never have any attributes). Again, we
; wrap our new enum values in a transaction and commit them into the DB.
(td/transact *conn*
(td/new-enum :weapon/gun)
(td/new-enum :weapon/knife)
(td/new-enum :weapon/guile)
(td/new-enum :weapon/wit))
Let’s create a few antagonists and load them into the DB. Note that we are just using plain Clojure values and literals here, and we don’t have to worry about any Datomic specific conversions.
; Create some antagonists and load them into the db. We can specify some of the attribute-value
; pairs at the time of creation, and add others later. Note that whenever we are adding multiple
; values for an attribute in a single step (e.g. :weapon/type), we must wrap all of the values
; in a set. Note that the set implies there can never be duplicate weapons for any one person.
; As before, we immediately commit the new entities into the DB.
(td/transact *conn*
(td/new-entity { :person/name "James Bond" :location "London" :weapon/type #{ :weapon/gun :weapon/wit } } )
(td/new-entity { :person/name "M" :location "London" :weapon/type #{ :weapon/gun :weapon/guile } } )
(td/new-entity { :person/name "Dr No" :location "Caribbean" :weapon/type :weapon/gun } ))
Enjoy!
Alan

enumerated types not overwriting even though cardinality/one

In writing a rating system, I want people to be able to rate posts, but I only want there to be one rating per user.
So in my schema I have something like
{:db/id #db/id[:db.part/db -1]
:db/ident :rating/value
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/one <<thinking this serves a purpose
:db/doc "rating applied to this particular post"
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/user -2]
:db/ident :rating.value/verypositive}
{:db/id #db/id[:db.part/user -3]
:db/ident :rating.value/positive}
{:db/id #db/id[:db.part/user -4]
:db/ident :rating.value/needswork}
I only want there to be accessible one rating per email at any time, but I am a little stumped.
When I submit several ratings to a post
>(add-rating-to-post 1759 "so#gm.co" "verypositive")
>(add-rating-to-post 1759 "so#gm.co" "needswork")
>(add-rating-to-post 1759 "so#gm.co" "positive")
>(add-rating-to-post 1759 "so#gm.co" "verypositive")
The transaction works fine, but when I query for the ratings attached to a particular post-eid I get something like
({:bid 1759,
:rating :rating.value/verypositive
:email "sova#web"}
{:bid 1759,
:rating :rating.value/positive,
:email "sova#web"}
{:bid 1759,
:rating :rating.value/needswork,
:email "sova#web"})
Really, all I want is the latest one, so a returned list of all the ratings a user submitted where I can take (last x) would be great.
...but it will populate until there is one of each of the enumerated types, and then disregard additions.
Any suggestions on how I can achieve the behavior I'm striving for?
Many thanks in advance
What your schema is essentially saying at the moment is that you are allowed exactly one value per rating.value
I would suggest that the 'single rating per user' should not be a schema constraint but instead is a domain level problem - The appropriate way of implementing this would be to allow multiple ratings per post per user and then write a transactor function that would check if a user have rated a post before and either deny rating again, or retract the old rating (depending on what behaviour you want).
You also would want to treat the rating itself as an entity, if you're not doing that already. So that you have :rating/post :ref, :rating/value :ref and :rating/email attributes and create a new entity for every rating.
Your schema is correct. Your use of idents for enumeration is correct.
Instead of creating a new rating entity every time, use a query to check whether there is one already with the rated :bid and the rating users :email. If so, transact the new :rating attribute with a :db/add assertion (don't worry about retractions, they are created implicitly for :cardinality/one attributes). If not, create one like you are already doing.
If you need to do it atomically, e. g. to avoid the creation of two rating entities by the same user on one article in a race condition, you the need all that to be done within the transactor by writing and using a database function.
If you need to see all the ratings a user has given, use a feature like history database to query for how the :rating value of an entity has changed over time.

How do I use django's Q with django taggit?

I have a Result object that is tagged with "one" and "two". When I try to query for objects tagged "one" and "two", I get nothing back:
q = Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
print len(q)
# prints zero, was expecting 1
Why does it not work with Q? How can I make it work?
The way django-taggit implements tagging is essentially through a ManytoMany relationship. In such cases there is a separate table in the database that holds these relations. It is usually called a "through" or intermediate model as it connects the two models. In the case of django-taggit this is called TaggedItem. So you have the Result model which is your model and you have two models Tag and TaggedItem provided by django-taggit.
When you make a query such as Result.objects.filter(Q(tags__name="one")) it translates to looking up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has the name="one".
Trying to match for two tag names would translate to looking up up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has both name="one" AND name="two". You obviously never have that as you only have one value in a row, it's either "one" or "two".
These details are hidden away from you in the django-taggit implementation, but this is what happens whenever you have a ManytoMany relationship between objects.
To resolve this you can:
Option 1
Query tag after tag evaluating the results each time, as it is suggested in the answers from others. This might be okay for two tags, but will not be good when you need to look for objects that have 10 tags set on them. Here would be one way to do this that would result in two queries and get you the result:
# get the IDs of the Result objects tagged with "one"
query_1 = Result.objects.filter(tags__name="one").values('id')
# use this in a second query to filter the ID and look for the second tag.
results = Result.objects.filter(pk__in=query_1, tags__name="two")
You could achieve this with a single query so you only have one trip from the app to the database, which would look like this:
# create django subquery - this is not evaluated, but used to construct the final query
subquery = Result.objects.filter(pk=OuterRef('pk'), tags__name="one").values('id')
# perform a combined query using a subquery against the database
results = Result.objects.filter(Exists(subquery), tags__name="two")
This would only make one trip to the database. (Note: filtering on sub-queries requires django 3.0).
But you are still limited to two tags. If you need to check for 10 tags or more, the above is not really workable...
Option 2
Query the relationship table instead directly and aggregate the results in a way that give you the object IDs.
# django-taggit uses Content Types so we need to pick up the content type from cache
result_content_type = ContentType.objects.get_for_model(Result)
tag_names = ["one", "two"]
tagged_results = (
TaggedItem.objects.filter(tag__name__in=tag_names, content_type=result_content_type)
.values('object_id')
.annotate(occurence=Count('object_id'))
.filter(occurence=len(tag_names))
.values_list('object_id', flat=True)
)
TaggedItem is the hidden table in the django-taggit implementation that contains the relationships. The above will query that table and aggregate all the rows that refer either to the "one" or "two" tags, group the results by the ID of the objects and then pick those where the object ID had the number of tags you are looking for.
This is a single query and at the end gets you the IDs of all the objects that have been tagged with both tags. It is also the exact same query regardless if you need 2 tags or 200.
Please review this and let me know if anything needs clarification.
first of all, this three are same:
Result.objects.filter(tags__name="one", tags__name="two")
Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
Result.objects.filter(tags__name_in=["one"]).filter(tags__name_in=["two"])
i think the name field is CharField and no record could be equal to "one" and "two" at same time.
in python code the query looks like this(always false, and why you are geting no result):
from random import choice
name = choice(["abtin", "shino"])
if name == "abtin" and name == "shino":
we use Q object for implement OR or complex queries
Into the example that works you do an end on two python objects (query sets). That gets applied to any record not necessarily to the same record that has one AND two as tag.
ps: Why do you use the in filter ?
q = Result.objects.filter(tags_name_in=["one"]).filter(tags_name_in=["two"])
add .distinct() to remove duplicates if expecting more than one unique object