Doctrine returning strange field names from query - doctrine-orm

I am using "doctrine/doctrine-orm-module": "^2.1" (it is a module for zend framework 3). I want to create a query which will return rows with field names (trivial, right?). But instead of exact names of fields I am getting this query result:
SELECT
u0_.id AS id_0, u0_.username AS username_1, u0_.email AS email_2,
u0_.first_name AS first_name_3, u0_.last_name AS last_name_4,
u0_.password AS password_5, u0_.status AS status_6, u0_.created AS created_7,
u0_.modified AS modified_8
FROM
user_item u0_
ORDER BY
u0_.id DESC
This query is generated by this code:
$entityManager = $this->getEntityManager();
$queryBuilder = $entityManager->createQueryBuilder();
$queryBuilder->select('u')
->from(UserItem::class, 'u')
->orderBy('u.id', 'DESC')
;
$query = $queryBuilder->getQuery();
echo $query->getSql();
print_r($query->getParameters());
die('|||');
What is the "0_" appending to the table name? What is appendings "_x" to the fields name?
How can I get normal fields and tables names without appended "_x"?

Just names, I'm assuming both the first_name and last_name as shown in that generated SQL, right?
I changed the order below, makes it easier to read / understand.
What you want to do is (pseudo code): Select from UserItem all the first & last names
So, write the code that way :)
$queryBuilder
->from(UserItem::class, 'u')
->select(['u.first_name', 'u.last_name'])
->orderBy('u.id', 'DESC'); // Might want to sort by either u.first_name or u.last_name
What's in the QueryBuilder?
->from(UserItem::class, 'u') - First parameter is the FQCN (Fully Qualified Class Name) of the Entity you wish to use with the QueryBuilder. Not required is the second parameter, which is an alias to use for this instance of the QueryBuilder to recognize the FQCN defined class by. (Off of the top of my head it defaults to snake_case'd names of the class, in this case "user_item")
->select(['u.first_name', 'u.last_name']) - Function takes a "mixed" param. Click through to its definition and you'll see the following in the function:
$selects = is_array($select) ? $select : func_get_args();
Which indicates that it will always pass the "$selects" on the next bit as an array. (Another hint is that $selects is plural)
->orderBy('u.id', 'DESC') - Creates a rule to order results by. If you click through to this function, you'll see that this one ends like so:
return $this->add('orderBy', $orderBy);
Meaning: you can add more than 1 order by.
When it comes to the generated DQL:
u0_ is the table alias as defined in the DQL, from your question: FROM user_item u0_, this will later be transformed to MySQL (usually) which will be the same. It sets u0_ as an alias for user_item.
The _* appended to property names is just plain the order of the columns as they've been created in the database (have a look, they'll be in that order).
Lastly, the fact you were receiving entire entities and not just the names (first_name & last_name) is due to ->select('u'). Because no property (or properties as shown above) is defined, Doctrine assumes you wish to receive the whole enchalada. Doing ->select('u.first_name') would then get you just the first names, and using an array as above would get you more than 1 property.
Hope that helped you out :)

Related

Return object when aggregating grouped fields in Django

Assuming the following example model:
# models.py
class event(models.Model):
location = models.CharField(max_length=10)
type = models.CharField(max_length=10)
date = models.DateTimeField()
attendance = models.IntegerField()
I want to get the attendance number for the latest date of each event location and type combination, using Django ORM. According to the Django Aggregation documentation, we can achieve something close to this, using values preceding the annotation.
... the original results are grouped according to the unique combinations of the fields specified in the values() clause. An annotation is then provided for each unique group; the annotation is computed over all members of the group.
So using the example model, we can write:
event.objects.values('location', 'type').annotate(latest_date=Max('date'))
which does indeed group events by location and type, but does not return the attendance field, which is the desired behavior.
Another approach I tried was to use distinct i.e.:
event.objects.distinct('location', 'type').annotate(latest_date=Max('date'))
but I get an error
NotImplementedError: annotate() + distinct(fields) is not implemented.
I found some answers which rely on database specific features of Django, but I would like to find a solution which is agnostic to the underlying relational database.
Alright, I think this one might actually work for you. It is based upon an assumption, which I think is correct.
When you create your model object, they should all be unique. It seems highly unlikely that that you would have two events on the same date, in the same location of the same type. So with that assumption, let's begin: (as a formatting note, class Names tend to start with capital letters to differentiate between classes and variables or instances.)
# First you get your desired events with your criteria.
results = Event.objects.values('location', 'type').annotate(latest_date=Max('date'))
# Make an empty 'list' to store the values you want.
results_list = []
# Then iterate through your 'results' looking up objects
# you want and populating the list.
for r in results:
result = Event.objects.get(location=r['location'], type=r['type'], date=r['latest_date'])
results_list.append(result)
# Now you have a list of objects that you can do whatever you want with.
You might have to look up the exact output of the Max(Date), but this should get you on the right path.

Calcite LogicalAggregate

What is a proper way to associate an AggregateCall that is part of a HAVING expression with a corresponding field in a RelRecordType for the LogicalAggregate? If the AggregateCall is not part of the SELECT clause, the LogicalAggregate's RelRecordType still has it, but the AggregateCall's name attribute is set to NULL and RelRecordType.getField(AggregateCall.getName()) can't be used in this case. If the AggregateCall is a part of the final output, its name is set and RelRecordType.getField(AggregateCall.getName()) returns the right field.
Use field ordinals rather than names.
In the world of Calcite RelNodes and RexNodes, field names are not that important; they exist mainly to help you understand the purpose of fields when debugging. Names of AggregateCalls are even less important; they exist so that Aggregate can give reasonable names to its fields, and if they don't exist, that's fine.
If your SELECT has N fields (numbered 0 .. N-1) and a HAVING clause, you will likely add the HAVING predicate as field N, apply a Filter relational operator, then apply a Project so that only fields 0 .. N-1 are returned. I'm pretty sure that this is what SqlToRelConverter does already.

Updating derived values in SQLAlchemy

Usual sqlalchemy usage:
my_prop = Column("my_prop", Text)
I would like different semantics. Let's say an object has a set of fields (propA, propB, propC). I would like to maintain a database column which is derived from these fields (let's say, propA + propB + propC). I would like the column to be updated whenever any one of these set of fields is updated. Thank you.
Hybrid properties provide the functionality you are looking for. They allow you to write python properties that are usable in queries.
Here's how you might start if you wanted to have a name column and provide access to first and last name properties.
#hybrid_property
def first_name(self):
# get the first name from the name column
#first_name.setter
def first_name(self, value):
# update the name column with the first name replaced
#first_name.expression
def first_name(cls):
# return a sql expression that extracts the first name from the name column
# this is appropriate to be used in queries

Filtering on the concatenation of two model fields in django

With the following Django model:
class Item(models.Model):
name = CharField(max_len=256)
description = TextField()
I need to formulate a filter method that takes a list of n words (word_list) and returns the queryset of Items where each word in word_list can be found, either in the name or the description.
To do this with a single field is straightforward enough. Using the reduce technique described here (this could also be done with a for loop), this looks like:
q = reduce(operator.and_, (Q(description__contains=word) for word in word_list))
Item.objects.filter(q)
I want to do the same thing but take into account that each word can appear either in the name or the description. I basically want to query the concatenation of the two fields, for each word. Can this be done?
I have read that there is a concatenation operator in Postgresql, || but I am not sure if this can be utilized somehow in django to achieve this end.
As a last resort, I can create a third column that contains the combination of the two fields and maintain it via post_save signal handlers and/or save method overrides, but I'm wondering whether I can do this on the fly without maintaining this type of "search index" type of column.
The most straightforward way would be to use Q to do an OR:
lookups = [Q(name__contains=word) | Q(description__contains=word)
for word in words]
Item.objects.filter(*lookups) # the same as and'ing them together
I can't speak to the performance of this solution as compared to your other two options (raw SQL concatenation or denormalization), but it's definitely simpler.

How do I use django's Q with django taggit?

I have a Result object that is tagged with "one" and "two". When I try to query for objects tagged "one" and "two", I get nothing back:
q = Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
print len(q)
# prints zero, was expecting 1
Why does it not work with Q? How can I make it work?
The way django-taggit implements tagging is essentially through a ManytoMany relationship. In such cases there is a separate table in the database that holds these relations. It is usually called a "through" or intermediate model as it connects the two models. In the case of django-taggit this is called TaggedItem. So you have the Result model which is your model and you have two models Tag and TaggedItem provided by django-taggit.
When you make a query such as Result.objects.filter(Q(tags__name="one")) it translates to looking up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has the name="one".
Trying to match for two tag names would translate to looking up up rows in the Result table that have a corresponding row in the TaggedItem table that has a corresponding row in the Tag table that has both name="one" AND name="two". You obviously never have that as you only have one value in a row, it's either "one" or "two".
These details are hidden away from you in the django-taggit implementation, but this is what happens whenever you have a ManytoMany relationship between objects.
To resolve this you can:
Option 1
Query tag after tag evaluating the results each time, as it is suggested in the answers from others. This might be okay for two tags, but will not be good when you need to look for objects that have 10 tags set on them. Here would be one way to do this that would result in two queries and get you the result:
# get the IDs of the Result objects tagged with "one"
query_1 = Result.objects.filter(tags__name="one").values('id')
# use this in a second query to filter the ID and look for the second tag.
results = Result.objects.filter(pk__in=query_1, tags__name="two")
You could achieve this with a single query so you only have one trip from the app to the database, which would look like this:
# create django subquery - this is not evaluated, but used to construct the final query
subquery = Result.objects.filter(pk=OuterRef('pk'), tags__name="one").values('id')
# perform a combined query using a subquery against the database
results = Result.objects.filter(Exists(subquery), tags__name="two")
This would only make one trip to the database. (Note: filtering on sub-queries requires django 3.0).
But you are still limited to two tags. If you need to check for 10 tags or more, the above is not really workable...
Option 2
Query the relationship table instead directly and aggregate the results in a way that give you the object IDs.
# django-taggit uses Content Types so we need to pick up the content type from cache
result_content_type = ContentType.objects.get_for_model(Result)
tag_names = ["one", "two"]
tagged_results = (
TaggedItem.objects.filter(tag__name__in=tag_names, content_type=result_content_type)
.values('object_id')
.annotate(occurence=Count('object_id'))
.filter(occurence=len(tag_names))
.values_list('object_id', flat=True)
)
TaggedItem is the hidden table in the django-taggit implementation that contains the relationships. The above will query that table and aggregate all the rows that refer either to the "one" or "two" tags, group the results by the ID of the objects and then pick those where the object ID had the number of tags you are looking for.
This is a single query and at the end gets you the IDs of all the objects that have been tagged with both tags. It is also the exact same query regardless if you need 2 tags or 200.
Please review this and let me know if anything needs clarification.
first of all, this three are same:
Result.objects.filter(tags__name="one", tags__name="two")
Result.objects.filter(Q(tags__name="one") & Q(tags__name="two"))
Result.objects.filter(tags__name_in=["one"]).filter(tags__name_in=["two"])
i think the name field is CharField and no record could be equal to "one" and "two" at same time.
in python code the query looks like this(always false, and why you are geting no result):
from random import choice
name = choice(["abtin", "shino"])
if name == "abtin" and name == "shino":
we use Q object for implement OR or complex queries
Into the example that works you do an end on two python objects (query sets). That gets applied to any record not necessarily to the same record that has one AND two as tag.
ps: Why do you use the in filter ?
q = Result.objects.filter(tags_name_in=["one"]).filter(tags_name_in=["two"])
add .distinct() to remove duplicates if expecting more than one unique object