Get several unique fields values in django - django

I have a model:
class MyModel(models.Model):
a = models.CharField(...)
b = models.CharField(...)
And some substring "substring"
To get all unique values from fields a and b which contains those substring I can do that:
a_values = MyModel.objects.filter(a__icontains=substring).values_list("a", flat=True).distinct()
b_values = MyModel.objects.filter(b__icontains=substring).values_list("b", flat=True).distinct()
unique_values = list(set([*a_values, *b_values]))
Is it possible to rewrite it with one database request?
PS to clarify
from objects:
MyModel(a="x", b="y")
MyModel(a="xx", b="xxx")
MyModel(a="z", b="xxxx")
by substring "x" I expect to get:
unique_values = ["x", "xx", "xxx", "xxxx"]

You can here make a union of two queries that are never evaluated individually, like:
qa = MyModel.objects.filter(a__icontains=query).values_list('a', flat=True)
qb = MyModel.objects.filter(b__icontains=query).values_list('b', flat=True)
result = list(qa.union(qb))
Since qa and qb are lazy querysets, we never evaluate these. The query that will be performed is:
(
SELECT mymodel.a
FROM mymodel
WHERE mymodel.a LIKE %query%
)
UNION
(
SELECT mymodel.b
FROM mymodel
WHERE mymodel.b
LIKE %query%
)
The union furthermore will only select distinct values. Even if a value occurs both in a and in b, it will only be retrieved once.
For example:
>>> MyModel(a="x", b="y").save()
>>> MyModel(a="xx", b="xxx").save()
>>> MyModel(a="z", b="xxxx").save()
>>> query = 'x'
>>> qa = MyModel.objects.filter(a__icontains=query).values_list('a', flat=True)
>>> qb = MyModel.objects.filter(b__icontains=query).values_list('b', flat=True)
>>> list(qa.union(qb))
['x', 'xx', 'xxx', 'xxxx']

why not get both of them in the values and then flatten the list ?
unique_values = MyModel.objects.filter(
Q(a__icontains=substring) | Q(b__icontains=substring)
).values_list("a", "b").distinct()
unique_values = sum(unqiue_values, [])
if sum doesnt work here, then you can probably use flatten from itertools. Sum won't probably work because we getting a list of tuples instead of list of lists. Sorry i couldn't test this right now.

Related

Django, get objects by multiple values

There is a model
class Fabric(models.Model):
vendor_code = models.CharField(max_length=50)
color = models.CharField(max_length=50)
lot = models.CharField(max_length=50)
I've a list of objects
values = [
{'vendor_code': '123', 'color': 'aodfe', 'lot': 'some lot 1'},
{'vendor_code': '456', 'color': 'adfae', 'lot': 'some lot 2'},
{'vendor_code': '789', 'color': 'dvade', 'lot': 'some lot 3'},
]
There are no ids in dict objects. How can get objects checking for list of field values(for all 3 values per object at same time)?
I know that I can query one by one in loop as:
for item in values:
fabric = Fabric.objects.filter(vendor_code=item['vendor_code'], color=item['color'], lot=item['lot'])
but amount of objects in list can be large. Is there any proper way to get objects at once, if they exists? Or at least to get them with min amount of db hit.
Thanks in advance!
You can use the in (__in) filter like so:
fabrics = Fabric.objects.filter(
vendor_code__in=[value['vendor_code'] for value in values],
color__in=[value['color'] for value in values],
lot__in=[value['lot'] for value in values],
)
This will however iterate the values list 3 times, to only iterate it once use something like this:
vendor_codes = []
colors = []
lots = []
for value in values:
vendor_codes.append(value['vendor_code'])
colors.append(value['color'])
lots.append(value['lot'])
fabrics = Fabric.objects.filter(
vendor_code__in=vendor_codes,
color__in=colors,
lot__in=lots,
)
To filter according to all three values at the same time you will have to use Q objects like this:
q_objects = []
for value in values:
q_objects.append(Q(
vendor_code=value['vendor_code'],
color=value['color'],
lot=value['lot']
)
)
final_q_object = Q()
for q_object in q_objects:
final_q_object.add(q_object, Q.OR)
fabrics = Fabric.objects.filter(final_q_object)
The gist of it is to get this query:
Q(Q(a=i, b=j, c=k)) | Q(Q(a=l, b=m, c=n) | ...)
Final answer after a bit of optimization:
final_query = Q()
for item in values:
final_query.add(
Q(
vendor_code=value['vendor_code'],
color=value['color'],
lot=value['lot']
),
Q.OR
)
fabrics = Fabric.objects.filter(final_query)
You could use the field lookup "in".
# get all the vendor codes in a list
vendor_code_list = []
for v in values:
vendor_code_list.append(v['vendor_code'])
# query all the fabrics
fabrics = Fabric.objects.filter(vendor_code__in=vendor_code_list)
If you want to match exact values and you are sure that your item keys are valid field names you can just:
for item in values:
fabric = Fabric.objects.filter(**item)
otherwise if you want check if your items are contained inside existing items you can:
for item in values:
item_in = {'{}__in'.format(key):val for key, val in item.items()}
fabric = Fabric.objects.filter(**item_in)

How do I filter by occurrences count in another table in Django?

Here are my models:
class Zoo(TimeStampedModel):
id = models.AutoField(primary_key=True)
class Animal(models.Model):
id = models.AutoField(primary_key=True)
zoo = models.ForeignKey(Zoo, on_delete=models.PROTECT, related_name='diffbot_results')
I would like to run a query like this:
Zoo.objects.filter("WHERE zoo.id IN (select zoo_id from animal_table having count(*) > 10 group by zoo_id)")
One way is to use a raw queryset:
>>> from testapp.models import Zoo, Animal
>>> z1, z2 = Zoo(), Zoo()
>>> z1.save(), z2.save()
(None, None)
>>> z1_animals = [Animal(zoo=z1) for ii in range(5)]
>>> z2_animals = [Animal(zoo=z2) for ii in range(15)]
>>> x = [a.save() for a in z1_animals+z2_animals]
>>> qs = Zoo.objects.raw("select * from testapp_zoo zoo WHERE zoo.id IN (select zoo_id from testapp_animal group by zoo_id having count(1) > 10)")
>>> list(qs)
[<Zoo: Zoo object (2)>]
In theory, per these docs, it should be possible to pass in a queryset to a regular .filter(id__in=<queryset>), but the queryset must only select one column, and I can't find a way of adding the HAVING clause without also causing the queryset to select a num_animals column, preventing it from being used with an __in filter expression.

Aggregation on 'one to many' for matrix view in Django

I have two tables like below.
These are in 'one(History.testinfoid) to many(Result.testinfoid)' relationship.
(Result table is external database)
class History(models.Model): # default database
idx = models.AutoField(primary_key=True)
scenario_id = models.ForeignKey(Scenario)
executor = models.CharField(max_length=255)
createdate = models.DateTimeField()
testinfoid = models.IntegerField(unique=True)
class Result(models.Model): # external (Result.objects.using('external'))
idx = models.AutoField(primary_key=True)
testinfoid = models.ForeignKey(History, to_field='testinfoid', related_name='result')
testresult = models.CharField(max_length=10)
class Meta:
unique_together = (('idx', 'testinfoid'),)
So, I want to express the Count by 'testresult' field in Result table.
It has some condition such as 'Pass' or 'Fail'.
I want to express a count query set for each condition. like this.
[{'idx': 1, 'pass_count': 10, 'fail_count': 5, 'executor': 'someone', ...} ...
...
{'idx': 10, 'pass_count': 1, 'fail_count': 10, 'executor': 'someone', ...}]
Is it possible?
It is a two level aggregation where the second level should be displayed as table columns - "matrix view".
A) Solution with Python loop to create columns with annotations by the second level ("testresult").
from django.db.models import Count
from collections import OrderedDict
qs = (History.objects
.values('pk', 'executor', 'testinfoid',... 'result__testresult')
.annotate(result_count=Count('pk'))
)
qs = qs.filter(...).order_by(...)
data = OrderedDict()
count_columns = ('pass_count', 'fail_count', 'error_count',
'expected_failure_count', 'unexpected_success_count')
for row in qs:
data.setdefault(row.pk, dict.fromkeys(count_columns, 0)).update(
{(k if k != result_count else row['result__testresult'] + '_count'): v
for k, v in row_items()
if k != 'result__testresult'
}
)
out = list(data.values())
The class OrderedDict is used to preserve order_by().
B) Solution with Subquery in Django 1.11+ (if the result should be a queryset. e.g. to be sorted or filtered finally in an Admin view by clicking, and if a more complicated query is acceptable and number of columns *_count is very low.). I can write a solution with subquery, but I'm not sure if the query will be fast enough with different database backends. Maybe someone other answers.

Django check if querysets are equals

I have this django code
q1 = MyModel.objects.all()
q2 = MyModel.objects.all()
When I try:
print(q1 == q2)
I get as a result:
False
So how can I check if two querysets results in django are equal?
You can convert the querysets to lists and check whether they are equal:
list(q1) == list(q2)
You can convert it to set, to check if 2 query sets have the same elements, without regard to ordering:
set(q1) == set(q2)
it will return:
True
Try this:
q1.intersection(q2).count() == q1.count() and q1.count() == q2.count()
Throwing in my two cents for a function that compares two QuerySets for equality while ignoring sort order. Note I'm not actually checking whether the QuerySets are empty or not; I'll leave that up to you.
def querysets_are_same(qs1, qs2, exclude_fields=[]):
'''
Check whether two queryset have the same content, sort order of querysets is ignored.
Params:
-------
qs1 (QuerySet) - first queryset to compare
qs2 (QuerySet) - second queryset to compare
exclude_fields (list) - fields to exclude from comparison; primary key field is automatically removed.
Yield:
------
True if both querysets contain the same data while ignoring sort order; False otherwise
'''
# lookup primary key field name
pk_qs1 = qs1[0]._meta.pk.name
pk_qs2 = qs2[0]._meta.pk.name
# update excluded fields
exclude_fields_qs1 = set(exclude_fields) | set([pk_qs1])
exclude_fields_qs2 = set(exclude_fields) | set([pk_qs2])
# convert queryset to list of dicts excluding fields
list_qs1 = [{k:v for k,v in d.items() if not k in exclude_fields_qs1} for d in qs1.values()]
list_qs2 = [{k:v for k,v in d.items() if not k in exclude_fields_qs2} for d in qs2.values()]
# sort lists
list_qs1_sorted = sorted(sorted(d.items()) for d in list_qs1)
list_qs2_sorted = sorted(sorted(d.items()) for d in list_qs2)
return list_qs1_sorted == list_qs2_sorted
You can see by .count() or:
q1 = Model.objects.all()
q2 = Model.objects.all()
equal = True
for idx, q in q1:
if q != q2[idx]:
equal = False
print(equal)

How to delete the first element of a row so that the whole row deleted from a list?

My list looks as follow:
items = []
a = "apple", 1.23
items.append(a)
b = "google", 2.33
items.append(b)
c = "ibm", 4.35
items.append(c)
Now I will just remove the row of "apple" by just giving the name of "apple".
How to do?
You can convert items into a dictionary, delete the entry with key apple and return the dictionary items:
>>> items
[('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]
>>> d = dict(items)
>>> del d['apple']
>>> items = d.items()
>>> items
[('ibm', 4.35), ('google', 2.33)]
In python 3, you should cast d.items with list as .items() returns a dict_items object which is iterable but not subscriptable:
>>> items = list(d.items())
I suggest that you use a proper data structure. In your case, a dict will do the trick.
items = {"apple": 1.23, "google": 2.33, "ibm": 4.35}
To delete, use:
items.pop("apple", None)
Since I canonly accept one answer and truely to say I am not 100% satisfied with both, so I haven't accpted any one. Hope it's OK for you all.
I do followings, a combination of both of yours:
d = dict(items)
d.pop("apple", None)
myitem = d.items()
I think the best approach is that of using a dictionary, as suggested by #Sricharan Madasi and #Moses Koledoye. However, provided that the OP seems to prefer to arrange data as a list of tuples, he may find this function useful:
def my_func(lst, key):
return [(name, number) for (name, number) in lst if name != key]
The following interactive session demonstrates its usage:
>>> items = [('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]
>>> my_func(items, 'apple')
[('google', 2.33), ('ibm', 4.35)]
>>> my_func(items, 'ibm')
[('apple', 1.23), ('google', 2.33)]
>>> my_func(items, 'foo')
[('apple', 1.23), ('google', 2.33), ('ibm', 4.35)]