There is a model
class Fabric(models.Model):
vendor_code = models.CharField(max_length=50)
color = models.CharField(max_length=50)
lot = models.CharField(max_length=50)
I've a list of objects
values = [
{'vendor_code': '123', 'color': 'aodfe', 'lot': 'some lot 1'},
{'vendor_code': '456', 'color': 'adfae', 'lot': 'some lot 2'},
{'vendor_code': '789', 'color': 'dvade', 'lot': 'some lot 3'},
]
There are no ids in dict objects. How can get objects checking for list of field values(for all 3 values per object at same time)?
I know that I can query one by one in loop as:
for item in values:
fabric = Fabric.objects.filter(vendor_code=item['vendor_code'], color=item['color'], lot=item['lot'])
but amount of objects in list can be large. Is there any proper way to get objects at once, if they exists? Or at least to get them with min amount of db hit.
Thanks in advance!
You can use the in (__in) filter like so:
fabrics = Fabric.objects.filter(
vendor_code__in=[value['vendor_code'] for value in values],
color__in=[value['color'] for value in values],
lot__in=[value['lot'] for value in values],
)
This will however iterate the values list 3 times, to only iterate it once use something like this:
vendor_codes = []
colors = []
lots = []
for value in values:
vendor_codes.append(value['vendor_code'])
colors.append(value['color'])
lots.append(value['lot'])
fabrics = Fabric.objects.filter(
vendor_code__in=vendor_codes,
color__in=colors,
lot__in=lots,
)
To filter according to all three values at the same time you will have to use Q objects like this:
q_objects = []
for value in values:
q_objects.append(Q(
vendor_code=value['vendor_code'],
color=value['color'],
lot=value['lot']
)
)
final_q_object = Q()
for q_object in q_objects:
final_q_object.add(q_object, Q.OR)
fabrics = Fabric.objects.filter(final_q_object)
The gist of it is to get this query:
Q(Q(a=i, b=j, c=k)) | Q(Q(a=l, b=m, c=n) | ...)
Final answer after a bit of optimization:
final_query = Q()
for item in values:
final_query.add(
Q(
vendor_code=value['vendor_code'],
color=value['color'],
lot=value['lot']
),
Q.OR
)
fabrics = Fabric.objects.filter(final_query)
You could use the field lookup "in".
# get all the vendor codes in a list
vendor_code_list = []
for v in values:
vendor_code_list.append(v['vendor_code'])
# query all the fabrics
fabrics = Fabric.objects.filter(vendor_code__in=vendor_code_list)
If you want to match exact values and you are sure that your item keys are valid field names you can just:
for item in values:
fabric = Fabric.objects.filter(**item)
otherwise if you want check if your items are contained inside existing items you can:
for item in values:
item_in = {'{}__in'.format(key):val for key, val in item.items()}
fabric = Fabric.objects.filter(**item_in)
Related
I am trying to generate a result that satisfies with the filter query below:
indicators = request.GET.getlist('indicators[]')
fmrprofiles = FMRPriority.objects.all()
q_objects = Q()
obj_filters = []
for indicator in indicators:
split_i = indicator.split('_')
if len(split_i) == 5:
if not any(d['indicator'] == split_i[1] for d in obj_filters):
obj_filters.append({
'indicator': split_i[1],
'scores': []
})
for o in obj_filters:
if split_i[1] == o['indicator']:
o['scores'].append(int(split_i[4]))
for obj in obj_filters:
print (obj['scores'])
q_objects.add(Q(pcindicator__id = int(obj['indicator'])) & Q(score__in=obj['scores']), Q.AND)
print (q_objects)
fmrprofiles = fmrprofiles.values('fmr__id','fmr__road_name').filter(q_objects).order_by('-fmr__date_validated')
print (fmrprofiles.query)
Basically, indicators is a list e.g. ['indicator_1_scoring_1_5', 'indicator_1_scoring_1_4', 'indicator_2_scoring_2_5']
I wanted to filter FMRPriority with these following fields:
pcindicator
score
e.g. pcindicator is equal 1 and scores selected are 5,4..another selection pcindicator is equal to 2 and scores selected are 3.
The query q_objects.add(Q(pcindicator__id = int(obj['indicator'])) & Q(score__in=obj['scores']), Q.AND) returns empty set..i have tried also the raw sql, same result.
Model:
class FMRPriority(models.Model):
fmr = models.ForeignKey(FMRProfile, verbose_name=_("FMR Project"), on_delete=models.CASCADE)
pcindicator = models.ForeignKey(PCIndicator, verbose_name=_("Priority Indicator"), on_delete=models.PROTECT)
score = models.FloatField(_("Score"))
I solve this by using OR and count the occurrence of id then exclude those are not equal to the length of filters:
for obj in obj_filters:
print (obj['scores'])
q_objects.add(
(Q(fmrpriority__pcindicator__id = int(obj['indicator'])) & Q(fmrpriority__score__in=obj['scores'])), Q.OR
)
fmrprofiles = fmrprofiles.values(*vals_to_display).filter(q_objects).annotate(
num_ins=Count('id'),
...
)).exclude(
~Q(num_ins = len(obj_filters))
).order_by('rank','road_name')
I am querying a set of objects with a condition like this:
filters_for_dates_report = ['birthday', 'job_anniversary', 'wedding_anniversary']
values_list_for_dates_report = ['id', 'name', 'date']
for filter_for_dates_report in filters_for_dates_report:
filter_dict.update({filter_for_dates_report: {
filter_for_dates_report + "__range" : [start_date, end_date]
}})
list_of_Q = [Q(**{key: val}) for key, val in filter_dict.items()]
if list_of_Q:
model_details = Model.objects.filter(reduce(operator.or_, list_of_Q))
.values(*values_list_for_dates_report)
Now I want to exclude the objects which have null values for filter_for_dates_report attributes.
A direct query would be
Model.objects.filter(
Q(birthday__range=[start_date, end_date] & birthday__isnull=False))
.values(*values_list_for_dates_report)
But how can I do this for multiple values wherein I want only the values within that range and also which are not null for multiple filter_for_dates_report attributes.
Something like:
Model.objects.filter(
(Q(birthday__range=[start_date, end_date]) & Q(birthday__isnull=False)) |
(Q(marriage_anniversary__range=[start_date, end_date]) & Q(marriage_anniversary__isnull=False)) |
(Q(job_anniversary__range=[start_date, end_date]) & Q(job_anniversary__isnull=False)))
.values(*values_list_for_dates_report)
loop over and reduce it with OR operator :
import operator
filter_dict = []
queryset = Model.objects.all()
for filter_for_dates_report in filters_for_dates_report:
filter_dict.append(Q(**{
filter_for_dates_report + "__range": [start_date, end_date]
}))
queryset = Model.objects.filter(
reduce(operator.or_, filter_dict)
).values(*values_list_for_dates_report)
This will create a queryset with filters OR with what you put in the loop.
You dont need to add a __isnull if you add a __range
I have a model:
class MyModel(models.Model):
a = models.CharField(...)
b = models.CharField(...)
And some substring "substring"
To get all unique values from fields a and b which contains those substring I can do that:
a_values = MyModel.objects.filter(a__icontains=substring).values_list("a", flat=True).distinct()
b_values = MyModel.objects.filter(b__icontains=substring).values_list("b", flat=True).distinct()
unique_values = list(set([*a_values, *b_values]))
Is it possible to rewrite it with one database request?
PS to clarify
from objects:
MyModel(a="x", b="y")
MyModel(a="xx", b="xxx")
MyModel(a="z", b="xxxx")
by substring "x" I expect to get:
unique_values = ["x", "xx", "xxx", "xxxx"]
You can here make a union of two queries that are never evaluated individually, like:
qa = MyModel.objects.filter(a__icontains=query).values_list('a', flat=True)
qb = MyModel.objects.filter(b__icontains=query).values_list('b', flat=True)
result = list(qa.union(qb))
Since qa and qb are lazy querysets, we never evaluate these. The query that will be performed is:
(
SELECT mymodel.a
FROM mymodel
WHERE mymodel.a LIKE %query%
)
UNION
(
SELECT mymodel.b
FROM mymodel
WHERE mymodel.b
LIKE %query%
)
The union furthermore will only select distinct values. Even if a value occurs both in a and in b, it will only be retrieved once.
For example:
>>> MyModel(a="x", b="y").save()
>>> MyModel(a="xx", b="xxx").save()
>>> MyModel(a="z", b="xxxx").save()
>>> query = 'x'
>>> qa = MyModel.objects.filter(a__icontains=query).values_list('a', flat=True)
>>> qb = MyModel.objects.filter(b__icontains=query).values_list('b', flat=True)
>>> list(qa.union(qb))
['x', 'xx', 'xxx', 'xxxx']
why not get both of them in the values and then flatten the list ?
unique_values = MyModel.objects.filter(
Q(a__icontains=substring) | Q(b__icontains=substring)
).values_list("a", "b").distinct()
unique_values = sum(unqiue_values, [])
if sum doesnt work here, then you can probably use flatten from itertools. Sum won't probably work because we getting a list of tuples instead of list of lists. Sorry i couldn't test this right now.
I have the next model:
class Departments(Document):
_id = fields.ObjectIdField()
name = fields.StringField(blank=True, null=True)
department_id = fields.StringField(blank=True, null=True) # Added
list_of_users = fields.ListField(blank=True, null=True)
list_of_workstations = fields.ListField(blank=True, null=True)
As you can see list_of_users and list_of_workstations are lists of items.
I wrote a code in Python, which takes all data from DB, put it into dict and then sorts as I need, but it works too slow.
How can I sort Departments right in the DB by the length of list_of_users or list_of_workstations or by ratio of list_of_users/list_of_workstations, something like:
departments = DepartmentStats.objects.order_by(len(list_of_users)).dsc
or
departments = DepartmentStats.objects.order_by(len(list_of_users)/len(list_of_workstations)).dsc
?
For your first request, use annotation like Umut Gunebakan told you in his comment. But I'm know sure about Count() on ListField
departments = DepartmentStats.objects.all().annotate(num_list_users=Count('list_of_users')).order_by('-num_list_users')
For a desc order by, you just need to add the sign '-' (minus).
https://docs.djangoproject.com/en/1.10/ref/models/querysets/#order-by
The second request will be :
departments = DepartmentStats.objects.all().annotate(user_per_workstation=(Count('list_of_users')/Count('list_of_workstations')).order_by('-user_per_workstation')
UPDATE: (Mongoengine used)
With mongoengine you need to get item frequencies and sorted the result :
Check this part of documentation - futher aggregation
list_user_freqs = DepartmentStats.objects.item_frequencies('list_of_users', normalize=True)
from operator import itemgetter
list_user_freqs_sroted = sorted(list_user_freqs.items(), key=itemgetter(1), reverse=True)
If someone needs raw query:
departments = DepartmentStats._get_collection().aggregate([
{"$project": {
"department_id": 1,
"name": 1,
"list_of_users": 1,
}},
{"$sort": {"list_of_users": -1}},
])
and the case, when the result must be sorted by the ratio list_of_users/list_of_workstations
departments = DepartmentStats._get_collection().aggregate([
{"$project": {
"department_id": 1,
"name": 1,
"list_of_users": 1,
"len_list_of_items": {"$divide": [{"$size": "$list_of_users"},
{"$size": "$list_of_workstations"}]}
}},
{"$sort": {"len_list_of_items": -1}},
])
I am trying to create a dictionary from a list and tuple of tuples as illustrated below. I have to reverse map the tuples to the list and create a set of non-None column names.
Any suggestions on a pythonic way to achieve the solution (desired dictionary) is much appreciated.
MySQL table 'StateLog':
Name NY TX NJ
Amy 1 None 1
Kat None 1 1
Leo None None 1
Python code :
## Fetching data from MySQL table
#cursor.execute("select * from statelog")
#mydataset = cursor.fetchall()
## Fetching column names for mapping
#state_cols = [fieldname[0] for fieldname in cursor.description]
state_cols = ['Name', 'NY', 'TX', 'NJ']
mydataset = (('Amy', '1', None, '1'), ('Kat', None, '1', '1'), ('Leo', None, None, '1'))
temp = [zip(state_cols, each) for each in mydataset]
# Looks like I can't do a tuple comprehension for the following snippet : finallist = ((eachone[1], eachone[0]) for each in temp for eachone in each if eachone[1] if eachone[0] == 'Name')
for each in temp:
for eachone in each:
if eachone[1]:
if eachone[0] == 'Name':
k = eachone[1]
print k, eachone[0]
print '''How do I get a dictionary in this format'''
print '''name_state = {"Amy": set(["NY", "NJ"]),
"Kat": set(["TX", "NJ"]),
"Leo": set(["NJ"])}'''
Output so far :
Amy Name
Amy NY
Amy NJ
Kat Name
Kat TX
Kat NJ
Leo Name
Leo NJ
Desired dictionary :
name_state = {"Amy": set(["NY", "NJ"]),
"Kat": set(["TX", "NJ"]),
"Leo": set(["NJ"])}
To be really honest, I would say your problem is that your code is becoming too cumbersome. Resist the temptation of "one-lining" it and create a function. Everything will become way easier!
mydataset = (
('Amy', '1', None, '1'),
('Kat', None, '1', '1'),
('Leo', None, None, '1')
)
def states(cols, data):
"""
This function receives one of the tuples with data and returns a pair
where the first element is the name from the tuple, and the second
element is a set with all matched states. Well, at least *I* think
it is more readable :)
"""
name = data[0]
states = set(state for state, value in zip(cols, data) if value == '1')
return name, states
pairs = (states(state_cols, data) for data in mydataset)
# Since dicts can receive an iterator which yields pairs where the first one
# will become a key and the second one will become the value, I just pass
# a list with all pairs to the dict constructor.
print dict(pairs)
The result is:
{'Amy': set(['NY', 'NJ']), 'Leo': set(['NJ']), 'Kat': set(['NJ', 'TX'])}
Looks like another job for defaultdict!
So lets create our default dict
name_state = collections.defaultdict(set)
We now have a dictionary that has sets as all default values, we can now do something like this
name_state['Amy'].add('NY')
Moving on you just need to iterate over your object and add to each name the right states. Enjoy
You can do this as a dictionary comprehension (Python 2.7+):
from itertools import compress
name_state = {data[0]: set(compress(state_cols[1:], data[1:])) for data in mydataset}
or as a generator expression:
name_state = dict((data[0], set(compress(state_cols[1:], data[1:]))) for data in mydataset)