I have three models:
class Box(models.Model):
name = models.TextField(blank=True, null=True)
class Toy(models.Model):
box = models.ForeignKey(Box, related_name='toys')
class ToyAttributes(models.Model):
toy = models.ForeignKey(Toy)
color = models.ForeignKey(Color, related_name='colors')
And list:
pairs = [[10, 3], [4, 5], [1, 2]]
Where every value is a pair or box and color id's.
I need to filter this data and return boxes objects with toys of needed color.
Now I do this:
for n in list:
box = Box.objects.filter(id=n[0], toys__colors=n[1])
if box.exist():
...
But it takes a lot of time for long lists, as I understand because of multiple SQL requests. Can I make it faster? Is it possible to get only needed boxes with one request and how can I make it? Thanks!
You should look at django Q function and construct your query in a loop adding values to Q like so
query = Q()
for box_id, toy_color in [[10, 3], [4, 5], [1, 2]]:
query |= Q(Q(id=box_id) & Q(toys__colors=toy_color))
Box.objects.filter(query)
This should work for you.
from django.db.models import Q
pairs = [[10, 3], [4, 5], [1, 2]]
conditions = [Q(id=box) & Q(toys__colors=color) for box, color in pairs]
query = Q()
for c in conditions:
query |= c
Box.objects.filter(query)
Related
I m having two tables 1) Visit 2) disease. visit table having a column for disease. I m trying to get top 5 disease from visit table.
dis=disease.objects.all()
for d in dis:
v=visits.objects.filter(disease=d.disease_name).count()
data={
d.disease_name : v
}
print (data)
This print all disease with respective count. as below:
{'Headache': 2}
{'Cold': 1}
{'Cough': 4}
{'Dog Bite': 0}
{'Fever': 2}
{'Piles': 3}
{'Thyroid': 4}
{'Others': 9}
I want to get top 5 from this list based on count. How to do it?
Thank you all for your reply, I got an other simple solution for it.
from django.db.models import Count
x = visits.objects.values('disease').annotate(disease_count=Count('disease')).order_by('-disease_count')[:5]
print(x)
it returns as below:
<QuerySet [{'disease': 'Others', 'disease_count': 9}, {'disease': 'Thyroid', 'disease_count': 4}, {'disease': 'Cough', 'disease_count': 4}, {'disease': 'Piles', 'disease_count': 3}, {'disease': 'Headache', 'disease_count': 2}]>
I think this is simplest solutions. It working for me...
Add data in a list and sort list based on what you want:
dis=disease.objects.all()
l = list()
for d in dis:
v=visits.objects.filter(disease=d.disease_name).count()
data={
d.disease_name : v
}
l.append(data)
l.sort(reverse=True, key=lambda x:list(x.values())[0])
for i in range(min(len(l), 5)):
print(l[i])
You can sort these values by writing code like that:
diseases = list(Disease.objects.values_list('disease_name', flat=True))
visits = list(
Visits.objects.filter(disease__in=diseases).values_list('disease', flat=True))
data = {}
for name in diseases:
count = visits.count(name)
data[name] = count
sorted_data = sorted(data.items(), key=operator.itemgetter(1), reverse=True)
new_data = {}
for idx in range(min(len(sorted_data), 5)):
item = sorted_data[idx]
new_data[item[0]] = item[1]
print(new_data)
It's little messy, but it does the job:
I also optimised your queries, so the code should also run bit faster (when you do logic like that, use list and .values_list(...) because it caches data in memory - and using native python functions on list instead of QuerySet like .count() should also be faster than hitting database).
I am having some trouble filtering a pandas dataframe on a column (let's call it column_1) whose data type is a list. Specifically, I want to return only rows such that column_1 and the intersection of another predetermined list are not empty. However, when I try to put the logic inside the arguments of the .where, function, I always get errors. Below are my attempts, with the errors returned.
Attemping to test whether or not a single element is inside the list:
table[element in table['column_1']]
returns the error ...
KeyError: False
trying to compare a list to all of the lists in the rows of the dataframe:
table[[349569] == table.column_1] returns the error Arrays were different lengths: 23041 vs 1
I'm trying to get these two intermediate steps down before I test the intersection of the two lists.
Thanks for taking the time to read over my problem!
consider the pd.Series s
s = pd.Series([[1, 2, 3], list('abcd'), [9, 8, 3], ['a', 4]])
print(s)
0 [1, 2, 3]
1 [a, b, c, d]
2 [9, 8, 3]
3 [a, 4]
dtype: object
And a testing list test
test = ['b', 3, 4]
Apply a lambda function that converts each element of s to a set and intersection with test
print(s.apply(lambda x: list(set(x).intersection(test))))
0 [3]
1 [b]
2 [3]
3 [4]
dtype: object
To use it as a mask, use bool instead of list
s.apply(lambda x: bool(set(x).intersection(test)))
0 True
1 True
2 True
3 True
dtype: bool
Hi for long term use you can wrap the whole work flow in functions and apply the functions where you need. As you did not put any example dataset. I am taking an example data set and resolving it. Considering I have text database. First I will find the #tags into a list then I will search the only #tags I want and filter the data.
# find all the tags in the message
def find_hashtags(post_msg):
combo = r'#\w+'
rx = re.compile(combo)
hash_tags = rx.findall(post_msg)
return hash_tags
# find the requered match according to a tag list and return true or false
def match_tags(tag_list, htag_list):
matched_items = bool(set(tag_list).intersection(htag_list))
return matched_items
test_data = [{'text': 'Head nipid mõnusateks sõitudeks kitsastel tänavatel. #TipStop'},
{'text': 'Homses Rooli Võimus uus #Peugeot208!\nVaata kindlasti.'},
{'text': 'Soovitame ennast tulevikuks ette valmistada, electric car sest uus #PeugeotE208 on peagi kohal! ⚡️⚡️\n#UnboringTheFuture'},
{'text': "Aeg on täiesti uueks roadtrip'i kogemuseks! \nLase ennast üllatada - #Peugeot5008!"},
{'text': 'Tõeline ikoon, mille stiil avaldab muljet läbi eco car, electric cars generatsioonide #Peugeot504!'}
]
test_df = pd.DataFrame(test_data)
# find all the hashtags
test_df["hashtags"] = test_df["text"].apply(lambda x: find_hashtags(x))
# the only hashtags we are interested
tag_search = ["#TipStop", "#Peugeot208"]
# match the tags in our list
test_df["tag_exist"] = test_df["hashtags"].apply(lambda x: match_tags(x, tag_search))
# filter the data
main_df = test_df[test_df.tag_exist]
There is Model with ManyToMany field:
class Number(Model):
current_number = IntegerField()
class MyModel(models.Model):
numbers_set = models.ManyToMany(Number)
For example we have such dataset:
my_model_1.numbers_set = [1, 2, 3, 4]
my_model_2.numbers_set = [2, 3, 4, 5]
my_model_3.numbers_set = [3, 4, 5, 6]
my_model_4.numbers_set = [4, 5, 6, 7]
my_model_5.numbers_set = [4, 5, 6, 7]
I'm looking for a way to aggregate MyModel by amount of same numbers.
f.e. MyModel objects that have at least 3 same numbers in theirs numbers_set.
[
[my_model_1, my_model_2],
[my_model_2, my_model_3],
[my_model_3, my_model_4, my_model_5],
]
if you are using Postgres version 9.4 and Django version 1.9 , It's better to use JSONField() rather than using ManyToMany(), for indexing purpose use jsonb indexing on Postgres which will provide you efficient query for fetching data. Check here
Say I have a dict of country -> [cities] (potentially an ordered dict):
{'UK': ['Bristol', 'Manchester' 'London', 'Glasgow'],
'France': ['Paris', 'Calais', 'Nice', 'Cannes'],
'Germany': ['Munich', 'Berlin', 'Cologne']
}
The number of keys (countries) is variable: and the number of elements cities in the array, also variable. The resultset comes from a 'search' on city name so, for example, a search on "San%" could potentially meet with 50k results (on a worldwide search)
The data is to be used to populate a select2 widget --- and I'd like to use its paging functionality...
Is there a smart way to slice this such that [3:8] would yield:
{'UK': ['Glasgow'],
'France': ['Paris', 'Calais', 'Nice', 'Cannes'],
'Germany': ['Munich']
}
(apologies for the way this question was posed earlier -- I wasn't sure that the real usage would clarify the issue...)
If I understand your problem correctly, as talked about in the comments, this should do it
from pprint import pprint
def slice_dict(d,a, b):
big_list = []
ret_dict = {}
# Make one big list of all numbers, tagging each number with the key
# of the dict they came from.
for k, v in d.iteritems():
for n in v:
big_list.append({k:n})
# Slice it
sliced = big_list[a:b]
# Put everything back in order
for k, v in d.iteritems():
for subd in sliced:
for subk, subv in subd.iteritems():
if k == subk:
if k in ret_dict:
ret_dict[k].append(subv)
else:
ret_dict[k] = [subv]
return ret_dict
d = {
'a': [1, 2, 3, 4],
'b': [5, 6, 7, 8, 9],
'c': [10, 11, 12, 13, 14]
}
x = slice_dict(d, 3, 11)
pprint(x)
$ python slice.py
{'a': [4], 'b': [5, 6], 'c': [10, 11, 12, 13, 14]}
The output is a little different from your example output, but that's because the dict was not ordered when it was passed to the function. It was a-c-b, that's why b is cut off at 6 and c is not cut off
heres a quick one for you:
I have a list of id's which I want to use to return a QuerySet(or array if need be), but I want to maintain that order.
Thanks
Since Django 1.8, you can do:
from django.db.models import Case, When
pk_list = [10, 2, 1]
preserved = Case(*[When(pk=pk, then=pos) for pos, pk in enumerate(pk_list)])
queryset = MyModel.objects.filter(pk__in=pk_list).order_by(preserved)
I don't think you can enforce that particular order on the database level, so you need to do it in python instead.
id_list = [1, 5, 7]
objects = Foo.objects.filter(id__in=id_list)
objects = dict([(obj.id, obj) for obj in objects])
sorted_objects = [objects[id] for id in id_list]
This builds up a dictionary of the objects with their id as key, so they can be retrieved easily when building up the sorted list.
If you want to do this using in_bulk, you actually need to merge the two answers above:
id_list = [1, 5, 7]
objects = Foo.objects.in_bulk(id_list)
sorted_objects = [objects[id] for id in id_list]
Otherwise the result will be a dictionary rather than a specifically ordered list.
Here's a way to do it at database level. Copy paste from: blog.mathieu-leplatre.info
:
MySQL:
SELECT *
FROM theme
ORDER BY FIELD(`id`, 10, 2, 1);
Same with Django:
pk_list = [10, 2, 1]
ordering = 'FIELD(`id`, %s)' % ','.join(str(id) for id in pk_list)
queryset = Theme.objects.filter(pk__in=[pk_list]).extra(
select={'ordering': ordering}, order_by=('ordering',))
PostgreSQL:
SELECT *
FROM theme
ORDER BY
CASE
WHEN id=10 THEN 0
WHEN id=2 THEN 1
WHEN id=1 THEN 2
END;
Same with Django:
pk_list = [10, 2, 1]
clauses = ' '.join(['WHEN id=%s THEN %s' % (pk, i) for i, pk in enumerate(pk_list)])
ordering = 'CASE %s END' % clauses
queryset = Theme.objects.filter(pk__in=pk_list).extra(
select={'ordering': ordering}, order_by=('ordering',))
id_list = [1, 5, 7]
objects = Foo.objects.filter(id__in=id_list)
sorted(objects, key=lambda i: id_list.index(i.pk))
Another better/cleaner approach can be
pk_list = [10, 2, 1]
sorted_key_object_pair = MyModel.objects.in_bulk(pk_list)
sorted_objects = sorted_key_object_pair.values()
Simple, clean, less code.