Django race condition aggregate(Max) in F() expression - django

Imagine the following model:
class Item(models.Model):
# natural PK
increment = models.PositiveIntegerField(_('Increment'), null=True, blank=True, default=None)
# other fields
When an item is created, I want the increment fields to automatically acquire the maximum value is has across the whole table, +1. For example:
|_item____________________________|
|_id_|_increment__________________|
| 1 | 1 |
| 2 | 2 |
| 4 | 3 | -> id 3 was deleted at some stage..
| 5 | 4 |
| 6 | 5 |
.. etc
When a new Item() comes in and is saved(), how in one pass, and in way that will avoid race conditions, make sure it will have increment 6 and not 7 in case another process does exactly the same thing, at the same time?
I have tried:
with transaction.atomic():
i = Item()
highest_increment = Item.objects.all().aggregate(Max('increment'))
i.increment = highest_increment['increment__max']
i.save()
I would like to be able to create it in a way similar to the following, but that obviously does not work (have checked places like https://docs.djangoproject.com/en/3.2/ref/models/expressions/#avoiding-race-conditions-using-f):
from django.db.models import Max, F
i = Item(
increment=F(Max(increment))
)
Many thanks

Related

Optimize code of a function for a search filter in django with variable numbers of keywords - too much code, i'm a beginner

Hello great community,
i'm learning django/python development, i'm training myself with development of a web app for asset inventory.
i've made a search filter, to give result of (for example) assets belonging to a specific user, or belonging to a specific department, or belonging to a specific brand, model or category (computers, desks, ecc..) there are many fields that mostly are foreign tables, main table is "Cespiti" that mean Asset in italian
now (after a lot) i've done with multiple keyword search (for example) somebody type in the search box the department and the category and obtain the relative results (for example all desks in a specific department, or all computer of a specific model in a specific department).
i've made it in a "if" check form that split the keyword in single words, count it and apply progressive filtering on the results of the previous keys in sequence.
but i'm not satisfact of my code, i think it's too much "hardcoded" and instead of creating an IF condition for each number of keyword (from 1 to 3) i wish like to code something that is not so dependent in the number of keyword, but is free.
Here's the code of the view, i hope someone can give me the right direction.
def SearchResults(request):
query = request.GET.get('q')
chiave =query.split()
lunghezza = int((len(chiave)))
if lunghezza == 1:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).distinct
elif lunghezza == 2:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).filter(Q(proprietario__cognome__icontains=chiave[1]) |
Q(proprietario__nome__icontains=chiave[1]) |
Q(categoria__nome__icontains=chiave[1]) |
Q(marca__nome__icontains=chiave[1]) |
Q(modello__nome__icontains=chiave[1]) |
Q(reparto__nome__icontains=chiave[1]) |
Q(matricola__icontains=chiave[1])
).distinct
elif lunghezza == 3:
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
).filter(Q(proprietario__cognome__icontains=chiave[1]) |
Q(proprietario__nome__icontains=chiave[1]) |
Q(categoria__nome__icontains=chiave[1]) |
Q(marca__nome__icontains=chiave[1]) |
Q(modello__nome__icontains=chiave[1]) |
Q(reparto__nome__icontains=chiave[1]) |
Q(matricola__icontains=chiave[1])
).filter(Q(proprietario__cognome__icontains=chiave[2]) |
Q(proprietario__nome__icontains=chiave[2]) |
Q(categoria__nome__icontains=chiave[2]) |
Q(marca__nome__icontains=chiave[2]) |
Q(modello__nome__icontains=chiave[2]) |
Q(reparto__nome__icontains=chiave[2]) |
Q(matricola__icontains=chiave[2])).distinct
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)
One way you could do it would be to separate the step of building the Q objects from the view method. That way it could be performed in a loop:
def generate_search_query_params(word):
return (
Q(proprietario__cognome__icontains=word) |
Q(proprietario__nome__icontains=word) |
Q(categoria__nome__icontains=word) |
Q(marca__nome__icontains=word) |
Q(modello__nome__icontains=word) |
Q(reparto__nome__icontains=word) |
Q(matricola__icontains=word)
)
def SearchResults(request):
query = request.GET.get('q')
queryset = Cespiti.objects.all()
for word in query.split():
queryset = queryset.filter(
generate_search_query_params(word)
)
object_list = queryset.distinct()
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)
thanks, i really appreciate your suggestion,
i reach to insert in a loop, now number of keywords is unlimited and it's not hardcode ( thanks a lot), i've thinked about Abdul idea and Damon solution, i wish to avoid the initial ".object.all() so i've arranged in this way: the first "level" is fixed, so i can avoid the .all() and all sublevels of filtering are looped, what do you think about?
def SearchResults(request):
query = request.GET.get('q')
chiave =query.split()
lunghezza = int((len(chiave)))
object_list = Cespiti.objects.filter(
Q(proprietario__cognome__icontains=chiave[0]) |
Q(proprietario__nome__icontains=chiave[0]) |
Q(categoria__nome__icontains=chiave[0]) |
Q(marca__nome__icontains=chiave[0]) |
Q(modello__nome__icontains=chiave[0]) |
Q(reparto__nome__icontains=chiave[0]) |
Q(matricola__icontains=chiave[0])
)
for I in range(1,lunghezza):
print(I)
object_list = object_list.filter(
Q(proprietario__cognome__icontains=chiave[I]) |
Q(proprietario__nome__icontains=chiave[I]) |
Q(categoria__nome__icontains=chiave[I]) |
Q(marca__nome__icontains=chiave[I]) |
Q(modello__nome__icontains=chiave[I]) |
Q(reparto__nome__icontains=chiave[I]) |
Q(matricola__icontains=chiave[I])
)
context = {
'object_list': object_list, 'query' : query,
}
return render(request, 'search_results.html', context=context)

Django ORM. Select only duplicated fields from DB

I have table in DB like this:
MyTableWithValues
id | user(fk to Users) | value(fk to Values) | text | something1 | something2 ...
1 | userobject1 | valueobject1 |asdasdasdasd| 123 | 12321
2 | userobject2 | valueobject50 |QWQWQWQWQWQW| 515 | 5555455
3 | userobject1 | valueobject1 |asdasdasdasd| 12345 | 123213
I need to delete all objects where are repeated fields user, value and text, but save one from them. In this example will be deleted 3rd record.
How can I do this, using Django ORM?
PS:
try this:
recs = (
MyTableWithValues.objects
.order_by()
.annotate(max_id=Max('id'), count_id=Count('user__id'))
#.filter(count_id__gt=1)
.annotate(count_values=Count('values'))
#.filter(count_icd__gt=1)
)
...
...
for r in recs:
print(r.id, r.count_id, , r.count_values)
it prints something like this:
1 1 1
2 1 1
3 1 1
...
Dispite the fact, that in database there are duplicated values. I cant understand, why Count function does not work.
Can anybody help me?
You should first be aware of how count works.
The Count method will count for identical rows.
It uses all the fields available in an object to check if it is identical with fields of other rows or not.
So in current situation the count_values is resulting 1 because Count is using all fields excluding id to look for similar rows.
Count is including user,value,text,something1,something2 fields to check for similarity.
To count rows with similar fields you have to use only user,values & text field
Query:
recs = MyTableWithValues.objects
.values('user','values','text')
.annotate(max_id=Max('id'),count_id=Count('user__id'))
.annotate(count_values=Count('values'))
It will return a list of dictionary
print(recs)
Output:
<QuerySet[{'user':1,'values':1,'text':'asdasdasdasd','max_id':3,'count_id':2,'count_values':2},{'user':2,'values':2,'text':'QWQWQWQWQWQW','max_id':2,'count_id':1,'count_values':1}]
using this queryset you can check how many times a row contains user,values & text field with same values
Would a Python loop work for you?
import collections
d = collections.defaultdict(list)
# group all objects by the key
for e in MyTableWithValues.objects.all():
k = (e.user_id, e.value_id, e.text)
d[k].append(e)
for k, obj_list in d.items():
if len(obj_list) > 1:
for e in obj_list[1:]:
# except the first one, delete all objects
e.delete()

Compare fields within relationship on Django ORM

I have two models, route and stop.
A route can have several stop, each stop have a name and a number. On same route, stop.number are unique.
The problem:
I need to search which route has two different stops and one stop.number is less than the other stop.number
Consider the following models:
class Route(models.Model):
name = models.CharField(max_length=20)
class Stop(models.Model):
route = models.ForeignKey(Route)
number = models.PositiveSmallIntegerField()
location = models.CharField(max_length=45)
And the following data:
Stop table
| id | route_id | number | location |
|----|----------|--------|----------|
| 1 | 1 | 1 | 'A' |
| 2 | 1 | 2 | 'B' |
| 3 | 1 | 3 | 'C' |
| 4 | 2 | 1 | 'C' |
| 5 | 2 | 2 | 'B' |
| 6 | 2 | 3 | 'A' |
In example:
Given two locations 'A' and 'B', search which routes have both location and A.number is less than B.number
With the previous data, it should match route id 1 and not route id 2
On raw SQL, this works with a single query:
SELECT
`route`.id
FROM
`route`
LEFT JOIN `stop` stop_from ON stop_from.`route_id` = `route`.`id`
LEFT JOIN `stop` stop_to ON stop_to.`route_id` = `route`.`id`
WHERE
stop_from.`stop_location_id` = 'A'
AND stop_to.`stop_location_id` = 'B'
AND stop_from.stop_number < stop_to.stop_number
Is this possible to do with one single query on Django ORM as well?
Generally ORM frameworks like Django ORM, SQLAlchemy and even Hibernate is not design to autogenerate most efficient query. There is a way to write this query only using Model objects, however, since I had similar issue, I would suggest to use raw query for more complex queries. Following is link for Django raw query:
[https://docs.djangoproject.com/en/1.11/topics/db/sql/]
Although, you can write your query in many ways but something like following could help.
from django.db import connection
def my_custom_sql(self):
with connection.cursor() as cursor:
cursor.execute("SELECT
`route`.id
FROM
`route`
LEFT JOIN `stop` stop_from ON stop_from.`route_id` = `route`.`id`
LEFT JOIN `stop` stop_to ON stop_to.`route_id` = `route`.`id`
WHERE
stop_from.`stop_location_id` = %s
AND stop_to.`stop_location_id` = %s
AND stop_from.stop_number < stop_to.stop_number", ['A', 'B'])
row = cursor.fetchone()
return row
hope this helps.

Django REST API return fields from ForeignKey in related model

With the following models:
class Tabs(models.Model):
name = CharField(max_length=64)
def __str__(self):
return self.name
class DataLink(models.Model):
data_id = models.ForeignKey(...)
tabs_id = models.ForeignKey(Tabs, ...)
def __str__(self):
return "{} {}".format(self.data_id, self.tabs_id)
DataLink: Tabs:
id | data_id | tabs_id | id | name
------+-----------+----------- | ------+--------
1 | 1 | 1 | 1 | tab1
2 | 1 | 2 | 2 | tab2
3 | 1 | 3 | 3 | tab3
4 | 2 | 1 | 4 | tab4
5 | 2 | 4 | 5 | tab5
I need to link data between two models/tables such that for a given data_id I can return a list of corresponding tabs, using the Tabs table and the tabs_id.
For example:
data_id = 1 would return ['tab1', 'tab2', 'tab3']
data_id = 2 would return ['tab1', 'tab4']
Is this possible? How? Is it a bad idea?
if you just want a flattened list like that given a data id, you should use values list with the key-value you want and the flat=True kwarg.
it would look something like this. try it in your shell.
https://docs.djangoproject.com/en/1.9/ref/models/querysets/#values-list
DataLink.objects.filter(data_id=1).values_list('tabs_id',flat=True)
also, you tagged the question with django rest but has no restful context. this appears to be only a Django question.

Using Django ListView with custom query

I have a set of django models that are set out as follows:
class Foo(models.Model):
...
class FooVersion(models.Model):
name = models.CharField(max_length=100)
parent = models.ForeignKey(Foo)
version = models.FloatField()
...
I'm trying to create a Django ListView that displays all Foos, in alphabetical order by the name of their highest version. For example, if I have a data set that looks like:
version_id | id | version_name | version
-----------+----+-----------------------------------+---------
1 | 1 | Test 1 | 1.0
2 | 1 | Test 2 | 2.0
3 | 1 | Test 2 | 3.0
4 | 2 | Test 1 | 1.0
5 | 1 | Test 3 | 2.5
6 | 3 | Test 3 | 1.0
I want the query to return:
version_id | id | version_name | version
-----------+----+-----------------------------------+---------
4 | 2 | Test 1 | 1.0
3 | 1 | Test 2 | 3.0
6 | 3 | Test 3 | 1.0
The raw sql I would use to generate this is:
SELECT version_class.id as version_id, someapp_foo.id, version_class.name as version_name, version_class.version
FROM someapp_foo
INNER JOIN(
SELECT someapp_fooversion.name, someapp_fooversion.version, someapp_fooversion.parent_id, someapp_fooversion.id
FROM someapp_fooversion
INNER JOIN(
SELECT parent_id, max(version) AS version
FROM courses_courseversion GROUP BY parent_id)
AS current_version ON current_version.parent_id = someapp_fooversion.parent_id
AND current_version.version = someapp_fooversion.version)
AS version_class ON version_class.parent_id = someapp_foo.id
ORDER BY version_name;
But I'm having trouble using a raw query because the RawQuerySet object doesn't have a 'count' method, which is called by ListView for pagination. I've looked into the 'extra' feature of Django querysets, but I'm having trouble formulating a query that will work with that.
How would I formulate a query for 'extra' that would get me what I'm looking for? Or is there a way to convert a RawQuerySet into a regular QuerySet? Any other possible solutions to get the results I'm looking for?
There may be a better way to do this, but for now I'm trying a custom solution that seems to work:
from django.db import models
from django.db.models.query import RawQuerySet
class CountableRawQuerySet(RawQuerySet):
def count(self):
return sum([1 for obj in self])
class FooManager(models.Manager):
def raw(self, raw_query, params=None, *args, **kwargs):
return CountableRawQuerySet(raw_query=raw_query, model=self.model, params=params, using=self._db, *args, **kwargs)
class Foo(models.Model):
objects = FooManager()
Then my queryset is:
Foo.objects.raw(sql)
Suggestions on how to improve this?
First of all - your solution is wrong and very uneffective with a big amount of data.
I believe you just need something like:
from django.db.models import Max
Foo.objects.annotate(max_version=Max(fooversion__version))
You can now reffer to max_version attribute in each result as to normal attribute.
Please see https://docs.djangoproject.com/en/dev/topics/db/aggregation/ for details.
One other point to add is that RawQuerySet works fine with a ListView as long as you don't use pagination, i.e. you can just leave out the paginate_by = NN attribute from your ListView subclass.