I have a Django query where I want to group the number of test attempts by test_id and get an average among each test.
The test_attempts table logs each test attempt a user makes on a given test. I want to find the average number of attempts per test
Here is my query:
average = TestAttempts.objects.values('test_id').annotate(Avg(Count('test_id'))).filter(user_id=id)
I am getting the following error:
'Count' object has no attribute 'split'
Is there a way to handle this without having to write raw SQL?
UPDATE:
Here is the TestAttemt model
class TestAttempts(models.Model):
id = models.IntegerField(primary_key=True)
user_id = models.IntegerField()
test_id = models.IntegerField()
test_grade = models.DecimalField(max_digits=6, decimal_places=1)
grade_date_time = models.DateTimeField()
start_time = models.DateTimeField()
seconds_taken = models.IntegerField()
taking_for_ce_credit = models.IntegerField()
ip_address = models.CharField(max_length=25L)
grade_points = models.DecimalField(null=True, max_digits=4, decimal_places=1, blank=True)
passing_percentage = models.IntegerField(null=True, blank=True)
passed = models.IntegerField()
class Meta:
db_table = 'test_attempts'
You want a single number, an average number of attempts over all tests? Do you have a Test model?
This will work then:
average = (Test.objects.filter(testattempt__user_id=id)
.annotate(c=Count('testattempt'))
.aggregate(a=Avg('c'))['a'])
If you don't have a TestAttempt → Test relationship, but only a test_id field, then this should work:
average = (TestAttempt.objects.filter(user_id=2)
.values('test_id')
.annotate(c=Count('pk'))
.aggregate(a=Avg('c')))
but doesn't work for me on sqlite, and I don't have a proper db at hand.
Related
I am studying about Django ORM. I couldn't get an answer from the search, but I'd appreciate it if someone could tell me the related site.
My model is as follows. user1 has2 accounts, and 500,000 transactions belong to one of the accounts.
class Account(models.Model):
class Meta:
db_table = 'account'
ordering = ['created_at']
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
account = models.CharField(max_length=20, null=False, blank=False, primary_key=True)
balance = models.PositiveBigIntegerField(default=0)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class AccountTransaction(models.Model):
class Meta:
db_table = 'account_transaction'
ordering = ['tran_time']
indexes = [
models.Index(fields=['tran_type', 'tran_time', ]),
]
account = models.ForeignKey(Account, on_delete=models.CASCADE)
tran_amt = models.PositiveBigIntegerField()
balance = models.PositiveBigIntegerField()
tran_type = models.CharField(max_length=10, null=False, blank=False)
tran_detail = models.CharField(max_length=100, null=True, default="")
tran_time = models.DateTimeField(auto_now_add=True)
The query time for the above model is as follows.
start = time.time()
rs = request.user.account_set.all().get(account="0000000010").accounttransaction_set.all()
count = rs.count()
print('>>all')
print(time.time() - start) # 0.028000831604003906
start = time.time()
q = Q(tran_time__date__range = ("2000-01-01", "2000-01-03"))
rs = request.user.account_set.all().get(account="0000000010").accounttransaction_set.filter(q)
print('>>filter')
print(time.time() - start) # 0.0019981861114501953
start = time.time()
result = list(rs)
print('>>offset')
print(time.time() - start) # 5.4373579025268555
The result of the query_set is about 3500 in total. (3500 out of 500,000 records were selected).
I've done a number of things, such as setting offset to the result (rs) of query_set, but it still takes a long time to get the actual value from query_set.
I know that the view loads data when approaching actual values such as count(), but what did I do wrong?
From https://docs.djangoproject.com/en/4.1/topics/db/queries/#querysets-are-lazy:
QuerySets are lazy – the act of creating a QuerySet doesn’t involve
any database activity. You can stack filters together all day long,
and Django won’t actually run the query until the QuerySet is
evaluated. Take a look at this example:
q = Entry.objects.filter(headline__startswith="What")
q = q.filter(pub_date__lte=datetime.date.today())
q = q.exclude(body_text__icontains="food")
print(q)
Though this looks like three database hits, in fact it hits the
database only once, at the last line (print(q)). In general, the
results of a QuerySet aren’t fetched from the database until you “ask”
for them. When you do, the QuerySet is evaluated by accessing the
database. For more details on exactly when evaluation takes place, see
When QuerySets are evaluated.
In your example the database is hit only when you're calling list(rs), that's why it takes so long.
I have a model Allotment
class Kit(models.Model):
kit_types = (('FLC', 'FLC'), ('FSC', 'FSC'), ('Crate', 'Crate'), ('PP Box', 'PP Box'))
kit_name = models.CharField(max_length=500, default=0)
kit_type = models.CharField(max_length=50, default=0, choices=kit_types, blank=True, null=True)
class AllotmentFlow(models.Model):
flow = models.ForeignKey(Flow, on_delete=models.CASCADE)
kit = models.ForeignKey(Kit, on_delete=models.CASCADE)
asked_quantity = models.IntegerField(default=0)
alloted_quantity = models.IntegerField(default=0)
class Allotment(models.Model):
transaction_no = models.IntegerField(default=0)
dispatch_date = models.DateTimeField(default=datetime.now)
send_from_warehouse = models.ForeignKey(Warehouse, on_delete=models.CASCADE)
flows = models.ManyToManyField(AllotmentFlow)
For a stacked graph I am trying to get the data of different kit_type alloted in different months.
For that I have tried annotate but it isn't getting the desired results
dataset = Allotment.objects.all().annotate(
month=TruncMonth('dispatch_date')).values(
'month').annotate(dcount=Count('flows__kit__kit_type')).values('month', 'dcount')
Expected Output:
[{'month':xyz, 'kit_type':foo, count:123},...]
I am getting the month and count of kit type from above but how do I segregate it by kit_type?
having a field that represents your choice field names in this query is difficult
instead how about use the Count filter argument and annotate to get what you want
dataset = Allotment.objects.all().annotate(month=TruncMonth('dispatch_date')).values('month').annotate(
FLC_count=Count('flows__kit__kit_type', filter=Q(flows__kit__kit_type="FLC")),
FSC_count=Count('flows__kit__kit_type', filter=Q(flows__kit__kit_type="FSC")),
Crate_count=Count('flows__kit__kit_type', filter=Q(flows__kit__kit_type="Crate")),
PP_Box_count=Count('flows__kit__kit_type', filter=Q(flows__kit__kit_type="PP_Box")),
).values('month', 'FLC_count', 'FSC_count', 'Crate_count', 'PP_Box_count')
I've got a Stock table and a StockArchive table.
My Stock table consists of roughly that 10000 stocks that I update daily. The reason I have a StockArchive table is because I still wanna some historic data and not just update existing records. My question is, is this a proper way of doing it?
First, my models:
class Stock(models.Model):
objects = BulkUpdateOrCreateQuerySet.as_manager()
stock = models.CharField(max_length=200)
ticker = models.CharField(max_length=200)
exchange = models.ForeignKey(Exchange, on_delete=models.DO_NOTHING)
eod_price = models.DecimalField(max_digits=12, decimal_places=4)
currency = models.CharField(max_length=20, blank=True, null=True)
last_modified = models.DateTimeField(blank=True, null=True)
class Meta:
db_table = "stock"
class StockArchive(models.Model):
objects = BulkUpdateOrCreateQuerySet.as_manager()
stock = models.ForeignKey(Stock, on_delete=models.DO_NOTHING)
eod_price = models.DecimalField(max_digits=12, decimal_places=4)
archive_date = models.DateField()
class Meta:
db_table = "stock_archive"
I proceed on doing the following:
#transaction.atomic
def my_func():
archive_stocks = []
batch_size = 100
old_stocks = Stock.objects.all()
for stock in old_stocks:
archive_stocks.append(
StockArchive(
stock=stock.stock,
eod_price = stock.eod_price,
archive_date = date.today(),
)
)
# insert into stock archive table
StockArchive.objects.bulk_create(archive_stocks, batch_size)
# delete stock table
Stock.objects.all().delete()
# proceed to bulk_insert new stocks
I also wrapped the function with a #transaction.atomic to make sure that everything is committed and not just one of the transactions.
Is my thought process correct, or should I do something differently? Perhaps more efficient?
I'd like to ask, how I could shrink this to one command? I understand that annotate is proper way to do this,but don't understand how.
Here is my code, which is too slow:
sum = 0
for contact in self.contacts.all():
sum += (contact.orders.aggregate(models.Sum('total'))['total__sum'])
return sum
I'd like to get Sum for each contact, all records in total column of relevant orders.
Code above produces sum, but is sluggishly slow. It is my understand it can be done with annotate,but not sure how to use it.
Here is Contact:
class Contact(models.Model):
company = models.ForeignKey(
Company, related_name="contacts", on_delete=models.PROTECT)
first_name = models.CharField(max_length=80)
last_name = models.CharField(max_length=80, blank=True)
email = models.EmailField()
And here is Orders:
class Order(models.Model):
order_number = models.CharField(max_length=80)
company = models.ForeignKey(Company, related_name="orders")
contact = models.ForeignKey(Contact, related_name="orders")
total = models.DecimalField(max_digits=12, decimal_places=6)
order_date = models.DateTimeField(null=True, blank=True)
Help please
You can annotate your queryset on the Contract model with:
from django.db.models import Sum
Contract.objects.annotate(
total_orders=Sum('orders__total')
)
The Contract objects that arise from this queryset will have an extra attribute .total_orders that contains the sum of the total field of the related Order objects.
This will thus create a query that looks like:
SELECT contract.*, SUM(order.total)
FROM contract
LEFT OUTER JOIN order ON order.contract_id = contract.id
GROUP BY contract.id
I'm trying to figure out how to execute the following sql join statement in Django without resorting to just raw sql. Is there a way to do it?
Select * from playertable, seasontable where player.id = season.player_id
Here are my models. Just to clarify, I used abbreviated table names in the above query for clarify
class Player(models.Model):
name = models.CharField(max_length=200)
team = models.CharField(max_length=3)
position = models.CharField(max_length=3)
class PlayerSeason(models.Model):
player = models.ForeignKey(Player)
year = models.IntegerField()
games_played = models.IntegerField()
goals = models.IntegerField()
assists = models.IntegerField()
points = models.IntegerField()
plusminus = models.CharField(max_length=200)
pim = models.IntegerField()
ppg = models.IntegerField()
shg = models.IntegerField()
gwg = models.IntegerField()
otg = models.IntegerField()
shots = models.IntegerField()
shooting_percentage = models.DecimalField(max_digits=5, decimal_places=2)
toi = models.CharField(max_length=200)
sftg = models.DecimalField(max_digits=5, decimal_places=2)
face_off = models.DecimalField(max_digits=5, decimal_places=2)
How should I do this with a Django QuerySet?
If all you wanted to do was to get all the players associated with a given season you could make use of Django's backwards relationships
When you use a ForeignKeyField to a model, in this case Season, the that model instances get an attribute which allows you to get a queryset of all the related objects.
In your example you could use season.player_set.all().
You can pass an optional parameter related_name to the ForeignKeyField that allows you to change the name of the season attribute.
Is there a way to do it?
No. Django's ORM deals with one model at a time, and you are getting columns from two tables. Perform a query on either of the models and then access the appropriate field to get the related model.