This coding is hitting the DB a lot. Is there a way to reduce the number of DB hits by grouping them together? If not in Django is it possible with SQL? My development machine is SQLite and production is PostgreSQL. If possible with SQL, please give me a few hints of where to get started.
class Sensor(models.Model):
Name = models.CharField( max_length=200 )
Value = models.FloatField()
class DataPoint(BaseModel):
Taken_datetime = models.DateTimeField( blank=True, null=True )
Sensors = models.ManyToManyField( SensorVal, blank=True, null=True )
for row in rows:
dp = DataPoint.objects.get(Taken_datetime=row['date'])
sensorToAdd = []
for sensor in sensors:
s = Sensor.objects.get(Name=sensor.name, Value=sensor.value )
sensorToAdd.append( s )
dp.Sensors.add( sensorToAdd )
All the data is stored in a cvs file, so I know all of it at the start.
For each row, the code hits the DB to load DataPoint, load the Sensors, and attach the sensors to the DataPoint. I'm looking for something like bulk_create, but for the m2m field. All the solutions I've found have used the same method I'm using above. The problem I'm running into is that there is a lot of time DataPoints, and I'm hitting the DB a lot of individual times. I'd like to group all these together and do a few DB calls.
If there is a better way to model the data without making the DB larger? I'd be open to that.
Related
I am working with Django to create a dashboard which present many kind of data. My problem is that the page loading slowly despite I hit the database (PostgreSql) always once. These tables are loading with data in every 10th minute, so currently consist of millions of record. My problem is that when I make a query with Django ORM, I get the data slowly (according to the Django toolbar it is 1,4 second). I know that this not too much b is the half of the total loading time (3,1), so If I could decrease the time of the query the page loading could decrease to there for the user experience could be better. When the query run I fetch ~ 2800 rows. Is there any way to speed up this query? I do not know that I do something wrong or this time is normal with this amount of data. I attach my query and model. Thank you in advance for your help.
My query (Here I fetch 6 hours time intervall.):
my_query=MyTable.filter(time_stamp__range=(before_now, now)).values('time_stamp', 'value1', 'value2')
Here I tried to use .iterator() but the query wasn't faster.
My model:
class MyTable(models.Model):
time_stamp = models.DateTimeField()
value1 = models.FloatField(blank=True, null=True)
values2 = models.FloatField(blank=True, null=True)
Add an index:
class MyTable(models.Model):
time_stamp = models.DateTimeField()
value1 = models.FloatField(blank=True, null=True)
values2 = models.FloatField(blank=True, null=True)
class Meta:
indexes = [
models.Index(fields=['time_stamp']),
]
Don't forget to run manage.py makemigrations and manage.py migrate after this.
I have Django project with two database models: Device and DeviceTest.
Every device in system should walk through some test stages from manufacturing to sale. And therefore many DeviceTest objects are connected to Device object through foreign key relationship:
class Device(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
name = models.CharField(max_length=255)
class DeviceTest(models.Model):
device = models.ForeignKey(Device)
created_at = models.DateTimeField(auto_now_add=True)
status = models.CharField(max_length=255)
tester = models.CharField(max_length=255)
action = models.CharField(max_length=255)
In my project there 2 kind of pages:
1) page with all tests for individual device
2) page with all devices with their latest status and action
Now I'm trying to optimize 2) page. To get latest test data I use this code:
status_list = []
last_update_list = []
last_action_list = []
for dev in device_list:
try:
latest_test = DeviceTest.objects.filter(device_id=dev.pk).latest('created_at')
status_list.append(latest_test.status)
last_update_list.append(latest_test.created_at)
last_action_list.append(latest_test.action)
except ObjectDoesNotExist:
status_list.append("Not checked")
last_update_list.append("Not checked")
last_action_list.append("Not checked")
For now in my database ~600 devices and ~4000 tests. And this is the main bottleneck in page loading.
What are the ways to speed up this calculation?
I came up with idea of adding extra field to Device model: foreign key to its last DeviceTest. In this scenario there wouldn't be any complicated requests to database at all.
And now I have a few questions:
Is it a good practice to add redundant field to model?
Is it possible to write migration rule to fill this redundant field to all current Devices?
And the most important, what are other choices to speed up my calculations?
id_list = [dev.id for dev in device_list]
devtests = DeviceTest.objects.filter(
device_id__in=id_list).order_by('-created_at').distinct('device')
That should give you, in one database call, in devtests only the latest entries for each device_id by create_at value.
Then do your loop and take the values from the list, instead of calling the database on each iteration.
However, it could also be a good idea to denormalize the database, like you suggested. Using "redundant fields" can definitely be good practice. You can automate the denormalization in the save() method or by listening to a post_save() signal from the related model.
Edit
First a correction: should be .distinct('device') (not created_at)
A list comprehension to fetch only the id values from the device_list. Equivalent to Device.objects.filter(...).values_list('id', flat=True)
id_list = [dev.id for dev in device_list]
Using the list of ids, we fetch all related DeviceTest objects
devtests = DeviceTest.objects.filter(device_id__in=id_list)
and order them by created_at but with the newest first -created_at. That also means, for every Device, the newest related DeviceTest will be first.
.order_by('-created_at')
Finally, for every device we only select the first related value we find (that would be the newest, because we sorted the values that way).
.distinct('device')
Additionally, you could also combine the device id and DeviceTest lookups
devtests = DeviceTest.objects.filter(device_in=Device.objects.filter(...))
then Django would create the SQL for it to do the JOIN in the database, so you don't need to load and loop the id list in Python.
I'm writing a customer management system for my business and got stuck on the payments entry system. This will run on a local dedicated server and should have only one user so code performance is not really an issue.
Every adult customer who enters the store is given a numbered card (Card, for the rest of this question) and his/her ID ( from Customer model ) is attached to it by a foreign key relation. There is an "entrance fee subtotal", which is the result of a choice field on Card model (there's only two choices and those won't change for a long time) plus kids 'fees'.
This, along with other two kind of models ( Product and Service), will compose the customer's bill. I have it working just fine, except on the payments registration.
As many Customers may be part of a family, and they may split their total bill quite often, I do believe Payment should be a model with an ManyToManyField related to Card so it could cover multiple payments methods ( treated as another choice field, since it will be either money, credit or debit cards ) but I can't figure it out how to model it neither how to handle it in my view/template.
I'm using django 1.9 & postgres 9.5 & python 2.7.
Bootstrap 3 along with some JS for styling (probably irrelevant).
Enough said, here's some code:
models.py
class Customer(models.Model):
id = models.AutoField(primary_key=True) #unnecessary but I had already written it
name = models.CharField(max_length=40)
last = models.CharField(max_length=80)
class Card(models.Model):
entrance_type1 = 1
entrance_type2 = 2
entrance_choices = (
(entrance_type1, 'Fun'),
(entrance_type2, 'Really Fun, kinda expensive'),
)
entrance_types = {
1:"Fun",
2:"Really Fun",
}
entrance_fee= {
'kid':5.0,
entrance_type1:15.0,
entrance_type1:35.0,
}
id = models.AutoField(primary_key=True) #Yeah, I do that
date = models.DateTimeField(auto_now=True, auto_now_add=False)
card_number = models.IntegerField()
entrance_type = models.PositiveIntegerField(choices=entrance_choices)
kids_number = models.PositiveIntegerField()
id_costumer = models.ForeignKey(Customer)
entrances_value = models.DecimalField(max_digits=6, decimal_places=2)
#will be entrance_fee[entrance_type] + entrance_fee['kid'] * kids_number
status = models.BooleanField(default=1) #should be 0 after payment(s)
Anyway, I really need help modelling payments for those. It should contain payment method, date and to which Cards it's related to.
I'm already getting ideas on the views/template step so I won't be strict about those on answers.
I do believe my question is kinda fuzzy and confuse, but can't figure how to make it better ( and maybe this is why I can't solve it by myself ) so please comment in your doubts and I'll edit it ( including removing this part when it does improve) after lunch.
Thanks in advance
I thought about my problem for days and i need a fresh view on this.
I am building a small application for a client for his deliveries.
# models.py - Clients app
class ClientPR(models.Model):
title = models.CharField(max_length=5,
choices=TITLE_LIST,
default='mr')
last_name = models.CharField(max_length=65)
first_name = models.CharField(max_length=65, verbose_name='Prénom')
frequency = WeekdayField(default=[]) # Return a CommaSeparatedIntegerField from 0 for Monday to 6 for Sunday...
[...]
# models.py - Delivery app
class Truck(models.Model):
name = models.CharField(max_length=40, verbose_name='Nom')
description = models.CharField(max_length=250, blank=True)
color = models.CharField(max_length=10,
choices=COLORS,
default='green',
unique=True,
verbose_name='Couleur Associée')
class Order(models.Model):
delivery = models.ForeignKey(OrderDelivery, verbose_name='Delivery')
client = models.ForeignKey(ClientPR)
order = models.PositiveSmallIntegerField()
class OrderDelivery(models.Model):
date = models.DateField(default=d.today())
truck = models.ForeignKey(Truck, verbose_name='Camion', unique_for_date="date")
So i was trying to get a query and i got this one :
ClientPR.objects.today().filter(order__delivery__date=date.today())
.order_by('order__delivery__truck', 'order__order')
But, i does not do what i really want.
I want to have a list of Client obj (query sets) group by truck and order by today's delivery order !
The thing is, i want to have EVERY clients for the day even if they are not in the delivery list and with filter, that cannot be it.
I can make a query with OrderDelivery model but i will only get the clients for the delivery, not all of them for the day...
Maybe i will need to do it with a Q object ? or even raw SQL ?
Maybe i have built my models relationships the wrong way ? Or i need to lower what i want to do... Well, for now, i need your help to see the problem with new eyes !
Thanks for those who will take some time to help me.
After some tests, i decided to go with 2 querys for one table.
One from OrderDelivery Queryset for getting a list of clients regroup by Trucks and another one from ClientPR Queryset for all the clients without a delivery set for them.
I that way, no problem !
What's the best way to ensure that transactions are always balanced in double-entry accounting?
I'm creating a double-entry accounting app in Django. I have these models:
class Account(models.Model):
TYPE_CHOICES = (
('asset', 'Asset'),
('liability', 'Liability'),
('equity', 'Equity'),
('revenue', 'Revenue'),
('expense', 'Expense'),
)
num = models.IntegerField()
type = models.CharField(max_length=20, choices=TYPE_CHOICES, blank=False)
description = models.CharField(max_length=1000)
class Transaction(models.Model):
date = models.DateField()
description = models.CharField(max_length=1000)
notes = models.CharField(max_length=1000, blank=True)
class Entry(models.Model):
TYPE_CHOICES = (
('debit', 'Debit'),
('credit', 'Credit'),
)
transaction = models.ForeignKey(Transaction, related_name='entries')
type = models.CharField(max_length=10, choices=TYPE_CHOICES, blank=False)
account = models.ForeignKey(Account, related_name='entries')
amount = models.DecimalField(max_digits=11, decimal_places=2)
I'd like to enforce balanced transactions at the model level but there doesn't seem to be hooks in the right place. For example, Transaction.clean won't work because transactions get saved first, then entries are added due to the Entry.transaction ForeignKey.
I'd like balance checking to work within admin also. Currently, I use an EntryInlineFormSet with a clean method that checks balance in admin but this doesn't help when adding transactions from a script. I'm open to changing my models to make this easier.
(Hi Ryan! -- Steve Traugott)
It's been a while since you posted this, so I'm sure you're way past this puzzle. For others and posterity, I have to say yes, you need to be able to split transactions, and no, you don't want to take the naive approach and assume that transaction legs will always be in pairs, because they won't. You need to be able to do N-way splits, where N is any positive integer greater than 1. Ryan has the right structure here.
What Ryan calls Entry I usually call Leg, as in transaction leg, and I'm usually working with bare Python on top of some SQL database. I haven't used Django yet, but I'd be surprised (shocked) if Django doesn't support something like the following: Rather than use the native db row ID for transaction ID, I instead usually generate a unique transaction ID from some other source, store that in both the Transaction and Leg objects, do my final check to ensure debits and credits balance, and then commit both Transaction and Legs to the db in one SQL transaction.
Ryan, is that more or less what you wound up doing?
This may sound terribly naive, but why not just record each transaction in a single record containing "to account" and "from account" foreign keys that link to an accounts table instead of trying to create two records for each transaction? From my point of view, it seems that the essence of "double-entry" is that transactions always move money from one account to another. There is no advantage using two records to store such transactions and many disadvantages.