Efficient Design of DB with several relations - Django - django

I want to know the most efficient way for structuring and designing a database with several relations. I will explain my problem with a toy example which is scaled up in my current situation
Here are the Models in the Django database
1.) Employee Master (biggest table with several columns and rows)
class Emp_Mast():
emp_mast_id = models.AutoField(primary_key=True)
first_name = models.CharField(max_length=50)
middle_name = models.CharField(max_length=50, blank=True)
last_name = models.CharField(max_length=50, blank=True)
desgn_mast = models.ForeignKey("hr.Desgn_Mast", on_delete=models.SET_NULL, null=True)
qual_mast = models.ForeignKey("hr.Qualification_Mast", on_delete=models.SET_NULL, null=True)
office_mast = models.ManyToManyField("company_setup.Office_Mast", ref_mast = models.ForeignKey("hr.Reference_Mast", on_delete=models.SET_NULL, null=True)
refernce_mast = models.ForeignKey("hr.Refernce_Mast", on_delete=models.SET_NULL, null=True)
This is how the data is displayed in frontend
2.) All the relational field in the Employee Master have their corresponding models
3.) Crw_Movement_Transaction
Now I need to create a table for Transaction Data that that stores each and every movement of the employees. We have several Offshore sites that the employees need to travel to and daily about 50 rows would be added to this Transaction Table called Crw_Movement_Transaction
The Crw_Movement Table will have a few additional columns of calculations of itself and rest of the columns will be static (data would not be changed from here) and will be from the employee_master such as desgn_mast, souring_mast (so not all the fields from emp_mast either)
One way to do this is just define a Nested Relation for Emp_Mast in the serializer for Crw_Movement and optimize it using select_related and prefetch_related to reduce the queries to the database. However that is still very slow, as any number of queries to Emp_Mast are unnecessary. Would it be better design to just store the fields from Emp_Mast in Crw_Movement and update them when Emp_Mast is updated as well. If yes, what is a good way of doing that. Or should I stick to using Nested Serializer?

Related

Relating two models with same field value?

I'm new to Django, so I apologize a head of time if my verbiage is off. But I'll try my best!
I have two models :
PlayerProfile - this is updated once a day.
PlayerListing - this is updated every 5 minutes.
Here are simplified versions of those models.
class PlayerProfile(models.Model):
listings_id = models.CharField(max_length=120)
card_id = models.CharField(max_length=120)
first_name = models.CharField(max_length=120)
last_name = models.CharField(max_length=120)
overall = models.IntegerField()
class PlayerListing(models.Model):
listings_id = models.CharField(max_length=120, unique=True)
buy = models.IntegerField()
sell = models.IntegerField()
Currently, we just make queries based on the matching listings_id - but I'd like to have a more traditional relationship setup if possible.
How do you relate two models that have the same value for a specific field (in this case, the listings_id)?
Some potentially relevant information:
Data for both models is brought in from an external API, processed and then saved to the database.
Each PlayerListing relates to a single PlayerProfile. But not every PlayerProfile will have a PlayerListing.
When we create PlayerListings (every 5 minutes), we don't necessarily have access to the correct PlayerProfile model. listings_id's are generated last (as we have to do some extra logic to make sure they're correct).

Querying an object with pk vs custom object_id

I have a model as shown below,
class Person(models.Model):
first_name = models.CharField(max_length=30)
middle_name = models.CharField(max_length=30, blank=True)
last_name = models.CharField(max_length=30)
person_id = models.CharField(max_length=32, blank=True)
where the person_id is populated on save, which is a random hex string generated by uuid, which would look something like 'E4DC6C20BECA49E6817DB2365924B1EF'
so my question is, in a database of a large magnitude of objects, does the queries
Person.objects.get(pk=10024)
(pk) vs (person_id)
Person.objects.get(person_id='E4DC6C20BECA49E6817DB2365924B1EF')
does any of the method has a performance advantage in a large scale of data?
I am not much aware of the database internals.
My database is postgresql
To get good performance from querying a column in a database, it needs to be indexed. The primary key column is indexed automatically (by definition), but your person_id one won't be; you should add db_index=True to the declaration, then make and run migrations.

Django - Best way to merge two identical apps

I recently came onto a project in which we have two applications that are virtually identical. We are using Django 1.4 and Postgresql 8.4. Both models have the following:
class Author(models.Model):
person = models.ForeignKey(Person)
book = models.ForeignKey(Book)
order = models.PositiveIntegerField(blank=True,null=True)
institute = models.ForeignKey(Institution,blank=True, null=True)
rank = models.ForeignKey(Rank,blank=True, null=True)
class Institution(models.Model):
name = models.CharField(max_length=200)
parent_institution = models.ForeignKey('self', blank=True, null=True)
location = models.ForeignKey(Location, blank=False, null=False)
type = models.ForeignKey(InstitutionType, blank=False, null=False)
class Person(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
middle_name = models.CharField(max_length=50,blank=True,null=True)
gender = models.CharField(max_length=1,choices=GENDER_TYPE,blank=True,null=True)
class InstitutionType(models.Model):
type = models.CharField(max_length=255)
Is there a way to easily merge the two, either through SQL or through Django? I'm not quite sure what the best approach would be. My only issue is that there is a lot of foreign key references. Is there a good way in which you could change the primary keys of one application's table to be higher than the other application (essentially reassign primary keys in the second table starting where the first table ends) and have them trickle down and then eventually merge the two tables? Any sort of feedback would be much appreciated.
If what Alexander is suggesting to you is not to your linking, you can use django multi-db support to:
Keep both apps running on the same database and models
Use a migration/sync script while you slowly decouple both models so that only one is being used
Write tests :)
Probably the most difficult thing in merging are models.
If you use Django you should use South too. If you don't - try it.
Take first application as a base. Add fields from second application and create schemamigration.
Then move your data with datamigration from second application.
Merge your code from the second app.

What is the best way to model a heterogenous many-to-many relationship in Django?

I've searched around for a while, but can't seem to find an existing question for this (although it could be an issue of not knowing terminology).
I'm new to Django, and have been attempting to take a design which should be very expandable over time, and make it work with Django's ORM. Essentially, it's a series of many-to-many relationships using a shared junction table.
The design is a generic game crafting system, which says "if you meet [require], you can create [reward] using [cost] as materials." This allows items to be sold from any number of shops using the same system, and is generic enough to support a wide range of mechanics - I've seen it used successfully in the past.
Django doesn't support multiple M2M relationships sharing the same junction table (apparently since it has no way to work out the reverse relationship), so I seem to have these options:
Let it create its own junction tables, which ends up being six or more, or
Use foreign keys to the junction table in place of a built-in MTM relationship.
The first option is a bit of a mess, since I know I'll eventually have to add additional fields into the junction tables. The second option works pretty well. Unfortunately, because there is no foreign key from the junction table BACK to each of the other tables, I'm constantly fighting the admin system to get it to do what I want.
Here are the affected models:
class Craft(models.Model):
name = models.CharField(max_length=30)
description = models.CharField(max_length=300, blank=True)
cost = models.ForeignKey('Container', related_name="craft_cost")
reward = models.ForeignKey('Container', related_name="craft_reward")
require = models.ForeignKey('Container', related_name="craft_require")
class ShopContent(models.Model):
shopId = models.ForeignKey(Shop)
cost = models.ForeignKey('Container', related_name="shop_cost")
reward = models.ForeignKey('Container', related_name="shop_reward")
require = models.ForeignKey('Container', related_name="shop_require")
description = models.CharField(max_length=300)
class Container(models.Model):
name = models.CharField(max_length=30)
class ContainerContent(models.Model):
containerId = models.ForeignKey(Container, verbose_name="Container")
itemId = models.ForeignKey(Item, verbose_name="Item")
itemMin = models.PositiveSmallIntegerField(verbose_name=u"min amount")
itemMax = models.PositiveSmallIntegerField(verbose_name=u"max amount")
weight = models.PositiveSmallIntegerField(null=True, blank=True)
optionGroup = models.PositiveSmallIntegerField(null=True, blank=True,
verbose_name=u"option group")
Is there a simpler, likely obvious way to get this working? I'm attempting to allow inline editing of ContainerContent information from each related column on the Craft edit interface.
It sounds like you have a sort of "Transaction" that has a name, description, and type, and defines a cost, reward, and requirement. You should define that as a single model, not multiple ones (ShopContent, Craft, etc.).
class Transaction(models.Model):
TYPE_CHOICES = (('Craft', 0),
('Purchase', 1),
)
name = models.CharField(max_length=30)
description = models.CharField(max_length=300, blank=True)
cost = models.ForeignKey('Container')
reward = models.ForeignKey('Container')
require = models.ForeignKey('Container')
type = models.IntegerField(choices = TYPE_CHOICES)
Now Shop etc. can have a single ManyToManyField to Transaction.
Whether or not you use this particular model, the cost, reward and require relationships should all be in one place -- as above, or in OneToOne relationships with Craft, ShopContent etc. As you guessed, you shouldn't have a whole bunch of complex Many-To-Many through tables that are all really the same.
You mention at the bottom of your post that you're
attempting to allow inline editing of ContainerContent information from each related column on the Craft edit interface.
If you're modeling several levels of relationship, and using the admin app, you'll need to either apply some sort of nested inline patch, or use some sort of linking scheme like the one I use in my recent question, How do I add a link from the Django admin page of one object to the admin page of a related object?
I am smelling something is too complicated here, but I might be wrong. As a start,
is this any better? (ContainerContent will be figured out later)
class Cost(models.Model):
name = models.CharField(max_length=30)
class Reward(models.Model):
name = models.CharField(max_length=30)
class Require(models.Model):
name = models.CharField(max_length=30)
class Craft(models.Model):
name = models.CharField(max_length=30)
description = models.CharField(max_length=300, blank=True)
cost = models.ForeignKey(Cost)
reward = models.ForeignKey(Reward)
require = models.ForeignKey(Require)
class Shop(models.Model):
name = models.CharField(max_length=30)
crafts = models.ManyToMany(Craft, blank=True)

Is it ok to use a dictionary on the database for an app with lots of records?

I'm am working on an app which will handle lots of information and am looking for the best way of creating my models. Since I have never worked with apps that deal with so many records, database optimization is not a topic I know lots of, but it seems to me that a good design is a good place to start.
Right now, I have a table for customers, a table for products and a table for product-customer (since we assign a code for each product a customer buys). Since I want to track the balances, there is also a balance table. My models look like this at the moment:
class Customer(models.Model):
first_name = models.CharField(max_length=35)
last_name = models.CharField(max_length=35)
customer_ID= models.IntegerField(primary_key=True)
phone = models.CharField(max_length=10, blank=True, null=True)
class Product(models.Model):
product_ID = models.IntegerField(primary_key=True)
product_code = models.CharField(max_length=25)
invoice_date = models.DateField()
employee = models.ForeignKey(Employee, null=True, blank=True)
product_active = models.BooleanField()
class ProductCustomer(models.Model):
prod = models.ForeignKey(Product, db_index=True)
cust = models.ForeignKey(Customer, db_index=True)
product_customer_ID = models.IntegerField(primary_key=True)
[...]
class Balance(models.Model):
product_customer = models.ForeignKey(ProductCustomer, db_index=True)
balance = models.DecimalField(max_digits=10, decimal_places=2)
batch = models.ForeignKey(Batch)
[...]
The app will return the 'history' of the customer. If the pax was overdue at some point and then he paid and then was due for a refund, etc.
I was thinking if I should insert a CharField on the Pax table which would hold a dictionary with date:status (the status could be calculated and added to the dictionary when I upload the information) or if it is more efficient to do a query on the Balance table, or if there is a better solution to be implemented.
Since there are thousands of products and even more customers, we are talking about around 400K records for the balances on a weekly basis... I am concerned about what can be done to ensure the app runs smoothly.
If I understand your question you seem to be asking about whether the join conditions will impose an unreasonable burden on your lookup query. To some extent this depends on your rdbms. My recommendation is that you go with PostgreSQL over MySQL because MySQL's innodb tables are heavily optimized for primary key lookups and this means two btrees have to be traversed in order to find the records on a join. PostgreSQL on the other hand allows for physical scans of tables meaning foreign key lookups are a bit faster usually.
In general yes, the dictionary approach is fine for an app with lots of records. The questions typically come out of how you are querying and how many records you are pulling in a given query. That is a much larger factor than how many records are stored, at least for a db like PostgreSQL.