I want to perform simple join operation like this.
raw SQL : select * from risks r join sku_details s on r.sku_id = s.sku_id;
model Details:
class SkuDetails(models.Model):
sku_id = models.DecimalField(primary_key=True, max_digits=65535, decimal_places=65535)
sku_desc = models.TextField(blank=True, null=True)
category = models.TextField(blank=True, null=True)
class Risks(models.Model):
risk_id = models.DecimalField(primary_key=True, max_digits=65535, decimal_places=65535)
risk_group_short_desc = models.TextField(blank=True, null=True)
risk_group_desc = models.TextField(blank=True, null=True)
var = models.DecimalField(max_digits=65535, decimal_places=65535, blank=True, null=True)
sku = models.ForeignKey(SkuDetails, models.DO_NOTHING, blank=True, null=True)
After joining I want all the column of both the table in flat structure through Django ORM...
In raw SQL I will get all the column ... But not getting from ORM
Please Help !!!
Getting all values in a list of dictionaries is quite easy with values():
Risks.objects.values(
'risk_id',
'risk_group_short_desc`,
# ... fields you need from Risks
'sku__sku_id',
# ... fields you need from SkuDetails
)
You can check out values_list() as well.
You can try this withselect_related. Relevant helping material As both model with foreign-key relation.
Related
I have two models as shown below with just a few fields:
class Query(models.Model):
query_text = models.TextField(blank=True, null=True)
variable = models.CharField(max_length=250, blank=True, null=True)
class Statistic(models.Model):
query = models.ForeignKey(Query, models.DO_NOTHING, blank=True, null=True)
processing_time = models.DateTimeField(blank=True, null=True)
module = models.CharField(max_length=500, blank=True, null=True)
My target is to perform a JOIN using the id of the two models. The SQL query equivalent to it would be :
SELECT * FROM statistic S JOIN query Q ON S.query_id = Q.id
I understand select_related or prefetch_related could do the trick? I don't know which one to use to perform the join.
I'd appreciate some help on that. Thanks. :)
You can use select_related for this. You can make sure that JOIN is using by calling queryset's query attribute, which show you raw SQL statement:
print(Statistic.obects.select_related("query").query)
You've to use select_related here as you've a ForeignKey relationship. (prefetch_related is for ManyToMany fields).
So,
some_id_value = 12
stats_queryset = Statistic.obects.select_related("query").filter(id=some_id_value)
I'm trying to get the django orm to replicate a call on a database for my current table structure:
Tables:
ServiceItems {Id, name, user, date_created}
ServiceItemsProps {fk_to_ServiceItems Item_id, Id, key, value}
I'm trying to select items from the ServiceItem table with multiple keys from the ServiceItemsProps table as columns.
I can accomplish this with a query like the following:
> select tbl1.value as bouncebacks, tbl2.value as assignees from
> service_items join service_item_props as tbl1 on tbl1.item_id =
> service_items.id join service_item_props as tbl2 on tbl2.item_id =
> service_items.id where service_items.item_type='CARD' and
> tbl1.key='bouncebacks' and tbl2.key='assignees'
But I'm not able to figure out how to reproduce this in Django's ORM. I would like to not inject raw SQL into the statements here, because codebase portability is important.
Section of models.py
class ServiceItems(models.Model):
class Meta:
db_table = 'service_items'
unique_together = ('service', 'item_type', 'item_id')
service = models.ForeignKey(Service, blank=False, db_column='service_id', on_delete=models.CASCADE)
item_type = models.CharField(max_length=255, blank=False)
url = models.TextField(blank=True, null=True)
item_id = models.TextField(blank=True, null=True)
item_creation_user = models.TextField(blank=True, null=True)
item_creation_date = models.DateTimeField(blank=True, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class ServiceItemProps(models.Model):
class Meta:
db_table = 'service_item_props'
item = models.ForeignKey(ServiceItems, blank=False, db_column='item_id', on_delete=models.CASCADE)
prop_id = models.TextField(blank=True, null=True)
key = models.CharField(max_length=255, blank=False)
value = models.TextField(blank=True, null=True)
# change one line to make it easier to query
item = models.ForeignKey(ServiceItems, blank=False, db_column='item_id', on_delete=models.CASCADE, related_name='item_props')
Query should become:
ServiceItem.objects.filter(Q(item_type='CARD') & (Q(item_props__key='bouncebacks') | Q(item_props__key='assignees'))
==============================================================
I think I misunderstood your query.
I believe this is a good case to use .raw() .
Try this one instead:
qs = ServiceItemProps.objects.raw('''
SELECT sip1.*, sip2.value as other_value
FROM {item_table} as service_items
INNER JOIN {props_table} as sip1 on sip1.item_id = service_items.id
INNER JOIN {props_table} as sip2 on sip2.item_id = service_items.id
WHERE service_items.item_type='CARD' and sip1.key='bouncebacks' and sip2.key='assignees'
'''.format(item_table=ServiceItems._meta.db_table, props_table=ServiceItemProps._meta.db_table)
for itemprop in qs:
print(qs.value, qs.other_value)
Background
I'm storing data about researchers. eg, researcher profiles, metrics for each researcher, journals they published in, papers they have, etc.
The Problem
My current database design is this:
Each Researcher has many journals (they published in). The journals have information about it.
Likewise for Subject Areas
But currently, this leads to massive data duplication. Eg, the same journal can appear many times in the Journal table, just linked to a different researcher, etc.
Is there any better way to tackle this problem? Like right now, I have over 5000 rows in the journal column but only about 1000 journals.
Thank you!
EDIT: This is likely due to the way im saving the models for new data (mentioned below). Could anyone provide the proper way to loop and save hashes to models?
Model - Researcher
class Researcher(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
scopus_id = models.BigIntegerField(db_index=True) # Index to make searches quicker
academic_rank = models.CharField(max_length=100)
title = models.CharField(max_length=200,default=None, blank=True, null=True)
salutation = models.CharField(max_length=200,default=None, blank=True, null=True)
scopus_first_name = models.CharField(max_length=100)
scopus_last_name = models.CharField(max_length=100)
affiliation = models.CharField(default=None, blank=True, null=True,max_length = 255)
department = models.CharField(default=None, blank=True, null=True,max_length = 255)
email = models.EmailField(default=None, blank=True, null=True)
properties = JSONField(default=dict)
def __str__(self):
return "{} {}, Scopus ID {}".format(self.scopus_first_name,self.scopus_last_name,self.scopus_id)
Model - Journal
class Journal(models.Model):
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
researchers = models.ManyToManyField(Researcher)
title = models.TextField()
journal_type = models.CharField(max_length=40,default=None,blank=True, null=True)
abbreviation = models.TextField(default=None, blank=True, null=True)
issn = models.CharField(max_length=50, default=None, blank=True, null=True)
journal_rank = models.IntegerField(default=None, blank=True, null=True)
properties = JSONField(default=dict)
def __str__(self):
return self.title
How I'm currently saving them:
db_model_fields = {'abbreviation': 'Front. Artif. Intell. Appl.',
'issn': '09226389',
'journal_type': 'k',
'researchers': <Researcher: x, Scopus ID f>,
'title': 'Frontiers in Artificial Intelligence and Applications'}
# remove researchers or else create will fail (some id need to exist error)
researcher = db_model_fields["researchers"]
del db_model_fields["researchers"]
model_obj = Journal(**db_model_fields)
model_obj.save()
model_obj.researchers.add(researcher)
model_obj.save()
Here is how it works :
class Journal(models.Model):
# some fields
class Researcher(models.Model):
# some fields
journal = models.ManyToManyField(Journal)
Django gonna create a relation table :
Behind the scenes, Django creates an intermediary join table to represent the many-to-many relationship
So you'll have many rows in this table, which is how it works, but journal instance and researcher instance in THEIR table will be unique.
Your error is maybe coming from how you save. Instead of :
model_obj = Journal(**db_model_fields)
model_obj.save()
Try to just do this:
model_obj = Journal.objects.get_or_create(journal_id)
This way you'll get it if it already exists. As none of your fields are unique, you're creating new journal but there's no problem cause django is generating unique ID each time you add a new journal.
Django newbie, so if this is super straightfoward I apologize.
I am attempting to get a listing of distinct "Name" values from a listing of "Activity"s for a given "Person".
Models setup as below
class Activity(models.Model):
Visit = models.ForeignKey(Visit)
Person = models.ForeignKey(Person)
Provider = models.ForeignKey(Provider)
ActivityType = models.ForeignKey(ActivityType)
Time_Spent = models.IntegerField(blank=True, null=True)
Repetitions = models.CharField(max_length=20, blank=True, null=True)
Weight_Resistance = models.CharField(max_length=50, blank=True, null=True)
Notes = models.CharField(max_length=500, blank=True, null=True)
class ActivityType(models.Model):
Name = models.CharField(max_length=100)
Activity_Category = models.CharField(max_length=40, choices=Activity_Category_Choices)
Location_Category = models.CharField(max_length=30, blank=True, null=True, choices=Location_Category_Choices)
I can get a listing of all activities done with a given Person
person = Person.objects.get(id=person_id)
activity_list = person.activity_set.all()
I get a list of all activities for that person, no problem.
What I can't sort out is how to generate a list of distinct/unique Activity_Types found in person.activity_set.all()
person.activity_set.values('ActivityType').distinct()
only returns a dictionary with
{'ActivityType':<activitytype.id>}
I can't sort out how to get straight to the name attribute on ActivityType
This is pretty straightforward in plain ol' SQL, so I know my lack of groking the ORM is to blame.
Thanks.
Update: I have this working, sort of, but this CAN'T be the right way(tm) to do this..
distinct_activities = person.activity_set.values('ActivityType').distinct()
uniquelist = []
for x in distinct_activities:
valuetofind = x['ActivityType']
activitytype = ActivityType.objects.get(id=valuetofind)
name = activitytype.Name
uniquelist.append((valuetofind, name))
And then iterate over that uniquelist...
This has to be wrong...
unique_names = ActivityType.objects.filter(
id__in=Activity.objects.filter(person=your_person).values_list('ActivityType__id', flat=True).distinct().values_list('Name', flat=True).distinct()
This should do the trick. There will be not a lot of db hits also.
Writing that down from my phone, so care for typos.
I have following code:
class Invoice(models.Model):
customer = models.ForeignKey(User, blank=True, null=True)
customer_name = models.CharField(max_length=50, blank=True, null=True)
email = models.CharField(max_length=100, blank=True, null=True)
#----------------------------------
invoices = Invoice.objects.raw("""
SELECT
`invoices`.`id`,
`invoices`.`customer_id`,
`invoices`.`customer_name`,
`invoices`.`email` AS `inv_email`,
`auth_user`.`username`,
`auth_user`.`email` AS `auth_email`,
COUNT('customer_id') AS `buy_count`
FROM `invoices`
LEFT JOIN `auth_user` ON `auth_user`.id = `invoices`.customer_id
GROUP BY `customer_id`, `invoices`.`email`
""", translations={'inv_email': 'email', 'auth_email': 'customer.email'})
But, when I write invoices[i].customer Django makes SQL-request for each customer. There are way to map JOIN to the Django model in the raw request? Or this SQL-request may be realized using pure Django ORM?